Saturday, December 20, 2025

How to integrate standard LLMs with custom data to create RAG applications

 Integrating standard Large Language Models (LLMs) with custom data to build Retrieval-Augmented Generation (RAG) applications involves a multi-stage pipeline: ingestion, retrieval, and generation. This process enables the LLM to access and utilize information not present in its original training data [1, 2]. 

Here is a step-by-step guide on how to create RAG applications: 
1. Data Preparation and Ingestion 
The first step is to get your custom data ready for the system to read and understand [1, 2]. 
  • Load and Parse Data: Collect your custom data from various sources (e.g., PDFs, websites, databases). Use a data loading library (like LangChain or LlamaIndex) to ingest and format the data into a usable structure [2].
  • Chunking: LLMs and vector databases have limits on the amount of text they can process at once. Divide your data into smaller, manageable "chunks" while maintaining sufficient context (e.g., paragraphs or a few sentences) [1, 2].
  • Embedding: Convert each text chunk into a numerical representation called a vector embedding using an embedding model (e.g., OpenAI's text-embedding-ada-002, or open-source models like sentence-transformers). These embeddings capture the semantic meaning of the text [2].
  • Indexing: Store these vector embeddings in a specialized database, a vector store (e.g., Pinecone, Weaviate, Chroma, or pgvector). This database is optimized for quick similarity searches [1, 2]. 
2. Retrieval 
When a user asks a question, the RAG system needs to find the most relevant information from your custom data [1, 2]. 
  • Embed User Query: The incoming user question is converted into a vector embedding using the same embedding model used during ingestion [2].
  • Vector Search: The system performs a similarity search in the vector store to find the top
    Kcap K
    (e.g., top 4) data chunks whose embeddings are most similar to the user query embedding [1].
  • Retrieve Context: The actual text content of the most relevant chunks is retrieved [2]. 
3. Generation 
The retrieved context is then combined with the original user query and sent to the LLM to generate an informed answer [1, 2]. 
  • Prompt Construction: A prompt is dynamically created for the LLM. This prompt typically includes a set of instructions, the user's question, and the retrieved context [1].
  • LLM Generation: The constructed prompt is sent to a standard LLM (e.g., GPT-4, Llama 3). The LLM uses the provided context to formulate an accurate and relevant answer, ensuring the response is grounded in your custom data rather than just its internal knowledge [2].
  • Response to User: The final, generated answer is delivered to the user. 
Tools and Platforms 
Several frameworks and platforms streamline the development of RAG applications: 
  • Frameworks: Libraries like LangChain and LlamaIndex provide abstractions and pre-built components for managing the entire RAG pipeline [2].
  • Vector Databases: Specialized databases for storing and searching vector embeddings include Pinecone, Weaviate, Chroma, and Qdrant [1].
  • Cloud Platforms: Major cloud providers offer managed services that simplify RAG implementation, such as AWS Bedrock, Google Cloud AI Platform, and Azure AI Studio [2]. 

How to integrate LLMss with backend using langchain.js

Integrating LLMs with a backend using LangChain.js involves creating a Node.js API that uses the LangChain library to communicate with an LLM provider (like OpenAI), process requests, and return responses to the client. The backend acts as a secure intermediary between your frontend application and the LLM API. [1, 2]

Core Concepts

LangChain.js simplifies the process by providing abstractions and components:
  • LLMs & Chat Models: Classes for connecting to various language models (e.g., ).
  • Prompt Templates: Reusable structures to format user input for the model.
  • Chains: Workflows that combine prompts, models, and other logic into a single sequence of calls.
  • Memory: Components that allow chains to remember past interactions for conversational context. [8, 9, 10]
Step-by-Step Integration Guide (Node.js/Express Backend) [11]

This guide assumes you have a Node.js project initialized and a frontend (e.g., React) that sends requests to your backend API. [12, 13, 14]

1. Set Up Your Backend Project
Initialize your Node.js project and install necessary packages: [15, 16, 17, 18]
mkdir llm-backend
cd llm-backend
npm init -y
npm install express dotenv langchain @langchain/openai


2. Secure Your API Key

# .env file

Store your LLM provider's API key securely in a file in the project root: [1]

# .env file
OPENAI_API_KEY="your_api_key_here"

3. Define the LLM Logic in the Backend
Create a file (e.g., ) to handle the LangChain logic. This code defines how the prompt is structured and sent to the LLM. [1, 10]

// llmService.js
import { ChatOpenAI } from "@langchain/openai";
import { PromptTemplate } from "langchain/prompts";
import { LLMChain } from "langchain/chains";
import * as dotenv from "dotenv";

dotenv.config();

// Initialize the model (using environment variable for key)
const model = new ChatOpenAI({
  temperature: 0.7, // Adjust creativity
  openAIApiKey: process.env.OPENAI_API_KEY,
});

// Define a prompt template
const promptTemplate = new PromptTemplate({
  template: "Generate a fun fact about {topic}",
  inputVariables: ["topic"],
});

// Create a chain that combines the prompt and model
export const factChain = new LLMChain({
  llm: model,
  prompt: promptTemplate,
});


4. Create a Backend API Endpoint
Create your main server file (e.g., ) using Express to receive requests from the frontend and interact with the . [1, 15]

// server.js
import express from 'express';
import cors from 'cors';
import { factChain } from './llmService.js';

const app = express();
const port = 3001;

// Enable CORS and parse JSON bodies
app.use(cors());
app.use(express.json());

// API endpoint to process user requests
app.post('/api/generate-fact', async (req, res) => {
  const { topic } = req.body;

  if (!topic) {
    return res.status(400).json({ error: 'Topic is required' });
  }

  try {
    // Call the LangChain chain with the user input
    const response = await factChain.call({ topic });
    res.json({ fact: response.text });
  } catch (error) {
    console.error("Error calling LangChain:", error);
    res.status(500).json({ error: 'Failed to generate fact' });
  }
});

app.listen(port, () => {
  console.log(`Backend server listening at http://localhost:${port}`);
});


5. Integrate with Your Frontend
From your frontend application (e.g., in a React component), you can use or to make a POST request to the backend endpoint: [19, 20, 21]

// Example Frontend JS (runs in the browser)
const getFact = async (topic) => {
  const response = await fetch('http://localhost:3001/api/generate-fact', {
    method: 'POST',
    headers: {
      'Content-Type': 'application/json',
    },
    body: JSON.stringify({ topic }),
  });

  const data = await response.json();
  console.log(data.fact);
};

// Usage:
getFact('the moon');

Wednesday, December 17, 2025

Adding new files to remote github repository

 To add new files to a remote GitHub repository, you can use the command line (Git Bash, Terminal) for a local workflow or the GitHub website for a quick web interface upload. [1, 2]


Method 1: Using the Command Line (Recommended for projects)

This method assumes you have Git installed, the repository cloned locally, and are in the local repository's directory.
  1. Move the new file(s) into your local repository's directory.
  2. Stage the files for the next commit using the command:
    • To add a specific file:
    • To add all new and modified files:
  3. Commit the staged files to your local repository with a descriptive message:
  4. Push the changes from your local repository to the remote GitHub repository:
    • The default branch name is typically or . If it's your first push, you might use to set the upstream branch. [1, 4, 5, 6, 7]
Method 2: Using the GitHub Web Interface (For small additions)

This method is useful for adding a few small files without using the command line.
  1. Navigate to your repository on the GitHub website.
  2. Above the list of files, select the dropdown menu and click .
  3. Drag and drop your file or folder into the browser window, or click to browse your local machine.
  4. Type a short, meaningful commit message in the "Commit changes" field at the bottom of the page.
  5. Click (or if you are working on a new branch) to finalize the upload. [1, 8]
The actions and results:



Error you get when running "npm create vite@latest in VS code : "npm.ps1 cannot be loaded because running scripts is disabled on this system. "

 

The error "running scripts is disabled on this system" occurs because the PowerShell execution policy, a Windows security feature, is set to Restricted by default. To fix this, you need to change the execution policy to for the current user, which allows your locally created scripts to run while still requiring downloaded scripts to be digitally signed. [1, 2, 3]


Steps to Resolve the Error
  1. Open PowerShell as Administrator.
    • Search for "PowerShell" in the Windows Start menu.
    • Right-click on Windows PowerShell (or PowerShell).
    • Select Run as administrator.
  2. Change the execution policy for the current user.
    • In the Administrator PowerShell window, run the following command:
  3. Confirm the change.
    • You will be prompted to confirm if you want to change the execution policy. Type and press .
  4. Verify the new policy (Optional).
    • You can check the current effective execution policy by running:
    • The output should be .
  5. Run your npm command again.
    • You should now be able to run your commands without encountering the script execution error. You may need to restart your terminal or VS Code for the changes to take effect. [2, 4, 5, 6, 7]
For more detailed information on PowerShell execution policies, you can refer to the official Microsoft Learn documentation. [3]



The error:

The solution:

The effect







How to link my local git repo with VS code IDE?

 To link your local Git repository with VS Code, you need to first ensure Git is installed and initialized in your project folder, and then use the integrated Source Control features in VS Code. [1, 2, 3, 4]


Here is a step-by-step guide:

Prerequisites
  • Install Git on your computer.
  • Install VS Code.
Step 1: Open your project folder in VS Code

Launch VS Code and open the folder that contains your project files.You can do this in two ways:
  • Go to File > Open Folder... and select your project directory.
  • Open your terminal, navigate to your project directory, and type [5, 6, 7, 8, 9]
Step 2: Initialize Git in your project folder (if not already done) [10, 11]

If you haven't already set up a Git repository in this folder, you need to initialize it:
  1. Open the integrated terminal in VS Code by going to Terminal > New Terminal (or pressing + ).
  2. In the terminal, run the command:
  3. This command creates a hidden folder in your project directory, which tracks changes. [13, 14, 15, 16, 17]
Step 3: Use the Source Control panel in VS Code [18, 19]

VS Code automatically detects the initialized Git repository and integrates it into its Source Control UI:
  1. Click the Source Control icon in the Activity Bar on the left side (it looks like a three-pronged fork with circles).
  2. You will now see all your local file changes listed in the Source Control panel. [23, 24, 25]
Step 4: Make your first commit (optional but recommended)
  1. In the Source Control panel, hover over the Changes section and click the + (Stage All Changes) button to stage all your current files for commit.
  2. Type a commit message in the text box above the changes list (e.g., "Initial commit").
  3. Click the Commit button (checkmark icon) or press + to record these changes locally. [29, 30, 31, 32, 33]
Step 5: Link to a remote repository (e.g., GitHub, GitLab, Bitbucket) [34, 35, 36, 37, 38]

If you want to sync your local repository with an online service (like GitHub):
  1. Create a new, empty repository on your preferred hosting service (e.g.,
    GitHub

    ).
  2. Copy the remote URL for that new repository (usually an HTTPS URL).
  3. In your VS Code terminal, link your local repository to the remote one using the command:
  4. (Replace with the actual URL).
  5. Push your local commits to the online repository using the command:
  6. (Depending on your hosting service, the default branch might be called instead of ). [39, 40, 41, 42, 43]
Your local VS Code environment is now fully linked and synchronized with both your local Git tracking and a remote online repository. [44, 45, 46]

How to bring changes in your Github remote repository to your local git repository?

 To bring changes from a remote GitHub repository into your local Git repository, you need to use a combination of the , (or ), and commands. [1, 2, 3]


Here is a step-by-step guide:

Prerequisites
  • Ensure you have a local Git repository already cloned from the remote GitHub repository.
  • Make sure your local repository has the correct remote configured (usually named ). You can check this using: [4, 5, 6, 7, 8]
Steps to Update Your Local Repository

The most common workflow involves switching to your main development branch (often or ) and pulling the latest changes. [9, 10, 11, 12]

1. Navigate to your local repository directory
Open your terminal or command prompt and change the current directory to your project folder. [13, 14, 15, 16]

2. Ensure your working directory is clean
Before pulling changes, it's best practice to commit or stash any local changes you have made. [17, 18]

3. Switch to the target branch
Switch to the branch you want to update (e.g., ): [19, 20, 21]

4. Pull the changes from GitHub
The command is a shortcut that performs a (downloads the changes) followed by a (integrates the changes into your current branch).
  • is the default name for your remote GitHub repository.
  • is the name of the branch you are pulling from. [27, 28, 29]
This command fetches the changes from the remote branch and automatically merges them into your local branch, updating your local files to match the GitHub repository's current state. [30, 31, 32, 33]

Alternative: Fetch and Merge Manually (More Control) [34]

If you prefer more control over the integration process, you can separate the steps:

1. Fetch the remote changes
This downloads the changes into your local repository without immediately updating your working files. The changes are stored in a temporary "remote-tracking" branch (e.g., ). [35, 36, 37, 38, 39]

2. Merge the fetched changes into your current branch
Once the changes are fetched, you can integrate them into your current local branch using : [40]

This approach allows you to inspect the incoming changes before applying them locally, which can be useful in complex workflows. [41, 42]

Saturday, December 13, 2025

High Level Design (system design) of Instagram.

 Flow chart of an instant messaging app like the Instagram. 

SRS (Software Requirements Specification) of the Instagram newsfeed:




How to clone your remote GitHub repository to your local git:



Follow these steps:
  1. Generate a new SSH key (if needed): Generate a new one using the following command, replacing with your GitHub email using this command in your git bash:
    $ ssh-keygen -t ed25519 -C "your primary email at GitHub"

  2. When prompted to "Enter a file in which to save the key", just press Enter to accept the default location. You can also press Enter for an empty passphrase.
  3. Add your public key to GitHub:
    • Copy the content of your public key file to your clipboard. Use this command for your Windows system , Windows (Git Bash):
       $ clip < ~/.ssh/id_ed25519.pub 
    • Go to to your GitHub Repository which you want to clone, in your GitHub in the browser.
      Click on the 'Settings' tab and scroll down the left hand menu to "Deploy keys".
    • Click Add Deploy key, give it a descriptive title, and paste the copied key into the "Key" field. Click Add SSH key.
  4. Ensure your SSH agent is running and has the key loaded, search git in search bar in the taskbar of your PC, in the pop-up window choose "run as administrator". Now in the git bash console cd to your project folder and Start the SSH agent in the background by these two commands:
    First start the 'ssh-agent' in the background:
    $ eval "$(ssh-agent -s)"
    Then add your private SSH key to the agent: 
    $ ssh-add ~/.ssh/id_ed25519

  5. Now to clone your remote GitHub repository to your local project directory, cd to your project directory and give this command:
    $ git clone <URL of your remote repo copied from GitHub>



Can anyone edit my public github repository?

 By default, no, a random person cannot directly edit (push changes to) your public GitHub repository. Only the repository owner and explicitly added collaborators have direct push access. 


GitHub is designed around a collaboration model that requires permission to modify a repository's main codebase. 

How Others Can Contribute (Standard Process)

The standard way for external users to contribute to a public repository without direct edit permissions is through the fork and pull request workflow:
  • Forking: A user creates their own copy (a fork) of your repository in their GitHub account.
  • Making Changes: They make their edits, commits, and additions within their own fork, where they have full control.
  • Proposing Changes (Pull Request): Once they are ready, they send a "pull request" to your original repository.
  • Review and Merge: You, as the repository owner, then review the proposed changes and decide whether to accept and "merge" them into your main repository's codebase. 
Granting Direct Edit Access

If you want specific individuals to be able to push changes directly to your repository without requiring pull requests, you must explicitly add them as collaborators (for personal accounts) or as members of a team with appropriate permissions (for organization accounts). 

To add a collaborator:
  1. Navigate to the main page of your repository.
  2. Click on the Settings tab.
  3. In the left sidebar, click on Collaborators & teams (or Manage access).
  4. Under "Manage access", click Add people.
  5. Search for their GitHub username and select the appropriate permission level (e.g., "Write").
Special Cases
  • Wikis:
    By default, only collaborators can edit a public repository's wiki, but you can change a setting to allow anyone with a GitHub account to edit it

    .
  • Branch Protection Rules: You can set up rules to enforce specific checks (like requiring reviews or signed commits) even for collaborators on important branches (like the default branch), adding another layer of control over the editing process.

How to integrate standard LLMs with custom data to create RAG applications

 Integrating standard Large Language Models (LLMs) with custom data to build Retrieval-Augmented Generation (RAG) applications involves a mu...