Saturday, December 27, 2025

What are best rag strategies?

 The best RAG strategies focus on improving data quality, smarter retrieval, and better context handling, with key techniques including context-aware chunking, hybrid search (keyword + vector), reranking top results, query expansion/rewriting, and using metadata filtering, often combined in architectures like Agentic RAG or Graph RAG, to reduce hallucinations and boost accuracy for complex, real-world queries. Start simple (chunking, reranking) and progressively add complexity like hybrid search and agents for multi-hop questions. [1, 2, 3, 4]


This video provides a high-level overview of various RAG strategies:

Foundational Strategies (Start Here)
  • Context-Aware Chunking: Don't just split by fixed length; use sentence/paragraph boundaries or semantic chunking to keep related ideas together, potentially with overlap (sliding window).
  • Reranking: Use a more advanced model to reorder the initial top results from the vector store for better relevance before sending to the LLM.
  • Data Cleaning & Metadata: Remove noise, fix errors, and use metadata (dates, types) for effective filtering to narrow down search results. [1, 3, 5, 6, 7, 8]
Intermediate Strategies (Improve Retrieval)
  • Hybrid Search: Combine sparse (keyword, BM25) and dense (vector) retrieval to capture both exact terms and semantic meaning.
  • Query Expansion/Rewriting: Use the LLM to generate alternative queries or hypothetical documents (HyDE) to cover phrasing gaps.
  • Parent Document Retrieval: Retrieve summaries or metadata first, then drill down to full chunks for better context in large documents. [2, 6, 8, 9, 10]
You can learn about sparse, dense, and hybrid retrieval methods in this video:

Advanced Strategies (Complex Use Cases)
  • Agentic RAG/Multi-Agent: Employ agents to break down complex, multi-step questions, use multiple tools (like search, graph lookups), and verify answers.
  • Graph RAG: Use Knowledge Graphs for structured data and relationships, ideal for complex domains like finance or medicine.
  • Context Distillation: Summarize retrieved chunks to fit more relevant info into the LLM's context window.
  • Fine-Tuning: Fine-tune embedding models or the LLM itself for specialized domain language or specific output formats (e.g., compliance). [1, 2, 4, 8]
This video explains different RAG architectures in detail:

Key Takeaway

A robust RAG system often combines 3-5 strategies, starting with solid fundamentals (chunking, reranking, data prep) and layering on more advanced techniques (hybrid search, agents) as needed for accuracy and complexity, with the goal of delivering grounded, high-quality answers. [2, 3, 11]


AI responses may include mistakes.

Does it matter in which folder you pip install packages

 




Does it matter in which folder you npm install packages?





Friday, December 26, 2025

JS prerequisites for learning React JS

 This course assumes the following prerequisites, as they will not be re-explained:

  • const and let
  • Template strings
  • Arrays and Objects
  • Array methods (filter, find, etc.)
  • Array and object destructuring
  • ES Modules (import and export)
  • Dynamic imports
  • Arrow functions
  • Promises
  • The Fetch API
  • Basic experience with the DOM (Document Object Model)

Thursday, December 25, 2025

Installing langchain.js in vs code

 Installing LangChain.js in VS Code involves setting up a Node.js project and using


or to add the necessary packages. The process is the same as installing any other JavaScript library.

Prerequisites

Before you begin, ensure you have the following installed on your system:
  • Node.js: LangChain.js runs on Node.js, and it includes the (Node Package Manager) command-line tool.
  • VS Code: Your preferred code editor. [3]
Step-by-Step Installation
  1. Open VS Code and create a new project folder.
  2. Open the integrated terminal in VS Code (Terminal > New Terminal or ++).
  3. Initialize a new Node.js project by running this command in the terminal. This creates a file:
  4. Install the main LangChain.js package using (or or ):
              $ npm install langchain

  1. Alternatively, use : 
            $ yarn add langchain
  1. Install specific integrations as needed. LangChain uses a modular design, so install packages for the specific Large Language Models (LLMs) or tools planned to be used

    5. Get a Google API Key: Obtain a Gemini API key from Google AI Studio.

    6. Set Environment Variable: Set the API key as an environment variable named GOOGLE_API_KEY:

      > setx GOOGLE_API_KEY "your-google-gemini-api-key"   //In windows command console.

       OR

     > [Environment] :: SetEnvironmentVariable("GOOGLE_API_KEY", "<YOUR_API_KEY_VALUE>", "User")


    

   6. For Google Gemini models:
           $ npm install @langchain/google-genai

    7. For managing API keys securely using environment variables, install :
         $ npm install dotenv


   8.  Configure the project for ES Modules (ESM) by adding to the file. This allows the use of statements:

To configure a project for ES Modules (ESM), the primary method is adding to your , which makes all files use ESM syntax (/); alternatively, use the extension for specific files or the flag for direct execution, but ensure you update CommonJS patterns (like , ) to ESM equivalents (, ), replacing with dynamic imports for path resolution, explains W3Schools, YouTube, and DEV Community. [1, 2, 3, 4]

Common Configuration Methods
  1. in (Recommended for ESM-first)
    • Add to your file.
    • This makes files ESM by default.
    • For any files that must remain CommonJS (CJS), rename them to or use for ESM files.
// package.json
{
  "name": "my-esm-project",
  "version": "1.0.0",
  "type": "module" // <--- This line enables ESM
}

  1. .mjs Extension (For mixed projects)
    • Use the .mjs  extension for individual files you want to run as ES Modules.
    • This works even if package.json  is set to "type";"commonjs" .
  2. Node.js Command-Line Flag
    • Run scripts directly as ESM using node --input-type=module script.js . [1, 2, 5, 6]
Code Migration Steps
  • Imports: Change const fs = require('fs')  to  import fs from 'fs' or import { readFile } from 'fs'. 
  • Exports: Replace module.exports = { ... }  with export { ... } or export default { ... } .
  • Paths: Add file extensions to relative imports, e.g., import util from './util.js' .
  • Globals: Remove  __dirname and __filename ; use import.meta.url  to get the current module's URL and construct paths.
  • : 'use strict' :Can be removed as ESM files are strict by default. [2, 4, 6, 7]
Web Browser Configuration
  • Add type="module"  to your <script>  tag to load an ES Module.
  • Use <script>  attribute for fallback scripts in older browsers. [3]
<script type="module" src="app.js"></script>
<script nomodule src="legacy-app.js"></script>


AI responses may include mistakes.




Example Usage (TypeScript recommended)
For the best development experience with LangChain.js, using TypeScript is recommended.
  1. Install TypeScript and Node.js type definitions:
  2. Create a file in the project root:
  3. Create a source file in a new directory, for example, .
  4. Add sample code to the file and run it using . [3, 8, 9, 10, 11]
Once these steps are completed, the VS Code environment is set up and ready for LangChain.js development.


  


Now, do 'add, commit and push' git bash ops to push this change to your remote github repo:

$ git add .




    $ git commit -m "coment" 



finally, 
$ git push -u origin main


Now see the new updates actually happened in your github repo






AI responses may include mistakes.



How to do chunking of pdf files for rag using langchain.js?

 Chunking a PDF in LangChain.js for a RAG pipeline involves three main steps: loading the document, splitting the text into manageable chunks, and then embedding and storing the chunks. [1, 2]


Step 1: Install Necessary Packages [3]

You will need LangChain libraries and a PDF loader (e.g., ). [4, 5, 6]

npm install langchain @langchain/community @langchain/textsplitters


Step 2: Load the PDF Document

Use the to read the content of your PDF file. This loader extracts text and often splits the document initially by page. [2, 7]

import { PDFLoader } from "@langchain/community/document_loaders/fs/pdf";

// Define the file path to your PDF document
const filePath = "path/to/your/document.pdf";

// Create a PDF loader
const loader = new PDFLoader(filePath);

// Load the documents
const rawDocs = await loader.load();
console.log(`Loaded ${rawDocs.length} pages/documents`);


Step 3: Split the Documents into Chunks

The initial pages may still be too large to fit into an LLM's context window. Use a text splitter, such as the , to break the text into smaller, contextually relevant chunks. This splitter attempts to split by paragraphs, then sentences, then words, to maintain semantic coherence. [2, 8, 9, 10]

import { RecursiveCharacterTextSplitter } from "@langchain/textsplitters";

// Initialize the text splitter with a specified chunk size and overlap
const textSplitter = new RecursiveCharacterTextSplitter({
  chunkSize: 1000, // Maximum size of each chunk in characters
  chunkOverlap: 200, // Number of characters to overlap between adjacent chunks to preserve context
});

// Split the loaded documents into smaller chunks
const splitDocs = await textSplitter.splitDocuments(rawDocs);
console.log(`Split into ${splitDocs.length} chunks`);


Step 4: Embed and Store the Chunks


The resulting are an array of objects, each representing a manageable chunk of text with associated metadata. These documents are ready to be converted into embeddings and stored in a vector database for use in a RAG pipeline. [2, 11, 12]

// Example of accessing a chunk's content and metadata
console.log("Example chunk content:", splitDocs[0].pageContent);
console.log("Example chunk metadata:", splitDocs[0].metadata);

// These chunks can now be embedded and stored in a vector store
// (e.g., Chroma, Pinecone, etc.) as the next step in your RAG pipeline.
/*
import { MemoryVectorStore } from "langchain/vectorstores/memory";
import { OpenAIEmbeddings } from "@langchain/openai";
const vectorStore = await MemoryVectorStore.fromDocuments(
  splitDocs,
  new OpenAIEmbeddings()
);
*/

By following these steps, you can effectively process PDF files in LangChain.js and prepare the data for an efficient and accurate RAG system. Experiment with different and values to find the optimal configuration for your specific documents. [2]


AI responses may include mistakes.

Saturday, December 20, 2025

How to integrate standard LLMs with custom data to create RAG applications

 Integrating standard Large Language Models (LLMs) with custom data to build Retrieval-Augmented Generation (RAG) applications involves a multi-stage pipeline: ingestion, retrieval, and generation. This process enables the LLM to access and utilize information not present in its original training data [1, 2]. 

Here is a step-by-step guide on how to create RAG applications: 
1. Data Preparation and Ingestion 
The first step is to get your custom data ready for the system to read and understand [1, 2]. 
  • Load and Parse Data: Collect your custom data from various sources (e.g., PDFs, websites, databases). Use a data loading library (like LangChain or LlamaIndex) to ingest and format the data into a usable structure [2].
  • Chunking: LLMs and vector databases have limits on the amount of text they can process at once. Divide your data into smaller, manageable "chunks" while maintaining sufficient context (e.g., paragraphs or a few sentences) [1, 2].
  • Embedding: Convert each text chunk into a numerical representation called a vector embedding using an embedding model (e.g., OpenAI's text-embedding-ada-002, or open-source models like sentence-transformers). These embeddings capture the semantic meaning of the text [2].
  • Indexing: Store these vector embeddings in a specialized database, a vector store (e.g., Pinecone, Weaviate, Chroma, or pgvector). This database is optimized for quick similarity searches [1, 2]. 
2. Retrieval 
When a user asks a question, the RAG system needs to find the most relevant information from your custom data [1, 2]. 
  • Embed User Query: The incoming user question is converted into a vector embedding using the same embedding model used during ingestion [2].
  • Vector Search: The system performs a similarity search in the vector store to find the top
    Kcap K
    (e.g., top 4) data chunks whose embeddings are most similar to the user query embedding [1].
  • Retrieve Context: The actual text content of the most relevant chunks is retrieved [2]. 
3. Generation 
The retrieved context is then combined with the original user query and sent to the LLM to generate an informed answer [1, 2]. 
  • Prompt Construction: A prompt is dynamically created for the LLM. This prompt typically includes a set of instructions, the user's question, and the retrieved context [1].
  • LLM Generation: The constructed prompt is sent to a standard LLM (e.g., GPT-4, Llama 3). The LLM uses the provided context to formulate an accurate and relevant answer, ensuring the response is grounded in your custom data rather than just its internal knowledge [2].
  • Response to User: The final, generated answer is delivered to the user. 
Tools and Platforms 
Several frameworks and platforms streamline the development of RAG applications: 
  • Frameworks: Libraries like LangChain and LlamaIndex provide abstractions and pre-built components for managing the entire RAG pipeline [2].
  • Vector Databases: Specialized databases for storing and searching vector embeddings include Pinecone, Weaviate, Chroma, and Qdrant [1].
  • Cloud Platforms: Major cloud providers offer managed services that simplify RAG implementation, such as AWS Bedrock, Google Cloud AI Platform, and Azure AI Studio [2]. 

What are best rag strategies?

  The best RAG strategies focus on improving data quality, smarter retrieval, and better context handling, with key techniques including con...