concept

RAG Pipeline

created 2026-04-19 ai · rag · embeddings · vector-search

RAG Pipeline

Retrieval-Augmented Generation — fetch relevant context from a vector store before generating LLM responses.

Our Implementation

Documents → Chunking → Embeddings → pgvector → Similarity Search → Reranking → LLM

Chunking

  • RecursiveCharacterTextSplitter (1000 chars, 200 overlap)
  • Document loaders: PDF, DOCX, Excel, PPTX via LangChain

Embeddings

  • fajb-next: OpenAI text-embedding-3-large
  • manzas: Google Gemini embeddings

Vector Storage

  • pgvector extension in PostgreSQL
  • manzas: SupabaseVectorStore with semantic filtering by org/department
  • fajb-next: Custom VectorDBService with article chunks + transcript segments

Reranking

  • fajb-next: RerankerService for result optimization after initial vector search

When to Use RAG vs Wiki

ApproachBest For
RAGQuerying large document collections, real-time retrieval
LLM WikiPersistent, curated knowledge that compounds over time

Both can coexist — the Second Brain is a wiki, but individual projects may use RAG for domain-specific document search.