concept
RAG Pipeline
created 2026-04-19 ai · rag · embeddings · vector-search
RAG Pipeline
Retrieval-Augmented Generation — fetch relevant context from a vector store before generating LLM responses.
Our Implementation
Documents → Chunking → Embeddings → pgvector → Similarity Search → Reranking → LLM
Chunking
RecursiveCharacterTextSplitter(1000 chars, 200 overlap)- Document loaders: PDF, DOCX, Excel, PPTX via LangChain
Embeddings
Vector Storage
- pgvector extension in PostgreSQL
- manzas:
SupabaseVectorStorewith semantic filtering by org/department - fajb-next: Custom VectorDBService with article chunks + transcript segments
Reranking
- fajb-next:
RerankerServicefor result optimization after initial vector search
When to Use RAG vs Wiki
| Approach | Best For |
|---|---|
| RAG | Querying large document collections, real-time retrieval |
| LLM Wiki | Persistent, curated knowledge that compounds over time |
Both can coexist — the Second Brain is a wiki, but individual projects may use RAG for domain-specific document search.
Related
- LangGraph Agent Pattern — agents that consume RAG results
- manzas, fajb-next — implementations
- Supabase — pgvector hosting