RAG System Side Project
I've started building a Retrieval-Augmented Generation (RAG) system from scratch as a side project.
What is RAG?
RAG combines a retrieval step (fetching relevant documents from a knowledge base) with a generation step (using an LLM to produce an answer grounded in those documents). It's one of the most practical ways to give LLMs access to up-to-date or domain-specific information.
Goals
- Build a clean, modular implementation I can reuse across projects
- Experiment with different embedding models and vector stores
- Explore chunking strategies and their impact on retrieval quality
- Open-source it once it's polished enough
Current Status
Early stages — basic pipeline is working with a simple in-memory vector store. Next up is plugging in a proper vector database and evaluating retrieval quality more rigorously.
Will post updates here as it progresses.