RAG System Side Project

I've started building a Retrieval-Augmented Generation (RAG) system from scratch as a side project.

What is RAG?

RAG combines a retrieval step (fetching relevant documents from a knowledge base) with a generation step (using an LLM to produce an answer grounded in those documents). It's one of the most practical ways to give LLMs access to up-to-date or domain-specific information.

Goals

Build a clean, modular implementation I can reuse across projects
Experiment with different embedding models and vector stores
Explore chunking strategies and their impact on retrieval quality
Open-source it once it's polished enough

Current Status

Early stages — basic pipeline is working with a simple in-memory vector store. Next up is plugging in a proper vector database and evaluating retrieval quality more rigorously.

Will post updates here as it progresses.