Skip to content
@RAG-Implementation

RAG Implementation

banner

RAG Implementation

Open-source RAG projects split into three layers: indexing, retrieval evaluation, and answer generation.


Table of Contents

1. About
          1.1. Why split the pipeline?
 
    2. Architecture
 
    3. Repositories
 
4. Where to Start
          4.1. Full path
          4.2. Shortcuts
     
    7. Tech Stack
 
    8. Quick Start
 
    9. Contact
 
    10. License
 

1. About

This organization hosts three repositories that cover different parts of a Retrieval-Augmented Generation pipeline. Each repo has a narrow scope, its own FastAPI service, Docker setup, and notebooks.

The projects share the BEIR SciFact dataset and Qdrant as the vector store. They are built as separate services so each layer can be developed, tested, and replaced on its own.

1.1. Why split the pipeline?

  • RAG quality depends heavily on indexing and retrieval — not just the language model.
  • Each layer lives in its own repo with a clear boundary and its own tests.
  • SciFact gives a fixed public dataset so results are comparable across repos.
  • The same patterns show up in all three projects: FastAPI, Docker, config files, and step-by-step notebooks.

2. Architecture

Raw documents
      │
      ▼
┌──────────────────────────────┐
│  rag-data-indexing-service   │  load → clean → chunk → embed → Qdrant
└──────────────────────────────┘
      │  output: searchable vector index
      ▼
┌──────────────────────────────┐
│  rag-retrieval-benchmark     │  dense / BM25 / hybrid / RRF → metrics
└──────────────────────────────┘
      │  output: retrieval scores and benchmark reports
      ▼
┌──────────────────────────────┐
│  production-rag-answering-api│  retrieve → generate → cite → validate
└──────────────────────────────┘
      │  output: grounded answers with citations

3. Repositories

Repository Layer Description Docs
rag-data-indexing-service Indexing Ingests raw documents, cleans and chunks text, generates embeddings, and stores indexed chunks in Qdrant. README
rag-retrieval-benchmark Retrieval Benchmarks retrieval strategies on SciFact using Recall@k, MRR, and related metrics. No answer generation. README
production-rag-answering-api Answering Full RAG answering API: retrieval, context building, grounded generation, citations, validation, caching, and tracing. README

4. Where to Start

4.1. Full path

If you are new to the stack, work through the repos in this order:

  1. Indexingrag-data-indexing-service: build a searchable vector index from documents.
  2. Retrievalrag-retrieval-benchmark: compare retrieval methods before touching generation.
  3. Answeringproduction-rag-answering-api: run the end-to-end answering pipeline through the API.

4.2. Shortcuts

Goal Repository
Document ingestion and Qdrant indexing only rag-data-indexing-service
Retrieval metrics and strategy comparison only rag-retrieval-benchmark
End-to-end RAG answering API production-rag-answering-api

Each repository README has a Quick Start section with make commands and notebook order.

5. Production-style practices

These are not toy scripts. Each repo includes:

  • HTTP API — FastAPI endpoints for the main workflow.
  • CLI — command-line entry points where a service layer is not enough.
  • Configuration — YAML files and environment variables instead of hard-coded values.
  • Docker Compose — local stack with Qdrant and related services.
  • Tests — pytest suites; the retrieval benchmark also ships evaluation metrics.
  • Notebooks — phased walkthroughs that call the same code as the API.
  • Observability — Phoenix tracing in the answering API.

6. Who this is for

  • Developers who want to see RAG split into clear, testable layers.
  • Engineers who need retrieval metrics before adding an LLM.
  • Anyone working through SciFact as a small, reproducible RAG dataset.

7. Tech Stack

Component Used in
Python, FastAPI All three repos
Qdrant All three repos
Docker / Docker Compose All three repos
sentence-transformers Indexing, retrieval benchmark, answering API
SciFact (BEIR) All three repos
Phoenix / OpenTelemetry Answering API
SQLite cache Answering API

8. Quick Start

Clone the repo you need, copy the env file, and start the stack. Details differ per project — follow the linked README for notebook order and data download steps.

# Indexing
git clone https://github.com/RAG-Implementation/rag-data-indexing-service.git
cd rag-data-indexing-service
cp .env.example .env
make up

# Retrieval benchmark
git clone https://github.com/RAG-Implementation/rag-retrieval-benchmark.git
cd rag-retrieval-benchmark
cp .env.example .env
make up

# Answering API
git clone https://github.com/RAG-Implementation/production-rag-answering-api.git
cd production-rag-answering-api
cp .env.example .env
make build && make up

9. Contact

Questions or collaboration: Max Ghadri on LinkedIn.

10. License

All repositories in this organization are released under the MIT License.

Popular repositories Loading

  1. rag-data-indexing-service rag-data-indexing-service Public

    Production-style RAG data indexing pipeline: load, clean, chunk, embed, and index documents into Qdrant using Python, FastAPI, and sentence-transformers.

    Jupyter Notebook

  2. rag-retrieval-benchmark rag-retrieval-benchmark Public

    Production-style benchmark for RAG retrieval: dense, BM25, hybrid, RRF, metadata filters & query rewriting on BEIR SciFact. Docker, Qdrant, FastAPI, GPU embeddings.

    Jupyter Notebook

  3. production-rag-answering-api production-rag-answering-api Public

    Production-style RAG answering API with hybrid retrieval, grounded generation, citations, validation, caching, and tracing.

    Jupyter Notebook

  4. .github .github Public

Repositories

Showing 4 of 4 repositories

People

This organization has no public members. You must be a member to see who’s a part of this organization.

Top languages

Loading…

Most used topics

Loading…