CodeRAG

A standalone, local-first semantic code-search engine for large and custom codebases.

CodeRAG indexes a whole codebase into a hybrid (vector + keyword) search index and answers questions like "where is retry/backoff handled?" with the exact functions, classes, and files that matter — ranked by meaning, not just string match.

It runs entirely on your machine with no API key (a local ONNX embedding model is the default), keeps its index up to date as you edit, and is built to stay fast on large codebases. Use it from the CLI, embed it as a Python library, self-host it as an HTTP service, or browse with the web UI.

Built for the cases off-the-shelf IDE assistants don't cover well: a codebase that's too big, too private, or too custom — or a search/RAG capability you want to own and embed in your own tools.

✨ Highlights

Local-first, zero-key. Default embeddings run locally via fastembed (ONNX, no PyTorch). Self-hosted, OpenAI, and Anthropic backends are all optional add-ons.
Bring your own model platform. Built for self-hosted and local models first (any OpenAI-compatible server — Ollama, vLLM, LM Studio, LocalAI), with first-class OpenAI API and Anthropic API support when you want it.
Symbol-aware chunking. Indexes functions, classes, and methods (Python via ast; JS/TS/Go/Rust/Java via tree-sitter), not crude fixed-size blocks — so results point at real code units with file:line citations.
Hybrid retrieval. Dense vector search + BM25 keyword search, fused with Reciprocal Rank Fusion. Great at both "what does this mean" and exact-identifier lookups.
Incremental & live. Content-hashed indexing only re-embeds files that changed; a debounced watcher keeps the index current as you code. No duplicate or stale vectors.
Built to scale. Exact Flat search for small repos, automatic switch to approximate IVF past a threshold so it stays fast at 100k+ chunks.
Four surfaces, one engine. CLI · Python library · HTTP/REST · web UI — all thin wrappers over the same CodeRAG object.

🚀 Quick start

pip install -e .            # core engine (local embeddings included)
# optional extras:
pip install -e ".[server]"     # HTTP/REST API
pip install -e ".[ui]"         # built-in web UI (FastAPI + Jinja + Pygments)
pip install -e ".[openai]"     # OpenAI (or self-hosted OpenAI-compatible) embeddings / answers
pip install -e ".[anthropic]"  # Anthropic (Claude) LLM answers
pip install -e ".[all]"        # everything above

Index a codebase and search it — no configuration, no API key:

coderag index --watched-dir /path/to/your/repo
coderag search "where are duplicate vectors removed on file change" --watched-dir /path/to/your/repo

1. coderag/indexer.py:141 (Indexer._index_file)  [method, sim=0.70]
   def _index_file(self, item): removed = 0; existing = self.store.get_file(item.rel) …
2. coderag/indexer.py:1  [window, sim=0.74]
   """Incremental indexing orchestration. ...the critical correctness property…"""

By default the index lives in ./.coderag/. Set CODERAG_WATCHED_DIR / CODERAG_STORE_DIR (or copy example.env to .env) to avoid repeating flags.

🧑‍💻 The four surfaces

CLI

coderag index [PATH] [--full]     # build / incrementally update the index
coderag search "QUERY" [-k 8]     # hybrid search; add --json or --answer
coderag watch                     # index, then keep it live as files change
coderag serve --port 8000         # run the HTTP API  (needs [server])
coderag ui                        # launch the web UI (needs [ui])
coderag status                    # index stats (files, chunks, model, index type)

Python library

from coderag import CodeRAG, Config

cr = CodeRAG(Config.from_env(watched_dir="/path/to/repo"))
cr.index()

for hit in cr.search("how is the FAISS index persisted?"):
    print(f"{hit.location}  {hit.symbol}  (sim={hit.similarity:.2f})")
    print(hit.text)

HTTP / REST (`coderag serve`)

curl "http://127.0.0.1:8000/search?q=token%20validation&k=5"
curl -X POST http://127.0.0.1:8000/index -d '{"full": false}' -H 'content-type: application/json'
curl "http://127.0.0.1:8000/status"
curl "http://127.0.0.1:8000/file?path=coderag/api.py&start_line=1&end_line=40"

Self-host it once and point any number of custom apps or teammates at a big shared codebase.

Security. The API is unauthenticated by default and can read indexed source and file contents. Keep it on 127.0.0.1 for local use, or set CODERAG_API_KEY (sent as Authorization: Bearer <key> or X-API-Key) and front it with TLS / an authenticating proxy before exposing it. CORS stays off unless you set CODERAG_CORS_ORIGINS. The /file endpoint only serves files that are actually indexed.

Web UI (`coderag ui`)

A built-in, server-rendered web UI (FastAPI + Jinja, syntax highlighting via Pygments): a search box with language/kind/path filters, results with path:line citations and similarity scores, an in-browser file viewer (cited lines highlighted), a file browser, index status, a one-click Reindex, and an optional streamed LLM answer (when an OpenAI/Anthropic key or a self-hosted endpoint is configured). It is progressively enhanced — every page works with JavaScript disabled, and there's no CDN/runtime network dependency, so it stays local-first.

🐳 Docker (beta)

Prebuilt multi-arch images (linux/amd64 + linux/arm64) are published to GHCR on every push to master. Beta — interfaces and tags may change.

# HTTP/REST API on :8000 — mount a repo to index, persist the index in a named volume
docker run --rm -p 8000:8000 \
  -v "$PWD:/workspace:ro" -v coderag-index:/data \
  ghcr.io/neverdecel/coderag:beta

# build the index once, then query the running server
curl -X POST localhost:8000/index -H 'content-type: application/json' -d '{"full": true}'
curl "localhost:8000/search?q=where%20is%20retry%20handled&k=5"

# Web UI on :8501
docker run --rm -p 8501:8501 \
  -v "$PWD:/workspace:ro" -v coderag-index:/data \
  ghcr.io/neverdecel/coderag:beta-ui

Tags: :beta (latest master), :edge (alias), :sha-<commit> (immutable); the UI image adds a -ui suffix. The container indexes /workspace and stores its index in /data (CODERAG_WATCHED_DIR / CODERAG_STORE_DIR). For OpenAI embeddings/answers, add -e OPENAI_API_KEY=…. The container binds 0.0.0.0, so set -e CODERAG_API_KEY=… and keep the port on a trusted network (or behind an authenticating proxy) when exposing it.

☸️ Kubernetes (Helm)

For teams who want a shared, always-on deployment, a Helm chart self-hosts the HTTP API (and optional UI) with a persistent index, scheduled re-indexing, and hardened defaults (non-root, read-only rootfs, single-writer-safe). It runs standalone with zero config on your cluster's default storage:

helm install coderag ./deploy/helm/coderag --namespace coderag --create-namespace

Then point it at your code (a git repo, or a PVC you already have):

helm upgrade coderag ./deploy/helm/coderag -n coderag --reuse-values \
  --set workspace.source=git \
  --set workspace.git.repository=https://github.com/Neverdecel/CodeRAG.git

It provisions the index volume, clones the repo into the pod, and builds the index automatically. Not a Helm user? helm template … | kubectl apply -f - works too. See the full guide — storage options, private repos, OpenAI/Anthropic keys, ingress, the UI, scheduled reindex — in deploy/README.md.

🏗️ How it works

graph LR
    A[Source files] --> B[Symbol-aware chunking<br/>ast / tree-sitter]
    B --> C[Embeddings<br/>fastembed · OpenAI · self-hosted]
    C --> D[(SQLite store<br/>chunks + vectors + FTS5)]
    D --> E[FAISS index<br/>Flat → IVF]
    Q[Query] --> F[Dense + BM25]
    E --> F
    D --> F
    F --> G[Reciprocal Rank Fusion]
    G --> H[Ranked hits<br/>path:line + score]

SQLite is the source of truth (chunk text, line ranges, symbols, content hashes, and the raw vectors). The FAISS index is a rebuildable cache — it can always be reconstructed from SQLite, so switching models or index types never corrupts your data.
Each file's content is hashed; unchanged files are skipped on re-index. A changed file's old chunks are removed from both the store and the vector index before new ones are added — so editing never accumulates stale or duplicate vectors.

⚙️ Configuration

Everything is configurable via CODERAG_* environment variables or a .env file (see example.env). Common ones:

Variable	Default	Meaning
`CODERAG_PROVIDER`	`fastembed`	`fastembed` (local) · `openai` · `fake`
`CODERAG_MODEL`	`BAAI/bge-small-en-v1.5`	Local embedding model
`CODERAG_WATCHED_DIR`	cwd	Codebase to index
`CODERAG_STORE_DIR`	`./.coderag`	Where the DB + index live
`CODERAG_INDEX_TYPE`	`auto`	`auto` · `flat` · `ivf`
`CODERAG_IVF_THRESHOLD`	`50000`	Vectors before switching Flat → IVF
`CODERAG_TOP_K`	`8`	Results returned
`OPENAI_API_KEY`	–	OpenAI embeddings / answers
`OPENAI_BASE_URL`	–	Point at a self-hosted / local OpenAI-compatible server (Ollama, vLLM, …)
`CODERAG_LLM_PROVIDER`	`openai`	Answer backend: `openai` · `anthropic`
`CODERAG_CHAT_MODEL`	`gpt-4o-mini`	OpenAI (or self-hosted) chat model for answers
`ANTHROPIC_API_KEY`	–	Anthropic (Claude) answers
`CODERAG_ANTHROPIC_MODEL`	`claude-opus-4-8`	Anthropic chat model for answers
`CODERAG_API_KEY`	–	If set, the HTTP API requires it (`Authorization: Bearer <key>` or `X-API-Key`). Set whenever the server is reachable beyond localhost.
`CODERAG_CORS_ORIGINS`	–	Comma-separated CORS allowlist for the HTTP API (never `*`). Empty ⇒ no cross-origin browser access.

🧩 Supported languages

Symbol-aware (function/class/method level): Python, JavaScript, TypeScript/TSX, Go, Rust, Java. Many other languages and docs (C/C++, Ruby, PHP, Markdown, YAML, …) are indexed with a line-window fallback, so they remain searchable.

🛠️ Development

python -m venv venv && source venv/bin/activate
pip install -e ".[dev,server,openai]"

pytest -m "not integration"     # fast, offline (uses a deterministic fake embedder)
pytest -m integration           # exercises the real local model (downloads once)
ruff check . && ruff format --check . && mypy coderag   # ruff = lint + import-sort + format

See DEVELOPMENT.md and AGENTS.md for architecture and contribution details.

📄 License

Apache License 2.0 — see LICENSE.

🙏 Acknowledgments

FAISS · fastembed · tree-sitter · FastAPI · Jinja · Pygments · watchdog

⭐ If CodeRAG helps you, please give it a star!

Name		Name	Last commit message	Last commit date
Latest commit History 43 Commits
.clusterfuzzlite		.clusterfuzzlite
.github		.github
assets/logo		assets/logo
coderag		coderag
deploy		deploy
fuzz		fuzz
tests		tests
.dockerignore		.dockerignore
.gitignore		.gitignore
.pre-commit-config.yaml		.pre-commit-config.yaml
AGENTS.md		AGENTS.md
DEVELOPMENT.md		DEVELOPMENT.md
Dockerfile		Dockerfile
LICENSE		LICENSE
README.md		README.md
SECURITY.md		SECURITY.md
example.env		example.env
pyproject.toml		pyproject.toml
requirements.txt		requirements.txt
uv.lock		uv.lock

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

CodeRAG

✨ Highlights

🚀 Quick start

🧑‍💻 The four surfaces

CLI

Python library

HTTP / REST (`coderag serve`)

Web UI (`coderag ui`)

🐳 Docker (beta)

☸️ Kubernetes (Helm)

🏗️ How it works

⚙️ Configuration

🧩 Supported languages

🛠️ Development

📄 License

🙏 Acknowledgments

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

CodeRAG

✨ Highlights

🚀 Quick start

🧑‍💻 The four surfaces

CLI

Python library

HTTP / REST (coderag serve)

Web UI (coderag ui)

🐳 Docker (beta)

☸️ Kubernetes (Helm)

🏗️ How it works

⚙️ Configuration

🧩 Supported languages

🛠️ Development

📄 License

🙏 Acknowledgments

About

Topics

Resources

License

Security policy

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

HTTP / REST (`coderag serve`)

Web UI (`coderag ui`)

Packages