codebase-memory-mcp
by DeusData
The fastest, most efficient code intelligence engine for AI coding agents. It indexes any repository into a persistent knowledge graph — full-indexing an average repo in seconds and the Linux kernel in 3 minutes — so your agent answers structural questions with ~120x fewer tokens. Tree-sitter parsing across 159 languages, Hybrid LSP type resolution, single static C binary.
Built-in 3D graph visualization (UI variant) — explore your knowledge graph at localhost:9749.
What is codebase-memory-mcp?
codebase-memory-mcp is an open-source Model Context Protocol (MCP) server that indexes a codebase into a persistent knowledge graph of functions, classes, call chains, HTTP routes, and cross-service links. Instead of reading files one at a time, an AI coding agent queries the graph — answering structural questions with roughly 120x fewer tokens. It parses 159 languages and ships as a single static C binary with zero runtime dependencies.
It is a structural-analysis backend, not a chatbot: there is no embedded LLM and no API key. Your MCP client (Claude Code, or any MCP-compatible agent) is the intelligence layer; codebase-memory-mcp builds and serves the graph. All processing happens locally — your code never leaves your machine.
Why do AI agents waste tokens exploring code?
AI coding agents explore codebases by reading files one at a time. Every structural question triggers a cascade of grep → read file → grep again → read more files. The cost compounds fast.
Across five structural questions about a real codebase, file-by-file search consumed ~412,000 tokens; the same questions answered from the knowledge graph took ~3,400 tokens — a ~120x reduction.
The win is not about fitting the context window. It is cost (at $3–15 per million tokens, exploration adds up), latency (sub-millisecond graph queries versus seconds of file reading), and accuracy (less noise means better answers and no "lost in the middle" problem).
Source: project benchmark, 5 structural queries — see the full benchmark report.
How do I install codebase-memory-mcp?
Install with a single command, then tell your agent to index the project. It is a single static C binary for macOS, Linux, and Windows — no Docker, no runtime dependencies, no API key.
curl -fsSL https://raw.githubusercontent.com/DeusData/codebase-memory-mcp/main/install.sh | bash
# 2. The installer auto-detects and configures every installed agent.
# 3. Restart your agent, then say:
"Index this project"
One command configures all 11 supported agents: Claude Code, Codex CLI, Gemini CLI, Zed, OpenCode,
Antigravity, Aider, KiloCode, VS Code, OpenClaw, and Kiro — with MCP entries, instruction files, and
pre-tool hooks for each. Windows users run install.ps1. Also available via
npm, pip, Homebrew, Scoop, Winget, Chocolatey, AUR, and go install.
What is Hybrid LSP?
Hybrid LSP is semantic type resolution beyond tree-sitter. Tree-sitter alone produces a syntactic
AST — it handles naming, structure, and call sites, but it cannot tell that
user.profile.display_name() resolves to Profile.display_name declared
three modules away, because it does not track imports, generics, inheritance, or stdlib types.
codebase-memory-mcp ships a clean-room re-implementation of the type-resolution algorithms
used by real language servers — tsserver/typescript-go, pyright, gopls, intelephense, and
Roslyn — embedded directly into the single static C binary. There is no language-server process, no
per-project setup, and no API key. This layer runs alongside tree-sitter on every parse and refines
CALLS, USAGE, and RESOLVED_CALLS edges with type information,
so the graph mirrors what an IDE "Go to Definition" would resolve.
Languages with full Hybrid LSP
| Language | What it resolves |
|---|---|
| Python | Imports and dotted submodule walks, dataclasses, Self return types, generics, @property, match/case patterns, SQLAlchemy 2.0 Mapped[T], Pydantic models, typing annotations, async/await, isinstance/walrus narrowing, and common stdlib. |
| TypeScript / JavaScript / JSX / TSX | Generics, JSX component dispatch, JSDoc inference for plain JS, .d.ts declarations, module re-exports, and method chaining via return-type propagation across a shared cross-file registry. |
| PHP | Namespaces, traits, late-static-binding, PHPDoc inference, parameter binding, and return-type inference. |
| C# | Global usings, file-scoped namespaces, records (incl. C# 12 primary constructors), LINQ method syntax, async Task<T>/ValueTask<T> unwrap, generic methods, var inference, and common BCL stdlib. |
| Go | Pre-built per-package cross-file registry, generics, embedded structs, interface satisfaction, and package-aware import resolution. |
| C / C++ | Shared cross-language registry: macros, typedef chains, and header-vs-source linking on the C side; templates, namespaces, auto inference, and class-hierarchy method resolution on the C++ side. |
The two-layer pipeline runs a fast syntactic tree-sitter pass for every one of the 159 languages, then a type-aware Hybrid LSP pass on top for the families above. Languages without a Hybrid LSP pass yet fall back to textual resolution, so you always get an answer.
Can it do semantic and natural-language code search?
Yes. Beyond structural and full-text search, codebase-memory-mcp performs semantic
vector search across the whole graph — so you can find code by meaning, not just by
name. A search for send surfaces functions named publish,
emit, or dispatch.
It is powered by nomic-embed-code embeddings compiled directly into the binary (768-dimensional, int8). There is no API key, no Ollama, and no Docker — the embeddings run on-device, so semantic search stays 100% local like everything else. Results combine 11 signals (TF-IDF, API/type/decorator signatures, AST profiles, data flow, Halstead-lite complexity, MinHash, module proximity, and graph diffusion) into one relevance score.
Meaning-aware edges in the graph
The indexer also writes two kinds of meaning-aware edges, queryable like any other relationship:
| Edge | What it captures |
|---|---|
SEMANTICALLY_RELATED | Conceptually similar functions whose names and tokens differ — vocabulary-mismatch matches, scored ≥ 0.80, within the same language. |
SIMILAR_TO | Near-duplicate and copy-pasted code, detected with MinHash + LSH and Jaccard scoring — ideal for finding clones and refactor candidates. |
search_graph(semantic_query=["retry", "backoff", "exponential"])
Semantic and similarity edges are computed in full and moderate index
modes; fast mode skips them for the lowest-latency indexing.
How much does the knowledge graph save?
Each common structural question costs hundreds of tokens against the graph versus tens of thousands via file-by-file search. Totals across five queries: ~3,400 vs ~412,000 tokens.
| Question type | Graph | File-by-file | Savings |
|---|---|---|---|
| Find function by pattern | ~200 | ~45,000 | 225x |
| Trace call chain (depth 3) | ~800 | ~120,000 | 150x |
| Dead code detection | ~500 | ~85,000 | 170x |
| List all routes | ~400 | ~62,000 | 155x |
| Architecture overview | ~1,500 | ~100,000 | 67x |
| Total | ~3,400 | ~412,000 | ~121x |
A separate evaluation across 31 real-world repositories, described in the preprint, reported 83% answer quality, 10x fewer tokens, and 2.1x fewer tool calls versus file-by-file exploration.
Source: “Codebase-Memory: Tree-Sitter-Based Knowledge Graphs for LLM Code Exploration via MCP”, arXiv:2603.27277 — and the project benchmark report.
How fast is it?
Indexing is RAM-first (LZ4 compression, in-memory SQLite, single dump at end) and memory is released to the OS afterward. Queries run in under a millisecond.
| Operation | Time | Notes |
|---|---|---|
| Linux kernel full index | 3 min | 28M LOC, 75K files → 4.81M nodes, 7.72M edges |
| Django full index | ~6 s | 49K nodes, 196K edges |
| Cypher query | <1 ms | Relationship traversal |
| Name search (regex) | <10 ms | SQL LIKE pre-filtering |
| Trace call path (depth 5) | <10 ms | BFS traversal |
Source: project Performance benchmarks, measured on Apple M3 Pro.
Features
159 languages
Python, Go, JS, TS, TSX, Rust, Java, C++, C#, C, PHP, Ruby, Kotlin, Scala, Zig, Elixir, Haskell, OCaml, Swift, Dart, Lean 4, and many more via vendored tree-sitter grammars compiled into the binary.
Hybrid LSP type resolution
Language-server-grade type inference for Python, TS/JS, PHP, C#, Go, and C/C++ — embedded in the binary, no server process or per-project setup.
Pure C, zero dependencies
A single static C binary for macOS, Linux, and Windows. No Docker, no runtime, no API keys. Download, run install, done.
Call-graph tracing
Trace callers and callees across files and packages with import-aware, type-inferred resolution. BFS traversal up to depth 5.
Dead-code detection
Find functions with zero callers, with smart filtering that excludes entry points like route handlers, main(), and framework decorators.
Cross-service linking
Matches REST routes to HTTP call sites across services with confidence scoring — and detects gRPC, GraphQL, and tRPC services plus pub/sub channels (EMITS/LISTENS_ON for Socket.IO, EventEmitter, and message buses) and async queue dispatch.
Infrastructure-as-code indexing
Dockerfiles, Kubernetes manifests, and Kustomize overlays become graph nodes with cross-references to the resources they configure.
Auto-sync
A background watcher detects changes and re-indexes incrementally. No manual reindex after editing files.
Team-shared graph artifact
Commit one zstd-compressed snapshot (.codebase-memory/graph.db.zst); teammates bootstrap from it and skip the full reindex.
3D graph visualization
An optional UI binary serves an interactive 3D graph at localhost:9749 to explore nodes, edges, and clusters visually.
14 MCP tools
search_graph, trace_path, detect_changes, query_graph (Cypher), get_architecture, get_code_snippet, manage_adr, and 7 more.
Cypher graph queries
Run read-only Cypher-style queries against the graph for multi-hop patterns that structured search can't express.
Semantic code search
Find code by meaning, not just name, via semantic_query vector search — powered by nomic-embed-code embeddings baked into the binary. No API key, fully local.
Clone & similarity detection
SIMILAR_TO edges (MinHash + LSH) surface near-duplicate code; SEMANTICALLY_RELATED edges link conceptually similar functions across the graph.
Cross-repo intelligence
Index multiple repositories in one store and link them with CROSS_* edges. A multi-galaxy 3D layout and cross-repo architecture summary span the whole fleet.
Data-flow tracing
DATA_FLOWS edges follow values from argument to parameter, with field-access chains — trace how data moves, not just who calls whom.
Change-impact analysis
detect_changes maps an uncommitted git diff to affected symbols and their blast radius, with risk classification — see what a change touches before you ship it.
Architecture Decision Records
manage_adr persists architectural decisions alongside the graph, so design rationale survives across sessions and teammates.
What are the recent releases?
The latest release notes are loaded live from GitHub. Each entry links to its full changelog.
Loading recent releases from GitHub…
Frequently asked questions
Does codebase-memory-mcp send my code anywhere?
No. All indexing and querying happen 100% locally. There is no embedded LLM and no API key. Release binaries are signed, checksummed, and scanned by 70+ antivirus engines.
Does it support semantic or natural-language code search?
Yes. Alongside structural and full-text search, search_graph's semantic_query
parameter runs vector search over the whole graph, powered by nomic-embed-code embeddings compiled
into the binary — so it finds publish when you search send. No API key, no
Ollama, no Docker; the embeddings run on-device. The indexer also builds SEMANTICALLY_RELATED
edges between similar functions and SIMILAR_TO edges for near-clone detection.
Do I need Docker or a runtime?
No. It is a single static C binary with zero runtime dependencies for macOS (arm64/amd64), Linux (arm64/amd64), and Windows (amd64).
How does it stay up to date as I edit code?
A background watcher detects file changes and re-indexes incrementally — typically a sub-millisecond
no-op when nothing changed. You only run a manual index for the first build or after a large
git pull.
Why is there no built-in LLM?
Other code-graph tools embed an LLM to translate natural language into graph queries, which means extra API keys and cost. With MCP, the agent you are already talking to is the query translator — codebase-memory-mcp just builds and serves the graph.
Is it free and open source?
Yes. It is MIT licensed. The full source, signed release binaries, and checksums are on GitHub.