The Problem
Agents explore the same codebase, blind, every session
Measured on a 200-file Python project. Results vary by codebase size and structure.
When an AI agent opens your repository, it sees files.
Not structure.
It doesn't know that auth_service.py is load-bearing and utils.py is mostly dead code.
It doesn't know which symbols are imported by 40 other files and which are imported by none.
It doesn't know that the last three bugs all touched the same module.
It finds out by reading. Session after session. That's expensive -- in tokens, in time, and in mistakes made before the picture comes into focus.
roam pre-computes the picture. One index, one query, complete structural understanding.
What it knows
roam models your codebase as a graph -- not text, but a network of what calls what, what changes together, and what breaks what.
- Structural graph
- Symbols, call edges, inheritance, dependency layers. PageRank identifies the most important code. Tarjan SCC finds circular dependency clusters. Louvain detects natural module boundaries.
- Git archaeology
- Churn rates, co-change coupling, bus factor, authorship entropy. Which files change together? Which developer owns what? Where are the hidden couplings?
- Architecture metrics
- Health scores (0-100), cognitive complexity per function, tangle ratios, god component detection. CI quality gates with SARIF output for GitHub Code Scanning.
How it stays safe
Deliberately local. No cloud. No API keys. No telemetry. No network calls. Everything runs on your machine in a SQLite database.
- Read-only by default
- The index is a queryable snapshot of your codebase structure. Your source code is never modified unless you explicitly run
roam mutate --apply. - Works air-gapped
- Once installed, no internet access is required. Safe for proprietary codebases, security-sensitive environments, and offline development.
- MIT licensed
- Open source, no vendor lock-in, no data leaving your machine. Inspect the code yourself.
What it does and doesn't do
- Maps code structure: symbols, call graphs, dependency layers, git history, architecture metrics
- Resolves cross-file references using name-based heuristics -- works well for most codebases
- Supports 27 languages via tree-sitter, all with full symbol extraction
- Detects 23 algorithm anti-patterns with Big-O improvements and language-aware fix suggestions
- Trace runtime behavior, dynamic dispatch, reflection, or eval'd code -- for that, use a profiler
- Run a full type system -- a small percentage of call edges may be imprecise in dynamic languages
- Replace your linter or security scanner -- roam understands architecture; SonarQube finds bugs; use both
- Work magic on tiny projects -- below 10 files, just read them directly
In Practice
Five questions, five commands
Most developers use about 10 commands regularly. Here are the five that matter most.
Quick Start
Working in under a minute
Install
pip install roam-code
Python 3.9+. Also works with pipx and uv tool install.
Index
cd your-project && roam init
Creates a local SQLite database. ~5s for 200 files. Incremental after that.
Query
roam understand
Full architectural briefing: tech stack, key abstractions, health score, entry points.
Connect your agent
roam mcp-setup claude-code
Generates MCP config for Claude Code, Cursor, Windsurf, VS Code, Gemini CLI, and more. For full client workflows, see integration tutorials.
When not to use roam
The right tool for the job. roam earns its weight at scale -- not everywhere.
Just read the files directly. roam adds overhead without value at this scale.
ripgrep is faster for raw string matching. roam is structural, not textual.
Use an LSP (pyright, gopls, tsserver). roam is static and offline.
Works with your tools
Via MCP: Claude Code, Cursor, Windsurf, VS Code + Copilot, Cline, Continue.dev, Gemini CLI. Via CLI: any agent that can run a shell command -- Aider, Codex CLI, custom scripts. Via CI: GitHub Actions, GitLab CI, Azure DevOps. JSON + SARIF output.
27 languages via tree-sitter: Python, TypeScript, JavaScript, Java, Go, Rust, C, C++, C#, PHP, Ruby, Kotlin, and 15 more. All 27 languages have full symbol extraction via native extractors or YAML-defined grammars.
The full surface
roam has 139 CLI commands and 101 MCP tools. That's a lot for a human to memorize. It's not designed for that. It's designed for AI agents, which use commands as a vocabulary -- calling whatever they need, when they need it.
Most developers use about 10 commands directly. The rest are there for your agent. See the full command reference.