Contextful is a local-first MCP context engine for AI coding agents. It indexes a project into searchable files, chunks, symbols, graph edges, evidence packs, and memory records so agents can retrieve compact context at runtime.

Does Contextful upload source code?

No. Version 1 is local-first and does not call hosted embedding APIs, upload source code, edit source files, or install dependencies inside the target workspace.

What is a context pack?

A context pack is a ranked, cited, token-budgeted bundle of files, symbols, graph paths, memory hits, and summaries returned by the context_pack MCP tool.

grep isn't enough

Give coding agents the context they need, efficiently.

Contextful is a context layer management harness + search engine + cross-session memory for AI coding agents. Available for Codex, Claude Code, Cursor, Windsurf, GitHub Copilot, VS Code, Cline, Roo Code, Continue, Zed, and any MCP-compatible coding tool. It indexes your workspace once, then returns ranked, cited, token-budgeted evidence packs instead of making agents re-read the same files in every session.

Install with npx View source on GitHub Open npm package

1 tool call context_pack replaces broad grep, glob, and read-file loops.

Local-first SQLite, FTS5, symbols, graph tables, and memory stay in your project.

Cited output Every useful answer can point back to files, symbols, graph paths, or packs.

Session memory Agents can write lessons only when they include valid evidence references.

context_pack("where are MCP tools registered", budget: 600)

intent: exact
7 cited hits across 2 files
graph paths: mcp-server.ts -> tool handlers -> search.ts
memory hits: 1 evidence-backed lesson
estimated result size: 560 tokens
saved roughly 18 follow-up file reads

AI coding agents do not need a bigger pile of files. They need a context engine.

Large context windows help, but they do not solve retrieval. A real project has source files, docs, migrations, tests, configs, commit history, prior agent discoveries, and architectural relationships. The agent needs the right slice, with citations, inside a token budget.

Repeated context is expensive

Agents often spend the first several minutes reading the same files they read yesterday. Contextful keeps searchable project state so repeated work can start from a compact evidence pack.

Vague queries are common

Real prompts sound like "resources for billing webhooks" or "what owns onboarding state". Contextful classifies intent, searches code and docs, then expands through symbols and graph edges.

Memory needs receipts

Loose "remember this" notes decay into stale advice. Contextful stores lessons in a memory ledger linked to files, symbols, commits, or prior context packs.

A solid search engine for agentic code understanding

Contextful combines practical retrieval techniques that work well in codebases: lexical search, BM25 ranking, identifier subtokens, symbol extraction, doc chunks, query intent classification, graph traversal, and deterministic reranking. The goal is simple: ask once and get the right context back.

Lexical and BM25 search

SQLite FTS5 indexes code, docs, symbols, and memories. Exact identifiers, file paths, config keys, Markdown headings, and natural-language queries all share one fast retrieval path.

Symbol and graph expansion

The index records files, chunks, symbols, imports, tests, config references, and typed edges such as DEFINES, IMPORTS, TESTS, CONFIGURES, and MENTIONS.

Token-budgeted evidence packs

The main output is a compact pack with citations, excerpts, graph paths, memory hits, confidence, and a token estimate. Agents get evidence instead of bulk context.

MCP server Model Context Protocol SQLite FTS5 BM25 query intent classification symbol search typed graph retrieval adjacency cache AST-path fingerprints Code2Vec-inspired reranking agent memory ledger

Built as the runtime layer agents actually use

Contextful is not a hosted code search product and not a vector-only RAG demo. It is a small local system: a CLI for setup and search, an indexer for the workspace, a SQLite state store, generated agent instructions, and an MCP server for runtime agent calls.

1. Index

Parse supported files, respect ignore rules, chunk docs, extract symbols, and store hashes.

2. Store

Write files, chunks, symbols, graph nodes, edges, packs, fingerprints, and memory to SQLite.

3. Retrieve

Classify the query, search FTS tables, expand graph context, and rerank deterministic evidence.

4. Return

Send a cited context pack that fits the requested budget and can be reused across sessions.

Efficient context storage without shipping code to a cloud service

V1 uses local SQLite as the default context store. It is boring in the right way: easy to inspect, easy to back up, fast enough for one workspace, and friendly to deterministic tests.

What ships now

SQLite tables for files, chunks, symbols, nodes, edges, node props, edge props, adjacency cache, fingerprints, evidence packs, queries, and memories. FTS5 provides lexical search over chunks, symbols, and memory claims.

Where the design can grow

The schema leaves room for local vectors through sqlite-vec, LanceDB, or HNSW, plus compressed adjacency lists with Roaring bitmaps or CSR arrays for larger graph traversal workloads.

Small MCP surface, high signal

Agents should not need 20 retrieval tools to understand a repo. Contextful keeps the tool surface focused and makes context_pack the first call for broad questions.

Tool	Use it for	What the agent gets back
`context_pack`	Broad codebase questions, vague prompts, architecture discovery, and first-pass context gathering.	Ranked citations, files, symbols, graph paths, memory hits, confidence, and token estimate.
`search_code`	Lexical, symbol, docs, and memory search with filters for focused lookup.	Scored hits with file refs, titles, excerpts, line ranges, and memory status.
`trace_path`	Relationship traversal between files, symbols, modules, tests, docs, and config references.	Typed graph paths that explain how project pieces connect.
`impact_analysis`	Reverse dependency checks before edits to shared modules or test-sensitive files.	Likely dependents and test impact around a symbol or file.
`why_changed`	Questions about rationale, history, and current evidence around a file or symbol.	Current context plus lightweight git-history signals when available.
`recall_memory`	Cross-session project lessons, decisions, gotchas, and useful agent discoveries.	Evidence-backed memory hits scoped to the current workspace.
`write_lesson`	Saving durable lessons after an agent finds real evidence.	A memory record only if the claim includes valid evidence refs.

Where Contextful helps immediately

The strongest use cases are the moments where an agent would normally spend half the task gathering context before it can write or reason.

Onboard to a large codebase

Ask "how does auth work" or "resources for onboarding flows" and get docs, symbols, tests, and file citations without opening the whole repository.

Plan safer code changes

Run impact analysis before editing a shared module. The agent can inspect reverse dependencies and likely tests before it touches code.

Reuse session learning

Store lessons such as "payments tests require the fixture clock" with evidence refs, then recall that lesson in future Codex, Claude Code, or Cursor sessions.

Explain architecture paths

Trace how a route reaches a service, how a config key is used, or where a module is imported through typed graph edges instead of plain text search alone.

Test retrieval before the agent uses it

Run cxf search locally to inspect the same cited evidence pack the agent will receive through MCP.

Keep context local

Use local retrieval for private repositories, internal docs, and proprietary code. V1 does not upload source or call external embedding APIs.

How it fits with the tools developers already use

Contextful is meant to sit beside your editor, coding agent, and normal search tools. It gives the agent a prepared retrieval layer so the agent can spend less time hunting and more time reasoning.

Approach	Good at	Contextful adds
grep, ripgrep, file reads	Exact search, quick inspection, and human debugging.	Intent classification, ranked packs, citations, graph edges, memory, and token budgets.
IDE symbol search	Finding definitions and references inside a developer workflow.	An MCP-native interface that agents can call directly from Codex, Claude Code, Cursor, and other clients.
Generic vector RAG	Semantic matching over prose-heavy documents.	Code-aware lexical search, FTS5/BM25, symbols, typed graph traversal, and evidence-required memory.
Long context windows	Holding more material after it has already been selected.	A retrieval step that decides what belongs in the window before tokens are spent.

Compatible with MCP-aware coding tools

Contextful runs as a standard MCP stdio server, so the same local context engine can support multiple agentic coding workflows.

Codex Claude Code Claude Desktop Cursor Windsurf GitHub Copilot VS Code Cline Roo Code Continue Zed Any MCP stdio client

Install Contextful

The npm package ships the cxf CLI and the MCP server. Start with init, which indexes the workspace and writes .contextful/AGENT_INSTRUCTIONS.md for your coding agent.

npx @inferensys/contextful init --workspace .
npx @inferensys/contextful search "where is auth handled" --workspace . --budget 2000
npx @inferensys/contextful memory add --workspace . --claim "..." --evidence file:src/example.ts:1-20

codex mcp add contextful -- \
  npx -y @inferensys/contextful server

# Generic MCP stdio command:
npx -y @inferensys/contextful server

Primary package: @inferensys/contextful. Primary CLI: cxf. MCP name: io.github.Inferensys/contextful.

FAQ

What is Contextful?

Contextful is a local-first MCP context engine for coding agents. It builds a searchable project index and returns compact evidence packs with citations, graph paths, symbols, and memory hits.

Is this only for Claude Code?

No. Contextful works through MCP stdio, so it can be used by Codex, Claude Code, Cursor, Windsurf, GitHub Copilot in VS Code, Cline, Roo Code, Continue, Zed, and other compatible clients.

Does Contextful use embeddings?

V1 does not call hosted embedding APIs or upload source code. The current retrieval layer uses local lexical search, BM25, symbols, graph edges, and deterministic reranking. Local vectors are a natural extension point, not a requirement.

Why require evidence for memory?

Agent memory is useful only when it can be checked. Contextful rejects memory writes without valid refs from files, symbols, commits, or prior evidence packs, and stale evidence can mark a lesson stale later.

How should agents use Contextful?

Agents should call context_pack before broad file exploration. Humans can run cxf search to preview the same kind of ready-to-use bundle: summary, citations, files, symbols, graph paths, memory hits, confidence, and token estimate.

Can I inspect the stored data?

Yes. Contextful stores local state in the project under .contextful/, with SQLite tables for files, chunks, symbols, graph nodes, graph edges, evidence packs, queries, and memories.