Showcase

Mapped in seconds, on a laptop

We pointed IngestMD at six of the most-used open-source codebases. Six languages, 67,301 files, each turned into a named map in single-digit seconds, with nothing leaving the machine.

6repositories
67,301source files
6languages
353named areas
0name collisions

Big or small, all in single-digit seconds

Source files per repository, with the first run measured on a laptop. The biggest, Node.js at 33,429 files, finished fastest because a tree that large groups structurally; the small ones lean on semantic clustering, which costs a little more. Either way, none took longer than about ten seconds.

Source files mapped, and the first-run time Node.js 33,429 files ~3s Kubernetes 17,212 files ~6s VS Code 11,452 files ~9s PostgreSQL 2,574 files ~4s Zulip 2,497 files ~9s Clean Architecture 137 files ~9s Real IngestMD area-discovery runs, on a laptop, fully on-device. Times are first-run approximations.

Name and describe areas with your own model

The structural map is the start. Point a local model at an area and it writes a readable name and a short description, on your machine, with no cloud. Here is a real one from Clean Architecture, named by a 6 GB model running on the laptop:

Structural name CreateTodoItem
Named by your local model
Command Handling and Persistence

This area contains the core logic for performing CRUD operations on todo lists and individual items. It implements the Command pattern, defining specific requests (CreateTodoItemCommand) and their handlers. Start with the command files in src/Application/TodoItems/Commands. All persistence relies on the injected IApplicationDbContext interface.

Honest note: this works best on focused areas, where the whole area fits the model's view. On a very large area, say a 900-file backend, the model reads a sample and describes that slice, so reach for it on coherent areas rather than whole subtrees. It is opt-in, and the model is one you run yourself.

After the map

A map is step one. Here is what you do with it.

Focused context for your assistant

Ask for a task and IngestMD assembles exactly the files needed, semantically ranked and packed. On a 2,800-file repo, one real task pulled 5 files with the right one on top.

Fits your model's window

It packs context to a token budget you set, so it fits whatever model you run, instead of a repo that overflows a context window dozens of times over.

Review the change, not the repo

From your git changes it builds a review bundle: the files you touched, their tests, and what depends on them, behind a short review prompt.

Curate and reuse

Drop areas and files into a collection, then export it as a folder, a zip, or a single file with an index. Reusable context, saved by name.

Not just code

It turns PDFs, Word docs, spreadsheets, and Confluence pages into the same clean Markdown, so the workflow covers documents and confidential folders too.

Scrubbed, reproducible, yours

Secrets are redacted before anything is bundled, the output is the same every run, and it works in your editor, the terminal, or as an MCP server your assistant talks to.

Why this matters

One tool, every stack

C#, Go, JavaScript, C, TypeScript, Python. The same command read all of them. IngestMD is not a language-specific tool.

It does not choke at scale

Node.js is 33,429 files. Kubernetes is 17,212. Both mapped on a laptop in single-digit seconds, no cloud anywhere.

Nothing left the machine

Discovery runs on-device. We measured zero non-loopback network connections during it. Your code is none of our business.

Honest note: on the largest monorepos the biggest area can stay coarse, and on flat trees the area names follow the directories. A local model renames them in seconds. Each repository page shows exactly what IngestMD produced, warts and all.

Run it on your own codebase

Free to download, with a 3-day full Pro trial. No account, no card, nothing leaves your machine.

Download IngestMD