Story How it works Tests Insights Whitepaper Download Author Contact
← Back to Insights
Technical

How I reduced my AI context from 30K tokens to 3.5K without losing precision.

The context problem

Every AI conversation needs context. Your projects, your decisions, your preferences. Without it, the AI starts from zero every time.

The typical approach is to paste a context file at the start of each session. The problem: context files grow. You add projects, update information, include more details. Before long you're pasting thousands of tokens worth of material into every conversation, burning context window on information the AI may not even need for that specific task.

I was maintaining dozens of knowledge files across 4 businesses. Pasting them manually into every conversation. Watching tokens burn on redundant context. Something had to give.

The compilation approach

Instead of pasting raw source files, I built a compiler that reads all my knowledge files and produces a single optimized output.

The process is straightforward. You write in markdown — each topic gets its own file. One project, one file. One tool, one file. Flat structure, no folders. Then a Python compiler (1,173 lines, zero external dependencies) validates the format, deduplicates overlapping content, auto-fixes broken references, strips unnecessary noise, and generates a single BRAIN.md. Any AI reads that one file. Current. Verified with SHA256 checksum.

The numbers

From ~114K characters of scattered markdown, the compiler produces ~3.5K tokens of compiled output. Roughly a 30:1 reduction — with near-zero precision loss. No hallucinated summaries, no data dropped.

It achieves this through three layers. First, each neuron starts with a →brain: summary line — the compiler uses this in the output and adds a pointer to the full file for when depth is needed. Second, tool and agent definitions get compressed to single-line entries with detail pointers. Third, sensitive patterns (API keys, file paths, personal IDs) are automatically redacted from the compiled output.

The result

The AI consumes ~3.5K tokens instead of the full source material. It gets the same information, optimized for its context window. No precision lost.

And I don't have to do anything. A file watcher detects changes and recompiles automatically in about 8 seconds.


EIDARA is the open-source project behind this compiler. GitHub · Website