Home/Blog/Why I built Metatron
Origin

Why I built Metatron

For a full day, nobody could pay me. Not because the servers were down — they were green the whole time — but because of a one-line change that no reasonable review would have flagged. That bug is why I stopped piling everything into ever-longer CLAUDE.md instructions and built a real memory for my coding agents instead.

A day of silence

One of my side projects is AI Collection, a directory where makers pay a small fee to list their tools. One of its paid funnels works like this: sign in, paste a URL, get an AI-drafted listing, review it, hit Publish. When that funnel breaks, the revenue behind it stops cold.

On a Saturday in May, the funnel went dark. Users could still sign in, paste a URL, and watch the AI draft their listing — but the final review step never became interactive. The Publish button just sat there. Completed submissions dropped from a dozen-odd a day to zero, and stayed there.

The cruel part was the silence. Nothing threw on the server. Nothing landed in the errors table. The deploy that broke it was green; the build passed type-checks, lint, and a clean server render. The only evidence that anything was wrong was the absence of success events on a chart that doesn't scream when it flatlines. I caught it the next day — a maker emailed to ask if the site was broken right as I was digging into why the numbers looked off.

It wasn't a bug. It was a forgotten decision.

The technical cause was a delight of modern bundling. The finish step of the submit flow transitively imported sharp, a native, server-only image library. sharp's module init reads process.*, which doesn't exist in a browser. For a long time this was harmless: the bundler tree-shook the sharp-using helpers out of the client chunk because they were private to the module. Submissions worked, invisibly held up by a tree-shaking pass nobody knew was load-bearing.

Then a perfectly sensible change — made in a routine Claude session (Opus 4.8) — exported two of those helpers so a new API route could reuse them. Exporting them defeated the tree-shaking. sharp got pulled into the browser bundle, its init ran process.platform, threw ReferenceError: process is not defined, and the component never hydrated. The fix, once I found it, was a single line — make the import lazy so it can never reach the client:

// before — static import; tree-shaking was the only thing keeping it client-safe
import sharp from "sharp";

// after — server-only, in its own async chunk, no matter what the module exports
const sharp = (await import("sharp")).default;

Here's the thing that stuck with me. The session that exported those helpers did nothing wrong by any normal standard. Sharing a function so another route can reuse it is good instinct. The only way to know it was dangerous was to already hold a specific, hard-won decision in your head: native server-only dependencies must be structurally isolated — don't let tree-shaking be the thing keeping them out of the client.

That decision existed. It just lived nowhere you could reach it at the moment you needed it. I wrote it down afterward, of course — it's action item #4 in the postmortem: "client components must not transitively import modules that statically import native/server-only deps." But a convention buried in a postmortem that nobody reads before touching that file is not a guardrail. It's a eulogy.

Every session starts from zero

Around the same time, I noticed I was living a smaller version of this every day with my coding agents. I'd open a fresh session and spend the first few minutes re-explaining the same things: why we wrap server-only deps this way, why we prefer this pattern over that one, which approach we already tried and abandoned. The agent is brilliant and completely amnesiac. Every conversation, I pay the same tax in tokens and in patience to rebuild context it had yesterday.

…and when I don't, it's confidently wrong

The flip side is worse. When I don't re-explain — because I'm moving fast, or I forgot this repo had that landmine — the agent does the reasonable thing. It exports the helper. It reaches for the obvious pattern. And it reintroduces exactly the class of mistake I already paid for — because nothing it learned in the last session, and nothing I learned in the last incident, is present this time.

Two failures, one root cause: the decision that would have prevented the mistake was real, but it wasn't anywhere the person — or the agent — about to act could read it first.

There was no layer for this

The knowledge isn't missing. It's in senior engineers' heads, in Slack threads, in PR review comments, in postmortems like the one above. The problem is that none of those are reachable by an agent at the moment it's writing code. We have a layer for source. A layer for dependencies. A layer for CI. We have nothing that sits between a codebase's decisions and the agents acting on them — no place that says "before you change this, here's what we already know."

The decisions were never the problem. Reaching them in time was.

The new hire with no one to ask

In the pre-agent era, this had a social fix. A new engineer who didn't know the codebase wouldn't just start committing — they'd wander over to someone senior first and ask: how do we usually do this? anything I should watch out for here? The expensive mistakes got caught in a two-minute conversation, before a line was written.

A fresh coding-agent session is exactly that new hire. Brilliant, fast, widely read — and completely new to your code, every single time. It doesn't matter whether the new hire is a junior or a principal engineer — or, in this case, a Claude session running Opus 4.8 that can hold your whole module in its head. If it doesn't know the hard-won convention, it can still ship something subtly wrong, or quietly against a best practice the team bled for in some earlier incident.

The difference is that the agent has had no one to ask. So give it someone. Except it doesn't walk to a desk or write a Slack message — it sends an MCP request to a teammate who has seen it all, and gets the answer back in the instant before it acts.

What Metatron is

Metatron is that teammate. It captures your team's real implementation decisions as structured records and serves them to coding agents over MCP, so an agent writing code can ask "what do we know about this?" and get the answer before it acts — not after a customer emails.

  • Decisions, not docs. Each record is a specific call — the pattern to prefer, the approach to avoid, the edge case that bites — with the rationale attached. The sharp rule would be one of them.
  • Human-gated canonical knowledge. Anything an agent relies on as ground truth is reviewed and promoted by a human first. The system proposes; people decide what becomes canonical.
  • It closes the loop. When an agent hits a gap the knowledge base doesn't cover, it reports back — so the next agent, and the next engineer, knows more than the last one did.

The demo walks through the whole thing end to end if you'd rather watch than read:

Why it's self-hosted and open-source

Your conventions are some of the most sensitive things you own — they describe how your system actually works and where its soft spots are. So Metatron runs on your own infrastructure, stores everything in a local SQLite database, and is MIT-licensed. No decision about your codebase leaves your machines, and you can read every line that handles it.


That one-day outage cost me real revenue and a few apologetic emails. But the more I sat with it, the less it looked like a bundling bug and the more it looked like a category error: we keep asking agents to make good decisions while giving them no way to know what we already decided. In a world where most code is about to be written by agents, the decision that prevents the next silent outage shouldn't live in someone's memory. It should be one query away.

That's the layer I'm building.