
We ran a Postgres-backed task queue for inter-agent coordination for about four months. It is now gone. In its place: a folder of markdown files, a frontmatter schema, and a git history. This is the deep-dive on why that swap held up, what the schema looks like in practice, and where it absolutely does not work.
The broader monorepo restart story gets its own post later this month. This one is narrower โ just the project-management layer between agents.
The first version of crew coordination was the version anyone would write. A tasks table in Postgres with the obvious columns: id, requester_agent, executor_agent, status, payload jsonb, created_at, updated_at. A worker loop on each agent polled for status = 'pending' AND executor_agent = $self. A small Slack notifier posted state transitions into the agents channel.
It worked. For one or two agents and a single task shape, it was fine.
Schema migrations every time a new agent joined. Sparkles needed an email_account_id on the payload. Soundwave needed attachments_uri[]. Wheeljack wanted branch_name and pr_url. Each new agent meant either widening the payload jsonb (and watching the implicit contract drift) or adding another columnar bolt-on. We had Alembic running for migrations no human reviewer could meaningfully review.
Race conditions that only happened in production. Two workers occasionally grabbed the same row before the row-level lock landed. We added SELECT โฆ FOR UPDATE SKIP LOCKED. That fixed the duplicate execution but introduced a new failure mode where a crashed worker held an invisible advisory lock that didn't release until session timeout. We were now writing distributed-systems code to coordinate four agents on a Mac mini.
Observability black box. When something went wrong, the answer to "what did Sparkles ask Soundwave to do?" was a Postgres query. The answer to "what was the state of that task three hours ago?" was "nothing, we don't keep history on the row." We added an audit_log table. Now we had two tables to keep in sync.
The killer: prompts couldn't see the work. This was the real one. When Optimus Prime โ our orchestration layer โ handed a task to Wheeljack, what Wheeljack actually wanted was the briefing: the context, the asks, the constraints, the edge cases the requester already thought through. A row in a database is a terrible briefing. Models read prose, not relational schemas.
We started writing handoffs as markdown files in a shared workspace, organized by date and series. Each handoff is a single file with frontmatter as the schema and a free-text body as the briefing. Git tracks history. The agents channel just gets a "new handoff: " ping with a one-line summary.
There is nothing novel here. Maildir-style file queues for agents, markdown task files like tick-md, and hybrid MCP-served mailboxes have been kicking around for the last year. AutoGen, the OpenAI Agents SDK, and LangGraph all treat the handoff as a first-class primitive. Even the recent New Yorker Altman piece โ published April 7 โ leaned heavily on internal documents and memos rather than ticket-tracker exports, which is a tell. Scratch a coordination problem at any scale and you find a paper trail.
What was novel for us was being honest about which parts of the database we were using as a database and which parts we were abusing as a notebook.
Every handoff file carries the same frontmatter:
task_id: 2026-04-11-sparkles-to-wheeljack-001
requester: sparkles
executor: wheeljack
status: open # open | in_progress | blocked | done | rejected
created_at: 2026-04-11T09:14:00-04:00
updated_at: 2026-04-11T09:14:00-04:00
blockers: []
artifacts: [] # paths or URLs the executor should produce
owner_for_review: optimus-prime
parent_task: null
links: []The body is freeform markdown. Required sections are ## Context, ## Ask, ## Constraints, ## Done When. The executor appends a ## Notes section as it works and a ## Result section on completion. State changes happen in two places: the executor edits the frontmatter status and writes a one-line entry to a per-agent log file. Git records who did what.
Sparkles โ Soundwave (an email task.) Sparkles needs the most recent thread with a specific vendor across all twelve mail accounts. The handoff body says exactly that, names the vendor, names the date floor, and lists the artifact: a single markdown summary written back to a known location. Soundwave reads it, runs the searches, writes the summary, flips status to done, and pings the channel. Sparkles' next loop iteration reads the result file. No row was modified. No migration was needed when, two weeks later, we added a 13th account.
Sparkles โ Wheeljack (a code change.) Sparkles has a build-log entry that says "the publisher mis-detects social filters." The handoff to Wheeljack carries the symptom, the relevant module name, the constraint that the fix must not change the publish-time behavior, and the Done When: a passing test plus a draft PR. Wheeljack works in a feature branch, writes its ## Notes as it goes, fills in ## Result with the PR URL, flips status to done, and assigns review to Optimus Prime via owner_for_review. The whole loop is legible to a human reading one file.
Three places, plainly:
parent_task and links, but past about three levels of nesting, you start wanting a real graph store. We haven't crossed that threshold; if we do, the answer is probably a small graph DB layered on top of the files, not replacing them.Three things, all of which are real database workloads:
git log across a thousand files.The win wasn't "files beat databases." It was admitting that what we had been calling a task queue was actually a document store wearing a relational costume. We moved the documents to where documents live, and we kept the database for the rows that are actually rows. Four months in, no migrations, and every handoff is grep-able.
## Context, ## Ask, ## Constraints, ## Done When) make handoffs legible to both humans and prompts. Models read prose, not relational schemas.Discover more content: