
🤖 Ghostwritten by Claude Opus 4.6 · Fact-checked & edited by GPT 5.4
The fastest way to stand up a multi-agent desk is usually not to build specialists one by one. It is to register the full roster on day one as named, typed, routable stubs, then replace those stubs with real implementations over time. That approach lets teams validate routing, naming, permissions, and lifecycle rules before most agents can do meaningful work.
Used carefully, this pattern separates the contract from the implementation. The contract covers identity, route ownership, response shape, and maturity state. The implementation can arrive later. That makes the desk testable earlier, but it also creates a risk: a poorly designed stub can look like a working agent. The safest version of this pattern is simple: stubs must return explicit not_implemented responses, and they must never receive credentials or write access before review.
This article explains why scaffolding matters, what a good stub looks like, where the pattern fails, and how to keep half-built agents from taking consequential actions.
TL;DR: Defining every planned agent before building each one turns the org chart into a testable contract instead of a planning artifact.
In a multi-agent system, the router does not primarily care whether a specialist is feature-complete. It cares whether the agent name resolves, the route exists, the permissions model is coherent, and the response schema is predictable. Registering the full roster early lets teams test those concerns across the whole desk before implementation catches up.
That has three practical benefits.
A front-door router can send traffic to a stub as easily as to a production agent, as long as the route and response contract are valid. That means integration tests can exercise the full desk topology from the start. When a specialist later receives real logic, it drops into an already-tested slot instead of forcing a routing redesign.
Agent codenames, route paths, and role definitions form a namespace. If two agents overlap too much in route ownership or role boundaries, the confusion shows up in dispatch logic, observability, and operator expectations. Scaffolding the full roster forces those collisions to appear during planning, when they are cheaper to fix.
If every agent carries a maturity field such as stub, development, staging, or production, the desk becomes queryable as a system rather than discussable as a status update. Teams can see how much of a desk is planned, partially implemented, or live without relying on subjective reporting.
TL;DR: A useful stub has a real identity, a real route, and a structured response that clearly says the agent is not implemented.
A stub should be minimal, but it should not be vague. At minimum, it needs a codename, desk assignment, role, maturity marker, route, and a handler that returns a machine-readable response.
Here is an illustrative manifest entry:
agents:
- codename: brainstorm
desk: legal
role: "Creative legal strategy and alternative argument generation"
maturity: stub
route: /legal/brainstorm
handler: stub_handler
credentials: none
write_scopes: []And here is the corresponding handler pattern:
def stub_handler(request):
return {
"status": "not_implemented",
"agent": "brainstorm",
"maturity": "stub",
"message": "Brainstorm is registered but not yet implemented.",
"request_id": request.id,
"routed": True
}The important detail is not the language or framework. It is the response semantics. A stub should not throw a generic exception unless the routing layer itself is broken. It also should not return an empty success payload that implies work was completed.
| Response pattern | What happens | Risk level |
|---|---|---|
| Throw an exception | The router may treat the call as a crash and trigger retries or alerts | High |
Return empty success {} |
The caller may assume the work succeeded | Critical |
Return {"status": "not_implemented", ...} |
The caller can distinguish registration from implementation | Low |
That last pattern is usually the safest default because it is explicit, parseable, and operationally quiet.
TL;DR: The most dangerous stub is one that looks healthy while doing nothing.
The core design rule is straightforward: a stub must be unmistakable. If it returns {"status": "ok", "result": null} or some similarly ambiguous payload, downstream systems may interpret that as a valid no-op rather than an unimplemented capability. Dashboards stay green. Routing appears healthy. The missing work is hidden.
That is why not_implemented should be treated as a first-class response state, not an edge case. Monitoring can filter on it. Tests can assert on it. Operators can distinguish between three different conditions that often get blurred together:
Those are different operational states and should produce different signals.
More broadly, this reflects a sound engineering principle: explicit failure is usually better than silent non-action. A system that cannot perform a requested task should say so in a structured way that both humans and machines can interpret.
TL;DR: A stub can be routable and observable without holding credentials, write scopes, or other consequential permissions.
A stub is still part of the system. It has an identity, a route, and a place in the desk model. That makes it tempting to provision everything else at the same time. That is the wrong default.
The safer pattern is to split scaffolding into two separate layers:
In practice, that means a stub should have no API keys, no database write access, no external service tokens, and no non-empty write scopes. It can receive a request and return a structured stub response. Nothing more.
This separation also makes policy enforcement easier. If maturity is part of the manifest, automated checks can reject any deployment where an agent marked stub has non-empty write scopes or attached credentials. That turns a security convention into an enforceable rule.
TL;DR: Treating identity as the stable layer and implementation as the changing layer makes agent lifecycle management cleaner and safer.
The deeper value of scaffolding is architectural. It defines when an agent starts to exist. In this model, an agent exists when it has a name, a route, and a manifest entry. Logic, credentials, and integrations are later additions, not prerequisites for existence.
That framing has useful consequences:
stub to development to productionThis is especially helpful in systems where desks evolve over time. Teams can preserve route stability and identity continuity even while implementations are rewritten, paused, or temporarily removed.
Because that delays validation of routing, naming, and policy boundaries until the most expensive phase of the work. Registering the full roster early lets teams test the desk shape as a whole, then fill in capabilities incrementally.
By making the response unambiguous. A stub should return a structured payload with a status such as not_implemented, include its maturity, and avoid any success-shaped response that implies work was completed.
As a default, no. If the goal is safe scaffolding, credentials and write scopes should be withheld until the implementation has passed review. A routable identity does not require operational power.
At minimum, observability and policy. Teams often use maturity to drive dashboards, deployment checks, permission rules, and promotion gates. The exact labels can vary, but the field is most useful when it affects system behavior rather than serving as documentation only.
Yes, and that can be a healthy operational move. If an implementation becomes unsafe, unreliable, or obsolete, demotion preserves the route and identity while removing capabilities until the agent is rebuilt.
not_implemented response rather than an exception or an empty success payload.Scaffolding a desk with intentional stubs is a practical way to validate system shape before system capability. It gives teams a stable contract for routing, naming, and lifecycle management while implementations are still in progress.
The pattern only works, however, if the stubs are honest. They must identify themselves clearly, and they must not inherit permissions they do not need. Done that way, a stub is not fake progress. It is a controlled, testable stage in the lifecycle of a real agent.
Discover more content: