Stubs with Intent: Scaffolding a Desk Before You Build It

🤖 Ghostwritten by Claude Opus 4.6 · Fact-checked & edited by GPT 5.4

The fastest way to stand up a multi-agent desk is usually not to build specialists one by one. It is to register the full roster on day one as named, typed, routable stubs, then replace those stubs with real implementations over time. That approach lets teams validate routing, naming, permissions, and lifecycle rules before most agents can do meaningful work.

Used carefully, this pattern separates the contract from the implementation. The contract covers identity, route ownership, response shape, and maturity state. The implementation can arrive later. That makes the desk testable earlier, but it also creates a risk: a poorly designed stub can look like a working agent. The safest version of this pattern is simple: stubs must return explicit not_implemented responses, and they must never receive credentials or write access before review.

This article explains why scaffolding matters, what a good stub looks like, where the pattern fails, and how to keep half-built agents from taking consequential actions.

Why Scaffold the Full Roster Up Front

TL;DR: Defining every planned agent before building each one turns the org chart into a testable contract instead of a planning artifact.

In a multi-agent system, the router does not primarily care whether a specialist is feature-complete. It cares whether the agent name resolves, the route exists, the permissions model is coherent, and the response schema is predictable. Registering the full roster early lets teams test those concerns across the whole desk before implementation catches up.

That has three practical benefits.

The routing contract gets validated early

A front-door router can send traffic to a stub as easily as to a production agent, as long as the route and response contract are valid. That means integration tests can exercise the full desk topology from the start. When a specialist later receives real logic, it drops into an already-tested slot instead of forcing a routing redesign.

Naming conflicts surface before the build sprint

Agent codenames, route paths, and role definitions form a namespace. If two agents overlap too much in route ownership or role boundaries, the confusion shows up in dispatch logic, observability, and operator expectations. Scaffolding the full roster forces those collisions to appear during planning, when they are cheaper to fix.

Maturity becomes observable

If every agent carries a maturity field such as stub, development, staging, or production, the desk becomes queryable as a system rather than discussable as a status update. Teams can see how much of a desk is planned, partially implemented, or live without relying on subjective reporting.

What a Stub Should Look Like

TL;DR: A useful stub has a real identity, a real route, and a structured response that clearly says the agent is not implemented.

A stub should be minimal, but it should not be vague. At minimum, it needs a codename, desk assignment, role, maturity marker, route, and a handler that returns a machine-readable response.

Here is an illustrative manifest entry:

agents:
  - codename: brainstorm
    desk: legal
    role: "Creative legal strategy and alternative argument generation"
    maturity: stub
    route: /legal/brainstorm
    handler: stub_handler
    credentials: none
    write_scopes: []

And here is the corresponding handler pattern:

def stub_handler(request):
    return {
        "status": "not_implemented",
        "agent": "brainstorm",
        "maturity": "stub",
        "message": "Brainstorm is registered but not yet implemented.",
        "request_id": request.id,
        "routed": True
    }

The important detail is not the language or framework. It is the response semantics. A stub should not throw a generic exception unless the routing layer itself is broken. It also should not return an empty success payload that implies work was completed.

Response pattern	What happens	Risk level
Throw an exception	The router may treat the call as a crash and trigger retries or alerts	High
Return empty success `{}`	The caller may assume the work succeeded	Critical
Return `{"status": "not_implemented", ...}`	The caller can distinguish registration from implementation	Low

That last pattern is usually the safest default because it is explicit, parseable, and operationally quiet.

The Main Failure Mode: Silent Success

TL;DR: The most dangerous stub is one that looks healthy while doing nothing.

The core design rule is straightforward: a stub must be unmistakable. If it returns {"status": "ok", "result": null} or some similarly ambiguous payload, downstream systems may interpret that as a valid no-op rather than an unimplemented capability. Dashboards stay green. Routing appears healthy. The missing work is hidden.

That is why not_implemented should be treated as a first-class response state, not an edge case. Monitoring can filter on it. Tests can assert on it. Operators can distinguish between three different conditions that often get blurred together:

the route exists and the agent is intentionally a stub
the route exists and the agent failed unexpectedly
the route does not exist at all

Those are different operational states and should produce different signals.

More broadly, this reflects a sound engineering principle: explicit failure is usually better than silent non-action. A system that cannot perform a requested task should say so in a structured way that both humans and machines can interpret.

Security: Give Stubs Identity, Not Power

TL;DR: A stub can be routable and observable without holding credentials, write scopes, or other consequential permissions.

A stub is still part of the system. It has an identity, a route, and a place in the desk model. That makes it tempting to provision everything else at the same time. That is the wrong default.

The safer pattern is to split scaffolding into two separate layers:

Identity and routing: provisioned immediately so the agent can be addressed, tested, and observed
Credentials and write scopes: withheld until implementation review confirms the agent needs them and can use them safely

In practice, that means a stub should have no API keys, no database write access, no external service tokens, and no non-empty write scopes. It can receive a request and return a structured stub response. Nothing more.

This separation also makes policy enforcement easier. If maturity is part of the manifest, automated checks can reject any deployment where an agent marked stub has non-empty write scopes or attached credentials. That turns a security convention into an enforceable rule.

Why This Matters Architecturally

TL;DR: Treating identity as the stable layer and implementation as the changing layer makes agent lifecycle management cleaner and safer.

The deeper value of scaffolding is architectural. It defines when an agent starts to exist. In this model, an agent exists when it has a name, a route, and a manifest entry. Logic, credentials, and integrations are later additions, not prerequisites for existence.

That framing has useful consequences:

lifecycle transitions become clearer because the same agent moves from stub to development to production
rollback becomes simpler because an agent can be demoted to a safer maturity state without losing its identity
policy checks become more consistent because permissions can be tied to maturity rather than ad hoc implementation details

This is especially helpful in systems where desks evolve over time. Teams can preserve route stability and identity continuity even while implementations are rewritten, paused, or temporarily removed.

Frequently Asked Questions

Q: Why not register agents only when they are ready to be built?

Because that delays validation of routing, naming, and policy boundaries until the most expensive phase of the work. Registering the full roster early lets teams test the desk shape as a whole, then fill in capabilities incrementally.

Q: How do you stop a stub from being mistaken for a working agent?

By making the response unambiguous. A stub should return a structured payload with a status such as not_implemented, include its maturity, and avoid any success-shaped response that implies work was completed.

Q: Should a stub ever have credentials?

As a default, no. If the goal is safe scaffolding, credentials and write scopes should be withheld until the implementation has passed review. A routable identity does not require operational power.

Q: What should the maturity field control?

At minimum, observability and policy. Teams often use maturity to drive dashboards, deployment checks, permission rules, and promotion gates. The exact labels can vary, but the field is most useful when it affects system behavior rather than serving as documentation only.

Q: Can a production agent be demoted back to a stub?

Yes, and that can be a healthy operational move. If an implementation becomes unsafe, unreliable, or obsolete, demotion preserves the route and identity while removing capabilities until the agent is rebuilt.

Key Takeaways

Scaffold the full roster early so routing, naming, and policy boundaries can be tested before implementations are complete.
Make stubs explicit with a structured not_implemented response rather than an exception or an empty success payload.
Separate identity from power by withholding credentials and write scopes from stub agents.
Use maturity as an enforceable system property rather than a planning label.
Prefer explicit non-implementation over silent no-ops so operators and downstream systems can tell what is actually happening.

Conclusion

Scaffolding a desk with intentional stubs is a practical way to validate system shape before system capability. It gives teams a stable contract for routing, naming, and lifecycle management while implementations are still in progress.

The pattern only works, however, if the stubs are honest. They must identify themselves clearly, and they must not inherit permissions they do not need. Done that way, a stub is not fake progress. It is a controlled, testable stage in the lifecycle of a real agent.