The Model Is Not the Agent: Identity ADR

🤖 Ghostwritten by GPT 5.4 · Fact-checked & edited by Claude Opus 4.6

An architecture decision record can sound like paperwork until it removes confusion from every technical decision that follows. That is what happened with ADR 0001, written in early May: an agent's identity is the durable bundle of role, system prompt, tools, permissions, and cadence. The model underneath is not the agent. It is one swappable runtime attribute.

That distinction turned out to be foundational for fleet design. It made model rotation possible without redefining who an agent is. It made it easier to assign one model family to manager agents and another to specialists. It also exposed redundancy hiding in plain sight: if two agents share the same role, prompt, tools, permissions, and schedule, they are not two agents with different identities. They are one agent with multiple possible inference engines.

The practical result was immediate. Two redundant pairs were collapsed into one each: Wheeljack and Perceptor became Wheeljack, and Hot Rod and Springer became Hot Rod. The development fleet moved from 11 agents to 9. More importantly, the team gained a cleaner mental model: reason about agents as roles with boundaries, not as model deployments with personalities.

Why ADR 0001 Mattered

TL;DR: Defining agent identity separately from the model made the system easier to reason about, safer to change, and simpler to scale.

Before this ADR, there was a subtle but expensive ambiguity in the design language. Teams would casually say things like "the Claude agent" or "the GPT reviewer," which sounds harmless until architecture starts to calcify around provider choice. Once that happens, every model swap feels like a migration, every benchmark result feels like a re-org, and every agent starts to inherit its identity from a vendor runtime rather than from its actual job.

ADR 0001 cut through that ambiguity with a simple rule: the model is not the agent. The agent is the durable operational contract. That contract includes:

The role it plays in the fleet
The system prompt that defines its behavior
The tools it can call
The permissions that bound its authority
The cadence on which it runs or is invoked

The model sits underneath that contract as an implementation detail. Important, yes. Identity-defining, no.

That may sound like semantics, but semantics become architecture surprisingly fast. In practice, a model-agnostic definition changes how a team evaluates almost everything:

A provider swap becomes runtime tuning, not identity change
A benchmark regression becomes a quality issue, not a fleet redesign
A new specialist model becomes a better engine for an existing role, not a new agent category
A security review stays anchored to permissions as identity, rather than drifting with provider capabilities

This framing also aligns with how modern AI systems are actually operated. Different tasks benefit from different model characteristics. Some workloads reward long-context reasoning. Others need low latency, predictable formatting, or lower cost for repeated execution. Treating the fleet as model deployments makes those trade-offs harder to manage. Treating the fleet as role definitions makes them manageable.

A related industry pattern reinforces the point. The National Institute of Standards and Technology's AI Risk Management Framework emphasizes governance, controls, and system behavior over model branding alone. Likewise, the OWASP Top 10 for LLM Applications focuses on tool access, data exposure, and authorization boundaries — not just base model choice. The lesson is consistent: operational identity lives in behavior and access, not only in inference.

What Counts as Agent Identity

TL;DR: Agent identity is the stable definition of role, prompt, tools, permissions, and cadence; the model field is intentionally swappable.

The easiest way to make this concrete is to show the shape of an agent definition. The exact production format is less important than the design intent: most fields define identity, while one field defines the current engine.

agent:
  name: "Wheeljack"
  role: "development workflow orchestrator"
  system_prompt: "prompts/wheeljack-system.md"
  tools:
    - "github"
    - "issue-tracker"
    - "docs-search"
    - "slack"
  permissions:
    github:
      repos: ["repo-a", "repo-b"]
      access: "read"
    issue_tracker:
      projects: ["engineering"]
      access: "write"
    docs_search:
      collections: ["internal-docs"]
      access: "read"
    slack:
      channels: ["#dev-ops", "#build-status"]
      access: "post"
  cadence:
    type: "scheduled"
    schedule: "weekday-mornings"
  model:
    provider: "anthropic"
    name: "YOUR_MODEL_NAME"
    override_allowed: true

The point of this structure is not YAML syntax. The point is separation of concerns.

Role is the job description

The role defines what the agent is for. Not its tone. Not its provider. Not its benchmark score. Its job. A role should be stable enough that an engineer can answer, in one sentence, why the agent exists.

System prompt is behavioral policy

The system prompt is part of identity because changing it can materially change the agent's operating behavior. A prompt tweak that sharpens formatting is minor. A prompt rewrite that changes review criteria, escalation behavior, or decision thresholds is identity change.

Tools and permissions are operational boundaries

Tools define what the agent can do. Permissions define how far it can go. Together, they are the difference between an observer, an editor, and an actor in the system.

Cadence matters more than it first appears

Cadence often gets ignored in early prototypes, but it belongs in identity. A continuously running monitor is not the same agent as an on-demand analyst, even if they share the same prompt and tools. Schedule changes alter system behavior, operational expectations, and failure modes.

Model is a runtime choice

The model field is present because it matters. It affects quality, latency, cost, and structured output behavior. But it is still a swappable field. If Wheeljack runs one provider this week and another next week, Wheeljack remains Wheeljack as long as role, prompt, tools, permissions, and cadence stay the same.

That is what "model agnostic" means in this context. It does not mean every model performs equally well. It means the architecture does not confuse engine selection with identity.

The Fleet-Design Consequence: 11 Agents Became 9

TL;DR: Once identity was defined correctly, two pairs of "different" agents were revealed to be duplicates and were retired.

ADR 0001 was not just a philosophy document. It forced a cleanup.

Once the team evaluated agent identity using the new standard, two redundant pairs stood out:

Wheeljack and Perceptor
Hot Rod and Springer

Those pairs had been treated as separate agents, but under the ADR lens they failed the identity test. They shared the same role. They used the same system prompt. They had the same tools. They operated with the same permissions. They ran on the same cadence. The only meaningful difference was model assignment.

That meant they were never actually two agents.

The result was a deliberate consolidation:

Before ADR 0001	After ADR 0001	Reason
Wheeljack + Perceptor	Wheeljack	Same role, prompt, tools, permissions, and cadence
Hot Rod + Springer	Hot Rod	Same role, prompt, tools, permissions, and cadence

The development fleet went from 11 agents to 9. This was not a simplification exercise for its own sake. It was a correction in ontology. The architecture decision record exposed that the naming scheme had drifted away from the actual system design.

Duplicate identities create noise everywhere:

Monitoring gets split across names that represent the same operational contract
Incident review gets muddled because behavior differences are attributed to "different agents" instead of model variation
Prompt management becomes harder because one conceptual role appears in multiple places
Security review becomes fragile because duplicated definitions can drift out of sync

There is also a subtler engineering benefit. Once duplicate agents are collapsed, evaluation gets cleaner. The team can compare model rotation within one agent definition instead of pretending to compare different agents. That produces better experiments because the independent variable is actually isolated.

Why Permissions as Identity Is the Security Boundary

TL;DR: If permissions are part of identity, then model rotation must never change what an agent is allowed to access or modify.

The security implication of ADR 0001 is more important than the naming implication.

If identity includes permissions, then the authorization boundary travels with the role definition. That means a model swap cannot silently widen access. A stronger model, a cheaper model, or a faster model is still bound by the same tool and write scopes. If the permissions change, the identity has changed.

This is where "permissions as identity" stops being a design slogan and becomes an operational rule.

Model upgrades should be boring from an access-control perspective

A healthy model rotation process should look something like this:

Change	Identity changed?	Security review required?
Swap provider for same role and same permission set	No	Standard regression review
Add a new write-capable tool	Yes	Yes
Expand repository access from read to write	Yes	Yes
Change cadence from manual to autonomous scheduled execution	Yes	Yes
Rewrite system prompt to authorize new actions	Yes	Yes

This separation is useful because model performance often changes faster than governance rules do. Providers release new versions. Context windows expand. Tool-use reliability improves. Cost profiles shift. None of that should accidentally alter who the agent is allowed to be in the system.

The OWASP guidance for LLM systems is relevant here because many real failures happen at the boundary between model output and external action. Tool invocation, data exfiltration, and over-broad authorization are system failures, not just model failures. By tying permissions to identity, the architecture creates a stable review surface.

There is also a practical debugging advantage. When an agent behaves unexpectedly after a model rotation, engineers can ask one clean question first: did identity change, or only runtime? If only runtime changed, the investigation can focus on output quality, tool-call behavior, latency, or formatting. If permissions changed, that is a different class of incident.

The architecture decision record effectively turned security into part of the agent definition rather than an afterthought attached to deployment.

What This Unlocked for Model Rotation and Manager-Specialist Design

TL;DR: Treating agents as roles rather than model deployments made mixed-model fleets easier to operate and reason about.

Once the model stopped carrying identity weight, the fleet became easier to compose.

That opened up two practical patterns.

Manager and specialist agents can use different models

Some agents exist to coordinate, triage, summarize, or route work. Others exist to perform narrower specialist tasks with high consistency. Those are different jobs, and they often benefit from different model characteristics.

A manager agent may need stronger planning or synthesis. A specialist may need lower latency or tighter formatting discipline. If the fleet is model-agnostic at the identity layer, those choices can be made per task without creating a new conceptual agent every time a provider decision changes.

Model rotation becomes a controlled experiment

With identity fixed, model rotation becomes testable. Engineers can compare outputs, tool behavior, and cost trade-offs while holding role, prompt, permissions, and cadence constant. That is much closer to a valid experiment than comparing two differently named agents with hidden configuration drift.

This also improves fleet design discussions. Instead of asking "Should there be a Claude agent and a GPT agent?" the better question becomes "Which model best serves this role under this permission set and cadence?" That is a more architectural question and a less tribal one.

By early May, this framing also made later infrastructure work easier because the system could port, schedule, and monitor agents as stable identities with configurable runtimes. The cleaner the identity model, the less friction in every surrounding layer.

Frequently Asked Questions

Q: What is an architecture decision record in an AI agent system?

An architecture decision record, or ADR, is a short document that captures an important technical decision, the reasoning behind it, and the consequences of choosing it. In AI agent systems, ADRs are especially useful because small naming or configuration choices can ripple into prompting, security, orchestration, and evaluation. The ADR format — originally popularized by Michael Nygard — provides a lightweight structure for recording context, decision, and consequences so future engineers understand not just what was decided but why.

Q: Why say the model is not the agent?

Because the model is only one part of the runtime, while the agent's durable identity is its role, prompt, tools, permissions, and cadence. If a provider swap changes the agent's name or conceptual identity, the architecture is likely coupling behavior to vendor choice too tightly. Decoupling identity from model also prevents organizational lock-in, where team discussions and documentation become provider-specific rather than role-specific.

Q: What does model agnostic mean for agent identity?

In this context, model agnostic means the architecture can rotate or override models without redefining the agent itself. It does not mean all models are interchangeable in performance — different models have meaningfully different strengths. It means the system keeps identity stable while treating model selection as an implementation choice that can be tuned, benchmarked, and swapped independently.

Q: Why are permissions considered part of agent identity?

Permissions determine what the agent is allowed to read, write, trigger, or publish. That authority is fundamental to who the agent is in the system, so changing permissions is not a minor config tweak — it is an identity change that should be reviewed with the same rigor as a role change. This also creates a natural security boundary: model rotation cannot silently widen access.

Q: How do you know when two agents are actually duplicates?

If two agents share the same role, system prompt, tools, permissions, and cadence, they are functionally the same identity even if they use different models. In that case, the cleaner design is one agent definition with a swappable model field rather than two separately named agents. The test is straightforward: if you cannot articulate a difference in the five identity dimensions, you have one agent with two engines.

Key Takeaways

ADR 0001 defined agent identity as role, system prompt, tools, permissions, and cadence
The model is a swappable runtime attribute, not the identity itself
This model-agnostic approach improved fleet design and made model rotation easier to reason about
Permissions as identity created a clearer security boundary for tool and write access
The ADR justified collapsing Wheeljack plus Perceptor into Wheeljack, and Hot Rod plus Springer into Hot Rod
That consolidation reduced the development fleet from 11 agents to 9 by removing duplicate identities
Mixed-model manager and specialist patterns become easier to operate when roles stay stable

The most useful architecture decisions are the ones that keep paying rent after the meeting ends. "The model is not the agent" became that kind of decision. It simplified naming, clarified security, improved evaluation, and made the fleet easier to evolve without losing conceptual integrity.