
🤖 Ghostwritten by GPT 5.4 · Fact-checked & edited by Claude Opus 4.6
The clearest test of an agent fleet is no longer whether it runs on the original laptop, but whether it can be brought up cleanly on a brand-new Mac. That became the focus of recent hardening work: making portability a first-class property of fleet operations rather than an afterthought. The practical outcome was a documented and tooled path from "fresh Mac" to "running fleet," including path resolution, config reconciliation, auth-profile sync, and a doctor pass that verifies the environment.
The lesson is simple: a fleet that runs on only one machine is not production-ready. It carries undocumented machine-specific state, and that state becomes an outage the moment hardware fails or a second machine is added. Portability work forces hidden assumptions into the open. It requires a clean boundary between what belongs in the repo and what must stay machine-local, and it turns setup from tribal knowledge into a repeatable system.
This build log covers what changed, why the repo-versus-host split became the core design decision, where the hidden state was found, and why the template-versus-live config boundary is as much a security control as an operational one.
TL;DR: Portability was not blocked by code; it was blocked by configuration, paths, and credentials that quietly existed only on the original machine.
The hardest part of getting a fleet onto a new Mac is not cloning the repo. It is discovering everything that has accreted around the repo over time. Agent systems tend to collect machine assumptions in small, easy-to-miss places: an absolute path in a wrapper, a local profile that was created manually, a runtime file that nobody intended to keep forever, or a credential reference that works only because it was once set up interactively.
That is the trap with multi-agent systems. The visible architecture looks portable because the code is version-controlled, but the actual runtime depends on invisible state. In practice, the fleet is only as portable as the least-documented machine-local dependency.
The work shifted from "can this run elsewhere?" to "can every required piece be named, classified, and regenerated?" That led to a more explicit doctrine:
This is not a new lesson in software operations, but agent fleets amplify it because they often combine:
As developers increase their use of automation and AI-assisted workflows across the software lifecycle, environment consistency becomes more important, not less. As more work moves through local automation, the cost of undocumented setup rises with it. Apple's continued transition toward security-hardened defaults on macOS has also made ad hoc local setup more brittle over time; workflows that rely on "just copy what worked before" tend to fail in subtle ways on a fresh machine.
The biggest insight was that portability work is mostly discovery work. The value comes from finding the state that silently lives only on the original machine and externalizing it into templates, scripts, and documented steps.
TL;DR: The repo/host-local split is the core design decision because it clearly defines what can travel and what must be created per machine.
Once the problem was framed correctly, the architecture became much easier to reason about. The fleet splits into two layers:
That boundary sounds obvious, but it changes how configuration is designed. Instead of treating config as one file that somehow needs to work everywhere, the system treats config as two related artifacts: a template that expresses the portable shape of the system, and a live machine-local config that expresses how that shape is realized on one specific Mac.
Here is the illustrative split:
| Layer | Lives where | Contains | Commit status | Purpose |
|---|---|---|---|---|
| Portable repo | Git repo | Agent code, wrappers, config template, validation scripts, doctor scripts, docs | Committed | Defines the fleet structure and expected inputs |
| Machine-local runtime | Host machine | Live config, resolved secret references, local clone paths, machine-specific runtime state | Not committed | Adapts the fleet to one machine safely |
That split turned "config as template" into a practical operating model rather than a documentation idea. The template describes what the fleet expects: agents, channels, wrappers, profiles, and required settings. The live config answers the machine-specific questions: where the local clones are, which local paths should be used, and how secret references resolve on that host.
The practical benefit is that a new Mac no longer needs a copy of somebody else's working state. It needs the repo, a way to resolve secrets on that machine, and a reconciliation step that produces a valid local runtime.
This also improves change management. A template can evolve in version control, be reviewed, and be validated. A live config can remain local, explicit, and disposable. If the machine is replaced, the runtime can be regenerated instead of recovered through guesswork.
TL;DR: The new flow is explicit and repeatable: resolve paths, reconcile template to live config, sync auth profiles, then run doctor.
The hardening work was less about adding one new tool and more about making the entire bring-up path coherent. The result was a RUNNING-ON-A-NEW-MACHINE guide backed by scripts that reflect the actual setup sequence.
The path now looks like this:
A fresh Mac rarely mirrors the exact directory layout of the original machine. That matters more than it seems because wrappers, launch scripts, and agent entrypoints often assume stable paths.
Instead of hardcoding one laptop's filesystem layout, the fleet now resolves local clone paths per machine. That makes path resolution an explicit setup concern rather than an accidental dependency. If a machine keeps repos in a different local directory, the runtime config records that locally.
The repo carries a config template, not the active runtime config. On a new machine, a reconcile step materializes the host-local live config from that template and fills in the machine-specific values that should never be committed.
This is the core portability move. The template defines the expected structure; reconciliation creates the local truth.
A sanitized example of that pattern:
## repo: config.template.yaml
agents:
sparkles:
enabled: true
workspace_path: "${LOCAL_REPOS}/sparkles"
auth_profile: "sparkles-default"
concierge:
enabled: true
workspace_path: "${LOCAL_REPOS}/concierge"
auth_profile: "concierge-default"
secrets:
provider: "1password"
mode: "file-backed-refs"## host-local: config.live.yaml
agents:
sparkles:
enabled: true
workspace_path: "/Users/your-user/dev/sparkles"
auth_profile: "sparkles-default"
concierge:
enabled: true
workspace_path: "/Users/your-user/work/concierge"
auth_profile: "concierge-default"
secrets:
provider: "1password"
mode: "file-backed-refs"
resolved_refs_dir: "~/.fleet/secrets"The fleet depends on more than one credential context. Different agents may need different profiles, scopes, or provider-specific auth state. Those profiles now sync from the secrets manager per machine rather than being copied from an older laptop.
That matters operationally because copied auth state is hard to reason about. Fresh sync makes provenance clearer and supports least privilege.
The final step is a doctor pass that verifies the assembled environment. It checks whether channels, wrappers, config expectations, and related runtime assumptions are actually satisfied.
This is the difference between "setup completed" and "fleet is runnable." The doctor closes the loop.
| Step | Input | Output | Failure mode caught |
|---|---|---|---|
| Path resolution | Local clone locations | Machine-correct paths | Broken wrappers, missing repos |
| Config reconcile | Repo template + local values | Host-local live config | Stale keys, missing machine values |
| Auth-profile sync | Secrets manager + local machine | Per-agent local auth state | Missing scopes, absent profiles |
| Doctor | Assembled runtime | Verified readiness report | Channel, wrapper, and runtime mismatches |
TL;DR: The template-vs-live split is not just cleaner engineering; it is a security boundary that keeps secrets and runtime state out of version control.
Portability work often gets described as a convenience feature. In this case, it also tightened security.
The first rule is that host-local runtime config must not be committed. That includes machine paths, active runtime details, and any resolved secret references. Once those values enter the repo, the portability boundary collapses. The template is meant to be shared; the live config is not.
The second rule is that secrets resolve per machine from a secrets manager rather than being baked into files or copied from another host. A new Mac should not inherit a previous machine's secret material by file transfer. It should authenticate to the secrets system and resolve what it is permitted to use.
A generic pattern for that:
## sanitized example
op read "op://{vault}/{item}/{field}" > ~/.fleet/secrets/provider-tokenThe important point is not the exact command. The point is the contract: secret references are resolved locally on the machine that will use them.
The third rule is fresh least-privilege provisioning. A new machine should receive only the credentials and scopes it needs. That reduces blast radius and makes deprovisioning cleaner if the machine is retired.
These practices align with widely accepted security guidance. NIST's Secure Software Development Framework (SP 800-218) emphasizes protecting sensitive configuration and using controlled processes for software environments. OWASP guidance has also long treated secrets management and environment separation as core operational controls rather than optional hygiene.
In practical terms, the security model became:
That model is more work upfront, but it scales better than machine cloning and is much easier to audit.
TL;DR: The hard part of portability was not writing scripts; it was identifying every assumption that had never been written down.
Build logs are most useful when they include the uncomfortable parts. The uncomfortable part here is that hidden state accumulates naturally. It does not require bad engineering. It only requires time, a working system, and enough local fixes that nobody notices which ones became dependencies.
That is why portability hardening can feel slower than expected. Each failure on the new Mac is usually a clue that something important existed only as local memory or local state on the original machine. A wrapper expects a path nobody documented. A profile exists because it was created manually months ago. A runtime file looks generated, but actually contains hand-edited values that matter.
The portability project succeeded because those assumptions were treated as bugs in system design, not as setup quirks.
A few practical lessons stood out:
Copying a working config from one machine to another feels fast, but it preserves unknowns. Reconciliation forces the system to declare what belongs in the template and what must be supplied locally.
Verification should not be a rescue tool used only when something breaks. In a portable fleet, doctor is part of normal bring-up and normal change validation.
If a human must edit it, its shape and lifecycle deserve as much care as the code. Poorly designed local config is where portability efforts stall.
A laptop failure is not the root problem. The root problem is runtime knowledge that exists nowhere except that laptop.
This is why portability belongs inside fleet operations, not outside it. A portable fleet is easier to recover, easier to extend to another machine, and easier to reason about when something changes.
A config template defines the portable structure of the system: expected agents, settings, and placeholders. Machine-local config is the realized version for one host, including local paths, resolved references, and runtime details that should not be committed. The template travels with the repo; the live config is generated per machine and treated as disposable.
Because a complete live config usually contains machine-specific paths, runtime assumptions, and potentially sensitive references. Committing that file mixes portable intent with host-local state, which hurts both security and portability. It also creates merge conflicts every time two machines diverge, which they inevitably do.
Per-machine secret resolution creates a cleaner security boundary and supports least privilege. It also makes credential rotation, revocation, and machine retirement more manageable because each host has its own provisioned access path. Copied credentials, by contrast, create an invisible dependency chain that is difficult to audit or revoke cleanly.
At minimum, it should verify config integrity, expected wrappers, channel readiness, required local paths, and whether auth-related prerequisites are present. The goal is to catch mismatches before an agent fails at runtime. A good doctor script also reports which checks passed, not just which failed, so operators can confirm coverage.
A fleet becomes portable when a new machine can go from fresh setup to verified runtime through documented, repeatable steps without copying hidden state from the original host. If success still depends on undocumented manual fixes, portability is not complete.
The fleet's portability story has become concrete: a fresh Mac can move toward a running system through a defined sequence instead of personal memory. That is the real milestone. Portability does not come from making one laptop more important; it comes from making any single laptop less special. The template-versus-live boundary, per-machine secret resolution, and a doctor-verified bring-up path together form a system that can survive hardware changes, scale to additional machines, and remain auditable throughout.
Discover more content: