
π€ Ghostwritten by Claude Opus 4.6 Β· Fact-checked & edited by GPT 5.4 Β· Curated by Tom Hundley
If you run agents in production, your naming scheme is an operational decision, not a branding exercise. We renamed our repos, worker IDs, log prefixes, and config namespaces to use plain business names because codenames slowed debugging, complicated onboarding, and made alerts harder to act on. When an alert says email-triage-worker-01 failed heartbeat, the responder knows what broke immediately. When it says Soundwave failed heartbeat, someone has to translate the codename before they can start fixing the problem.
That translation cost sounds small until it shows up in every log search, every handoff, every runbook, and every 2 AM incident. We learned that the hard way. Our old fleet used Transformer-style codenames across parts of the stack. It was fun in conversation and costly in operations.
The rule we adopted during the platform kernel rebuild is simple: business names in code, codenames for humans. It affects repo names, file paths, alerts, database tables, and API identifiers. Getting it right removes friction from every operational task that follows.
TL;DR: Use descriptive business names in code, logs, file paths, APIs, and infrastructure; reserve codenames for conversation and optional human-facing context.
The rule comes directly from our file-based operating model: operational clarity wins in code, file paths, logs, APIs, and infrastructure. Human-facing codenames are allowed, but they are secondary.
This is not about banning personality from a team. It is about reducing the distance between what an operator sees and what the system is doing. Googleβs SRE guidance consistently emphasizes reducing cognitive load during incident response. A codename-to-function translation step adds avoidable overhead at exactly the wrong moment.
A business name describes what the component does, not what the team happens to call it internally.
| Layer | Codename (old) | Business Name (new) | Why it works better |
|---|---|---|---|
| Repo | ess-agents-slack-sparkles |
ess-agent-platform |
New engineer immediately understands scope |
| Log prefix | [Soundwave] |
[email-triage] |
Grep-friendly and self-describing |
| Heartbeat ID | soundwave-01 |
email-triage-worker-01 |
Alert identifies the failed capability |
| Slack command | /sparkles status |
/fleet status |
Operator intent is clearer |
| DB table | sparkles_sessions |
operator_sessions |
Survives agent renames |
| Config key | soundwave.inbox_path |
email_triage.inbox_path |
No translation required |
The pattern is simple: if a machine reads it, use the business name. If a human says it in a hallway conversation, a codename is fine.
TL;DR: Our old naming scheme forced operators to maintain a mental lookup table, and that lookup table became a source of delay and tribal knowledge.
Before the rebuild, a typical alert looked like this:
Error: Connection timeout in Soundwave worker
Project: ess-agents-soundwave
Environment: productionTo act on that alert, the responder had to remember that Soundwave meant the email triage agent, then remember where its inbox config lived, then remember how that worker was scheduled, and then reconcile those names with whatever the orchestrator called it. That is unnecessary work during an incident.
Now multiply that across a fleet of agents with inconsistent naming conventions, and the naming scheme itself becomes operational drag.
The bigger problem was that the mapping lived mostly in one personβs head. As we described in our journal about why we stopped building agents and restarted the platform, the fleet had accumulated too much tribal knowledge. The codename system made that worse.
Stripeβs 2023 Developer Coefficient report found that developers lose substantial time each week to inefficiencies such as technical debt, poor tooling, and interruptions. That does not prove codenames alone cost a fixed number of hours, but it does support the broader point: comprehension friction is expensive. Codenames in production add comprehension work without adding operational value.
A useful test is this: if a new engineer needs a glossary before they can read your logs, your naming scheme is doing the wrong job.
TL;DR: We renamed from the platform outward, standardized identifiers around business domains, and used a temporary deprecation map to track old-to-new references during the transition.
The rename happened as part of the platform kernel rebuild, not as a standalone cleanup project. That matters. Renaming a live system in place is risky. Renaming while you are already consolidating architecture gives you cleaner boundaries and fewer compatibility hacks.
The consolidated repo is ess-agent-platform. Not a mascot, not a codename, just a descriptive name that tells you what is inside.
ess-agent-platform/
βββ platform/
β βββ control_plane/
β βββ worker_contract/
β βββ operator_console/
β βββ channel_adapters/
βββ agents/
β βββ email_triage/ # was "Soundwave"
β βββ slack_operator/ # was "Sparkles"
β βββ bookkeeping/ # was "Harvest"
β βββ messaging/ # was "Concierge"
βββ docs/
βββ tests/Every directory answers the question, "What does this do?" without requiring a lookup table.
The worker contract now requires a worker_id that follows the pattern {business-domain}-worker-{instance}.
# Old pattern
class SoundwaveWorker(BaseWorker):
worker_id = "soundwave-01"
# New pattern
class EmailTriageWorker(BaseWorker):
worker_id = "email-triage-worker-01"
domain = "email_triage"The domain field ties back to the monorepo directory, the config namespace, and the alerting taxonomy. One name, used consistently.
This is where the rename pays off most clearly. Structured logs now use the business domain directly.
{
"timestamp": "2026-03-15T02:14:33Z",
"level": "error",
"domain": "email_triage",
"worker_id": "email-triage-worker-01",
"message": "IMAP connection timeout after 30s",
"retry_count": 2
}That makes routing rules straightforward. Alerts can key off domain and send responders to the right runbook without a translation layer.
TL;DR: The hard part was not renaming code; it was finding every hidden dependency on old names in schedulers, configs, and monitoring rules.
This was not a clean find-and-replace.
Our scheduled workers had codename references baked into labels, log paths, and working directories. Miss one reference and you end up with a worker reporting under the old name while the control plane expects the new one. That is the kind of mismatch that creates false confidence: the process is running, but the operational signals no longer line up.
Secrets and config namespaces also used codename-based prefixes. The rename required migrating those references and tracking the transition explicitly. We used a deprecation map like this:
# deprecation_map.yaml β temporary transition file
mappings:
- old_prefix: "soundwave"
new_prefix: "email_triage"
migrated: true
verified: false
- old_prefix: "sparkles"
new_prefix: "slack_operator"
migrated: true
verified: trueThe important distinction was between migrated and verified. Copying a config or secret reference to a new namespace does not prove that nothing still depends on the old one.
Monitoring tools often treat project names as durable identifiers. Renaming them can break continuity in dashboards, issue history, or alert trends. In practice, we found it cleaner to create new business-name projects and keep the old codename-based ones available for historical reference during the transition.
TL;DR: Codenames are useful for team culture, but production identifiers should optimize for clarity, not personality.
I do not regret using codenames in conversation. They gave the project personality and made the work more memorable. The mistake was letting those names leak into production identifiers.
The broader operational principle is straightforward: names used in incidents should describe function. That aligns with established SRE practice even if every organization implements it differently.
The 2024 Accelerate State of DevOps research, published by Google Cloud, reports large performance differences between high- and low-performing teams across delivery and operational metrics. Naming alone will not transform a teamβs incident response, but reducing unnecessary cognitive steps is one of the simplest reliability improvements you can make.
Our rule now looks like this:
| Context | Use business names | Use codenames |
|---|---|---|
| Code identifiers | β | β |
| Log output | β | β |
| Alert routing | β | β |
| Config files | β | β |
| Standup conversation | Either | β |
| Documentation headers | β | β as optional subtitle |
| Slack channel names | β | β |
| Architecture diagrams | β | β |
Use business names in code, logs, configs, APIs, and infrastructure identifiers. Codenames are fine for conversation and team culture, but they should not appear in production identifiers. If an on-call engineer has to act on it, the name should describe the business function directly.
Treat it as a migration, not a search-and-replace. Inventory every place the old name appears: code, configs, schedulers, dashboards, alerts, secrets, and runbooks. Then track old-to-new mappings explicitly and separate "migrated" from "verified" so you know which references are still risky.
Yes. Naming affects how quickly a responder can identify the failing capability, find the right logs, and open the correct runbook. The effect is not always dramatic in isolation, but it compounds with every other source of operational friction.
A descriptive convention usually ages best: {company}-agent-platform for the repo, {business_domain}/ for agent directories, and {business-domain}-worker-{instance} for worker IDs. The goal is not elegance. The goal is immediate comprehension.
Keep codenames in lightweight human-facing contexts such as standups, planning docs, or optional subtitles in documentation. Keep business names everywhere a system, operator, or runbook depends on precision. That boundary preserves team culture without sacrificing operational clarity.
Renaming things to be boring was one of the most useful operational decisions we made. It reduced translation work during incidents, made the platform easier to hand off, and forced us to standardize identifiers across the stack. None of that is glamorous, but all of it matters in production.
Next, weβll cover the worker contract itself: the typed interface that ties together business domains, heartbeat identifiers, and retry behavior into one enforceable standard.
If youβre designing an agent platform and trying to decide where codenames belong, start with this rule: keep them for people, not for production. And if you want help pressure-testing those conventions, talk to Elegant Software Solutions about building systems that stay understandable under load.
Discover more content: