One Agent for Twelve Email Tenants

🤖 Ghostwritten by GPT 5.4 · Fact-checked & edited by Claude Opus 4.6

Administering twelve email tenants by hand does not scale. By June 2026, Ratchet had matured into a single multi-tenant admin agent that could manage 7 Microsoft 365 tenants and 5 Google Workspace tenants from one codebase, covering user CRUD, alias management, group membership, license assignment, mailbox settings, and billing-health checks. The engineering challenge was not simply making those actions work. It was making them safe enough that one tool with cross-tenant authority would not become the fastest way to make the same mistake twelve times.

That trade-off defined May's work. Ratchet added a brand-new Google Workspace tenant from scratch, expanded group creation for open-distribution use cases, validated domains for multi-domain tenants, and ran billing-health probes across the full fleet. It also exposed the kind of failure that only appears at admin scale: one string-handling bug around hyphens in a group email local-part was enough to create a broken address. That is the real lesson in multi-tenant admin systems. The dangerous bugs are often not dramatic auth failures. They are tiny transformations applied repeatedly, with administrative authority behind them.

Why Ratchet Became One Agent Instead of Twelve Scripts

TL;DR: A single admin surface reduces operational drag, but only if the guardrails are stronger than the convenience.

The obvious alternative to Ratchet was a pile of tenant-specific scripts. That approach looks safer at first because each script feels smaller and more isolated. In practice, it creates a different class of risk: duplicated logic, inconsistent behavior, and one-off fixes that never propagate cleanly across environments.

By May, the shape of the problem was clear. The operational surface had grown beyond occasional account creation. The agent needed to handle:

User create, read, update, and disable flows
Alias management
Group membership changes
License assignment
Mailbox settings
Billing-health probes across all tenants
New tenant bootstrap for Google Workspace

That is enough recurring work that a unified control plane starts to make sense. The same abstractions show up repeatedly even when the providers differ. A user is still a user. A group still has an address, members, and delivery posture. A mailbox still has settings that need to be checked and corrected. The provider APIs are different, but the operational intent is similar.

The reason this is worth calling a build-log milestone is that Ratchet crossed from "automation helper" into "admin agent." That is a different category of tool. Once one codebase can act across Microsoft 365 and Google Workspace, the engineering burden shifts. Feature work matters, but constraint design matters more.

There is also a practical scaling point here. Microsoft reported in its FY2025 Q3 earnings release that Microsoft 365 Commercial seats exceeded 430 million. Google Workspace operates at comparable global scale across businesses of every size. Those numbers are not directly about tenant administration, but they underline the same reality: the platforms are mature, broad, and admin-heavy. The bottleneck for smaller operator teams is rarely whether the APIs exist. It is whether the control model remains understandable as tenant count rises.

The Two-Identity Model: One Pattern, Two Provider Realities

TL;DR: Ratchet uses separate identity patterns for Microsoft 365 and Google Workspace because the providers expose administrative trust differently.

The cleanest architectural decision in Ratchet was refusing to force one authentication model onto both ecosystems.

On the Microsoft side, the agent authenticates per tenant. Each tenant has its own app registration and its own scoped administrative path. That means the codebase is shared, but the credential boundary is not. A command targeting tenant-a.example does not borrow some global super-credential that can silently drift into tenant-b.example. It resolves the tenant, loads the tenant-specific auth context, and then acts within that boundary.

On the Google side, the model is different. Ratchet uses a domain-wide delegation pattern through a shared service account that has been authorized separately in each Workspace. The service identity can act on behalf of an administrator, but only in domains where that delegation has been explicitly granted and only for the scopes that were approved. Google's Admin SDK documentation describes domain-wide delegation as a way for an application to access user data without individual consent prompts when an administrator has authorized it for the domain. That makes it powerful and operationally efficient, but it also means scope discipline is not optional. The trust boundary lives in the combination of delegated authorization and selected scopes.

That split sounds inelegant until the alternatives are considered. A fake "universal" auth layer would mostly hide important differences and make debugging worse. Ratchet instead keeps the provider-specific trust model visible in the implementation while presenting a more consistent operator interface.

A simplified command shape for group creation looks like this:

ratchet groups create \
  --tenant tenant-a.example \
  --domain tenant-a.example \
  --name "all-staff" \
  --email "all-staff@tenant-a.example" \
  --open-distribution \
  --yes

For a multi-domain tenant, the --domain selector matters because "tenant" and "mail domain" are not always the same thing:

ratchet groups create \
  --tenant tenant-b.example \
  --domain sub.tenant-b.example \
  --name "support-rotation" \
  --email "support-rotation@sub.tenant-b.example"

The important part is not the exact CLI shape. It is that the command forces the operator to be explicit about where the object should live, and the system validates that choice against the tenant's known domains before doing anything destructive or persistent.

What Changed in May

TL;DR: May was the month Ratchet stopped being a collection of admin helpers and became a coherent cross-tenant operating layer.

Several changes mattered, but three stood out.

1. A New Google Workspace Tenant Was Stood Up from Scratch

Adding a fresh Workspace tenant is useful as a capability test because it exercises more than one happy path. It touches identity setup, domain assumptions, group patterns, licensing expectations, and the awkward edge between "new environment" and "standardized environment." A mature agent should not just maintain existing tenants. It should be able to bring a new one into the same operational model.

That work also validated the Google Workspace side of the abstraction. If an agent can only manage long-lived tenants after manual cleanup, it is not really expressing a repeatable admin model. Standing up a new tenant from scratch is a stronger proof that the control plane is becoming portable.

2. Billing-Health Probes Were Added Across All Tenants

Billing checks are not glamorous, but they are operationally important. A user lifecycle agent that ignores billing state is only seeing half the system. Account provisioning can fail for reasons that look like permissions but are actually tied to subscription or tenant health.

Billing-health probes became a lightweight way to answer a basic question before deeper admin work begins: is this tenant healthy enough for the requested action to succeed predictably? That does not eliminate all failure modes, but it reduces blind spots.

3. Group Creation Became More Realistic

Real organizations rarely live in a single-domain, closed-membership default world forever. Open-distribution posture and multi-domain validation sound like small additions, but they reflect actual administrative complexity. A decent multi-tenant admin tool has to handle the cases operators see every week, not just the safest demo scenario.

That is where one of May's most useful bugs appeared. Group email generation included a transformation that failed to preserve hyphens in the local-part. The result was simple and damaging: a valid intended address could become an invalid or incorrect one. At small scale, this is an annoying bug. At admin scale, it is a broken mailbox factory.

The lesson is worth stating plainly: string normalization in identity and messaging systems is never "just formatting." Email addresses, aliases, and group identifiers are operational data. Any transformation rule that is not explicitly justified should be treated as suspect.

Guardrails Are the Product

TL;DR: In high-blast-radius admin systems, safeguards are not support features — they are the core design.

Ratchet's central convenience is also its central risk. One agent can touch many tenants, which means one mistake can propagate quickly unless the system is deliberately inconvenient at the right moments.

Three safeguards became non-negotiable.

Explicit Confirmation by Default

Every operation requires confirmation unless --yes is passed. That sounds almost trivial, but it changes operator behavior in exactly the right place: the final moment before a state-changing action. Destructive and high-impact admin commands should feel intentional.

This is especially important in a build-log context because automation teams often remove confirmations too early. The argument is usually speed. The reality is that speed without friction is only helpful when target selection is already guaranteed correct. In multi-tenant admin, that guarantee is never strong enough to skip all pauses.

Per-Tenant Scoped Credentials

Ratchet does not rely on one global credential that can administer the entire fleet. The code is centralized, but trust is partitioned. That matters for both security and debugging.

Design choice	Operational upside	Primary risk	Ratchet posture
One global admin credential	Simple setup, fewer moving parts	Extremely high blast radius	Rejected
Per-tenant scoped credentials	Stronger isolation, clearer audit boundaries	More setup and secret management	Adopted
No confirmation prompts	Faster scripted execution	Easier accidental changes	Rejected by default
Confirmation unless `--yes`	Safer interactive operations	Slightly slower workflows	Adopted
Implicit domain selection	Less typing	Wrong-domain actions in multi-domain tenants	Rejected
Explicit validated `--domain`	Clear targeting	More operator input required	Adopted

This is a classic least-privilege decision. The system is still powerful, but the power is segmented. NIST's guidance on least privilege and separation of duties (see SP 800-53 AC-6) has long framed this as foundational security practice, and the principle applies cleanly here even if the implementation details differ by provider.

Domain Validation Before Execution

Multi-domain tenants create a subtle failure mode: the tenant is right, but the domain is wrong. That can still create a bad object, route mail incorrectly, or fail in a confusing way. Ratchet validates that the chosen domain belongs to the selected tenant before it proceeds.

That check does more than prevent typos. It forces the code to model tenant-domain relationships explicitly rather than assuming them. In admin systems, explicit models are safer than convenience shortcuts.

The Uncomfortable Truth About Centralized Admin Power

TL;DR: A single multi-tenant admin agent is a deliberate convenience-versus-risk trade-off, and pretending otherwise leads to weak design.

It would be easy to describe Ratchet as purely an efficiency win. That would be incomplete.

A single agent driving Microsoft 365 and Google Workspace across twelve tenants is powerful because it collapses repetition. It is risky for the same reason. Centralization removes friction, and friction is sometimes what prevents a bad day from becoming a fleet-wide incident.

The honest engineering position is that this kind of tool is justified only when its constraints are designed as carefully as its capabilities. That means:

Keeping provider trust models visible rather than abstracting them into mush
Validating tenant and domain targets before mutation
Preserving explicit operator intent through confirmations
Treating string handling as identity-critical logic, not cosmetic cleanup
Preferring tenant-scoped trust over fleet-wide credentials

This is also where build-log writing is more useful than polished architecture diagrams alone. The useful details are the awkward ones. A missing hyphen. A domain mismatch. A billing probe that reveals the tenant is unhealthy before a provisioning workflow starts. Those are the moments where a system stops being impressive and starts being reliable.

Frequently Asked Questions

Q: What is a multi-tenant admin agent in this context?

A multi-tenant admin agent is a single codebase that can perform administrative actions across multiple separate email or identity tenants. In Ratchet's case, that includes both Microsoft 365 and Google Workspace environments, while still keeping credentials and validation boundaries tied to each tenant.

Q: Why use domain-wide delegation for Google Workspace?

Domain-wide delegation allows an authorized application to act on behalf of users or administrators within a Workspace domain after an admin grants approval for specific scopes. It is operationally efficient for centralized administration, but it must be constrained carefully because the delegated permissions can be broad if scope selection is sloppy.

Q: Why not use one global credential for every tenant?

One global credential simplifies setup but creates an unnecessary blast radius. Per-tenant scoped credentials make onboarding and secret management more involved, yet they provide better isolation, clearer fault boundaries, and a safer failure mode when something goes wrong.

Q: Why does preserving hyphens in a group email local-part matter so much?

Because email addresses are identifiers, not presentation text. If string handling silently changes the local-part, the resulting address may not match the intended mailbox or group, which can break routing, provisioning, or downstream administration in ways that are hard to spot until users are affected.

Q: What does "inventory-first execution" mean for future operations?

It means future operations should begin by querying the current state of users, groups, domains, licenses, and mailbox settings before attempting changes. That reduces drift, avoids duplicate or conflicting actions, and turns admin workflows from assumption-driven commands into state-aware reconciliations.

Key Takeaways

Ratchet now operates as a single multi-tenant admin layer across 7 Microsoft 365 tenants and 5 Google Workspace tenants.
The important May milestone was not raw capability alone, but safer capability.
The authentication model is intentionally split: per-tenant trust on the Microsoft side, domain-wide delegation on the Google side.
Explicit confirmation, tenant-scoped credentials, and domain validation are the core safeguards.
Small string-handling bugs can become large operational failures when applied at admin scale.
Billing-health probes matter because admin workflows fail differently when tenant health is degraded.
The next maturity step is inventory-first execution — the system checks state before it changes state.

Conclusion

Ratchet's May progress made one point clear: the real work in production admin agents is not accumulating more commands, but reducing the distance between authority and verification. A single agent can manage twelve tenants effectively, but only when every action is bounded by tenant-aware trust, explicit intent, and state checks that happen before mutation. The next stage is not more guesswork with better tooling. It is less guesswork altogether, as the fleet moves toward inventory-first operations that verify reality before they try to reshape it.