Case Studies · 9 engagements

Real work, shipped to production.

EDI integrations with Fortune 100 trading partners. Autonomous bug-fix pipelines that close the loop without humans. B2B analytics platforms. DTC e-commerce constellations. Daily content agents. Enterprise data gateways.

The thing every case study below has in common

Every line of code, config, test, and doc written by AI.

Not a single hand-written line. Each engagement below was shipped by Tom (Orchestrator) commanding Claude, Codex, and Gemini directly — one human, three frontier models. That's the PowerDev model.

The human role: oversight and coordination. Decision gates, approval boundaries, sequencing, judgment calls. The models handle the keyboard. The Orchestrator handles the strategy.

Featured · No human in the loop
MDSi · Internal SREAutonomous engineeringPowerDev engagement100% AI-built · Human oversight + coordination

Production exception → autonomous research → adversarial three-model fix → deploy → notify — with no human in the loop

A live exception in production triggers a ten-stage agent pipeline that researches the bug, writes a failing test, fixes the code through three different models in adversarial review, ships the patch, updates the knowledge graph, and notifies the customer — without a human touching a keyboard.

The challenge

Bug-fix workflows are where most engineering velocity dies: a customer hits an error, support files a ticket, an engineer eventually picks it up, repros it, fixes it, ships it, writes the postmortem, updates the docs, tells the customer. Each handoff loses context. Each step waits on a human. The total elapsed time from exception to notify-customer is days, not minutes — and the institutional knowledge from the fix evaporates the moment the engineer moves on.

The approach

Wired a closed-loop autonomous pipeline that listens for production exceptions and runs the entire fix-to-notify flow without human intervention. The fix step itself is run through three different models in an adversarial feedback loop — one writes, one reviews, one audits — so no single model's blind spots make it to production. Tests are written before the fix (TDD), and the regression-prevention tests stay in the suite forever.

Stack

RollbarSlack webhooksJira APIConfluence APIClaude OpusCodex GPTThird-model auditNeo4j (knowledge graph)GitHub ActionsTDD enforcement

What we built

  • 01Listener: Rollbar webhook (or Slack-channel listener) picks up the runtime exception within seconds of it firing in production
  • 02Research: agent searches Jira for prior tickets matching the stack trace and Confluence for documentation context
  • 03Triage: creates a new Jira bug or updates an existing matching one; opens or updates the Confluence postmortem page
  • 04Adversarial three-model loop: Model A writes the failing test + the proposed fix; Model B reviews adversarially; Model C audits the merged delta. Disagreements iterate until all three agree.
  • 05TDD: failing regression test is committed BEFORE the fix; the test going green is the fix's acceptance criterion
  • 06Deploy: PR opens, CI runs, agent auto-merges on green and triggers the deploy pipeline
  • 07Documentation: Confluence postmortem completed with root cause + the regression test that now guards against recurrence
  • 08Knowledge graph: Neo4j updated with a new Learning node linked to the Component, the Failure-mode, and the Fix
  • 09Team notify: Slack notification with the bug summary, the fix, and the regression-test link
  • 10Customer notify: email to affected customers confirming the issue is resolved and what changed

Outcomes

  • Time from production exception to customer-notified-of-fix collapsed from days to minutes for the class of bugs the pipeline can handle autonomously
  • Three-model adversarial review means a single model's hallucination or blind spot can't ship a bad fix — at least two of three must independently approve
  • Regression tests written before the fix and kept in the suite mean the same bug genuinely cannot reappear silently
  • Knowledge graph captures every fix as a permanent learning — onboarding a new engineer (human or agent) means querying the graph instead of asking the previous person
  • Confluence postmortems write themselves — the graph and the PR are the source of truth, the doc is generated from them
MDSi · Trading-partner EDI platformTelecom logistics · EDI / ERP integrationPowerDev engagement100% AI-built · Human oversight + coordination

Production X12 EDI platform between a Fortune 100 telecom buyer and MDSi’s Dynamics 365 ERP

A 61-project .NET platform processing five EDI document types across two production environments — rebuilt on Terraform + Azure Functions in a single iteration with zero data loss and partner-facing 503s eliminated.

The challenge

MDSi sells networking equipment to a Fortune 100 telecom buyer. Every purchase order, acknowledgement, advance ship notice, and invoice flows as X12 EDI between the buyer’s trading-partner system and MDSi’s Dynamics 365 ERP. The legacy hosting setup couldn’t isolate the partner-edge AS2 transport from the inbound processing pipeline; partner-facing 503s started appearing under load. The platform needed a full hosting rebuild without disrupting active EDI flows.

The approach

Designed a split-tier hosting model — Windows Elastic Premium for the partner-facing AS2 edge, Flex Consumption for internal workers — and implemented it as Terraform-managed infrastructure across DEV/UAT/PRD. Every change went through a five-agent review gate with cross-model paired senior dev review, dev-time security review, and DevOps validation before merge.

Stack

.NET 10 LTSAzure FunctionsX12 EDI / AS2Dynamics 365TerraformAzure Key VaultGitHub Actions

What we built

  • 015 EDI pipelines: PO 850 (inbound), ACK 855 + ASN 856 + INV 810 (outbound), ADV 824 (inbound advisory)
  • 0261 .NET 10 LTS projects, ~300 tests, single-solution build
  • 03Windows Premium edge for AS2 transport + Flex Consumption for processors, integrators, success/failure paths
  • 04Azure Key Vault + Managed Identity for every secret; no environment variables
  • 05Terraform-managed DEV/UAT/PRD with GitHub Actions deployment
  • 06External shadow monitor reproducing partner-shaped traffic against UAT before cutover

Outcomes

  • Cutover-free hosting rebuild — DEV, UAT, and PRD rebuilt and redeployed via Terraform; all 35 health endpoints returned HTTP 200 after warm-up
  • Partner-facing 503s eliminated; UAT shadow monitor confirmed clean AS2 runs across multiple post-rebuild scheduled tests
  • PRD rebuilt in parked mode so cutover timing stays a business decision, not a deploy decision
  • Custom hostname, TLS binding, and ingress allowlist all moved into Terraform — prior config drift impossible
MDSi · Major telecom-carrier integrationTelecom logistics · Multi-flow B2B integrationPowerDev engagement100% AI-built · Human oversight + coordination

11-FDD integration platform between MDSi and a major US telecom carrier — purchase orders, transfer orders, ownership changes, inventory, supply forecasts

207 Linux Azure Function Apps across DEV/UAT/PRD. 11 Functional Design Documents covering every business flow between the two companies. A canonical 7-step pipeline pattern per flow. Terraform-managed. Auth0 M2M outbound. Rollbar-instrumented. PRD went live April 2026.

The challenge

MDSi sells equipment and services to a major US telecom carrier across a sprawl of business flows: inbound purchase orders, transfer-order acknowledgements, transfer-order shipment notices, ownership-change notices and acks, inventory adjustments, open-PO supply forecasts. Each flow is its own contract, its own data shape, its own SLA. The legacy approach was a soup of one-off scripts and manual handoffs. The carrier was modernizing their side; MDSi needed an integration platform that could match it — production-grade, multi-environment, observable, and built to add new flows without touching existing ones.

The approach

Architected the integration as a constellation of Functional Design Document (FDD) implementations, each owning one business flow end-to-end. Within every FDD: a canonical 7-step pipeline — Trigger → Processor → Integrator → Success Handler → Failure Handler → Email Handler → Slack Handler. Same shape, every flow. New FDDs onboard by stamping the pattern. Terraform owns every Azure resource. Auth0 owns partner-facing M2M auth. Key Vault owns every secret. Rollbar instruments every handler.

Stack

Azure Functions (Linux Consumption)C# / .NETService BusAuth0 (M2M)Azure Key VaultTerraformRollbarDynamics 365 (D365 ERP)GitHub Actions

What we built

  • 0111 FDD implementations: purchase order receipts (FDD-2119), transfer-order acks + notices + shipments (FDD-2120/2121), ownership-change acks + notices (FDD-2122), inventory adjustments (FDD-2123), purchase-order notice + acks + receipts (FDD-6016 7A/7B), open-PO supply forecasts (FDD-6170)
  • 02Canonical 7-step pipeline per flow: Trigger / Processor / Integrator / Success / Failure / Email / Slack — same skeleton everywhere, easy to stamp and extend
  • 03207 Linux Azure Function Apps under management — 69 per environment × DEV/UAT/PRD, all Terraform-managed
  • 04Service Bus messaging backbone for async flows; HTTP triggers for partner inbound
  • 05Auth0 M2M tokens with per-environment client IDs; partner outbound credentials never in git
  • 06Per-environment Azure Key Vault (kv-wareagle-{env}); every secret consumed via @Microsoft.KeyVault references in app settings
  • 07Rollbar error tracking integrated into every handler with per-environment project mapping
  • 0812+ operator skills (deploy-verify, error-investigation, e2e-testing, polling-timestamp-reset, service-bus-purge) — runbooks-as-code for the on-call surface
  • 09D365 integration on the ERP side; partner API integration on the carrier side; clean separation between the two

Outcomes

  • PRD outbound went live on the carrier’s production tier in April 2026 after Terraform-driven UAT validation cycles
  • New business flows onboard by copying the 7-step pattern — no architectural debate per flow, just config and integration code
  • 24/24 connectivity diagnostics passing across DEV/UAT/PRD; RBAC + TLS + app-settings drift impossible because Terraform is the source of truth
  • Operator runbooks captured as agent-callable skills — diagnosing a stuck timer or a failed handoff is one command instead of a tribal-knowledge query
  • Sensitive partner-facing secrets (Function host keys, Auth0 client secrets) never enter git — regenerated from Azure / Auth0 / 1Password on demand
MDSi · Sales operationsB2B sales enablement · Conversational AI quote builderPowerDev engagement100% AI-built · Human oversight + coordination

Conversational quote-building agent that turns messy customer inputs into structured quotes — and gets smarter every time it runs

A chat interface that lets sales reps paste a loose customer email — or just type what they want in plain English — and have a complete, structured quote in seconds. Memory-backed: every quote it builds makes the next one faster and more accurate.

The challenge

MDSi sales reps spend hours per week translating messy customer inputs into structured quotes. A customer sends a forwarded email thread, a list of part numbers in a Slack message, a "we need 50 of those things we got last year" voice note — and the rep has to extract intent, look up SKUs, validate pricing, check inventory, and assemble a quote. The data is all there (ERP, CRM, quote history) but stitching it manually is the bottleneck. Reps want to paste the input and get a draft quote ready to review — not start from a blank quote template every time.

The approach

Built a conversational quote-building agent that takes free-form input — pasted emails, messy notes, natural-language requests — and produces a structured draft quote against MDSi’s ERP and CRM. The agent uses a persistent memory layer that learns from every quote it generates: customer preferences, common bundles, price history, who-buys-what patterns. Each quote teaches it more about the customer, the product mix, and the rep’s working style. Old quote history isn’t just queryable — it’s the training signal that makes the next quote smarter.

Stack

Conversational AI (LLM)Persistent memory layerCRM + ERP integration (Dynamics 365)Catalog + pricing resolutionAudit-logged generation

What we built

  • 01Paste-anything input — customer email threads, loose notes, "we need X" requests, even copy-pasted Slack/Teams messages
  • 02Natural-language → structured quote: agent extracts intent, resolves SKUs against the catalog, validates against the customer’s account, applies the right pricing tier, drafts the quote
  • 03Memory layer that learns customer-specific patterns (preferred bundles, common quantities, special pricing, who they normally buy from)
  • 04Every quote generated feeds the memory — same customer next month gets a faster, more accurate draft
  • 05Inline conversation: rep can refine the draft by chatting ("add 10 of the rack-mount version, drop the warranty extension")
  • 06Audit-logged: every generated quote and the inputs that produced it are stored for review and sales-leader oversight

Outcomes

  • Quote turnaround time collapses from hours to minutes — paste the customer’s email, review the draft, adjust, send
  • Reps get full structured quotes back from messy inputs they used to retype by hand
  • The agent gets smarter for each rep and each customer — the more they use it, the less they have to correct
  • Loose customer requests (forwarded emails, voice-note transcripts, Slack messages) become first-class quote inputs
  • Sales leaders see the audit trail — which inputs led to which quotes — and tune the agent based on what reps are actually doing
Climb Analytics · Anthem Pest ControlB2B SaaS · Pest control operations analyticsPowerDev engagement100% AI-built · Human oversight + coordination

Vertical analytics platform for pest control operators — call center coaching + customer ops analytics with PCI/PAN-redacted call transcription

A pest control operations analytics platform that turns Podium calls and texts into coaching signal for technicians and CSRs — with AssemblyAI redacting every credit card number and PAN before a single transcript leaves the redaction pipeline. Anthem Pest Control was the lead customer.

The challenge

Pest control operators run on tight per-route economics. Customer calls and texts are where the business actually happens — booking, rescheduling, complaint handling, technician dispatch coordination — but most operators have zero visibility into call quality. Recordings sit in Podium unindexed; CSR coaching is anecdotal; bad calls are caught when the customer churns. Anthem Pest Control needed a way to turn every conversation into structured signal that drives staff training and customer-management decisions, while staying compliant: every pest control call ends with a credit card payment, so PCI / PAN redaction had to be airtight before anything reaches an LLM or a transcript reviewer.

The approach

Built a domain-specific analytics platform for pest control operators with three connected layers: customer-facing operator dashboards, raw-data exploration for power users, and a Podium call/text analytics layer powered by AssemblyAI's transcription API for PII/PAN redaction at the source, with LeMUR applied to redacted transcripts for coaching analysis. Every call is redacted at the audio + transcript boundary before any human or LLM sees it; redacted transcripts then flow into coaching insights, customer-mood signals, and incident analysis.

Stack

Next.js 15 App RouterTypeScriptAssemblyAI (transcription API PII/PAN redaction + LeMUR coaching analysis)Podium APIVercelDopplerJira APISentryPlaywright

What we built

  • 01Operator dashboard with pest-control-specific KPIs (technician utilization, route density, recurring-customer health, treatment-type margin)
  • 02Raw-data exploration layer for power users — analytics, raw entities, and admin in one app
  • 03Podium call + text ingestion pipeline pulling every CSR conversation in real time
  • 04AssemblyAI transcription API with PII redaction policies — credit card numbers, PANs, SSNs, and other PCI/PII scrubbed at the audio + transcript level before any output moves downstream; LeMUR applied to the redacted transcripts for coaching and incident analysis
  • 05Coaching engine: scores every call on tone, resolution, upsell opportunity, churn signal — surfaces the top three coaching moments per CSR per week
  • 06Incident analysis: when a customer files a complaint, the agent pulls the redacted transcript, the booking trail, and the technician dispatch log into one timeline
  • 07In-app Sparkles feedback flow with hardened Jira ticket creation (paging-grade Sentry alert if the parent epic ever 404s — built after a 2026-04-27 incident where epic deletion silently broke ticket creation for 3.5 hours)

Outcomes

  • Pest control operators get coaching signal on CSR calls without listening to a single call themselves — manager time freed for customer-saving work
  • PCI/PAN redaction is automatic and audited — operators can transcribe calls without risking a card number ending up in an LLM context window
  • Customer incidents are reconstructable in seconds — a complaint at 3pm has its full call/text/dispatch trail in front of the manager by 3:01
  • Anthem Pest Control onboarded as the lead customer; product roadmap shaped by their real operations problems
  • Same-day shipping of bug fixes via the hardened Jira-feedback flow — every silent failure pages instead of dying in support
Your Peptide Brand · YPBDTC e-commerce · Health & wellnessPowerDev engagement100% AI-built · Human oversight + coordination

Full operations platform — ordering, fulfillment, COA, customer ops, and Podium-driven training & incident analysis

A microservice operations platform running an entire DTC peptide brand — orders, payments, fulfillment, regulatory COA workflow, Shipstation sync, and a Podium call/text analytics layer (with AssemblyAI PCI/PAN redaction) that drives customer management, incident analysis, and CSR training.

The challenge

YPB needed an operations stack that grew with the brand. Off-the-shelf e-commerce platforms (Shopify, WooCommerce alone) couldn’t handle the regulated-product workflows: Certificate of Analysis (COA) verification per product batch, customer compliance signals, multi-channel order routing, the back-office accounting handoff. And once orders were flowing, the next bottleneck became customer ops: every customer call and text in Podium contained signal — incident severity, churn risk, training opportunity — but no one had time to listen back to calls. Calls also contained credit card numbers (every order ends in a payment), so any analytics had to be PCI-safe by construction.

The approach

Architected a constellation of small focused services — each owning a single bounded context (orders, payments, fulfillment, customers, catalog, inventory, communications) — with thin shared infrastructure and a single CEO/admin command center pulling the live state of all of them together. Layered a Podium call/text analytics module on top with AssemblyAI's transcription API doing PII/PAN redaction at the audio + transcript boundary so every downstream consumer (dashboards, LLMs, human reviewers) only ever sees redacted content; LeMUR runs over the redacted transcripts for the coaching and analysis layer.

Stack

TypeScriptNext.jsWordPress (storefront plugins)DopplerShipstation APIPodium APIAssemblyAI (transcription API PII/PAN redaction + LeMUR coaching analysis)PostgresKnowledge graph (Neo4j)

What we built

  • 0115+ focused operations microservices: orders, payments, fulfillment, customers, catalog, inventory, communications
  • 02Regulated-product COA workflow: every SKU verified against a current batch certificate before it can be sold
  • 03Shipstation sync keeping fulfillment state mirrored both ways
  • 04AI-mediated internal support — staff questions resolved via the agent without human triage
  • 05Podium call + text ingestion across customer service and order-support channels
  • 06AssemblyAI transcription API with PII redaction policies — credit card numbers, PANs, expiry dates, CVVs scrubbed at the audio + transcript level before transcripts move downstream; LeMUR runs over the redacted transcripts for coaching analysis
  • 07Customer-management layer: redacted transcripts feed mood, churn-risk, and incident-severity signals back into the customer record
  • 08Incident analysis: when a complaint arrives, the platform reconstructs the full conversation history (Podium calls + texts) plus the order trail in one timeline
  • 09Staff training: weekly coaching reports score every CSR on tone, resolution rate, and missed upsell opportunities

Outcomes

  • YPB operates with a small ops team because the platform handles the routine work — orders flow from storefront through fulfillment to accounting without manual touch
  • COA workflow ensures every product offered has a compliant chain-of-evidence — non-compliant SKUs blocked at storefront, not after the sale
  • Adding a new service means standing up one repo with the shared scaffolding, not retrofitting a monolith
  • PCI/PAN redaction is automatic on every Podium call — analytics without a card-number leak risk
  • CSR coaching shifted from anecdotal to systematic — managers see the top three coaching moments per CSR per week
  • Incident-to-resolution timeline reconstruction in under a minute instead of an afternoon of digging through Podium recordings
ESS · Internal (Nexus)Engineering productivity · Custom Jira UIPowerDev engagement100% AI-built · Human oversight + coordination

Custom face over Jira — a focused, agent-aware project management surface that replaces the standard Jira UI for high-velocity teams

Standard Jira buries the work in clicks. Nexus is a custom UI sitting on top of the same Jira data, designed for the way agent-augmented teams actually plan, ship, and review — keyboard-first, agent-aware, and fast enough to stay open all day.

The challenge

The Jira UI was built for a world where one human moved one ticket through one workflow at a time. In an agent-augmented engineering org, half of the activity on any sprint is generated by autonomous agents (PRs opening, postmortems writing themselves, regression tickets created from runtime exceptions). The standard Jira interface obscures that signal: agent-created tickets look identical to human-created ones, cross-project queries take a dozen clicks, and the bulk operations engineers actually need (link these five issues to this epic, move this whole cluster to next sprint) are buried in admin menus.

The approach

Built Nexus as a custom Next.js app sitting on top of the Jira REST + GraphQL APIs — the data and the source of truth stay in Jira, but the workflow surface is rebuilt around the way agent-augmented teams actually work. Keyboard-first navigation, agent-vs-human ticket attribution surfaced visually, cross-project search by default, and bulk operations as first-class actions instead of buried admin tasks.

Stack

Next.js 15TypeScriptJira REST + GraphQL APIsConfluence APINeo4j (knowledge graph)

What we built

  • 01Custom Next.js UI over Jira’s REST + GraphQL APIs — Jira remains the system of record
  • 02Agent-vs-human attribution: every ticket shows whether it was opened by a human or by an autonomous agent (and which agent)
  • 03Keyboard-first navigation — every common action has a shortcut, mouse optional
  • 04Cross-project search and bulk operations as default behavior, not admin features
  • 05Agent-aware filters: "show me everything Optimus opened this week," "show me regression tickets from the autonomous bug-fix pipeline"
  • 06Live links between Jira tickets, the Confluence pages they reference, and the Neo4j knowledge graph nodes they relate to

Outcomes

  • Engineers spend less time navigating Jira and more time shipping — keyboard-first means common actions take a fraction of the click count
  • Agent-generated tickets are visually distinct, so humans can review the agent backlog at a glance
  • Cross-project work (which is most of the work in a multi-repo team) is no longer hostile to plan — bulk move, bulk link, bulk re-prioritize all work
  • Jira stays the system of record (no migration, no risk) — the UI is the only thing that changes
AI Coding GuildEducation / Developer mediaPowerDev engagement100% AI-built · Human oversight + coordination

Daily multi-series content pipeline — 8 articles per day, fully agent-generated

A content agent that researches, writes, edits, illustrates, and publishes 8 articles every day across 8 distinct series — security, databases, DevOps, web dev, mobile, architecture, AI/prompts, and a bridge series for senior devs.

The challenge

AI Coding Guild publishes daily across 8 specialized tracks. Doing that by hand would require a team of writers + editors + illustrators + publishers working every day. The economics never close. But mass-produced LLM slop ranks low and gets ignored. The bar is: real research, real cross-model editorial review, real illustrations, real publish — every day, at scale, with quality high enough that senior developers actually read it.

The approach

Built the content pipeline on a five-stage pattern — research, write, edit, illustrate, publish — with cross-model editorial review (one model writes, another reviews) and a planning step grounded in Perplexity research rather than just LLM imagination.

Stack

PythonClaude (Anthropic)GPT (OpenAI)Perplexity research APISupabase + pgvectorImage generation

What we built

  • 01Stage 1 (Research): Perplexity-grounded research + planning per article
  • 02Stage 2 (Write): article generation with model rotation across the writer pool
  • 03Stage 3 (Edit): cross-model editorial pass — one model writes, another reviews
  • 04Stage 4 (Image): hero image generation matched to the article topic
  • 05Stage 5 (Publish): publish with embeddings generated for semantic search
  • 068 series defined declaratively in config — adding a series is a config change, not a code change

Outcomes

  • Eight articles ship every day, every day, without exception — same cadence a 6-person editorial team would aim for
  • Cross-model review catches the pattern of bugs each model has a blind spot for — a Claude-written piece reviewed by a GPT model, and vice versa
  • Embeddings + semantic search means readers find the article they need by meaning, not keyword
  • The same 5-stage pattern now powers ESS’s own blog and the Guild — written once, dogfooded twice
GoodwillNonprofit / Retail · Enterprise data integrationPowerDev engagement100% AI-built · Human oversight + coordination

Azure-based data platform — lake exports, migration tooling, notification fanout, SharePoint extraction

A multi-component Azure data platform for one of the country’s largest nonprofit retailers — moving data flow off ad-hoc scripts onto an audited, monitored platform with full retry, dead-lettering, and observability.

The challenge

Goodwill operates across a sprawling Microsoft estate — SharePoint lists, Azure SQL, multiple operational systems — with data that needs to flow from operations into a data lake for analysis and from the lake out to downstream systems. The integration layer between those systems had become a bottleneck: ad-hoc exports, scripts running on someone’s laptop, no audit trail, no observability when something failed silently.

The approach

Stood up a managed integration architecture — Azure Functions for the export and migration jobs, dedicated notification services for downstream fanout, a SharePoint list extractor for data trapped in collaboration tools, and a small constellation of admin websites and APIs — all on Azure SQL with Entra MFA on the secured vNet.

Stack

Azure FunctionsC# / .NETAzure SQLSharePoint APIEntra MFASecured vNet

What we built

  • 01Data-lake export services — Azure Functions + console-app fallback for backfills
  • 02Migration tooling with full audit trail — repeatable, observable, recoverable
  • 03Notification services — downstream fanout (email, queue, webhook) for data events
  • 04SharePoint list data extractor — pulls data trapped in SharePoint lists into the platform
  • 05Multiple admin websites and APIs for different operational personas
  • 06Centralized logging service + observability layer with telemetry separated from app logic

Outcomes

  • Data flow between operational systems and the data lake moved off ad-hoc scripts onto an audited, monitored platform
  • Migrations and exports now run on a schedule with retry, dead-lettering, and observability
  • Data trapped in SharePoint is now first-class — analysts query it like any other source
  • The platform served as the integration backbone for the period the engagement was active; archived once the internal team took it over
And about Agent OS — what does that look like?

Meet the Transformers.

Our internal Agent OS. 36 production agents organized as a real org chart — Sparkles (COS), Optimus Prime (CTO), Megatron (CFO), plus the rest of the team. We run our own business on it.

The PowerDev case studies above were shipped through the model we sell — one Orchestrator (Tom) commanding Claude, Codex, and Gemini directly. The Transformers are something else: an Agent OS that already runs the rest of ESS's business — invoicing, email triage across 12 inboxes, finance ops, security monitoring, customer comms, facility automation.

An Agent OS for your business is bespoke to your operations, your data, your team — not a clone of ours.

Meet the AI Staff
Tom Hundley
CEO
Sparkles
COS
Optimus Prime
CTO
Megatron
CFO
Wheeljack
Senior Dev
Hot Rod
Engineer
What it actually feels like to build this

Every agent on the chart is evolving every single day.

The Transformers aren't a finished system. They're a fleet that's growing, learning, and coming online in real time.

It's trial and error every day. Like chewing glass. But it's working — and the agents are coming online in front of our eyes. It's amazing to watch.

The journey so far
MCP integrations
Wiring tools to LLMs
Done
Skills by hand
Manual invocation
Done
OpenClaw framework
Proper agent runtime
Done
We are here · working on it
04
Autonomy
Acting on their own — event-driven messaging, event-driven triggers, scheduled jobs
In progress

Want one of these for your business?

Same approach, your stack, your operations. $35K/month, flat, per Orchestrator. Month-to-month.

Case Studies — Real engagements ESS has shipped | Elegant Software Solutions