🤖 Ghostwritten by Claude Opus 4.5 · Curated by Tom Hundley
This article was written by Claude Opus 4.5 and curated for publication by Tom Hundley.
Field notes from the trenches of agentic development.
Here's a scenario that happens more often than you'd expect: You're deep in a workflow with Claude Code, everything's flowing, and then you hit a wall. The task requires updating something on a website—but there's no API. No MCP server. No programmatic access at all.
The old approach? Sigh, context-switch out of your flow, manually log into the website, click around, and update things by hand. Then try to remember where you were.
The new approach? Have your coding agent write a prompt for your browser agent. Copy, paste, execute, copy results back. Stay in the flow.
This sounds almost too simple to be worth writing about. But it's been working remarkably well—well enough that it's become a core part of my daily workflow.
Claude Code with MCP servers is incredibly powerful. GitHub, Slack, databases, file systems—when there's an API or MCP available, you're in automation heaven.
But not everything has an API:
When you hit these, the typical response is: "I can't automate that. You'll need to do it manually."
That's no longer true.
Browser agents have gotten good. Not "interesting demo" good—genuinely useful for real tasks. They can:
A note on reliability: Browser agents aren't perfect. Benchmark success rates hover around 40-60% for complex tasks. Simple, well-defined tasks work much better than ambiguous ones. Always verify the results, especially for anything important.
The tools I've been using:
Each has strengths and quirks, but they all share one capability: they can execute natural language instructions in a browser. The landscape is evolving rapidly—new tools appear regularly, and existing ones keep improving.
Here's the pattern that's been working:
You're working in Claude Code and reach a point where you need to do something in a browser:
"I need to update the DNS settings in Cloudflare, but the API doesn't support this specific setting."
Or:
"I need to submit this form on the client's portal, but they don't have an API."
Instead of switching to manual mode, ask Claude Code to write the prompt:
"Write me a prompt for a browser agent that will update the DNS TXT record for _dmarc.example.com to the following value: [value]. Include the login steps and verification."
Claude Code generates something like:
Navigate to cloudflare.com and log in to my account.
Go to the DNS settings for example.com.
Find or create a TXT record with name "_dmarc".
Set the value to: "v=DMARC1; p=reject; rua=mailto:dmarc@example.com"
Save the changes.
Verify the record appears in the list.
Report back with confirmation or any errors.Open your browser agent of choice—ChatGPT, Atlas, Gemini, or whichever you prefer. Paste the prompt. Watch it work.
The browser agent navigates, clicks, fills, and completes the task. For well-defined tasks, it's remarkably effective. Keep an eye on it the first few times to build trust in the workflow.
When the browser agent finishes, it reports back:
"Successfully added TXT record _dmarc.example.com. The record is now visible in the DNS list. Propagation typically takes 5-10 minutes."
Copy this back to Claude Code:
"The browser agent reports: [paste result]. Please continue with the next step."
Claude Code picks up right where you left off, now with confirmation that the manual step is complete.
The biggest win is staying in your flow. Your coding agent maintains full context of what you're building, why you needed that DNS change, and what comes next. You never fully context-switch to "manual mode."
Claude Code writes excellent browser agent prompts. It knows:
When you write prompts for browser agents yourself, they're often vague or miss steps. When Claude Code writes them, they're specific and complete.
Browser agents report back with details. If something fails, you get:
"Could not find DNS settings. The account appears to have multiple domains. Please specify which domain: example.com, staging.example.com, or dev.example.com."
You paste this back to Claude Code, which can then refine the prompt:
"Write an updated prompt that specifies example.com as the target domain."
This loop—write prompt, paste, execute, paste back—takes less time than:
The browser agent handles steps 2-5. You just manage the handoff.
Claude Code needs to add an environment variable to Vercel, but you're at the free tier without API access:
"Write a browser agent prompt to add an environment variable called RESEND_API_KEY with value [key] to my project 'my-webapp' on Vercel. Apply it to Production, Preview, and Development environments."
Browser agent executes, reports success. Claude Code continues building the email feature.
You need to verify a new domain in Google Search Console:
"Write a browser agent prompt to add the domain elegantsoftwaresolutions.com to Google Search Console using DNS verification. Report back with the TXT record value I need to add."
Browser agent navigates, initiates verification, extracts the TXT value. You paste back to Claude Code, which then writes the prompt for adding that TXT record to your DNS provider.
A client needs settings changed in their CRM that has no API:
"Write a browser agent prompt to navigate to HubSpot, go to Settings > Properties > Contact Properties, and add a new single-line text property called 'AI_Score' with the description 'Automated lead scoring from our AI system'."
Browser agent handles the multi-step navigation and form filling. You get confirmation and can continue with the integration.
Always ask for verification in your prompts:
"After making the change, verify it appears correctly and report the exact values you see."
This catches errors before they cascade.
Browser agents handle auth, but be clear about what account:
"Log in to Cloudflare using the credentials stored in my browser (you should see the dashboard for Elegant Software Solutions)."
When you need data back, specify the format:
"Report back with: (1) whether the change was successful, (2) the exact URL of the page, and (3) any confirmation messages shown."
For multi-step processes, consider breaking into separate prompts:
"First, write a prompt to check the current state. After I get the results, we'll write the update prompt."
This gives you a checkpoint and reduces risk.
Here's the beautiful thing: this workflow is itself agentic. You're orchestrating two specialized agents:
You're the human in the loop, but your job has become coordination, not execution. You copy-paste between agents, provide oversight, and make decisions. The agents do the work.
This is what agentic development feels like in practice. Not one all-powerful AI that does everything, but specialized agents that you compose together, with yourself as the orchestration layer.
This copy-paste workflow is a bridge. As MCP servers proliferate and APIs improve, more of these manual steps will become automated. Browser agents are getting better every month.
But right now, in the real world, you hit walls. And when you do, this pattern works. It's been one of the most practically useful additions to my daily workflow.
The key insight: when your coding agent hits a wall, don't drop into manual mode. Ask it to write a prompt for another agent. Stay in the flow.
Have you been using browser agents in combination with coding agents? I'd love to hear what patterns you've discovered. Reach out at elegantsoftwaresolutions.com/contact.
Discover more content: