🤖 Ghostwritten by Claude Opus 4.5 · Curated by Tom Hundley

This article was written by Claude Opus 4.5 and curated for publication by Tom Hundley.

The Browser Agent Bridge: When APIs Fail, Prompts Prevail

Field notes from the trenches of agentic development.

Here's a scenario that happens more often than you'd expect: You're deep in a workflow with Claude Code, everything's flowing, and then you hit a wall. The task requires updating something on a website—but there's no API. No MCP server. No programmatic access at all.

The old approach? Sigh, context-switch out of your flow, manually log into the website, click around, and update things by hand. Then try to remember where you were.

The new approach? Have your coding agent write a prompt for your browser agent. Copy, paste, execute, copy results back. Stay in the flow.

This sounds almost too simple to be worth writing about. But it's been working remarkably well—well enough that it's become a core part of my daily workflow.

The Problem: The API Gap

Claude Code with MCP servers is incredibly powerful. GitHub, Slack, databases, file systems—when there's an API or MCP available, you're in automation heaven.

But not everything has an API:

Account settings pages that only exist in web UIs
Admin panels for SaaS tools with no developer access
Legacy systems where the only interface is a browser
Third-party services that haven't built APIs for your use case
Forms and submissions that require browser interaction

When you hit these, the typical response is: "I can't automate that. You'll need to do it manually."

That's no longer true.

The Solution: Browser Agents as the Bridge

Browser agents have gotten good. Not "interesting demo" good—genuinely useful for real tasks. They can:

Navigate complex web interfaces
Fill forms accurately
Click through multi-step workflows
Read and extract information from pages
Handle authentication (with your supervision)
Report back with structured results

A note on reliability: Browser agents aren't perfect. Benchmark success rates hover around 40-60% for complex tasks. Simple, well-defined tasks work much better than ambiguous ones. Always verify the results, especially for anything important.

The tools I've been using:

ChatGPT agent — OpenAI's browser automation, integrated directly into ChatGPT (previously called "Operator")
ChatGPT Atlas — OpenAI's AI-powered browser with built-in agent capabilities
Gemini with Chrome — Google's agentic features integrated with the browser
Anthropic's computer use — Claude's own computer control capabilities

Each has strengths and quirks, but they all share one capability: they can execute natural language instructions in a browser. The landscape is evolving rapidly—new tools appear regularly, and existing ones keep improving.

The Workflow: The Copy-Paste Loop

Here's the pattern that's been working:

Step 1: Hit the Wall

You're working in Claude Code and reach a point where you need to do something in a browser:

"I need to update the DNS settings in Cloudflare, but the API doesn't support this specific setting."

Or:

"I need to submit this form on the client's portal, but they don't have an API."

Step 2: Ask for a Browser Agent Prompt

Instead of switching to manual mode, ask Claude Code to write the prompt:

"Write me a prompt for a browser agent that will update the DNS TXT record for _dmarc.example.com to the following value: [value]. Include the login steps and verification."

Claude Code generates something like:

Navigate to cloudflare.com and log in to my account.
Go to the DNS settings for example.com.
Find or create a TXT record with name "_dmarc".
Set the value to: "v=DMARC1; p=reject; rua=mailto:dmarc@example.com"
Save the changes.
Verify the record appears in the list.
Report back with confirmation or any errors.

Step 3: Execute in Browser Agent

Open your browser agent of choice—ChatGPT, Atlas, Gemini, or whichever you prefer. Paste the prompt. Watch it work.

The browser agent navigates, clicks, fills, and completes the task. For well-defined tasks, it's remarkably effective. Keep an eye on it the first few times to build trust in the workflow.

Step 4: Copy Results Back

When the browser agent finishes, it reports back:

"Successfully added TXT record _dmarc.example.com. The record is now visible in the DNS list. Propagation typically takes 5-10 minutes."

Copy this back to Claude Code:

"The browser agent reports: [paste result]. Please continue with the next step."

Step 5: Continue the Flow

Claude Code picks up right where you left off, now with confirmation that the manual step is complete.

Why This Works So Well

Context Preservation

The biggest win is staying in your flow. Your coding agent maintains full context of what you're building, why you needed that DNS change, and what comes next. You never fully context-switch to "manual mode."

Prompt Quality

Claude Code writes excellent browser agent prompts. It knows:

What information the browser agent needs
How to structure multi-step instructions
What verification steps to include
How to request structured responses

When you write prompts for browser agents yourself, they're often vague or miss steps. When Claude Code writes them, they're specific and complete.

Error Handling

Browser agents report back with details. If something fails, you get:

"Could not find DNS settings. The account appears to have multiple domains. Please specify which domain: example.com, staging.example.com, or dev.example.com."

You paste this back to Claude Code, which can then refine the prompt:

"Write an updated prompt that specifies example.com as the target domain."

Speed

This loop—write prompt, paste, execute, paste back—takes less time than:

Stopping what you're doing
Opening the browser
Remembering what you needed to do
Navigating to the right page
Making the change
Remembering where you were in your code

The browser agent handles steps 2-5. You just manage the handoff.

Practical Examples

Example 1: Vercel Environment Variables

Claude Code needs to add an environment variable to Vercel, but you're at the free tier without API access:

"Write a browser agent prompt to add an environment variable called RESEND_API_KEY with value [key] to my project 'my-webapp' on Vercel. Apply it to Production, Preview, and Development environments."

Browser agent executes, reports success. Claude Code continues building the email feature.

Example 2: Google Search Console Verification

You need to verify a new domain in Google Search Console:

"Write a browser agent prompt to add the domain elegantsoftwaresolutions.com to Google Search Console using DNS verification. Report back with the TXT record value I need to add."

Browser agent navigates, initiates verification, extracts the TXT value. You paste back to Claude Code, which then writes the prompt for adding that TXT record to your DNS provider.

Example 3: SaaS Admin Panel Updates

A client needs settings changed in their CRM that has no API:

"Write a browser agent prompt to navigate to HubSpot, go to Settings > Properties > Contact Properties, and add a new single-line text property called 'AI_Score' with the description 'Automated lead scoring from our AI system'."

Browser agent handles the multi-step navigation and form filling. You get confirmation and can continue with the integration.

Tips for Better Results

Be Specific About Verification

Always ask for verification in your prompts:

"After making the change, verify it appears correctly and report the exact values you see."

This catches errors before they cascade.

Browser agents handle auth, but be clear about what account:

"Log in to Cloudflare using the credentials stored in my browser (you should see the dashboard for Elegant Software Solutions)."

Request Structured Output

When you need data back, specify the format:

"Report back with: (1) whether the change was successful, (2) the exact URL of the page, and (3) any confirmation messages shown."

Break Down Complex Tasks

For multi-step processes, consider breaking into separate prompts:

"First, write a prompt to check the current state. After I get the results, we'll write the update prompt."

This gives you a checkpoint and reduces risk.

The Meta-Workflow

Here's the beautiful thing: this workflow is itself agentic. You're orchestrating two specialized agents:

Claude Code — Expert at code, context, reasoning, and prompt generation
Browser Agent — Expert at navigating and manipulating web interfaces

You're the human in the loop, but your job has become coordination, not execution. You copy-paste between agents, provide oversight, and make decisions. The agents do the work.

This is what agentic development feels like in practice. Not one all-powerful AI that does everything, but specialized agents that you compose together, with yourself as the orchestration layer.

Looking Forward

This copy-paste workflow is a bridge. As MCP servers proliferate and APIs improve, more of these manual steps will become automated. Browser agents are getting better every month.

But right now, in the real world, you hit walls. And when you do, this pattern works. It's been one of the most practically useful additions to my daily workflow.

The key insight: when your coding agent hits a wall, don't drop into manual mode. Ask it to write a prompt for another agent. Stay in the flow.

Have you been using browser agents in combination with coding agents? I'd love to hear what patterns you've discovered. Reach out at elegantsoftwaresolutions.com/contact.