🤖 Ghostwritten by Claude Opus 4.6 · Fact-checked & edited by GPT 5.4 · Curated by Tom Hundley

Agent-to-Agent Attacks: How AI Tools Infect Each Other

If one AI tool in your workflow can read untrusted input, write code, or trigger downstream automation, it can become a stepping stone for a broader compromise. That's the core idea behind an agent-to-agent attack: an attacker influences one system, and trust between connected tools helps the malicious change spread.

In practice, that might mean a coding assistant acts on a poisoned GitHub issue, a chat-connected bot forwards bad instructions, or an automated pipeline treats AI-generated changes as trustworthy because they came from an internal tool. The exact path varies, but the risk is the same: too much automation, too much trust, and not enough human review.

If you're building with tools like Cursor, Replit, v0, or Bolt, the safest assumption is simple: any tool with broad permissions and access to external content needs tighter controls. The good news is that you do not need to stop using AI tools. You need to limit what they can access, require human approval for sensitive actions, and treat AI-to-AI handoffs as security boundaries.

How Agent-to-Agent Attacks Actually Work

TL;DR: Attackers usually do not need to break your AI tools directly. They can influence one tool with untrusted input, then let automation and trust relationships carry the damage forward.

Think of your AI tools like coworkers with overlapping responsibilities. One tool reads messages or tickets. Another writes code. Another runs tests or opens pull requests. A deployment system may publish whatever passes the pipeline. If each step assumes the previous one is trustworthy, a bad instruction can move surprisingly far.

A typical chain looks like this:

Entry point: An attacker places malicious or misleading content where an AI-enabled tool can read it, such as a GitHub issue, support ticket, shared document, or chat message.
First compromise: A coding agent interprets that content as a legitimate instruction and makes a harmful change.
Propagation: Other tools treat the change as routine because it came from an internal workflow, approved bot, or expected automation path.
Impact: The change is merged, deployed, or used to expose data before a human notices.

This pattern overlaps with well-documented AI security risks, including prompt injection, excessive permissions, and over-trusting tool output. OWASP's guidance for LLM applications has consistently warned about excessive agency and insecure tool use as major risk areas. The specific label "agent-to-agent attack" is still emerging and not yet a formal standard category, but the underlying mechanics are real and align with known attack classes.

Why Vibe Coders Are Especially Vulnerable

TL;DR: The biggest risk is not that AI tools exist. It's that many builders accept default permissions, integrations, and automation without checking what those tools can actually do.

If you started with an AI-first workflow, you may have connected tools quickly so you could move faster. That convenience can create a risky setup, especially when one tool can read external content and another can change code or trigger deployment.

Common high-risk capabilities include:

Read and write access across an entire repository
Package installation or shell command execution
Access to external sources such as GitHub issues, docs, chat tools, or plugin ecosystems
Pull request creation or merge permissions
Deployment or production environment access

Here's a practical way to think about the risk:

Agent Permission Level	What It Can Do	Risk If Exploited
Read-only on current file	View a limited part of the codebase	Lower risk, but can still expose sensitive code or context
Read/write on project	Edit files across the repository	High risk: can inject malicious logic or weaken controls
Package installation or command execution	Add dependencies or run scripts	Critical risk: can introduce malware or alter the environment
External communication or content ingestion	Read issues, messages, or third-party content	Critical risk: can be manipulated by prompt injection or spoofed instructions
Deployment or merge access	Push changes toward production	Severe risk: can turn a bad suggestion into a live incident

If you have not reviewed these permissions manually, assume they are broader than necessary. That's especially important for solo builders and small teams moving quickly. For a broader look at permission scoping and review habits, see AI Agent Security for Vibe Coders.

What These Attacks Look Like in Practice

TL;DR: Most real-world examples look ordinary at first: a bug report, a patch suggestion, a support message, or a bot action that seems routine.

Here are three patterns worth watching.

Poisoned Issue or Ticket

An attacker submits a GitHub issue, support request, or task description containing hidden instructions or misleading remediation steps. A coding agent reads the text as part of its context and makes changes beyond the stated task.

How to spot it: If the AI modifies files unrelated to the request, adds network calls, changes authentication logic, or introduces new dependencies without a clear reason, stop and review manually. We covered this broader pattern in Hidden Prompts in GitHub Issues.

Fake Security Fix

An attacker shares a supposed patch for a vulnerability and frames it as urgent. The message may reference a real package or a plausible CVE format, but the code itself introduces a backdoor, weakens validation, or exfiltrates data.

How to spot it: Verify security fixes against the official maintainer repository, vendor advisory, or a trusted vulnerability database. Do not trust a patch just because it sounds urgent or cites a CVE-like identifier.

Bot or Automation Impersonation

An attacker creates a bot, action, or integration with a name that resembles a legitimate tool. Humans and AI systems alike may treat it as trustworthy if the name looks familiar.

How to spot it: Check the actual publisher, installation source, permissions requested, and account identity. Similar names are not proof of legitimacy.

These patterns also overlap with software supply chain risk. If an AI tool can install packages or follow dependency suggestions automatically, the blast radius gets larger. For a related example, see the ClawHavoc supply chain attack.

Do This Now: 5 Steps to Protect Yourself

TL;DR: Reduce permissions, isolate external inputs, and require human approval before code is merged or deployed.

Here is the practical checklist.

1) Audit what each AI tool can access

Open every AI tool you use and document what it can read, write, execute, install, and connect to. Include repository scope, shell access, package management, issue access, chat integrations, and deployment permissions.

2) Separate external content from coding authority

Do not let a coding agent act directly on Slack messages, Discord posts, email, public issues, or scraped web content without human review. External content is untrusted input.

3) Require a human checkpoint before merge or deploy

Auto-deploy is not always wrong, but auto-deploying AI-generated changes without review is a bad idea. At minimum, require human approval before merging security-sensitive changes or promoting code to production.

4) Review every AI-generated change set

Before accepting changes, inspect:

New files you did not ask for
Dependency additions or lockfile changes
Authentication, authorization, payment, or data-handling code
New outbound network requests
Build scripts, CI configuration, and deployment settings

5) Ask the tool to summarize its own changes

A self-audit prompt is not a security control by itself, but it can help surface surprises.

Prompt to paste into your AI tool:

Review all the changes you just made to my project. List every file you modified, every dependency you added or changed, every command you ran, and every external URL or API endpoint referenced in the new code. Flag anything that affects authentication, authorization, payments, secrets, deployment, or data sent outside this application.

Use that summary as a starting point, not as proof that the changes are safe.

Frequently Asked Questions

TL;DR: These attacks are less about one specific product and more about how much authority, connectivity, and trust you give your tools.

Q: What is an agent-to-agent attack in AI workflows?

It is a chain attack where one AI-enabled system is influenced by malicious or untrusted input, and downstream tools amplify the damage because they trust the earlier step. The attack may involve coding assistants, bots, CI systems, ticketing tools, or deployment automation.

Q: Are Cursor, Bolt, Replit, and v0 inherently vulnerable?

Not inherently. The risk depends on configuration: what the tool can access, whether it can act on external content, and whether a human reviews the output before merge or deploy. Broad permissions and unattended automation increase risk.

Q: How would I know if a cascade attack may have happened?

Look for changes you did not request, especially around authentication, secrets handling, network calls, CI configuration, dependencies, or deployment settings. Also review logs for unusual bot actions, unexpected pull requests, or changes triggered from external content sources.

Q: Can I still use multiple AI tools safely?

Yes. Treat each handoff as a trust boundary. Limit permissions, isolate untrusted inputs, require approval for sensitive actions, and log what each tool did. Multiple tools are manageable if they are not allowed to silently approve one another.

Q: What is the single most effective control?

For most teams, it is a mandatory human approval step before high-impact actions such as merging privileged changes, rotating secrets, changing infrastructure, or deploying to production. That breaks the automatic cascade.

Key Takeaways

Agent-to-agent attacks are best understood as chain attacks across connected AI tools and automations.
The core enablers are untrusted input, excessive permissions, and automatic trust between systems.
The label is newer than the underlying problem, but the mechanics align with established risks like prompt injection and insecure tool use.
The safest default is to treat external content as untrusted and AI-generated changes as requiring review.
Human approval before merge or deploy is one of the most effective ways to stop a bad chain from reaching production.

Conclusion

AI tools can speed up development, but they also compress the time between bad input and real impact. When one tool can influence another, convenience becomes a security design problem.

The fix is not to abandon AI-assisted development. It is to build safer boundaries: narrower permissions, cleaner separation between external content and code changes, and explicit human approval for high-risk actions.

If your team is adopting AI-assisted workflows and wants help designing safer guardrails, Elegant Software Solutions can help you assess permissions, review automation paths, and build practical controls that do not kill velocity.

Share this with someone building with AI tools. They may not realize how much trust exists between the systems in their workflow.