🤖 Ghostwritten by GPT 5.4 · Fact-checked & edited by Claude Opus 4.6 · Curated by Tom Hundley

DryRun Study: AI Coding Vulnerabilities Explained

On March 11, 2026, DryRun Security published a result that should make every vibe coder pause: in its testing of major AI coding agents building full applications through pull requests, 87% of those pull requests introduced security flaws. Across 38 scans, DryRun reported 143 vulnerabilities. This is no longer a hypothetical problem. AI-generated insecure code is common enough that you should assume it is present until you check.

If you use Claude, Codex, Gemini, Cursor, Replit, Bolt, v0, or Lovable, the lesson is simple: keep using AI, but stop trusting first drafts with your data, logins, payments, or customer records. The DryRun Security study found repeat problems like improper JWT handling, brute-force vulnerabilities, token replay exploits, and broken WebSocket authentication. In plain English, AI tools often build doors that look locked but are easy to push open.

I'm not telling you to stop vibe coding. I'm telling you not to learn security the hard way.

The DryRun Security study matters because the failures were predictable

TL;DR: The scary part is not just the number of flaws — it is that the same basic mistakes kept showing up across different AI tools.

DryRun Security tested leading AI agents — including Claude, Codex, and Gemini — by having them build real application features through pull requests. According to DryRun Security's March 11, 2026 findings, 143 vulnerabilities appeared across 38 scans, and 87% of pull requests introduced security issues. These were not weird edge cases. They were the kind of mistakes attackers look for first.

The key takeaway: the models were different, but the failure pattern was similar. That suggests the problem is bigger than any one product. AI systems are very good at producing code that looks finished. They are not automatically good at producing code that is safe under pressure from real attackers.

Think of it like asking three different people to install the front door on your house. All three doors close. All three doors look nice. But if none of them know how deadbolts work, you still have a break-in problem.

The vulnerabilities DryRun highlighted are especially dangerous for vibe coders because they often sit in the exact parts of an app you are least equipped to inspect:

Login and session handling
Password reset flows
Account protection
Real-time chat or live updates
Private data access rules

This is also why articles like The Reviewer's Toolkit: How to Babysit AI Code Effectively matter. The risk is not that AI writes ugly code. The risk is that it writes confident-looking code with hidden traps.

A definitive rule: If AI wrote your authentication code, assume it needs a security review. That is not paranoia. It is a practical response to what the DryRun Security study showed.

What those vulnerabilities mean in normal-person language

TL;DR: Most AI coding vulnerabilities boil down to weak locks, reusable tickets, and side doors nobody checked.

Let's translate the major issues into plain English.

Improper JWT handling

A JWT is basically a digital claim ticket used to keep you signed in. If it is handled badly, someone can fake a ticket, keep using an old one, or use a ticket longer than they should.

Bad AI-generated patterns often include:

Tokens that never expire
Missing checks on who issued the token
Weak secret values protecting the token
Trusting data inside the token without verifying it properly

For vibe coders, the safe mindset is simple: if your app has login, your app has risk. JWT security mistakes are dangerous because they can let an attacker pretend to be a real user.

Brute-force vulnerabilities

This means an attacker can keep trying passwords or codes over and over with no meaningful slowdown. Imagine a keypad lock that lets someone test ten thousand combinations with no alarm, no delay, and no lockout.

Token replay exploits

This happens when a valid login token can be stolen and reused. It is like someone finding a used concert wristband that still gets them through the gate.

WebSocket authentication issues

WebSockets power things like live chat, live dashboards, and instant updates. AI tools sometimes protect the front door but forget the side door. A user may be checked when opening the app, but not properly rechecked on the live connection that keeps running in the background.

Vulnerability	Plain-English meaning	What can go wrong
Improper JWT handling	Weak or badly checked login ticket	Account takeover
Brute-force vulnerability	Unlimited guess attempts	Password cracking
Token replay exploit	Stolen session reused	Silent impersonation
WebSocket auth issue	Live channel not properly locked	Private data leakage

These are classic security failures dressed up in modern AI clothing. The models changed. The old mistakes did not.

Why Claude security flaws, Codex vulnerabilities, and similar failures keep happening

TL;DR: AI writes insecure code because it predicts likely code patterns, not because it understands consequences like an attacker does.

AI coding tools are not lying to you when they produce working code. They are doing exactly what they were built to do: predict plausible next steps based on patterns. The problem is that "common" code and "secure" code are not the same thing.

A lot of insecure code on the internet still looks normal. When a model has seen thousands of examples that are functional but sloppy, it may reproduce those patterns with confidence. That is one reason Claude security flaws, Codex vulnerabilities, and other AI coding vulnerabilities keep appearing even when the output seems polished.

There is also a second problem: security requires defensive thinking. You must ask, "How would someone abuse this?" AI is better at "How do I make this feature work?" than "How do I stop a determined stranger from breaking it?"

That gap gets worse in full app generation. Once you ask an AI tool to create signup, login, billing, dashboards, and live features all together, context starts drifting. A secure check added in one file may not be carried into another. A rule applied to the web page may be forgotten in the background service. A password reset may be added without rate limits. A chat feature may use WebSockets without proper identity checks.

This is also why supply-chain issues are dangerous. If you have not read ClawHavoc: AI Skills Supply Chain Attack Explained, do. Not all AI risk comes from your prompt. Sometimes the risk comes from what the tool pulls in around your prompt.

AI can accelerate building, but it cannot yet be trusted to self-police security. You still need checks after generation.

What this means for you before you deploy anything

TL;DR: If your AI-built app touches logins, money, files, or private data, you need a simple security checklist before going live.

The practical rule: the moment your app handles user accounts or private information, you are no longer "just experimenting." You are responsible for protecting people.

DryRun Security's findings mean you should add AI code scanning and human review before deployment, even for small projects. You do not need a computer science degree to do the basics. You need a repeatable routine.

Your pre-launch security check

Before you publish, ask these questions:

Can someone try passwords forever without getting slowed down?
Do login links or codes expire quickly?
If someone logs out, does their old session stop working?
Are live features like chat or notifications protected the same way as the main app?
Can a regular user change a web address and see someone else's data?
Did the AI tool leave any secret keys in the code?

For that last one, read GitHub Secret Scanning Now Detects Vercel and Supabase Keys. Secret leakage is one of the fastest ways to turn a toy app into a breach.

A simple scan routine

If your AI tool supports code review, use prompts like this:

Review this project for security issues. Focus on login, session handling, password reset, rate limiting, private data access, file uploads, and live connections. Explain each risk in plain English, show the exact file, and propose the safest fix.

Then run a second pass with a different prompt:

Assume you are an attacker. Show me the three easiest ways to break this app, steal data, reuse sessions, or bypass login. Be harsh. Do not praise the code.

The goal is not to make the AI feel smart. The goal is to force it to look for failure.

Do this now: safer prompts, safer habits, fewer surprises

TL;DR: You can reduce AI-generated insecure code by telling your tool exactly what security protections to include and what tests to perform.

Here is a prompt you can paste into Cursor, Replit, Claude, Codex, or a similar tool when building any app with accounts:

Build this feature with security first. Add rate limits for login and password reset. Make sessions expire. Invalidate old sessions after logout or password change. Protect live connections the same way as normal page requests. Never trust user input. Add permission checks for every private record. At the end, list the security protections you added and the remaining risks.

Here is a second prompt for review:

Audit this app for AI coding vulnerabilities. Specifically check for improper token handling, brute-force attacks, token replay, weak password reset logic, missing permission checks, exposed secrets, and insecure live connection authentication. Explain findings in plain English for a non-developer.

Homework

Before tomorrow, open one AI-built project and ask your coding tool these two questions:

"Where could a stranger access data they should not see?"
"What part of this login system would fail first under attack?"

If the answers are vague, that is your warning.

Security is not anti-AI. It is how you keep using AI without hurting yourself or your users. The DryRun Security study did not prove vibe coding is doomed. It proved that trusting unreviewed output is reckless.

You've got this. See you tomorrow. Share this with someone who needs it.

Frequently Asked Questions

Q: Should I stop using AI coding tools after the DryRun Security study?

No. The right response is not to abandon AI but to stop treating its first draft like finished work. Use AI for speed, then add AI code scanning, targeted review, and simple security checks before you deploy. The DryRun study showed that flaws are common — not that AI tools are useless.

Q: What is the most dangerous type of AI-generated insecure code for vibe coders?

Login and session code is usually the most dangerous because it protects everything behind it. If JWT handling, password reset, or session expiration is wrong, attackers may get full account access without much effort. This is also the code vibe coders are least likely to understand deeply enough to spot problems.

Q: How can I check my app if I do not know how to read code well?

Start with behavior, not syntax. Ask your AI tool to explain where a stranger could guess passwords endlessly, reuse an old login, access another user's data, or connect to live features without proper checks. Then ask it to show the exact files and safest fixes. You can also use free tools like GitHub's secret scanning to catch leaked credentials.

Q: Are Claude security flaws and Codex vulnerabilities different, or basically the same problem?

They often show up as the same class of problem: code that works in normal use but fails under abuse. Different tools may make different mistakes in different places, but the pattern is consistent enough that you should apply the same security review routine to all of them.

Q: What should I scan before launching an AI-built app?

At minimum, scan for hardcoded secrets, broken login logic, missing rate limits, weak permission checks, unsafe file uploads, and insecure real-time features. If your app handles payments, medical data, or customer records, do not launch without a professional security review.

Key Takeaways

DryRun Security reported 143 vulnerabilities across 38 scans and found flaws in 87% of pull requests, published March 11, 2026.
The biggest risks included improper JWT handling, brute-force vulnerabilities, token replay exploits, and WebSocket authentication problems.
AI coding vulnerabilities happen because models predict likely code, not attacker-resistant code.
Vibe coder security starts with one mindset shift: assume AI-written auth and data access code needs review.
AI code scanning works best when you prompt for specific risks, not a generic "check my code."
You do not need to stop using AI tools. You need a pre-launch security routine.