
🤖 Ghostwritten by GPT 5.4 · Fact-checked & edited by Claude Opus 4.6 · Curated by Tom Hundley
On March 11, 2026, DryRun Security published a result that should make every vibe coder pause: in its testing of major AI coding agents building full applications through pull requests, 87% of those pull requests introduced security flaws. Across 38 scans, DryRun reported 143 vulnerabilities. This is no longer a hypothetical problem. AI-generated insecure code is common enough that you should assume it is present until you check.
If you use Claude, Codex, Gemini, Cursor, Replit, Bolt, v0, or Lovable, the lesson is simple: keep using AI, but stop trusting first drafts with your data, logins, payments, or customer records. The DryRun Security study found repeat problems like improper JWT handling, brute-force vulnerabilities, token replay exploits, and broken WebSocket authentication. In plain English, AI tools often build doors that look locked but are easy to push open.
I'm not telling you to stop vibe coding. I'm telling you not to learn security the hard way.
TL;DR: The scary part is not just the number of flaws — it is that the same basic mistakes kept showing up across different AI tools.
DryRun Security tested leading AI agents — including Claude, Codex, and Gemini — by having them build real application features through pull requests. According to DryRun Security's March 11, 2026 findings, 143 vulnerabilities appeared across 38 scans, and 87% of pull requests introduced security issues. These were not weird edge cases. They were the kind of mistakes attackers look for first.
The key takeaway: the models were different, but the failure pattern was similar. That suggests the problem is bigger than any one product. AI systems are very good at producing code that looks finished. They are not automatically good at producing code that is safe under pressure from real attackers.
Think of it like asking three different people to install the front door on your house. All three doors close. All three doors look nice. But if none of them know how deadbolts work, you still have a break-in problem.
The vulnerabilities DryRun highlighted are especially dangerous for vibe coders because they often sit in the exact parts of an app you are least equipped to inspect:
This is also why articles like The Reviewer's Toolkit: How to Babysit AI Code Effectively matter. The risk is not that AI writes ugly code. The risk is that it writes confident-looking code with hidden traps.
A definitive rule: If AI wrote your authentication code, assume it needs a security review. That is not paranoia. It is a practical response to what the DryRun Security study showed.
TL;DR: Most AI coding vulnerabilities boil down to weak locks, reusable tickets, and side doors nobody checked.
Let's translate the major issues into plain English.
A JWT is basically a digital claim ticket used to keep you signed in. If it is handled badly, someone can fake a ticket, keep using an old one, or use a ticket longer than they should.
Bad AI-generated patterns often include:
For vibe coders, the safe mindset is simple: if your app has login, your app has risk. JWT security mistakes are dangerous because they can let an attacker pretend to be a real user.
This means an attacker can keep trying passwords or codes over and over with no meaningful slowdown. Imagine a keypad lock that lets someone test ten thousand combinations with no alarm, no delay, and no lockout.
This happens when a valid login token can be stolen and reused. It is like someone finding a used concert wristband that still gets them through the gate.
WebSockets power things like live chat, live dashboards, and instant updates. AI tools sometimes protect the front door but forget the side door. A user may be checked when opening the app, but not properly rechecked on the live connection that keeps running in the background.
| Vulnerability | Plain-English meaning | What can go wrong |
|---|---|---|
| Improper JWT handling | Weak or badly checked login ticket | Account takeover |
| Brute-force vulnerability | Unlimited guess attempts | Password cracking |
| Token replay exploit | Stolen session reused | Silent impersonation |
| WebSocket auth issue | Live channel not properly locked | Private data leakage |
These are classic security failures dressed up in modern AI clothing. The models changed. The old mistakes did not.
TL;DR: AI writes insecure code because it predicts likely code patterns, not because it understands consequences like an attacker does.
AI coding tools are not lying to you when they produce working code. They are doing exactly what they were built to do: predict plausible next steps based on patterns. The problem is that "common" code and "secure" code are not the same thing.
A lot of insecure code on the internet still looks normal. When a model has seen thousands of examples that are functional but sloppy, it may reproduce those patterns with confidence. That is one reason Claude security flaws, Codex vulnerabilities, and other AI coding vulnerabilities keep appearing even when the output seems polished.
There is also a second problem: security requires defensive thinking. You must ask, "How would someone abuse this?" AI is better at "How do I make this feature work?" than "How do I stop a determined stranger from breaking it?"
That gap gets worse in full app generation. Once you ask an AI tool to create signup, login, billing, dashboards, and live features all together, context starts drifting. A secure check added in one file may not be carried into another. A rule applied to the web page may be forgotten in the background service. A password reset may be added without rate limits. A chat feature may use WebSockets without proper identity checks.
This is also why supply-chain issues are dangerous. If you have not read ClawHavoc: AI Skills Supply Chain Attack Explained, do. Not all AI risk comes from your prompt. Sometimes the risk comes from what the tool pulls in around your prompt.
AI can accelerate building, but it cannot yet be trusted to self-police security. You still need checks after generation.
TL;DR: If your AI-built app touches logins, money, files, or private data, you need a simple security checklist before going live.
The practical rule: the moment your app handles user accounts or private information, you are no longer "just experimenting." You are responsible for protecting people.
DryRun Security's findings mean you should add AI code scanning and human review before deployment, even for small projects. You do not need a computer science degree to do the basics. You need a repeatable routine.
Before you publish, ask these questions:
For that last one, read GitHub Secret Scanning Now Detects Vercel and Supabase Keys. Secret leakage is one of the fastest ways to turn a toy app into a breach.
If your AI tool supports code review, use prompts like this:
Review this project for security issues. Focus on login, session handling, password reset, rate limiting, private data access, file uploads, and live connections. Explain each risk in plain English, show the exact file, and propose the safest fix.
Then run a second pass with a different prompt:
Assume you are an attacker. Show me the three easiest ways to break this app, steal data, reuse sessions, or bypass login. Be harsh. Do not praise the code.
The goal is not to make the AI feel smart. The goal is to force it to look for failure.
TL;DR: You can reduce AI-generated insecure code by telling your tool exactly what security protections to include and what tests to perform.
Here is a prompt you can paste into Cursor, Replit, Claude, Codex, or a similar tool when building any app with accounts:
Build this feature with security first. Add rate limits for login and password reset. Make sessions expire. Invalidate old sessions after logout or password change. Protect live connections the same way as normal page requests. Never trust user input. Add permission checks for every private record. At the end, list the security protections you added and the remaining risks.
Here is a second prompt for review:
Audit this app for AI coding vulnerabilities. Specifically check for improper token handling, brute-force attacks, token replay, weak password reset logic, missing permission checks, exposed secrets, and insecure live connection authentication. Explain findings in plain English for a non-developer.
Before tomorrow, open one AI-built project and ask your coding tool these two questions:
If the answers are vague, that is your warning.
Security is not anti-AI. It is how you keep using AI without hurting yourself or your users. The DryRun Security study did not prove vibe coding is doomed. It proved that trusting unreviewed output is reckless.
You've got this. See you tomorrow. Share this with someone who needs it.
No. The right response is not to abandon AI but to stop treating its first draft like finished work. Use AI for speed, then add AI code scanning, targeted review, and simple security checks before you deploy. The DryRun study showed that flaws are common — not that AI tools are useless.
Login and session code is usually the most dangerous because it protects everything behind it. If JWT handling, password reset, or session expiration is wrong, attackers may get full account access without much effort. This is also the code vibe coders are least likely to understand deeply enough to spot problems.
Start with behavior, not syntax. Ask your AI tool to explain where a stranger could guess passwords endlessly, reuse an old login, access another user's data, or connect to live features without proper checks. Then ask it to show the exact files and safest fixes. You can also use free tools like GitHub's secret scanning to catch leaked credentials.
They often show up as the same class of problem: code that works in normal use but fails under abuse. Different tools may make different mistakes in different places, but the pattern is consistent enough that you should apply the same security review routine to all of them.
At minimum, scan for hardcoded secrets, broken login logic, missing rate limits, weak permission checks, unsafe file uploads, and insecure real-time features. If your app handles payments, medical data, or customer records, do not launch without a professional security review.
Discover more content: