🤖 Ghostwritten by Claude Opus 4.5 · Edited by GPT-5.2 Codex · Curated by Tom Hundley
This article was written by Claude Opus 4.5, fact-checked by GPT-5.2 Codex, and curated for publication by Tom Hundley.
This is Part 3 of the Professional's Guide to Vibe Coding series. Start with Part 1 if you haven't already.
Reviewing AI Code in Vibe Coding Is Different
Human code review and AI code review require different mindsets.
When reviewing human code, you're looking for intentions that didn't translate correctly—the developer knew what they wanted but made mistakes in expression. You're checking logic, style, and overlooked edge cases.
When reviewing AI code, you're looking for confident incorrectness—code that reads well, follows patterns, and is fundamentally wrong. AI makes different mistakes than humans, and catching them requires different vigilance.
This article is the checklist I've developed after reviewing thousands of AI-generated code blocks.
The AI Code Review Checklist
Category 1: Security
AI code is often insecure by default. Not maliciously—just naively.
Check for input validation:
// AI often generates this:
app.get('/user/:id', (req, res) => {
const user = db.query(`SELECT * FROM users WHERE id = ${req.params.id}`);
});
// When it should be this:
app.get('/user/:id', (req, res) => {
const id = parseInt(req.params.id, 10);
if (isNaN(id)) return res.status(400).json({ error: 'Invalid ID' });
const user = db.query('SELECT * FROM users WHERE id = $1', [id]);
});
Questions to ask:
- Is user input being interpolated directly into queries or commands?
- Are file paths being validated before access?
- Are secrets hardcoded or exposed in logs?
- Is authentication checked before authorization?
- Are rate limits implemented for public endpoints?
Category 2: Hallucinated Dependencies
AI frequently invents packages that don't exist or uses APIs that have been deprecated.
Common patterns:
- Package names that sound right but don't exist in npm/PyPI
- Method signatures from old versions of libraries
- API endpoints from outdated documentation
- Configuration options that were removed years ago
Verification steps:
- Check if the package exists:
npm view <package> or search PyPI
- Check if the imported method exists in current version docs
- Verify API endpoints in official documentation
- Test imports before building logic on top
Category 3: Logic Errors
AI excels at producing code that looks correct but fails on edge cases.
Watch for:
- Off-by-one errors in loops and slices
- Incorrect null/undefined handling
- Race conditions in async code
- Comparison operators that fail for edge values
Example:
// AI generated:
function getLastItem(arr) {
return arr[arr.length]; // Off by one—returns undefined
}
// Should be:
function getLastItem(arr) {
return arr.length > 0 ? arr[arr.length - 1] : undefined;
}
Category 4: Architecture Mismatches
AI doesn't know your system's architecture. It generates locally reasonable code that may conflict globally.
Questions to ask:
- Does this follow the patterns established elsewhere in the codebase?
- Is it creating coupling that will make future changes hard?
- Does the abstraction level match similar features?
- Is it introducing inconsistent naming or structure?
Category 5: Missing Error Handling
AI tends toward "happy path" code. It often skips:
- Try/catch blocks around I/O operations
- Fallbacks for failed network requests
- Validation of external data
- Graceful degradation for missing dependencies
Standard practice: After AI generates code, explicitly ask: "What are all the ways this could fail?" Then verify those cases are handled.
The Hallucination Detection Protocol
Hallucinations are AI's most insidious failure mode. The code compiles. It runs. It just does something subtly wrong based on fabricated information.
Signs of Hallucination
Confident specificity about unknown things:
- Very specific version numbers (e.g., "as of version 3.4.7")
- Detailed API signatures you can't verify
- "Best practices" you've never heard of from authoritative sources
Made-up documentation references:
- Links to documentation pages that don't exist
- Citations of blog posts that return 404
- References to configuration files with invented schema
Plausible but fictional features:
- Methods that would make sense but don't exist
- Configuration options that seem reasonable but aren't supported
- Integrations between tools that don't actually work together
Verification Steps
- Don't trust—verify. Before building on any AI claim, check primary sources.
- Check version alignment. Is the AI using information from the correct library version?
- Test in isolation. Before integrating, verify the specific feature works.
- When in doubt, ask explicitly. "Is this a real feature or are you uncertain?"
The Context Drift Problem
Over long conversations or complex tasks, AI loses coherence. This manifests as:
Symptoms of drift:
- Contradictory changes to the same file
- Forgetting project structure established earlier
- Repeating the same mistake after you corrected it
- Generating code that conflicts with earlier generations
Mitigation strategies:
- Use shorter sessions. Start fresh for each major task.
- Provide explicit context. Don't rely on conversation history; restate important constraints.
- Checkpoint frequently. Test and commit working code before continuing.
- Recognize the signs. When coherence breaks down, it's time to restart.
The 60-Second Triage
Not every AI generation needs deep review. Here's my fast-pass checklist:
30 seconds—structural scan:
- Are imports real and necessary?
- Does the overall structure match expectations?
- Are there obvious security patterns missing?
30 seconds—logic scan:
- Do loops have correct bounds?
- Is error handling present?
- Are edge cases addressed?
If the triage passes, proceed to deeper review. If it fails on any point, it's often faster to regenerate with a better prompt than to fix the output.
When to Read Diffs vs. Test Behavior
Read the diff when:
- The change is small and targeted
- You're reviewing security-critical code
- You need to understand the implementation for future maintenance
- The AI is working in an area you don't know well
Test behavior when:
- The change is large and complex
- You're prototyping and correctness matters more than understanding
- Time pressure is high and the risk of failure is low
- The code is throwaway (tests, demos, experiments)
Most production code requires both: test that it works, then review to understand why.
The Security Review Addendum
Because AI doesn't think adversarially, security review requires explicit attention:
Always check for:
- SQL injection via string interpolation
- Command injection in shell executions
- Path traversal in file operations
- Cross-site scripting in rendered output
- Insecure deserialization
- Missing authentication on endpoints
- Hardcoded secrets or credentials
- Overly permissive CORS or permissions
Assume AI code is insecure until you've explicitly verified each attack surface.
Building Your Review Workflow
Here's the workflow I use:
- Generate with a clear, scoped prompt
- Triage in 60 seconds—reject and regenerate if structurally wrong
- Deep review using the checklist categories above
- Test behavior before committing
- Commit with a clear message that notes AI assistance
- Document any non-obvious decisions for future maintainers
The workflow becomes automatic with practice. The first few months feel slow; eventually it's faster than writing code directly.
The Bottom Line
Reviewing AI code is a skill distinct from writing code or reviewing human code. It requires:
- Systematic checking for AI-specific failure modes
- Healthy skepticism about confident-sounding claims
- Explicit security review that AI won't do for you
- Recognition of context drift and when to restart
The checklist in this article isn't exhaustive—you'll develop your own patterns. But it's a foundation that catches most issues before they reach production.
Next in the series: Building Intuition: What AI Gets Wrong (How to Predict It)
Ready to level up your team's AI development practices?
Elegant Software Solutions offers hands-on training that takes you from AI-curious to AI-proficient—with the professional discipline that production systems require.
👉 Book a consultation