Code Review: Multiple AI Agents, Cross-Verified, Sub-1% False Positives
Multiple AI agents review your pull requests in parallel, cross-verify findings, and rank issues by severity. False positive rate below 1%.
Code Review is Claude Cowork's built-in feature for automated, multi-agent code review. Instead of a single AI scanning your code, multiple specialized agents each focus on different aspects — security, performance, correctness, style — and then cross-verify each other's findings to eliminate false positives.
Review the pull request in ~/Projects/myapp/ (branch: feature/new-auth). Deploy three review agents in parallel: Agent 1 — Security Reviewer: - Check for injection vulnerabilities (SQL, XSS, command) - Look for hardcoded secrets and credentials - Verify input validation and auth checks - Flag any insecure deserialization Agent 2 — Correctness Reviewer: - Trace logic paths for edge cases - Check null/undefined handling - Verify error handling completeness - Look for race conditions in async code Agent 3 — Performance Reviewer: - Identify O(n^2) or worse algorithms - Check for unnecessary re-renders - Look for missing DB indexes - Flag memory leaks After all agents complete: 1. Cross-verify: each agent reviews the others' findings 2. Remove false positives (any issue flagged by only one agent and disputed by another) 3. Rank remaining issues by severity: BLOCKER > MAJOR > MINOR > SUGGESTION 4. Save the final report as ~/Reviews/pr-[BRANCH]-[DATE].md
What Code Review Does
Code Review deploys multiple AI agents that each specialize in a different aspect of code quality. They work in parallel, then cross-verify each other's findings to produce a single, high-confidence review report.
The Multi-Agent Advantage
A single AI reviewer has blind spots. It might flag a "security issue" that's actually a false positive, or miss a performance problem because it was focused on correctness. Multiple agents solve this:
- Specialization: Each agent focuses on one domain (security, correctness, performance, style) and does it thoroughly.
- Cross-verification: After the initial review, agents check each other's findings. If Agent A flags a security issue but Agent B (who reviewed the same code for correctness) disagrees, the issue is debated and resolved.
- Consensus ranking: Only issues that survive cross-verification make it to the final report. This drops the false positive rate below 1%.
The Review Pipeline
Phase 1: Parallel Analysis
Each agent independently reviews the code diff:
| Agent | Focus Area | What It Checks |
|---|---|---|
| Security | Vulnerabilities | Injection, auth, secrets, deserialization |
| Correctness | Logic | Edge cases, null handling, error paths, races |
| Performance | Speed | Algorithm complexity, re-renders, DB indexes, leaks |
| Style | Consistency | Naming, formatting, function length, comments |
Phase 2: Cross-Verification
Each agent reviews the others' findings:
- Does this issue actually exist?
- Is the severity rating correct?
- Is the suggested fix appropriate?
Issues that are disputed and lack consensus are dropped.
Phase 3: Report Generation
Surviving issues are compiled into a single report, ranked by severity:
- BLOCKER: Must fix before merge (security vulnerabilities, data loss risks)
- MAJOR: Should fix before merge (logic errors, significant performance issues)
- MINOR: Can fix in a follow-up PR (style issues, minor optimizations)
- SUGGESTION: Optional improvements (better naming, refactoring opportunities)
How to Use Code Review
Option 1: Manual Trigger
Open a Cowork session in your project folder and ask Claude to review a specific branch or PR. Claude deploys the agents and generates the report.
Option 2: Git Hook Integration
Add Code Review to your pre-push or pre-PR git hook:
# .git/hooks/pre-push
#!/bin/bash
echo "Running Claude Code Review..."
claude cowork --review --branch $(git rev-parse --abbrev-ref HEAD)
Option 3: CI/CD Integration
Add Code Review as a step in your CI pipeline:
# .github/workflows/code-review.yml
- name: Claude Code Review
run: |
npx claude cowork --review --pr ${{ github.event.pull_request.number }} \
--output-format markdown \
--output-file review.md
Review Report Format
Each report includes:
# Code Review: [PR Title]
**Branch**: feature/new-auth
**Date**: 2026-06-27
**Files changed**: 12
**Lines added**: 245
**Lines removed**: 89
## Summary
3 issues found: 1 BLOCKER, 1 MAJOR, 1 MINOR
## BLOCKER
### [Security] SQL injection in user search (src/api/search.ts:42)
The `query` parameter is concatenated directly into the SQL string...
**Fix**: Use parameterized queries:
```ts
db.query('SELECT * FROM users WHERE name LIKE ?', [`%${query}%`])
MAJOR
[Correctness] Unhandled promise rejection (src/auth/login.ts:78)
...
MINOR
[Style] Function exceeds 50 lines (src/utils/parse.ts:15-72)
...
## False Positive Reduction
The cross-verification phase is what sets this feature apart from single-agent reviewers. Here's how it works:
1. **Agent A flags** "Possible SQL injection in search.ts:42"
2. **Agent B reviews** the same code and confirms: "Yes, the query parameter is not parameterized"
3. **Agent C reviews** and adds context: "The query comes from user input via req.query, so this is exploitable"
4. **Consensus**: All three agents agree → issue is confirmed, rated BLOCKER
vs.
1. **Agent A flags** "Possible SQL injection in config.ts:15"
2. **Agent B reviews** and disagrees: "This is a static config string, not user input. Not exploitable."
3. **Consensus**: Disputed → issue is dropped as a false positive
This process typically eliminates 60-80% of initial flags, leaving only high-confidence findings.
## Best Practices
### Review Focused Diffs
Code Review works best on PRs under 500 lines changed. For larger PRs, break them into smaller PRs or ask Claude to review specific files.
### Provide Context
Tell the agents what the PR is trying to do: "This PR adds OAuth authentication to the login flow." This helps them distinguish intentional design choices from bugs.
### Don't Skip Cross-Verification
The cross-verification phase takes extra time but is what makes the review trustworthy. Don't disable it to save time — you'll get more false positives.
### Use as a Pre-Filter, Not a Replacement
Code Review catches common issues before human reviewers see the PR. Human reviewers should focus on architecture, business logic, and edge cases that AI may miss.
## Limitations
- **No runtime testing**: Code Review analyzes the diff statically. It doesn't run the code or execute tests. Pair it with your CI test suite.
- **Language support**: Works best with TypeScript, JavaScript, Python, Go, and Rust. Other languages may have less thorough analysis.
- **Large PRs**: PRs over 1000 lines may hit context limits. Reviewers may miss issues in files they couldn't fully analyze.
- **Not a security audit**: Code Review catches common vulnerabilities but is not a substitute for a professional security audit.
## Frequently Asked Questions
### How long does a Code Review take?
A typical PR (100-300 lines changed) takes 2-5 minutes. Larger PRs can take 10-15 minutes. The cross-verification phase adds about 30% to the total time.
### Can I customize which agents run?
Yes. You can enable/disable specific agents (Security, Correctness, Performance, Style) or add custom agents with your own review criteria.
### Does Code Review work with all languages?
It works with all languages, but analysis depth varies. TypeScript, JavaScript, Python, Go, and Rust get the most thorough reviews. Other languages get a more general review.
### Can Code Review auto-fix issues?
Not by default. Code Review produces a report with suggested fixes. You can ask Claude to apply the fixes in a follow-up task, but always review the changes before committing.
<ShareButtons />