Code Review: Multiple AI Agents, Cross-Verified, Sub-1% False Positives

Multiple AI agents review your pull requests in parallel, cross-verify findings, and rank issues by severity. False positive rate below 1%.

Code Review is Claude Cowork's built-in feature for automated, multi-agent code review. Instead of a single AI scanning your code, multiple specialized agents each focus on different aspects — security, performance, correctness, style — and then cross-verify each other's findings to eliminate false positives.

Review the pull request in ~/Projects/myapp/ (branch: feature/new-auth).

Deploy three review agents in parallel:

Agent 1 — Security Reviewer:
- Check for injection vulnerabilities (SQL, XSS, command)
- Look for hardcoded secrets and credentials
- Verify input validation and auth checks
- Flag any insecure deserialization

Agent 2 — Correctness Reviewer:
- Trace logic paths for edge cases
- Check null/undefined handling
- Verify error handling completeness
- Look for race conditions in async code

Agent 3 — Performance Reviewer:
- Identify O(n^2) or worse algorithms
- Check for unnecessary re-renders
- Look for missing DB indexes
- Flag memory leaks

After all agents complete:
1. Cross-verify: each agent reviews the others' findings
2. Remove false positives (any issue flagged by only one agent and disputed by another)
3. Rank remaining issues by severity: BLOCKER > MAJOR > MINOR > SUGGESTION
4. Save the final report as ~/Reviews/pr-[BRANCH]-[DATE].md

What Code Review Does

Code Review deploys multiple AI agents that each specialize in a different aspect of code quality. They work in parallel, then cross-verify each other's findings to produce a single, high-confidence review report.

The Multi-Agent Advantage

A single AI reviewer has blind spots. It might flag a "security issue" that's actually a false positive, or miss a performance problem because it was focused on correctness. Multiple agents solve this:

Specialization: Each agent focuses on one domain (security, correctness, performance, style) and does it thoroughly.
Cross-verification: After the initial review, agents check each other's findings. If Agent A flags a security issue but Agent B (who reviewed the same code for correctness) disagrees, the issue is debated and resolved.
Consensus ranking: Only issues that survive cross-verification make it to the final report. This drops the false positive rate below 1%.

The Review Pipeline

Phase 1: Parallel Analysis

Each agent independently reviews the code diff:

Agent	Focus Area	What It Checks
Security	Vulnerabilities	Injection, auth, secrets, deserialization
Correctness	Logic	Edge cases, null handling, error paths, races
Performance	Speed	Algorithm complexity, re-renders, DB indexes, leaks
Style	Consistency	Naming, formatting, function length, comments

Phase 2: Cross-Verification

Each agent reviews the others' findings:

Does this issue actually exist?
Is the severity rating correct?
Is the suggested fix appropriate?

Issues that are disputed and lack consensus are dropped.

Phase 3: Report Generation

Surviving issues are compiled into a single report, ranked by severity:

BLOCKER: Must fix before merge (security vulnerabilities, data loss risks)
MAJOR: Should fix before merge (logic errors, significant performance issues)
MINOR: Can fix in a follow-up PR (style issues, minor optimizations)
SUGGESTION: Optional improvements (better naming, refactoring opportunities)

How to Use Code Review

Option 1: Manual Trigger

Open a Cowork session in your project folder and ask Claude to review a specific branch or PR. Claude deploys the agents and generates the report.

Option 2: Git Hook Integration

Add Code Review to your pre-push or pre-PR git hook:

# .git/hooks/pre-push
#!/bin/bash
echo "Running Claude Code Review..."
claude cowork --review --branch $(git rev-parse --abbrev-ref HEAD)

Option 3: CI/CD Integration

Add Code Review as a step in your CI pipeline:

# .github/workflows/code-review.yml
- name: Claude Code Review
  run: |
    npx claude cowork --review --pr ${{ github.event.pull_request.number }} \
      --output-format markdown \
      --output-file review.md

Review Report Format

Each report includes:

# Code Review: [PR Title]
**Branch**: feature/new-auth
**Date**: 2026-06-27
**Files changed**: 12
**Lines added**: 245
**Lines removed**: 89

## Summary
3 issues found: 1 BLOCKER, 1 MAJOR, 1 MINOR

## BLOCKER
### [Security] SQL injection in user search (src/api/search.ts:42)
The `query` parameter is concatenated directly into the SQL string...
**Fix**: Use parameterized queries:
```ts
db.query('SELECT * FROM users WHERE name LIKE ?', [`%${query}%`])

MAJOR

[Correctness] Unhandled promise rejection (src/auth/login.ts:78)

...

MINOR

[Style] Function exceeds 50 lines (src/utils/parse.ts:15-72)

...


## False Positive Reduction

The cross-verification phase is what sets this feature apart from single-agent reviewers. Here's how it works:

1. **Agent A flags** "Possible SQL injection in search.ts:42"
2. **Agent B reviews** the same code and confirms: "Yes, the query parameter is not parameterized"
3. **Agent C reviews** and adds context: "The query comes from user input via req.query, so this is exploitable"
4. **Consensus**: All three agents agree → issue is confirmed, rated BLOCKER

vs.

1. **Agent A flags** "Possible SQL injection in config.ts:15"
2. **Agent B reviews** and disagrees: "This is a static config string, not user input. Not exploitable."
3. **Consensus**: Disputed → issue is dropped as a false positive

This process typically eliminates 60-80% of initial flags, leaving only high-confidence findings.

## Best Practices

### Review Focused Diffs
Code Review works best on PRs under 500 lines changed. For larger PRs, break them into smaller PRs or ask Claude to review specific files.

### Provide Context
Tell the agents what the PR is trying to do: "This PR adds OAuth authentication to the login flow." This helps them distinguish intentional design choices from bugs.

### Don't Skip Cross-Verification
The cross-verification phase takes extra time but is what makes the review trustworthy. Don't disable it to save time — you'll get more false positives.

### Use as a Pre-Filter, Not a Replacement
Code Review catches common issues before human reviewers see the PR. Human reviewers should focus on architecture, business logic, and edge cases that AI may miss.

## Limitations

- **No runtime testing**: Code Review analyzes the diff statically. It doesn't run the code or execute tests. Pair it with your CI test suite.
- **Language support**: Works best with TypeScript, JavaScript, Python, Go, and Rust. Other languages may have less thorough analysis.
- **Large PRs**: PRs over 1000 lines may hit context limits. Reviewers may miss issues in files they couldn't fully analyze.
- **Not a security audit**: Code Review catches common vulnerabilities but is not a substitute for a professional security audit.

## Frequently Asked Questions

### How long does a Code Review take?
A typical PR (100-300 lines changed) takes 2-5 minutes. Larger PRs can take 10-15 minutes. The cross-verification phase adds about 30% to the total time.

### Can I customize which agents run?
Yes. You can enable/disable specific agents (Security, Correctness, Performance, Style) or add custom agents with your own review criteria.

### Does Code Review work with all languages?
It works with all languages, but analysis depth varies. TypeScript, JavaScript, Python, Go, and Rust get the most thorough reviews. Other languages get a more general review.

### Can Code Review auto-fix issues?
Not by default. Code Review produces a report with suggested fixes. You can ask Claude to apply the fixes in a follow-up task, but always review the changes before committing.

<ShareButtons />