Project 25: Code Review Workflow (Multi-Agent Review)
Project 25: Code Review Workflow (Multi-Agent Review)
Build a multi-agent code review system where specialized agents (security, performance, style) review code in parallel and synthesize their findings into actionable feedback.
Learning Objectives
By completing this project, you will:
- Master multi-agent orchestration patterns for parallel task execution
- Design specialized AI agents with focused expertise and custom configurations
- Implement result synthesis combining findings from multiple sources
- Apply severity ranking algorithms to prioritize code review feedback
- Understand agent delegation patterns for complex workflows
Deep Theoretical Foundation
The Code Review Challenge
Traditional code review has fundamental limitations:
Traditional Code Review:
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ Human Reviewer โ
โ โ
โ โโโโโโโโโโโโโโโ โโโโโโโโโโโโโโโ โโโโโโโโโโโโโโโ โโโโโโโโโโโโโโโโ
โ โ Security โ โ Performance โ โ Style โ โ Logic โโ
โ โ Focus โ โ Focus โ โ Focus โ โ Focus โโ
โ โโโโโโโโโโโโโโโ โโโโโโโโโโโโโโโ โโโโโโโโโโโโโโโ โโโโโโโโโโโโโโโโ
โ โ โ โ โ โ
โ โโโโโโโโโโโโโโโโโดโโโโโโโโโโโโโโโโดโโโโโโโโโโโโโโโโ โ
โ โ โ
โ โผ โ
โ Single Brain Tries โ
โ to Cover Everything โ
โ โ
โ Problems: โ
โ โข Cognitive overload โ
โ โข Expertise gaps โ
โ โข Inconsistent focus โ
โ โข Time constraints โ
โ โข Fatigue โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
Multi-Agent Review:
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ Coordinator Agent โ
โ โ โ
โ โโโโโโโโโโโโโโโโโโโโโโผโโโโโโโโโโโโโโโโโโโโโ โ
โ โ โ โ โ
โ โผ โผ โผ โ
โ โโโโโโโโโโโโโโโ โโโโโโโโโโโโโโโ โโโโโโโโโโโโโโโ โ
โ โ Security โ โ Performance โ โ Style โ โ
โ โ Agent โ โ Agent โ โ Agent โ โ
โ โ โ โ โ โ โ โ
โ โ โข OWASP โ โ โข O(n) vs โ โ โข ESLint โ โ
โ โ โข Injection โ โ O(n^2) โ โ โข Prettier โ โ
โ โ โข Auth โ โ โข Memory โ โ โข Naming โ โ
โ โ โข Crypto โ โ โข Caching โ โ โข DRY โ โ
โ โโโโโโโโโโโโโโโ โโโโโโโโโโโโโโโ โโโโโโโโโโโโโโโ โ
โ โ โ โ โ
โ โโโโโโโโโโโโโโโโโโโโโโผโโโโโโโโโโโโโโโโโโโโโ โ
โ โผ โ
โ โโโโโโโโโโโโโโโโโโโ โ
โ โ Synthesize โ โ
โ โ & Prioritize โ โ
โ โโโโโโโโโโโโโโโโโโโ โ
โ โ
โ Benefits: โ
โ โข Deep expertise per domain โ
โ โข Parallel execution โ
โ โข Consistent focus โ
โ โข No fatigue โ
โ โข Comprehensive coverage โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
Multi-Agent Architectures
There are several patterns for organizing multiple agents:
Pattern 1: PARALLEL (Your Project)
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ โ
โ โโโโโโโโโโโโโโโโโ โ
โ โ Coordinator โ โ
โ โโโโโโโโโโโโโโโโโ โ
โ โ โ
โ โโโโโโโโโโโโโโโโผโโโโโโโโโโโโโโโ โ
โ โ โ โ โ
โ โผ โผ โผ โ
โ โโโโโโโโโโโโ โโโโโโโโโโโโ โโโโโโโโโโโโ โ
โ โ Agent A โ โ Agent B โ โ Agent C โ โ Run in parallelโ
โ โโโโโโโโโโโโ โโโโโโโโโโโโ โโโโโโโโโโโโ โ
โ โ โ โ โ
โ โโโโโโโโโโโโโโโโผโโโโโโโโโโโโโโโ โ
โ โผ โ
โ โโโโโโโโโโโโโโโโโ โ
โ โ Synthesize โ โ
โ โโโโโโโโโโโโโโโโโ โ
โ โ
โ Use case: Independent tasks, time-sensitive โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
Pattern 2: SEQUENTIAL (Pipeline)
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ โ
โ โโโโโโโโโโโโ โโโโโโโโโโโโ โโโโโโโโโโโโ โโโโโโโโโโโโ โ
โ โ Agent A โโโโบโ Agent B โโโโบโ Agent C โโโโบโ Agent D โ โ
โ โ (Parse) โ โ(Analyze) โ โ(Suggest) โ โ (Format) โ โ
โ โโโโโโโโโโโโ โโโโโโโโโโโโ โโโโโโโโโโโโ โโโโโโโโโโโโ โ
โ โ
โ Use case: Each step depends on previous output โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
Pattern 3: HIERARCHICAL (Tree)
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ โ
โ โโโโโโโโโโโโโโโโโ โ
โ โ Manager โ โ
โ โโโโโโโโโโโโโโโโโ โ
โ โ โ
โ โโโโโโโโโโโโโโโโผโโโโโโโโโโโโโโโ โ
โ โผ โผ โผ โ
โ โโโโโโโโโโโโ โโโโโโโโโโโโ โโโโโโโโโโโโ โ
โ โ Lead A โ โ Lead B โ โ Lead C โ โ
โ โโโโโโโโโโโโ โโโโโโโโโโโโ โโโโโโโโโโโโ โ
โ โ โ โ
โ โโโโโโโดโโโโโโ โโโโโโโดโโโโโโ โ
โ โผ โผ โผ โผ โ
โ โโโโโโโโโโ โโโโโโโโโโ โโโโโโโโโโ โโโโโโโโโโ โ
โ โWorker 1โ โWorker 2โ โWorker 3โ โWorker 4โ โ
โ โโโโโโโโโโ โโโโโโโโโโ โโโโโโโโโโ โโโโโโโโโโ โ
โ โ
โ Use case: Large teams, complex delegation โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
Pattern 4: DEBATE (Adversarial)
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ โ
โ โโโโโโโโโโโโ โโโโโโโโโโโโ โ
โ โ Agent A โโโโโโ Debate/Challenge โโโโโโบโ Agent B โ โ
โ โ(Advocate)โ โ (Critic) โ โ
โ โโโโโโโโโโโโ โโโโโโโโโโโโ โ
โ โ โ โ
โ โโโโโโโโโโโโโโโโโโฌโโโโโโโโโโโโโโโโโโโโโโโโ โ
โ โผ โ
โ โโโโโโโโโโโโโโโโ โ
โ โ Judge โ โ
โ โ (Arbiter) โ โ
โ โโโโโโโโโโโโโโโโ โ
โ โ
โ Use case: Exploring trade-offs, finding edge cases โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
Specialized Agent Design
Each agent needs a focused configuration that shapes its expertise:
Agent Specialization Architecture:
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ SECURITY AGENT โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโค
โ System Prompt: โ
โ "You are a security-focused code reviewer. Your expertise: โ
โ - OWASP Top 10 vulnerabilities โ
โ - Input validation and sanitization โ
โ - Authentication and authorization โ
โ - Cryptographic best practices โ
โ - SQL injection, XSS, CSRF prevention โ
โ โ
โ For each issue, rate severity: CRITICAL, HIGH, MEDIUM, LOW โ
โ Provide specific remediation steps." โ
โ โ
โ Focus Areas: โ
โ โโโโโโโโโโโโโโโ โโโโโโโโโโโโโโโ โโโโโโโโโโโโโโโ โโโโโโโโโโโโโโโ โ
โ โ Injection โ โ Broken โ โ Sensitive โ โ Broken โ โ
โ โ Flaws โ โ Auth โ โ Data Expose โ โ Access โ โ
โ โโโโโโโโโโโโโโโ โโโโโโโโโโโโโโโ โโโโโโโโโโโโโโโ โโโโโโโโโโโโโโโ โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ PERFORMANCE AGENT โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโค
โ System Prompt: โ
โ "You are a performance-focused code reviewer. Your expertise: โ
โ - Algorithmic complexity (Big O notation) โ
โ - Memory management and leaks โ
โ - Database query optimization โ
โ - Caching strategies โ
โ - Async/parallel execution opportunities โ
โ โ
โ For each issue, estimate impact: 10x, 5x, 2x, marginal โ
โ Suggest benchmarks to verify improvements." โ
โ โ
โ Focus Areas: โ
โ โโโโโโโโโโโโโโโ โโโโโโโโโโโโโโโ โโโโโโโโโโโโโโโ โโโโโโโโโโโโโโโ โ
โ โ O(n^2) โ โ โ N+1 โ โ Memory โ โ Blocking โ โ
โ โ O(n) โ โ Queries โ โ Leaks โ โ I/O โ โ
โ โโโโโโโโโโโโโโโ โโโโโโโโโโโโโโโ โโโโโโโโโโโโโโโ โโโโโโโโโโโโโโโ โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ STYLE AGENT โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโค
โ System Prompt: โ
โ "You are a code style and quality reviewer. Your expertise: โ
โ - Naming conventions and clarity โ
โ - Code organization and modularity โ
โ - DRY principle adherence โ
โ - Documentation completeness โ
โ - Consistency with project patterns โ
โ โ
โ Group issues by: formatting, naming, structure, documentation โ
โ Reference relevant style guides when applicable." โ
โ โ
โ Focus Areas: โ
โ โโโโโโโโโโโโโโโ โโโโโโโโโโโโโโโ โโโโโโโโโโโโโโโ โโโโโโโโโโโโโโโ โ
โ โ Naming โ โ Code โ โ Missing โ โ DRY โ โ
โ โ Conventions โ โ Smells โ โ Docs โ โ Violations โ โ
โ โโโโโโโโโโโโโโโ โโโโโโโโโโโโโโโ โโโโโโโโโโโโโโโ โโโโโโโโโโโโโโโ โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
Result Synthesis
Combining findings from multiple agents requires careful prioritization:
Synthesis Algorithm:
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ INPUT: Agent Findings โ
โ โ
โ Security: [Finding1, Finding2, Finding3] โ
โ Performance: [Finding4, Finding5] โ
โ Style: [Finding6, Finding7, Finding8, Finding9] โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ
โผ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ STEP 1: Normalize Severity โ
โ โ
โ Map each agent's severity to common scale (1-10): โ
โ โ
โ Security: Performance: Style: โ
โ CRITICAL = 10 10x impact = 9 Blocking = 6 โ
โ HIGH = 8 5x impact = 7 Major = 4 โ
โ MEDIUM = 5 2x impact = 5 Minor = 2 โ
โ LOW = 3 marginal = 2 Nitpick = 1 โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ
โผ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ STEP 2: Deduplicate โ
โ โ
โ Detect overlapping findings (same line, similar issue): โ
โ โ
โ Security: "SQL injection at line 42" โ
โ Performance: "Unparameterized query at line 42" โ MERGE โ
โ โ
โ Result: Combined finding with both perspectives โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ
โผ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ STEP 3: Weight by Category โ
โ โ
โ Apply category multipliers (configurable): โ
โ โ
โ Security findings: ร1.5 (most critical) โ
โ Performance findings: ร1.2 (important) โ
โ Style findings: ร1.0 (baseline) โ
โ โ
โ Final score = normalized_severity ร category_weight โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ
โผ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ STEP 4: Sort and Group โ
โ โ
โ 1. SQL Injection (Security - CRITICAL) Score: 15.0 โ
โ 2. N+1 Query (Performance - HIGH) Score: 10.8 โ
โ 3. Missing Auth Check (Security - HIGH) Score: 12.0 โ
โ 4. Unused import (Style - Minor) Score: 2.0 โ
โ ... โ
โ โ
โ Group by file for developer convenience โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ
โผ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ OUTPUT: Prioritized Review โ
โ โ
โ MUST FIX (Score > 10): โ
โ 1. SQL Injection in getUserById() โ
โ 2. Missing auth check in deleteUser() โ
โ โ
โ SHOULD FIX (Score 5-10): โ
โ 3. N+1 query in getOrdersWithItems() โ
โ โ
โ CONSIDER (Score < 5): โ
โ 4. Unused imports โ
โ 5. Variable naming suggestions โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
Kiro Subagent Spawning
Kiro CLI supports spawning subagents for parallel execution:
Subagent Spawning Patterns:
Method 1: CLI Subprocess
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ const reviews = await Promise.all([ โ
โ $`kiro-cli --agent security --print "${prompt}"`, โ
โ $`kiro-cli --agent performance --print "${prompt}"`, โ
โ $`kiro-cli --agent style --print "${prompt}"`, โ
โ ]); โ
โ โ
โ Pros: Simple, isolated โ
โ Cons: Startup overhead per agent โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
Method 2: Agent Configuration
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ // .kiro/agents/review-coordinator.yaml โ
โ name: review-coordinator โ
โ system_prompt: | โ
โ You coordinate code reviews by delegating to specialized agents. โ
โ โ
โ allowed_tools: โ
โ - spawn_subagent โ
โ โ
โ subagents: โ
โ - security-reviewer โ
โ - performance-reviewer โ
โ - style-reviewer โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
Method 3: Direct Invocation
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ > "Review this PR with all specialized agents" โ
โ โ
โ [Coordinator] Spawning security-reviewer... โ
โ [Coordinator] Spawning performance-reviewer... โ
โ [Coordinator] Spawning style-reviewer... โ
โ โ
โ [Waiting for subagents...] โ
โ โ
โ [security-reviewer] Found 3 issues โ
โ [performance-reviewer] Found 2 issues โ
โ [style-reviewer] Found 5 issues โ
โ โ
โ [Coordinator] Synthesizing findings... โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
Real-World Analogy: The Architecture Review Board
Think of this system like a corporate Architecture Review Board:
- The Coordinator is the meeting chair who assigns the agenda
- Security Agent is the security architect who focuses only on threats
- Performance Agent is the performance engineer who watches for bottlenecks
- Style Agent is the tech lead who maintains coding standards
- The Synthesis is the meeting minutes that prioritize action items
Each expert reviews independently, then they meet to consolidate feedback.
Historical Context
Code review automation has evolved:
Code Review Evolution:
1970s: Fagan Inspections
โโโบ Formal, meeting-based reviews
1990s: Lightweight Reviews
โโโบ Email-based, async reviews
2000s: Tool-Assisted (Crucible, Review Board)
โโโบ Web interfaces, inline comments
2010s: Pull Request Workflow
โโโบ GitHub/GitLab integrated reviews
2020s: AI Linters (Codacy, DeepSource)
โโโบ Automated issue detection
2024+: Multi-Agent AI Review โโโโ YOU ARE HERE
โโโบ Specialized AI agents with synthesis
Book References
For deeper understanding:
- โWorking Effectively with Legacy Codeโ by Michael Feathers - Code analysis techniques
- โClean Codeโ by Robert C. Martin - Style and quality principles
- โSecure Coding in C and C++โ by Seacord - Security review patterns
- โHigh Performance Browser Networkingโ by Grigorik - Performance analysis
Complete Project Specification
What You Are Building
A multi-agent code review system that:
- Accepts code for review (file, diff, or PR)
- Spawns specialized agents in parallel
- Collects and normalizes findings from each agent
- Synthesizes a prioritized report with actionable feedback
- Optionally applies fixes for certain issue types
Functional Requirements
| Feature | Behavior |
|---|---|
| Input | Accept file path, git diff, or GitHub PR URL |
| Parallel Review | Run security, performance, style agents simultaneously |
| Findings Format | Standardized structure with line numbers, severity |
| Prioritization | Rank issues by weighted severity |
| Output | Clear, actionable review comments |
Non-Functional Requirements
- Latency: Complete review within 60 seconds for typical PR
- Accuracy: Minimize false positives while catching real issues
- Extensibility: Easy to add new specialized agents
- Integration: Work with GitHub PR workflow
Solution Architecture
High-Level Component Diagram
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ User Request โ
โ โ
โ "Review PR #42 with all agents" โ
โ โ โ
โ โผ โ
โ โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ โ Coordinator Agent โโ
โ โ โโ
โ โ 1. Parse request (PR #42) โโ
โ โ 2. Fetch code diff โโ
โ โ 3. Spawn subagents โโ
โ โ 4. Collect results โโ
โ โ 5. Synthesize report โโ
โ โ โโ
โ โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ โ โ
โ โ Parallel Spawn โ
โ โผ โ
โ โโโโโโโโโโโโโโโโฌโโโโโโโโโโโโโโโฌโโโโโโโโโโโโโโโ โ
โ โ Security โ Performance โ Style โ โ
โ โ Agent โ Agent โ Agent โ โ
โ โ โ โ โ โ
โ โ Input: Diff โ Input: Diff โ Input: Diff โ โ
โ โ โ โ โ โ
โ โ Output: โ Output: โ Output: โ โ
โ โ [{finding}] โ [{finding}] โ [{finding}] โ โ
โ โโโโโโโโโโโโโโโโดโโโโโโโโโโโโโโโดโโโโโโโโโโโโโโโ โ
โ โ โ โ โ
โ โโโโโโโโโโโโโโโโผโโโโโโโโโโโโโโโ โ
โ โผ โ
โ โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ โ Synthesizer โโ
โ โ โโ
โ โ โข Normalize severities โโ
โ โ โข Deduplicate findings โโ
โ โ โข Apply category weights โโ
โ โ โข Sort by priority โโ
โ โ โข Format output โโ
โ โ โโ
โ โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ โ โ
โ โผ โ
โ โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ โ Prioritized Report โโ
โ โ โโ
โ โ MUST FIX: โโ
โ โ 1. SQL Injection (Security - CRITICAL) โโ
โ โ 2. Missing auth check (Security - HIGH) โโ
โ โ โโ
โ โ SHOULD FIX: โโ
โ โ 3. N+1 query (Performance - HIGH) โโ
โ โ ... โโ
โ โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
Data Flow: Complete Review Cycle
1. Input Processing
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ Input: "Review PR #42" โ
โ โ
โ Coordinator: โ
โ 1. Parse: PR number = 42 โ
โ 2. Fetch: gh pr diff 42 > diff.patch โ
โ 3. Extract: Changed files and line ranges โ
โ โ
โ Output: โ
โ { โ
โ "files": ["src/api/users.ts", "src/services/auth.ts"], โ
โ "diff": "...unified diff content...", โ
โ "additions": 150, โ
โ "deletions": 23 โ
โ } โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ
โผ
2. Parallel Agent Execution
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ Promise.all([ โ
โ securityAgent.review(context), // 15 seconds โ
โ performanceAgent.review(context), // 12 seconds โ
โ styleAgent.review(context), // 8 seconds โ
โ ]) โ
โ โ
โ Total time: ~15 seconds (parallel) โ
โ Sequential would be: ~35 seconds โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ
โผ
3. Raw Findings Collection
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ Security Agent Output: โ
โ [ โ
โ { โ
โ "type": "SQL_INJECTION", โ
โ "severity": "CRITICAL", โ
โ "file": "src/api/users.ts", โ
โ "line": 42, โ
โ "message": "User input directly interpolated in SQL", โ
โ "suggestion": "Use parameterized queries" โ
โ }, โ
โ ... โ
โ ] โ
โ โ
โ Performance Agent Output: โ
โ [ โ
โ { โ
โ "type": "N_PLUS_1", โ
โ "severity": "HIGH", โ
โ "file": "src/services/orders.ts", โ
โ "line": 78, โ
โ "message": "Query inside loop creates N+1 problem", โ
โ "suggestion": "Use eager loading or batch query" โ
โ }, โ
โ ... โ
โ ] โ
โ โ
โ Style Agent Output: โ
โ [ โ
โ { โ
โ "type": "NAMING", โ
โ "severity": "LOW", โ
โ "file": "src/api/users.ts", โ
โ "line": 15, โ
โ "message": "Variable 'x' is not descriptive", โ
โ "suggestion": "Rename to 'userId' or 'userIndex'" โ
โ }, โ
โ ... โ
โ ] โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ
โผ
4. Synthesis and Output
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ MULTI-AGENT CODE REVIEW - PR #42 โ
โ โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ โ
โ Summary: 10 issues found (2 critical, 3 high, 5 low) โ
โ โ
โ MUST FIX (Critical): โ
โ โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ โ
โ 1. [Security] SQL Injection โ
โ File: src/api/users.ts:42 โ
โ Issue: User input directly interpolated in SQL query โ
โ Fix: Use parameterized query with $1, $2 placeholders โ
โ โ
โ SHOULD FIX (High): โ
โ โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ โ
โ 2. [Performance] N+1 Query โ
โ File: src/services/orders.ts:78 โ
โ Issue: Database query inside loop โ
โ Fix: Use .include() for eager loading โ
โ โ
โ 3. [Security] Missing Rate Limiting โ
โ File: src/api/auth.ts:15 โ
โ Issue: Login endpoint has no rate limit โ
โ Fix: Add rate-limiter-flexible middleware โ
โ โ
โ CONSIDER (Low): โ
โ โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ โ
โ 4-10. [Style] Various naming and formatting issues โ
โ โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
Key Interfaces
// Finding from any agent
interface Finding {
agent: 'security' | 'performance' | 'style';
type: string;
severity: 'critical' | 'high' | 'medium' | 'low';
file: string;
line: number;
endLine?: number;
message: string;
suggestion: string;
codeSnippet?: string;
references?: string[];
}
// Review context passed to agents
interface ReviewContext {
diff: string;
files: FileContext[];
metadata: {
prNumber?: number;
baseBranch: string;
headBranch: string;
author: string;
};
}
interface FileContext {
path: string;
content: string;
diff: string;
changedLines: number[];
}
// Synthesized report
interface SynthesizedReport {
summary: {
total: number;
bySeverity: Record<string, number>;
byAgent: Record<string, number>;
};
findings: PrioritizedFinding[];
suggestedActions: Action[];
}
interface PrioritizedFinding extends Finding {
priority: number; // Computed score
relatedFindings?: Finding[]; // Merged duplicates
}
Agent Configuration Files
# .kiro/agents/security-reviewer.yaml
name: security-reviewer
system_prompt: |
You are a security-focused code reviewer with expertise in:
- OWASP Top 10 vulnerabilities
- Authentication and authorization flaws
- Input validation and output encoding
- Cryptographic weaknesses
- Information disclosure
When reviewing code:
1. Focus ONLY on security issues
2. Rate each finding: CRITICAL, HIGH, MEDIUM, LOW
3. Provide specific, actionable remediation
4. Reference CWE numbers when applicable
Output format: JSON array of findings.
allowed_tools:
- read_file
- search_codebase
model: claude-sonnet-4-20250514 # Fast, capable
---
# .kiro/agents/performance-reviewer.yaml
name: performance-reviewer
system_prompt: |
You are a performance-focused code reviewer with expertise in:
- Algorithmic complexity (Big O)
- Database query optimization
- Memory management
- Caching strategies
- Async/parallel execution
When reviewing code:
1. Focus ONLY on performance issues
2. Estimate impact: 10x, 5x, 2x, marginal
3. Suggest benchmarks to verify
4. Provide specific optimization techniques
Output format: JSON array of findings.
allowed_tools:
- read_file
- search_codebase
model: claude-sonnet-4-20250514
---
# .kiro/agents/style-reviewer.yaml
name: style-reviewer
system_prompt: |
You are a code style and quality reviewer with expertise in:
- Naming conventions
- Code organization
- DRY principle
- Documentation
- Consistency
When reviewing code:
1. Focus ONLY on style and quality issues
2. Reference project style guides
3. Distinguish: blocking vs. suggestions
4. Keep suggestions constructive
Output format: JSON array of findings.
allowed_tools:
- read_file
- search_codebase
model: claude-haiku-4-20250514 # Fast, good for style
---
# .kiro/agents/review-coordinator.yaml
name: review-coordinator
system_prompt: |
You are the code review coordinator. Your role:
1. Parse user review requests
2. Delegate to specialized agents
3. Collect and synthesize findings
4. Present prioritized report
You have access to these subagents:
- security-reviewer
- performance-reviewer
- style-reviewer
allowed_tools:
- spawn_subagent
- read_file
- gh_cli
model: claude-sonnet-4-20250514
Phased Implementation Guide
Phase 1: Single Agent Review (Days 1-3)
Goal: Create one working review agent (start with security).
Tasks:
- Create security-reviewer agent configuration
- Implement review prompt that outputs JSON findings
- Parse agent output into structured findings
- Test with sample code containing known vulnerabilities
- Format findings for display
Hints:
- Start with a hardcoded file path for testing
- Use JSON mode for structured output
- Include example findings in the system prompt
Starter Agent Prompt:
const securityReviewPrompt = `
Review this code for security vulnerabilities:
\`\`\`typescript
${codeContent}
\`\`\`
Return a JSON array of findings:
[
{
"type": "SQL_INJECTION",
"severity": "CRITICAL",
"line": 42,
"message": "User input directly in SQL",
"suggestion": "Use parameterized queries"
}
]
If no issues found, return empty array: []
`;
Phase 2: Multiple Agents (Days 4-6)
Goal: Add performance and style agents, run in parallel.
Tasks:
- Create performance-reviewer agent configuration
- Create style-reviewer agent configuration
- Implement parallel execution with Promise.all
- Collect results from all agents
- Handle agent failures gracefully
Hints:
- Each agent should have isolated context
- Use timeouts to prevent hanging agents
- Log which agent produced which findings
Parallel Execution:
async function runAllAgents(context: ReviewContext): Promise<Finding[]> {
const agents = ['security', 'performance', 'style'];
const results = await Promise.allSettled(
agents.map(agent =>
runAgent(agent, context).catch(err => {
console.error(`${agent} agent failed:`, err);
return [];
})
)
);
return results
.filter((r): r is PromiseFulfilledResult<Finding[]> => r.status === 'fulfilled')
.flatMap(r => r.value);
}
Phase 3: Coordinator Agent (Days 7-9)
Goal: Create the orchestrating coordinator agent.
Tasks:
- Create review-coordinator agent configuration
- Implement PR/diff fetching logic
- Build context object for subagents
- Implement subagent spawning
- Collect results from subagents
Hints:
- The coordinator needs access to
ghCLI - Pass minimal context to subagents (just what they need)
- Track timing for each agent
Coordinator Flow:
class ReviewCoordinator {
async review(request: string): Promise<SynthesizedReport> {
// 1. Parse request
const { prNumber, files } = this.parseRequest(request);
// 2. Fetch context
const context = await this.fetchContext(prNumber);
// 3. Spawn subagents in parallel
const findings = await this.runAllAgents(context);
// 4. Synthesize
return this.synthesize(findings);
}
}
Phase 4: Synthesis and Output (Days 10-14)
Goal: Implement finding synthesis and prioritized output.
Tasks:
- Implement severity normalization
- Detect and merge duplicate findings
- Apply category weights
- Sort by computed priority
- Format beautiful output
Hints:
- Duplicates often have same file and similar line numbers
- Use fuzzy matching for message similarity
- Group by file for developer convenience
Synthesis Implementation:
function synthesize(findings: Finding[]): SynthesizedReport {
// Normalize severities to 1-10 scale
const normalized = findings.map(f => ({
...f,
normalizedSeverity: normalizeSeverity(f.agent, f.severity),
}));
// Deduplicate (same file + similar line + similar message)
const deduplicated = deduplicateFindings(normalized);
// Apply category weights
const weighted = deduplicated.map(f => ({
...f,
priority: f.normalizedSeverity * getCategoryWeight(f.agent),
}));
// Sort by priority
weighted.sort((a, b) => b.priority - a.priority);
return {
summary: computeSummary(weighted),
findings: weighted,
suggestedActions: generateActions(weighted),
};
}
Testing Strategy
Unit Tests
describe('FindingSynthesizer', () => {
describe('normalizeSeverity', () => {
it('maps security CRITICAL to 10', () => {
expect(normalizeSeverity('security', 'critical')).toBe(10);
});
it('maps style minor to 2', () => {
expect(normalizeSeverity('style', 'low')).toBe(2);
});
});
describe('deduplicateFindings', () => {
it('merges findings on same line', () => {
const findings = [
{ agent: 'security', file: 'a.ts', line: 42, message: 'SQL injection' },
{ agent: 'performance', file: 'a.ts', line: 42, message: 'Slow query' },
];
const deduped = deduplicateFindings(findings);
expect(deduped).toHaveLength(1);
expect(deduped[0].relatedFindings).toHaveLength(1);
});
});
});
Integration Tests
describe('Full Review Pipeline', () => {
it('reviews a PR with all agents', async () => {
const coordinator = new ReviewCoordinator();
// Review a known test PR
const report = await coordinator.review('Review PR #1');
expect(report.findings.length).toBeGreaterThan(0);
expect(report.summary.byAgent).toHaveProperty('security');
expect(report.summary.byAgent).toHaveProperty('performance');
expect(report.summary.byAgent).toHaveProperty('style');
});
});
Manual Testing
# 1. Start coordinator agent
kiro-cli --agent review-coordinator
# 2. Review a local file
> "Review src/api/users.ts for all issues"
# 3. Review a PR
> "Review PR #42 with all agents"
# 4. Verify output format and prioritization
# Should see categorized, prioritized findings
Common Pitfalls and Debugging
Pitfall 1: Agents Return Inconsistent Formats
Symptom: JSON parsing fails on some agent outputs
Prevention:
function parseAgentOutput(output: string, agent: string): Finding[] {
try {
// Try to extract JSON from markdown code blocks
const jsonMatch = output.match(/```json\n?([\s\S]*?)\n?```/);
const json = jsonMatch ? jsonMatch[1] : output;
const findings = JSON.parse(json);
// Validate structure
return findings.filter(f =>
f.type && f.severity && f.line && f.message
).map(f => ({
...f,
agent,
}));
} catch (e) {
console.error(`Failed to parse ${agent} output:`, e);
return [];
}
}
Pitfall 2: Subagent Times Out
Symptom: One slow agent blocks entire review
Solution:
async function runAgentWithTimeout(agent: string, context: ReviewContext, timeoutMs = 30000) {
const controller = new AbortController();
const timeout = setTimeout(() => controller.abort(), timeoutMs);
try {
return await runAgent(agent, context, { signal: controller.signal });
} catch (e) {
if (e.name === 'AbortError') {
console.warn(`${agent} agent timed out after ${timeoutMs}ms`);
return [];
}
throw e;
} finally {
clearTimeout(timeout);
}
}
Pitfall 3: Duplicate Findings Not Detected
Symptom: Same issue reported by multiple agents separately
Solution:
function isSimilarFinding(a: Finding, b: Finding): boolean {
// Same file
if (a.file !== b.file) return false;
// Similar line (within 5 lines)
if (Math.abs(a.line - b.line) > 5) return false;
// Similar message (fuzzy match)
const similarity = stringSimilarity(a.message, b.message);
return similarity > 0.6;
}
Extensions and Challenges
Extension 1: GitHub Integration
Post review comments directly to PRs:
async function postToGitHub(report: SynthesizedReport, prNumber: number) {
for (const finding of report.findings) {
await $`gh pr comment ${prNumber} --body ${formatComment(finding)}`;
// Or use review API for inline comments
await $`gh api repos/:owner/:repo/pulls/${prNumber}/comments -f body="${finding.message}" -f path="${finding.file}" -f line=${finding.line}`;
}
}
Extension 2: Learning from Feedback
Track which findings developers actually fix:
interface FindingFeedback {
findingId: string;
wasFixed: boolean;
wasHelpful: boolean;
comment?: string;
}
// Use feedback to tune severity weights
function updateWeights(feedback: FindingFeedback[]) {
const fixRates = groupBy(feedback, f => f.findingType);
// Increase weight for types that are frequently fixed
// Decrease weight for types that are often dismissed
}
Extension 3: Custom Agents
Allow users to define project-specific agents:
# .kiro/agents/react-reviewer.yaml
name: react-reviewer
system_prompt: |
You are a React specialist. Review for:
- Hook rules violations
- State management anti-patterns
- Performance issues (missing memo, key props)
- Accessibility issues
Extension 4: Auto-Fix Capability
For certain issues, apply fixes automatically:
interface AutoFix {
type: string;
pattern: RegExp;
replacement: string | ((match: string) => string);
}
const autoFixes: AutoFix[] = [
{
type: 'MISSING_AWAIT',
pattern: /(?<!await\s)(fetch\()/g,
replacement: 'await $1',
},
];
Extension 5: Review History
Track review trends over time:
Review Trends (Last 30 Days):
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ Total Reviews: 47 โ
โ Total Findings: 234 โ
โ โ
โ Top Issue Types: โ
โ 1. N+1 Queries 45 โโโโโโโโโโโโ โ
โ 2. Missing Auth 23 โโโโโโ โ
โ 3. Hardcoded Values 18 โโโโโ โ
โ โ
โ Trend: Security issues โ 15%, Performance issues โ 8% โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
Real-World Connections
Industry Adoption
Multi-agent review patterns are used by:
- Amazon CodeGuru: Security and performance analysis
- DeepSource: Multiple analyzers running in parallel
- Codacy: Rule-based multi-category checks
- Snyk Code: Security-focused AI review
Production Considerations
| Concern | Solution |
|---|---|
| Cost | Use cheaper models for style, expensive for security |
| Latency | Parallel execution, aggressive timeouts |
| Accuracy | Track false positive rates, tune prompts |
| Coverage | Add new agents for project-specific patterns |
| Integration | GitHub Actions, GitLab CI/CD, Bitbucket |
Self-Assessment Checklist
Knowledge Verification
- Can you explain the parallel vs. sequential multi-agent patterns?
- How do you design agent specialization through system prompts?
- What is the finding synthesis process?
- Why is severity normalization important?
- How do you handle agent failures gracefully?
Implementation Verification
- All three agents run in parallel successfully
- Findings are properly attributed to their source agent
- Duplicate findings are detected and merged
- Output is sorted by priority
- The system handles agent timeouts gracefully
Quality Verification
- Security agent catches common vulnerabilities
- Performance agent identifies complexity issues
- Style agent flags inconsistencies
- False positive rate is acceptable
- Report is actionable and clear
Integration Verification
- Works with local files
- Works with git diffs
- Works with GitHub PRs
- Results can be posted as PR comments
Summary
Building a multi-agent code review system teaches you:
- Agent Orchestration: Coordinating multiple AI agents in parallel
- Specialization Design: Creating focused agents with deep expertise
- Result Synthesis: Combining and prioritizing findings from multiple sources
- Production Patterns: Handling timeouts, failures, and inconsistencies
The multi-agent pattern you have learned here applies far beyond code review - it works for any complex task that benefits from multiple specialized perspectives: security audits, documentation review, test planning, and more.
Next Project: P26-mdflow-workflow-engine.md - Executable markdown workflows with AI