Project 4: "The Subagent Researcher" — Agentic Architectures

Project 4: “The Subagent Researcher” — Agentic Architectures

Attribute	Value
File	`KIRO_CLI_MASTERY.md`
Main Programming Language	Natural Language
Coolness Level	Level 3: Genuinely Clever
Difficulty	Level 2: Intermediate
Knowledge Area	Agentic Architectures

What you’ll build: Use subagents to map a complex repository. Aggregate findings into ARCHITECTURE.md.

Why it teaches Subagents: You break linear context limits using parallel research.

Core challenges you’ll face:

Writing clear delegation prompts.
Avoiding hallucinated summaries.

Success criteria:

ARCHITECTURE.md accurately reflects repo structure.

Real World Outcome

You’ll create an ARCHITECTURE.md document that provides a comprehensive overview of a complex codebase by delegating research to multiple parallel subagents, each analyzing different parts of the system.

Example Output:

$ cd ~/large-monorepo  # 500+ files across 20 modules

$ kiro chat

You: "Map the entire architecture of this monorepo and create ARCHITECTURE.md"

Kiro: This codebase is large (500+ files). I'll spawn subagents to analyze different modules in parallel.

  Spawning subagents:
  → Subagent 1: Analyzing frontend packages (apps/web, apps/mobile)
  → Subagent 2: Analyzing backend services (services/api, services/auth, services/billing)
  → Subagent 3: Analyzing shared libraries (packages/ui, packages/utils, packages/types)
  → Subagent 4: Analyzing infrastructure (terraform/, k8s/, .github/workflows/)

  [After 2-3 minutes, all subagents complete]

  Aggregating findings from 4 subagents...

  Created ARCHITECTURE.md

$ cat ARCHITECTURE.md

# Monorepo Architecture Overview

## High-Level Structure

This is a TypeScript monorepo using Turborepo with the following components:

### Frontend Applications (2)
- **apps/web** - Next.js 14 customer dashboard (Server Components + App Router)
- **apps/mobile** - React Native app (iOS + Android)

### Backend Services (3)
- **services/api** - Express REST API (PostgreSQL + Prisma)
- **services/auth** - Authentication service (JWT + OAuth)
- **services/billing** - Stripe integration service

### Shared Packages (6)
- **packages/ui** - React component library (Tailwind CSS)
- **packages/utils** - Shared TypeScript utilities
- **packages/types** - Shared TypeScript types
- **packages/config** - ESLint, TypeScript, Tailwind configs
- **packages/database** - Prisma schema + migrations
- **packages/api-client** - Type-safe API client

### Infrastructure
- **terraform/** - AWS infrastructure (ECS, RDS, S3, CloudFront)
- **k8s/** - Kubernetes manifests for staging/prod
- **.github/workflows/** - CI/CD pipelines (test, build, deploy)

## Key Architectural Decisions

1. **Monorepo Strategy**: Turborepo for build caching and task orchestration
2. **Data Flow**: REST API → PostgreSQL (no GraphQL, deliberate choice for simplicity)
3. **Authentication**: Centralized auth service (JWT tokens, 1-hour expiry, refresh flow)
4. **Deployment**: ECS on AWS (not Lambda, for long-running WebSocket connections)

## Module Dependencies

[Dependency graph generated from package.json analysis]

web → @repo/ui, @repo/api-client, @repo/utils
api → @repo/database, @repo/utils
billing → @repo/database, @repo/api-client

## Critical Entry Points

- **Frontend**: `apps/web/app/page.tsx` (root route)
- **Backend**: `services/api/src/server.ts` (Express app)
- **Database**: `packages/database/prisma/schema.prisma` (data model)

---

This document was generated by analyzing 523 files across 4 parallel subagents.
Last updated: 2025-01-02

Each subagent had its own isolated 200K context window, allowing parallel analysis of the entire monorepo.

The Core Question You’re Answering

“How do I analyze a codebase that’s too large for a single context window, and how do I coordinate multiple AI agents working in parallel?”

Before building this, understand: Subagents are Kiro’s answer to the context window limit. Each subagent gets its own isolated 200K context, runs autonomously with a specific task, and reports back findings. This project teaches you distributed AI workflows—how to decompose a large problem into parallel subtasks and aggregate results.

Concepts You Must Understand First

Stop and research these before coding:

Subagent Isolation
- How does each subagent get its own 200K context window?
- Can subagents communicate with each other, or only with the main agent?
- What happens if a subagent fails or runs out of context?
- Reference: Subagents and Plan Agent Changelog
Task Decomposition
- How do you break a vague goal (“map the architecture”) into specific subagent tasks?
- What makes a good subagent delegation prompt?
- How do you avoid duplicated work between subagents?
- Book Reference: “The Pragmatic Programmer” by Hunt & Thomas - Ch. 6 (Concurrency)
Result Aggregation
- How do you merge findings from 5 different subagents into a coherent document?
- What if subagents have conflicting information?
- How do you verify subagent results are accurate?
- Reference: Built-in Tools - use_subagent

Questions to Guide Your Design

Before implementing, think through these:

Delegation Strategy
- Should you divide work by directory (frontend/ vs backend/), by file type (*.tsx vs *.ts), or by concern (auth, billing, UI)?
- How many subagents should you spawn? (Kiro supports up to 10 parallel)
- What instructions should each subagent receive? (generic vs specific)
Overlap and Gaps
- How do you ensure no files are missed between subagent tasks?
- What if a file belongs to multiple domains (e.g., shared types)?
- Should subagents have overlapping scopes for validation?
Aggregation Logic
- Should you manually merge subagent outputs or ask Kiro to synthesize them?
- What structure should the final ARCHITECTURE.md follow?
- How do you cite which subagent discovered which fact?

Thinking Exercise

Exercise: Design Subagent Delegation

You’re analyzing a Django monolith with this structure:

project/
├── apps/           (8 Django apps: users, products, orders, payments, etc.)
├── core/           (shared models, middleware, utilities)
├── api/            (DRF API endpoints)
├── frontend/       (React SPA)
├── tests/          (pytest test suite)
└── infrastructure/ (Docker, k8s, Terraform)

Design a subagent strategy:

Subagent 1: ______________________
  Task: ___________________________
  Expected output: _________________

Subagent 2: ______________________
  Task: ___________________________
  Expected output: _________________

Subagent 3: ______________________
  Task: ___________________________
  Expected output: _________________

Questions while designing:

How do you handle shared code in core/ that all apps depend on?
Should the API analysis be separate from app analysis, or combined?
How do you prevent one subagent from analyzing the entire codebase redundantly?

The Interview Questions They’ll Ask

“Explain how Kiro’s subagents have isolated context windows and why this matters.”
“How would you decompose the task of ‘document this codebase’ into parallel subagent tasks?”
“What are the tradeoffs between spawning many small subagents vs few large subagents?”
“How do you handle conflicting information from different subagents?”
“When should you use subagents vs when should you use the main agent with /compact?”
“How would you verify that subagent-generated documentation is accurate?”

Hints in Layers

Hint 1: Start with Manual Decomposition Don’t just say “analyze the codebase.” First, manually explore with ls -R or tree to understand the structure. Then write specific tasks: “Subagent 1: analyze all files in apps/web/, summarize routing and components.”

Hint 2: Use Focused Delegation Prompts Each subagent needs a clear objective and output format. Example: “Analyze the authentication flow in services/auth/. Output: 1) Entry points, 2) Key functions, 3) Database models used, 4) External dependencies.”

Hint 3: Aggregate Incrementally Don’t wait for all subagents to finish. As each completes, copy its findings into a draft ARCHITECTURE.md. This lets you spot gaps early and spawn follow-up subagents if needed.

Hint 4: Verify with Cross-References After aggregation, use the main agent to verify facts. Example: “Subagent 2 claims the API uses JWT authentication. Can you confirm this by reading services/api/src/middleware/auth.ts?”

Books That Will Help

Topic	Book	Chapter
Concurrent task execution	“The Pragmatic Programmer” by Hunt & Thomas	Ch. 6: Concurrency
Distributed systems patterns	“Designing Data-Intensive Applications” by Kleppmann	Ch. 5: Replication
Code archaeology techniques	“Working Effectively with Legacy Code” by Feathers	Ch. 16: I Don’t Understand This Code

Common Pitfalls & Debugging

Problem 1: “Subagents all analyzed the same files”

Why: Delegation prompts were too vague (“analyze the frontend”)
Fix: Be specific with directory scopes: “analyze only apps/web/”, “analyze only apps/mobile/”
Quick test: Check each subagent’s output—file paths should not overlap significantly

Problem 2: “Subagent hallucinated architecture details”

Why: Asked for high-level summary without grounding in actual files
Fix: Require subagents to cite specific files and line numbers for claims
Quick test: Manually verify 3-5 claims from each subagent’s output

Problem 3: “One subagent ran out of context”

Why: Assigned too many files to a single subagent (e.g., “analyze all 200 components”)
Fix: Split into smaller chunks or use /grep for reconnaissance before loading files
Quick test: If a subagent’s task scope > 50 files, consider splitting

Problem 4: “Aggregated document is incoherent”

Why: Each subagent used different section structures
Fix: Give all subagents a template: “Output format: ## Module Name\n### Purpose\n### Key Files\n### Dependencies”
Quick test: All subagent outputs should have consistent markdown headings

Definition of Done

Spawned at least 3 subagents with non-overlapping scopes
Each subagent received a specific directory or module to analyze
All subagents completed successfully (no context overflow errors)
Aggregated findings into a single ARCHITECTURE.md document
Document includes: high-level structure, module purposes, key entry points, dependencies
Manually verified 5+ claims from subagent outputs (checked actual files)
Cross-referenced between subagents to resolve conflicts
Understand when to use subagents vs main agent with compaction