Sprint: Multi-Agent Coding Interoperability Mastery - Real World Projects
Goal: Build a deep, first-principles understanding of how multiple coding agents interoperate across CLIs, headless modes, tools, hooks, memory, and configuration. You will design practical automation systems that coordinate Claude Code, Codex, Gemini CLI, Kiro CLI, and similar agents without lock-in. You will learn how to standardize prompts, normalize outputs, route tasks, and verify safety policies across heterogeneous agent runtimes. By the end, you will be able to architect reliable, auditable, and cost-aware multi-agent automation pipelines for real coding work.
Why Multi-Agent Interoperability Matters
Tooling for AI-assisted coding has shifted from single-assistant workflows to ecosystems of specialized agents. Each CLI has different strengths: some excel at repository analysis, others at headless batch runs, and others at hooks, plugins, or MCP integrations. Interoperability lets you combine those strengths, reduce vendor lock-in, and standardize safety and quality across the team. It also enables durable automation: when one tool changes, you can adapt the adapter layer instead of rewriting workflows.
Siloed agents vs interoperable agents:
Siloed CLI Agents Interoperable Agent Mesh
[Claude Code] [Codex] [Claude] [Codex] [Gemini]
| | | | |
Scripts Scripts +--------+-------+
| | | Interop Layer |
+------X-----+ +--------+-------+
No sharing | Shared Policies|
| Shared Schemas |
+-----------------+

Prerequisites & Background Knowledge
Before starting these projects, you should have foundational understanding in these areas:
Essential Prerequisites (Must Have)
Programming Skills:
- Comfortable reading and structuring automation scripts in Python or JavaScript
- Familiarity with JSON and YAML configuration patterns
CLI and Tooling Fundamentals:
- Basic shell usage and file system navigation
- Git workflows and repository structure
- Recommended Reading: “The Linux Command Line” by William E. Shotts - Ch. 1-7
Software Design Basics:
- Interfaces, contracts, and separation of concerns
- Recommended Reading: “Clean Architecture” by Robert C. Martin - Ch. 1-4
Helpful But Not Required
DevOps and Delivery:
- CI/CD concepts and automation pipelines
- Can learn during: Projects 28-31
Systems Thinking:
- Observability and reliability basics
- Can learn during: Projects 14-18
Self-Assessment Questions
Before starting, ask yourself:
- ✅ Can you read a CLI tool configuration file and explain precedence rules?
- ✅ Can you explain the difference between interactive and headless modes?
- ✅ Can you define a contract for a tool output and validate it?
If you answered “no” to questions 1-3: Spend 1-2 weeks on the “Recommended Reading” books above before starting. If you answered “yes” to all: You’re ready to begin.
Development Environment Setup
Required Tools:
- Git
- A scripting language runtime (Python or Node.js)
- At least two AI coding CLIs installed (Claude Code, Codex, Gemini CLI, Kiro CLI, or similar)
Recommended Tools:
- JSON tooling (jq or similar)
- A task runner (Make, Just, or similar)
Testing Your Setup:
# Verify you have the basics
$ [command to show CLI versions]
[expected output showing installed tools]
Time Investment:
- Simple projects (1-10): Weekend (4-8 hours each)
- Moderate projects (11-30): 1 week (10-20 hours each)
- Complex projects (31-42): 2+ weeks (20-40 hours each)
- Total sprint: 6-12 months if doing all projects sequentially
Important Reality Check: Interoperability work is messy by nature. You will spend time normalizing outputs, handling edge cases, and reconciling conflicting expectations between tools. This is the learning.
Core Concept Analysis
1. Agent Surfaces and Interaction Modes
Agents expose different surfaces: REPL, headless batch mode, subagents, or plugin hooks. Interoperability requires a clear model of each surface and how tasks should move across them.
User Input -> Interactive REPL -> Agent Actions
|
v
Headless Batch

2. Configuration and Instruction Hierarchy
Every agent has layered configuration: system prompts, user prompts, tool configs, and runtime overrides. Interoperability depends on mapping these layers into a common contract.
3. Tool APIs, Extensions, and Plugins
Agents differ in tool invocation models and extension ecosystems. You need a schema registry that normalizes tools into a shared vocabulary.
4. Task Routing and Subagent Orchestration
Modern CLIs can spawn subagents. Your interoperability layer should define when to delegate, how to pass context, and how to merge results.
Main Agent
|-- Subagent A: search
|-- Subagent B: test
|-- Subagent C: patch
|
Merge Results

5. Context and Memory Management
Agents have limited context windows and different memory systems. Interoperability requires explicit rules for what to keep, summarize, and externalize.
6. Safety, Approval, and Sandbox Boundaries
Each CLI has its own approval policy and sandboxing model. A unified automation system must enforce the strictest policy and provide human checkpoints.
7. Observability and Evaluation
Without shared logs and metrics, a multi-agent system is opaque. Define a consistent event schema to track prompts, tool calls, and outcomes.
8. Protocols for Interop (MCP and Beyond)
Model Context Protocol and similar specs enable standardized tool and data sharing. Understanding these protocols is essential for scaling across tools.
Concept Summary Table
This section provides a map of the mental models you will build during these projects.
| Concept Cluster | What You Need to Internalize |
|---|---|
| Agent Surfaces | How interactive, headless, and hook-based modes differ and how to bridge them |
| Configuration Hierarchy | How layered prompts and configs interact across tools |
| Tool/Plugin Ecosystems | How to normalize tool schemas and capability discovery |
| Orchestration | How to delegate tasks, merge results, and handle failures |
| Memory & Context | How to budget context and synchronize memory across agents |
| Safety & Governance | How to enforce approvals, sandboxing, and audit trails |
| Observability | How to define logs and evaluation signals for automation |
| Interoperability Protocols | How MCP and similar specs enable shared context |
Deep Dive Reading by Concept
This section maps each concept to specific book chapters for deeper understanding.
Interoperability Architecture
| Concept | Book & Chapter | Why This Matters |
|---|---|---|
| Interface contracts | “Clean Architecture” by Robert C. Martin - Ch. 1-4 | Clean boundaries make adapters feasible |
| System boundaries | “Fundamentals of Software Architecture” by Mark Richards and Neal Ford - Ch. 2-4 | Helps define stable seams |
Automation and Reliability
| Concept | Book & Chapter | Why This Matters |
|---|---|---|
| Delivery pipelines | “Continuous Delivery” by David Farley and Jez Humble - Ch. 1-3 | Builds reliable automation primitives |
| Operational stability | “Release It!” by Michael T. Nygard - Ch. 1-3 | Teaches resilience patterns |
Data, Memory, and Observability
| Concept | Book & Chapter | Why This Matters |
|---|---|---|
| Data flow models | “Designing Data-Intensive Applications” by Martin Kleppmann - Ch. 1-2 | Foundations for shared logs |
| Metrics and feedback | “Accelerate” by Nicole Forsgren et al. - Ch. 2-3 | Measurement for automation success |
Quick Start: Your First 48 Hours
Feeling overwhelmed? Start here instead of reading everything:
Day 1 (4 hours):
- Read the “Agent Surfaces” and “Configuration” concepts above
- Install two CLIs and capture their version outputs
- Start Project 1 and Project 2, focusing only on mapping capabilities
- Do not optimize yet, just document differences
Day 2 (4 hours):
- Write a simple prompt contract (Project 3)
- Sketch a tool registry map (Project 4)
- Review the Core Question for Project 3
- Document one friction point per CLI
End of Weekend: You now understand how agent surfaces and configuration layers shape interoperability. That is the core mental model.
Next Steps:
- If it clicked: Continue to Project 5
- If confused: Re-read the Concept Summary Table
- If frustrated: Take a break. Interop work is hard.
Recommended Learning Path
Path 1: The Pragmatic Automator (Recommended Start)
Best for: Engineers who want working automation quickly
- Start with Project 1 - Build a capability matrix
- Then Project 3 - Define prompt contracts
- Then Project 9 - Headless batch execution
Path 2: The Platform Engineer
Best for: People building team-wide agent tooling
- Start with Project 4 - Tool schema registry
- Then Project 14 - Logging standard
- Then Project 28 - Event-driven agent bus
Path 3: The Researcher
Best for: Those focused on evaluation and benchmarks
- Start with Project 16 - Context budget planner
- Then Project 20 - Test harness
- Then Project 38 - Benchmark suite
Project List
The following projects guide you from basic interoperability to full automation platforms.
- Agent Capability Matrix
- Config Precedence Detective
- Prompt Contract Spec
- Tool Schema Registry
- Subagent Task Router
- Hook Lifecycle Harness
- Extension and Plugin Compatibility Lab
- MCP Gateway Prototype
- Headless Batch Runner
- Interactive Session Recorder
- Approval Policy Simulator
- Sandbox Matrix Auditor
- Output Style Normalizer
- Multi-Agent Logging Standard
- Error Taxonomy and Retry Controller
- Context Budget Planner
- Memory Import and Export Bridge
- Cross-Agent Workspace Sync
- Secrets Broker Shim
- Test Harness for Agents
- Prompt Injection Red Team Lab
- Multi-Agent Code Review Pipeline
- Issue Triage Mesh
- Documentation Generator Federation
- Repo Indexing Strategy
- Skill and Prompt Pack Manager
- Cross-CLI Command Adapter
- Event-Driven Agent Bus
- Distributed Job Queue
- Cost and Latency Budget Enforcer
- Human-in-the-Loop Gate
- Semantic Diff and Patch Gate
- Knowledge Base RAG Connector
- Model Failover Switch
- Compliance Audit Logger
- Offline and Edge Mode Playbook
- Multi-tenant Agent Service
- Benchmark Suite for Agents
- Incident Response Automation
- IDE Bridge Integration
- Multi-Agent Pair Programming Protocol
- Capstone: Interoperable Automation Platform
Project 1: Agent Capability Matrix
- File: P01-agent-capability-matrix.md
- Main Programming Language: Python
- Alternative Programming Languages: JavaScript, Go
- Coolness Level: 2
- Business Potential: 3
- Difficulty: 1
- Knowledge Area: Tooling, Documentation
- Software or Tool: Claude Code, Codex CLI, Gemini CLI, Kiro CLI
- Main Book: “The Pragmatic Programmer”
What you’ll build: A structured capability matrix that compares agent features and limits.
Why it teaches interoperability: You cannot bridge tools until you see their mismatched surfaces and strengths.
Core challenges you’ll face:
- Capability discovery -> maps to configuration and documentation analysis
- Normalization -> maps to contract design
- Coverage gaps -> maps to fallback strategy
Real World Outcome
A single reference matrix that lists each CLI agent, its modes, hooks, tools, and configuration scope. You can answer which agent to use for a given task and why.
What you will see:
- Matrix table: Each row is an agent, each column is a capability
- Notes column: Documented limitations and missing features
- Decision notes: Simple guidance on which agent fits which task
Command Line Outcome Example:
# 1. Export CLI versions
$ [command]
[expected output]
# 2. Extract config locations
$ [command]
[expected output]
# 3. Summarize capabilities
$ [command]
[expected output]
The Core Question You’re Answering
“What is the minimum shared feature set that every agent can support?”
Before you write any code, sit with this question. It defines your interoperability baseline.
Concepts You Must Understand First
Stop and research these before coding:
- CLI surfaces
- What is a REPL versus headless execution?
- Which features exist only in interactive mode?
- Book Reference: “The Pragmatic Programmer” Ch. 3
- Capability taxonomy
- How do you categorize tools, hooks, and extensions?
- What is a stable naming scheme?
- Configuration precedence
- Where do defaults, user configs, and project configs override each other?
Questions to Guide Your Design
- Matrix structure
- What columns must be present to compare agents fairly?
- How do you annotate partial support?
- Validation rules
- How do you verify a capability is real, not a marketing claim?
Thinking Exercise
Capability Taxonomy Sketch
Before coding, sketch a list of 10 capabilities and categorize them by surface.
[Capability] -> [Surface] -> [Evidence]
Questions while sketching:
- Which capabilities overlap across all agents?
- Which capabilities are unique and may require adapters?
- Which capabilities are unstable or experimental?
The Interview Questions They’ll Ask
- “How do you compare CLI agent capabilities in a consistent way?”
- “What is a good interoperability baseline and why?”
- “How do you verify a feature actually works?”
- “What is the risk of assuming feature parity?”
- “How do you document capability gaps?”
Hints in Layers
Hint 1: Starting Point Begin with a list of the surfaces: interactive, headless, hooks, plugins, MCP.
Hint 2: Next Level Define a schema with fields for mode, configuration, and tool support.
Hint 3: Technical Details Represent each agent as a row in a structured document and validate it with checks.
Hint 4: Tools/Debugging Use a diff tool to compare versions over time and highlight changes.
Books That Will Help
| Topic | Book | Chapter |
|---|---|---|
| Capability analysis | “The Pragmatic Programmer” by Thomas and Hunt | Ch. 3 |
| Interface boundaries | “Clean Architecture” by Robert C. Martin | Ch. 1-4 |
| Observability basics | “Release It!” by Michael T. Nygard | Ch. 1 |
Common Pitfalls & Debugging
Problem 1: “Every agent seems to support everything”
- Why: Docs are high-level and omit limitations
- Fix: Validate each capability with a concrete test
- Quick test: [command that probes capability]
Problem 2: “Capabilities change between versions”
- Why: Rapid releases alter defaults
- Fix: Capture version metadata with every matrix entry
- Quick test: [command that prints version info]
Project 2: Config Precedence Detective
- File: P02-config-precedence-detective.md
- Main Programming Language: Python
- Alternative Programming Languages: JavaScript, Go
- Coolness Level: 2
- Business Potential: 3
- Difficulty: 2
- Knowledge Area: Configuration Management
- Software or Tool: Claude Code, Codex CLI, Gemini CLI, Kiro CLI
- Main Book: “Clean Architecture”
What you’ll build: A documented map of configuration precedence across agents.
Why it teaches interoperability: Config rules define how automation behaves under different environments.
Core challenges you’ll face:
- Config discovery -> maps to filesystem probing
- Override hierarchy -> maps to policy design
- Conflict resolution -> maps to standardization
Real World Outcome
A precedence chart that shows which config files and flags override others for each CLI.
What you will see:
- Precedence diagram: Global -> user -> project -> runtime
- Conflict table: mismatched key names across tools
- Standard mapping: a normalized config glossary
Command Line Outcome Example:
$ [command to list config paths]
[expected output]
The Core Question You’re Answering
“When two settings conflict, which one wins and why?”
Concepts You Must Understand First
- Layered configuration
- How do defaults interact with environment overrides?
- Book Reference: “Clean Architecture” Ch. 7
- Config schema design
- How do you handle deprecated or renamed keys?
- Environment scoping
- How should project overrides be scoped to avoid global drift?
Questions to Guide Your Design
- Precedence rules
- What is the canonical order for each CLI?
- How will you represent exceptions?
- Mapping strategy
- Which keys are equivalent across tools?
- Where must you keep agent-specific names?
Thinking Exercise
Override Graph
Sketch a graph of config layers and mark which layer should own a safety rule.
The Interview Questions They’ll Ask
- “Why does configuration precedence matter in automation?”
- “How do you avoid unintentional overrides?”
- “What is the best place to store safety defaults?”
- “How do you manage config drift across tools?”
- “How do you standardize keys across CLIs?”
Hints in Layers
Hint 1: Starting Point List all known config files per CLI.
Hint 2: Next Level Create a precedence ladder and mark where each key is set.
Hint 3: Technical Details Define a mapping table between native keys and normalized keys.
Hint 4: Tools/Debugging Simulate overrides by toggling one setting at a time and observing output.
Books That Will Help
| Topic | Book | Chapter |
|---|---|---|
| Config discipline | “Clean Architecture” by Robert C. Martin | Ch. 7 |
| Operations control | “Release It!” by Michael T. Nygard | Ch. 2 |
| Automation hygiene | “Continuous Delivery” by Farley and Humble | Ch. 4 |
Common Pitfalls & Debugging
Problem 1: “Settings seem ignored”
- Why: A higher-precedence layer overrides them
- Fix: Annotate all layers and retest
- Quick test: [command to print effective config]
Problem 2: “Same key means different things”
- Why: Semantic drift across tools
- Fix: Add a normalization glossary
Project 3: Prompt Contract Spec
- File: P03-prompt-contract-spec.md
- Main Programming Language: Python
- Alternative Programming Languages: JavaScript, Go
- Coolness Level: 3
- Business Potential: 4
- Difficulty: 2
- Knowledge Area: Prompt Engineering, Contracts
- Software or Tool: All agents
- Main Book: “Clean Architecture”
What you’ll build: A portable prompt contract that standardizes instructions, constraints, and output rules.
Why it teaches interoperability: A shared prompt contract is the foundation of consistent behavior.
Core challenges you’ll face:
- Instruction hierarchy -> maps to system vs user prompt separation
- Output schema -> maps to validation rules
- Style normalization -> maps to post-processing
Real World Outcome
A prompt contract document and a validation checklist that each agent can follow.
What you will see:
- Prompt template: Required sections for any task
- Output schema: Required fields in responses
- Compatibility notes: Exceptions per CLI
Command Line Outcome Example:
$ [command that validates prompt contract]
[expected output]
The Core Question You’re Answering
“How do you make different agents follow the same instructions?”
Concepts You Must Understand First
- Instruction hierarchy
- How do system and user instructions interact?
- Book Reference: “Clean Architecture” Ch. 8
- Structured outputs
- What fields must be present in every response?
- Policy constraints
- Which rules must never be violated across tools?
Questions to Guide Your Design
- Contract scope
- What is universal vs tool-specific?
- Validation
- How do you detect missing fields or invalid formats?
Thinking Exercise
Prompt Contract Skeleton
Write a minimal contract in plain language with three required sections: goal, constraints, output.
The Interview Questions They’ll Ask
- “What is a prompt contract?”
- “How do you normalize outputs across agents?”
- “What risks come from inconsistent prompts?”
- “How do you validate agent adherence?”
- “How do you handle exceptions?”
Hints in Layers
Hint 1: Starting Point List the most common prompt instructions you give today.
Hint 2: Next Level Standardize those instructions into required sections.
Hint 3: Technical Details Create a schema-like checklist that every response must satisfy.
Hint 4: Tools/Debugging Compare outputs from two agents using the same contract and record deviations.
Books That Will Help
| Topic | Book | Chapter |
|---|---|---|
| Contract design | “Clean Architecture” by Robert C. Martin | Ch. 8 |
| Refactoring outputs | “Refactoring” by Martin Fowler | Ch. 1 |
| Reliability | “Release It!” by Michael T. Nygard | Ch. 3 |
Common Pitfalls & Debugging
Problem 1: “Agents ignore formatting rules”
- Why: Contract not explicit enough
- Fix: Add a strict required output checklist
Problem 2: “Output is verbose or inconsistent”
- Why: No defined tone or length guidance
- Fix: Add style constraints to the contract
Project 4: Tool Schema Registry
- File: P04-tool-schema-registry.md
- Main Programming Language: Python
- Alternative Programming Languages: JavaScript, Go
- Coolness Level: 3
- Business Potential: 4
- Difficulty: 3
- Knowledge Area: Tooling, Schemas
- Software or Tool: MCP, CLI tools
- Main Book: “Designing Data-Intensive Applications”
What you’ll build: A registry that maps tool schemas across agents.
Why it teaches interoperability: Normalized tool schemas allow a single task to call tools across agents.
Core challenges you’ll face:
- Schema extraction -> maps to tool introspection
- Normalization -> maps to registry design
- Versioning -> maps to change management
Real World Outcome
A registry file or service listing tool names, inputs, outputs, and compatibility notes.
What you will see:
- Tool catalog: each tool with a normalized name
- Schema mapping: input and output fields by agent
- Version map: supported schema versions
The Core Question You’re Answering
“How can tools be invoked consistently across different agents?”
Concepts You Must Understand First
- Schema normalization
- How do you define a canonical tool signature?
- Versioning
- How do you handle incompatible changes?
- Tool discovery
- How do you detect available tools in each CLI?
Questions to Guide Your Design
- Registry format
- Should it be file-based, service-based, or both?
- Compatibility rules
- How do you label partial support?
Thinking Exercise
Tool Mapping Table
Create a small table mapping three tools across two agents and note mismatches.
The Interview Questions They’ll Ask
- “Why do you need a tool schema registry?”
- “How do you handle schema drift?”
- “How do you normalize tool names?”
- “What is the difference between discovery and registration?”
- “How do you version tool interfaces?”
Hints in Layers
Hint 1: Starting Point Pick five tools and document their inputs and outputs.
Hint 2: Next Level Define a canonical schema that all tools map to.
Hint 3: Technical Details Add version fields and compatibility notes for each mapping.
Hint 4: Tools/Debugging Simulate a schema mismatch and describe fallback behavior.
Books That Will Help
| Topic | Book | Chapter |
|---|---|---|
| Data modeling | “Designing Data-Intensive Applications” by Martin Kleppmann | Ch. 1 |
| Interfaces | “Clean Architecture” by Robert C. Martin | Ch. 8 |
| Change management | “Release It!” by Michael T. Nygard | Ch. 5 |
Common Pitfalls & Debugging
Problem 1: “Registry becomes out of date”
- Why: Tools update frequently
- Fix: Add version metadata and review cadence
Problem 2: “Schema mappings are ambiguous”
- Why: Vague field definitions
- Fix: Add explicit field descriptions and examples
Project 5: Subagent Task Router
- File: P05-subagent-task-router.md
- Main Programming Language: JavaScript
- Alternative Programming Languages: Python, Go
- Coolness Level: 4
- Business Potential: 4
- Difficulty: 3
- Knowledge Area: Orchestration
- Software or Tool: Subagents, task routing
- Main Book: “Fundamentals of Software Architecture”
What you’ll build: A concrete task-routing spec that decides which agent handles which task, how context is packaged, and how outputs are merged into a single answer.
Why it teaches interoperability: Real interoperability is not just calling multiple tools. It is deciding when to delegate, what to send, and how to reassemble results without losing intent.
Core challenges you’ll face:
- Task decomposition -> maps to architecture and boundaries
- Context handoff -> maps to state and minimal contracts
- Result merging -> maps to normalization and conflict resolution
Real World Outcome
A router document that defines task categories, routing rules, handoff schema, and merge rules.
What you will see:
- Routing table: task type -> primary agent and fallback
- Handoff schema: minimal required fields by task
- Merge rules: how to combine partial answers and resolve conflicts
Example routing spec (excerpt):
routes:
repo_search:
primary: codex
fallback: claude
required_context: [repo_path, query, constraints]
api_design:
primary: claude
fallback: gemini
required_context: [requirements, existing_endpoints, auth_model]
test_execution:
primary: codex
fallback: kiro
required_context: [test_cmd, env, timeout]
handoff_schema:
base:
task_id: string
goal: string
constraints: [string]
artifacts: [path]
success_criteria: [string]
merge_rules:
- If two agents disagree, prefer the one with evidence links or command output.
- If both answers are partial, compose by section and mark unresolved items.
The Core Question You’re Answering
“When should a task be delegated and to whom?”
Concepts You Must Understand First
- Task granularity
- What is a unit of work that can stand alone?
- Context minimalism
- What is the smallest context that still avoids ambiguity?
- Merge strategy
- How do you reconcile conflicting outputs without losing evidence?
Questions to Guide Your Design
- Routing rules
- What signals determine the right agent: task type, risk, or cost?
- Fallbacks
- What is the fallback path when an agent fails or times out?
- Handoff schema
- Which fields are mandatory vs optional for each task type?
- Merge policy
- How do you label conflicts and decide a final output?
Thinking Exercise
Delegation Tree
Draw a tree that splits a complex task into three subagent assignments. For each node, write the minimum context payload and the expected return artifact.
The Interview Questions They’ll Ask
- “How do you decide when to delegate tasks?”
- “What context must be passed to a subagent?”
- “How do you merge conflicting outputs?”
- “What is the cost of over-delegation?”
- “How do you handle subagent failure?”
- “How do you validate the output from an untrusted agent?”
- “When is a single agent faster than a multi-agent split?”
Hints in Layers
Hint 1: Starting Point List three task types that are naturally separable and name the best agent for each.
Hint 2: Next Level Define the minimal fields each handoff must include and write a schema.
Hint 3: Technical Details Write merge rules for conflicts and missing data.
Hint 4: Tools/Debugging Run a tiny task through two agents and compare which produced evidence.
Books That Will Help
| Topic | Book | Chapter |
|---|---|---|
| Architecture tradeoffs | “Fundamentals of Software Architecture” | Ch. 2 |
| Resilience | “Release It!” | Ch. 4 |
| System boundaries | “Clean Architecture” | Ch. 9 |
Common Pitfalls & Debugging
Problem 1: “Too many handoffs”
- Why: Over-decomposition
- Fix: Merge tasks until each handoff has clear purpose
Problem 2: “Subagent misses context”
- Why: Handoff schema too sparse
- Fix: Add required context fields
Problem 3: “Conflicting answers with no decision”
- Why: No merge policy or evidence requirement
- Fix: Add a rule to prefer outputs with citations or command output
Problem 4: “Fallbacks never trigger”
- Why: Failure detection is vague
- Fix: Define explicit timeout and error conditions for retries
Project 6: Hook Lifecycle Harness
- File: P06-hook-lifecycle-harness.md
- Main Programming Language: JavaScript
- Alternative Programming Languages: Python, Go
- Coolness Level: 3
- Business Potential: 3
- Difficulty: 2
- Knowledge Area: Hooks, Automation
- Software or Tool: CLI hooks
- Main Book: “Continuous Delivery”
What you’ll build: A harness that documents pre and post hooks across agents.
Why it teaches interoperability: Hooks are the seams where automation integrates.
Core challenges you’ll face:
- Hook discovery -> maps to CLI introspection
- Ordering -> maps to lifecycle modeling
- Side effects -> maps to reliability concerns
Real World Outcome
A hook lifecycle diagram and checklist for each CLI tool.
What you will see:
- Hook map: pre-task, post-task, error hooks
- Timing notes: when hooks fire
- Compatibility: which hooks are portable
The Core Question You’re Answering
“Where can automation safely intervene in each agent?”
Concepts You Must Understand First
- Lifecycle events
- Which events are stable vs experimental?
- Side effects
- What should never happen inside a hook?
- Ordering
- How do hooks compose across tools?
Questions to Guide Your Design
- Hook priority
- Which hooks should run first?
- Safety boundaries
- What hooks require approval?
Thinking Exercise
Hook Timeline
Draw a timeline of one CLI execution and mark hook points.
The Interview Questions They’ll Ask
- “What are hooks and why are they important?”
- “What risks do hooks introduce?”
- “How do you enforce ordering?”
- “How do you share hook behavior across tools?”
- “How do you debug hook failures?”
Hints in Layers
Hint 1: Starting Point List all hook types described in docs.
Hint 2: Next Level Map them to a shared lifecycle schema.
Hint 3: Technical Details Define a minimal payload each hook should receive.
Hint 4: Tools/Debugging Simulate a failure hook and record response.
Books That Will Help
| Topic | Book | Chapter |
|---|---|---|
| Pipeline control | “Continuous Delivery” | Ch. 2 |
| Operational safety | “Release It!” | Ch. 5 |
| Automation style | “The Pragmatic Programmer” | Ch. 6 |
Common Pitfalls & Debugging
Problem 1: “Hooks run in the wrong order”
- Why: Unclear lifecycle rules
- Fix: Define and document a canonical timeline
Problem 2: “Hooks hide failures”
- Why: Errors swallowed inside scripts
- Fix: Make failure conditions explicit
Project 7: Extension and Plugin Compatibility Lab
- File: P07-extension-and-plugin-compatibility-lab.md
- Main Programming Language: JavaScript
- Alternative Programming Languages: Python
- Coolness Level: 4
- Business Potential: 4
- Difficulty: 3
- Knowledge Area: Extensions, Plugins
- Software or Tool: Plugins, extensions, custom agents
- Main Book: “Design Patterns”
What you’ll build: A compatibility report for extensions across agent ecosystems.
Why it teaches interoperability: Extensions often define unique features that must be bridged.
Core challenges you’ll face:
- Capability mapping -> maps to adapter patterns
- Version compatibility -> maps to change control
- Isolation -> maps to sandboxing
Real World Outcome
A compatibility grid that shows which extensions can be adapted across tools and how.
What you will see:
- Extension list: categorized by purpose
- Portability rating: high, medium, low
- Adapter notes: what to build or omit
The Core Question You’re Answering
“Which extension features can be made portable and which cannot?”
Concepts You Must Understand First
- Adapter pattern
- How do you translate one plugin system to another?
- Isolation boundaries
- What must stay sandboxed?
- Versioning
- How do you manage extensions across releases?
Questions to Guide Your Design
- Portability criteria
- What makes an extension portable?
- Risk analysis
- Which extensions introduce security or compliance risks?
Thinking Exercise
Adapter Mapping
Pick one extension and list the hooks or tools it depends on in each CLI.
The Interview Questions They’ll Ask
- “How do plugin systems differ across tools?”
- “What is an adapter and why is it needed?”
- “How do you assess portability?”
- “What is the risk of extension lock-in?”
- “How do you manage extension versions?”
Hints in Layers
Hint 1: Starting Point Select three extensions from different ecosystems.
Hint 2: Next Level Compare their APIs and dependencies.
Hint 3: Technical Details Map them into an adapter interface with optional fields.
Hint 4: Tools/Debugging Document the minimal test to confirm compatibility.
Books That Will Help
| Topic | Book | Chapter |
|---|---|---|
| Adapter pattern | “Design Patterns” by GoF | Ch. 4 |
| Modularity | “Clean Architecture” | Ch. 6 |
| Risk tradeoffs | “Release It!” | Ch. 6 |
Common Pitfalls & Debugging
Problem 1: “Extensions assume unavailable tools”
- Why: Ecosystem-specific dependencies
- Fix: Add capability checks before enabling
Problem 2: “Extension output is inconsistent”
- Why: Different output schemas
- Fix: Normalize outputs in adapters
Project 8: MCP Gateway Prototype
- File: P08-mcp-gateway-prototype.md
- Main Programming Language: Go
- Alternative Programming Languages: Python, JavaScript
- Coolness Level: 4
- Business Potential: 4
- Difficulty: 4
- Knowledge Area: Protocols, MCP
- Software or Tool: MCP
- Main Book: “Designing Data-Intensive Applications”
What you’ll build: A gateway concept that exposes shared MCP servers to multiple CLIs.
Why it teaches interoperability: MCP is a key protocol for shared tool and context access.
Core challenges you’ll face:
- Protocol translation -> maps to interoperability
- Security boundaries -> maps to policy design
- Multi-client support -> maps to concurrency
Real World Outcome
A conceptual gateway design with message flow diagrams and connection rules.
What you will see:
- Protocol diagram: client -> gateway -> MCP server
- Auth mapping: how credentials are passed
- Failure handling: retry and fallback rules
The Core Question You’re Answering
“How do multiple agents safely share MCP resources?”
Concepts You Must Understand First
- Protocol basics
- How does MCP represent tools and resources?
- Security scopes
- How do you limit access per agent?
- Concurrency
- How do multiple clients share the gateway?
Questions to Guide Your Design
- Gateway boundaries
- Where do you terminate auth?
- Resource naming
- How do you avoid collisions?
Thinking Exercise
Gateway Request Flow
Sketch a request flow from two agents to one MCP server.
The Interview Questions They’ll Ask
- “What problem does MCP solve?”
- “Why use a gateway instead of direct access?”
- “How do you secure shared resources?”
- “How do you handle resource collisions?”
- “How do you scale MCP access?”
Hints in Layers
Hint 1: Starting Point List the MCP servers you want to share.
Hint 2: Next Level Define an authentication and routing layer.
Hint 3: Technical Details Describe request envelopes and error handling rules.
Hint 4: Tools/Debugging Create a failure scenario and define a fallback response.
Books That Will Help
| Topic | Book | Chapter |
|---|---|---|
| Protocol design | “Designing Data-Intensive Applications” | Ch. 4 |
| Concurrency | “Operating Systems: Three Easy Pieces” | Ch. 26 |
| Security | “Security in Computing” by Pfleeger | Ch. 3 |
Common Pitfalls & Debugging
Problem 1: “Gateway becomes a bottleneck”
- Why: Single shared resource without scaling plan
- Fix: Define sharding or replication
Problem 2: “Access rules are inconsistent”
- Why: Policies applied per tool instead of per agent
- Fix: Centralize policy evaluation
Project 9: Headless Batch Runner
- File: P09-headless-batch-runner.md
- Main Programming Language: Python
- Alternative Programming Languages: Go, JavaScript
- Coolness Level: 3
- Business Potential: 4
- Difficulty: 2
- Knowledge Area: Automation
- Software or Tool: Headless mode
- Main Book: “Continuous Delivery”
What you’ll build: A batch execution plan that runs tasks headlessly across agents.
Why it teaches interoperability: Headless mode is critical for CI and scheduled jobs.
Core challenges you’ll face:
- Input packaging -> maps to repeatable runs
- Output capture -> maps to logging
- Error detection -> maps to reliability
Real World Outcome
A runbook showing how to execute a queue of tasks through multiple agents without interactive steps.
What you will see:
- Batch manifest: list of tasks and inputs
- Output archive: standardized logs
- Failure summary: auto-retry guidance
The Core Question You’re Answering
“How do you run agents in automation without human intervention?”
Concepts You Must Understand First
- Headless mode
- What changes in behavior without a REPL?
- Idempotency
- How do you avoid repeated damage on retries?
- Output capture
- How do you store results for later analysis?
Questions to Guide Your Design
- Batch inputs
- What is the minimal input schema?
- Error handling
- When should you retry vs stop?
Thinking Exercise
Batch Run Scenario
Imagine three tasks failing in different stages. Define what your runner should do.
The Interview Questions They’ll Ask
- “What is the difference between headless and interactive execution?”
- “Why is idempotency important in batch automation?”
- “How do you capture headless outputs reliably?”
- “When do you retry and when do you fail fast?”
- “How do you audit batch runs?”
Hints in Layers
Hint 1: Starting Point Design a task manifest with input and expected output fields.
Hint 2: Next Level Define how each agent is invoked in headless mode.
Hint 3: Technical Details Specify output capture and logging rules for each run.
Hint 4: Tools/Debugging Run a dry-run mode that prints intended actions without executing.
Books That Will Help
| Topic | Book | Chapter |
|---|---|---|
| Automation pipelines | “Continuous Delivery” | Ch. 1-2 |
| Reliability | “Release It!” | Ch. 1 |
| Shell discipline | “Effective Shell” by Dave Kerr | Ch. 3 |
Common Pitfalls & Debugging
Problem 1: “Batch runs produce inconsistent output”
- Why: Output capture not standardized
- Fix: Define a single output envelope schema
Problem 2: “Retries cause side effects”
- Why: Tasks are not idempotent
- Fix: Add preflight checks and safe guards
Project 10: Interactive Session Recorder
- File: P10-interactive-session-recorder.md
- Main Programming Language: Python
- Alternative Programming Languages: JavaScript
- Coolness Level: 2
- Business Potential: 3
- Difficulty: 2
- Knowledge Area: Observability
- Software or Tool: CLI session logs
- Main Book: “Release It!”
What you’ll build: A system that records interactive sessions and replays them as headless scripts.
Why it teaches interoperability: It bridges human exploration with automated repeatability.
Core challenges you’ll face:
- Command capture -> maps to event logging
- State reconstruction -> maps to reproducibility
- Replay safety -> maps to approval policies
Real World Outcome
A session log format and a replay plan that turns interactive work into automation.
What you will see:
- Session log: ordered steps with timestamps
- Replay plan: steps converted to batch tasks
- Safety notes: checkpoints for risky actions
The Core Question You’re Answering
“How do you convert human-driven sessions into automated runs?”
Concepts You Must Understand First
- Event logging
- What should be captured and why?
- Determinism
- What changes between interactive and headless runs?
- Safety checkpoints
- Which steps require human approval?
Questions to Guide Your Design
- Log schema
- What fields are required for replay?
- Replay validation
- How do you detect drift?
Thinking Exercise
Replay Checklist
List five interactive actions and decide if each can be safely replayed.
The Interview Questions They’ll Ask
- “What is the value of session recording?”
- “How do you handle nondeterminism?”
- “What is the risk of replaying interactive actions?”
- “How do you store session logs safely?”
- “How do you verify a replay succeeded?”
Hints in Layers
Hint 1: Starting Point Start with a simple timestamped log of actions.
Hint 2: Next Level Define a replayable action schema.
Hint 3: Technical Details Add environment snapshot fields to reduce drift.
Hint 4: Tools/Debugging Compare outputs between interactive and replay runs.
Books That Will Help
| Topic | Book | Chapter |
|---|---|---|
| Reliability patterns | “Release It!” | Ch. 2 |
| Automation discipline | “Continuous Delivery” | Ch. 3 |
| Logging practices | “Designing Data-Intensive Applications” | Ch. 11 |
Common Pitfalls & Debugging
Problem 1: “Replays fail because of drift”
- Why: Environment not captured
- Fix: Log environment metadata with each session
Problem 2: “Session logs are incomplete”
- Why: Missing step types or artifacts
- Fix: Add required fields to the log schema
Project 11: Approval Policy Simulator
- File: P11-approval-policy-simulator.md
- Main Programming Language: Python
- Alternative Programming Languages: JavaScript
- Coolness Level: 3
- Business Potential: 4
- Difficulty: 3
- Knowledge Area: Safety, Governance
- Software or Tool: Approval policies
- Main Book: “Release It!”
What you’ll build: A simulator that shows when approvals are required across agents.
Why it teaches interoperability: A unified automation system must follow the strictest approval policy.
Core challenges you’ll face:
- Policy mapping -> maps to rules engines
- Action classification -> maps to risk assessment
- Human checkpoints -> maps to governance
Real World Outcome
A policy matrix showing which actions need approval per agent and a common enforcement rule.
What you will see:
- Policy table: action -> approval required
- Risk tiers: low, medium, high
- Unified rule: safest default policy
The Core Question You’re Answering
“How do you enforce safety across agents with different rules?”
Concepts You Must Understand First
- Risk classification
- How do you rate actions by potential impact?
- Policy precedence
- Which rule wins when policies conflict?
- Human approval flow
- How do you record approvals?
Questions to Guide Your Design
- Policy schema
- What fields define an approval rule?
- Enforcement strategy
- Where do you enforce the strictest policy?
Thinking Exercise
Approval Scenarios
List five actions and assign a required approval level for each.
The Interview Questions They’ll Ask
- “Why do you need approval policies for agents?”
- “How do you handle conflicting safety rules?”
- “What is the strictest-policy principle?”
- “How do you audit approvals?”
- “How do you avoid blocking safe tasks?”
Hints in Layers
Hint 1: Starting Point Collect policy descriptions from each agent.
Hint 2: Next Level Normalize them into a single schema.
Hint 3: Technical Details Define a resolution rule that always selects the strictest policy.
Hint 4: Tools/Debugging Test the rule with sample actions and verify expected approvals.
Books That Will Help
| Topic | Book | Chapter |
|---|---|---|
| Risk awareness | “Release It!” | Ch. 3 |
| Governance | “Clean Architecture” | Ch. 12 |
| Delivery controls | “Continuous Delivery” | Ch. 9 |
Common Pitfalls & Debugging
Problem 1: “Policies are inconsistent”
- Why: Different agents use different categories
- Fix: Normalize into a unified risk taxonomy
Problem 2: “Too many approvals”
- Why: Overly strict defaults
- Fix: Add exceptions with clear justification
Project 12: Sandbox Matrix Auditor
- File: P12-sandbox-matrix-auditor.md
- Main Programming Language: Python
- Alternative Programming Languages: JavaScript
- Coolness Level: 2
- Business Potential: 3
- Difficulty: 2
- Knowledge Area: Security, Sandboxing
- Software or Tool: Sandboxing models
- Main Book: “Security in Computing”
What you’ll build: A matrix that compares sandbox boundaries across agents.
Why it teaches interoperability: You need consistent safety boundaries when chaining tools.
Core challenges you’ll face:
- Boundary mapping -> maps to security analysis
- Capability restrictions -> maps to policy enforcement
- Escalation paths -> maps to approval flows
Real World Outcome
A sandbox comparison chart with action categories and allowed scopes.
What you will see:
- Sandbox table: file, network, process access
- Escalation rules: when approvals are required
- Unified baseline: least-privilege defaults
The Core Question You’re Answering
“What is the safest common sandbox that still enables automation?”
Concepts You Must Understand First
- Least privilege
- What is the minimal access needed?
- Escalation
- When can permissions be raised safely?
- Auditability
- How do you log privileged actions?
Questions to Guide Your Design
- Baseline policy
- Which sandbox rules must always apply?
- Exceptions
- How do you document and approve exceptions?
Thinking Exercise
Sandbox Gap Analysis
List three tasks and identify which agent supports each safely.
The Interview Questions They’ll Ask
- “What is the least-privilege principle?”
- “How do sandbox rules differ across tools?”
- “When should you allow escalation?”
- “How do you audit privileged actions?”
- “What is the risk of permissive defaults?”
Hints in Layers
Hint 1: Starting Point List the default sandbox settings of each CLI.
Hint 2: Next Level Compare them against a strict baseline.
Hint 3: Technical Details Define an escalation process with approvals.
Hint 4: Tools/Debugging Try a restricted action in each CLI and record behavior.
Books That Will Help
| Topic | Book | Chapter |
|---|---|---|
| Security fundamentals | “Security in Computing” | Ch. 2 |
| Operational safety | “Release It!” | Ch. 6 |
| Risk controls | “Clean Architecture” | Ch. 12 |
Common Pitfalls & Debugging
Problem 1: “Sandbox is too permissive”
- Why: Defaults prioritized convenience
- Fix: Adopt a strict baseline and add exceptions sparingly
Problem 2: “Automation fails due to restrictions”
- Why: Missing approvals or unclear escalation
- Fix: Document explicit escalation flows
Project 13: Output Style Normalizer
- File: P13-output-style-normalizer.md
- Main Programming Language: Python
- Alternative Programming Languages: JavaScript
- Coolness Level: 3
- Business Potential: 3
- Difficulty: 2
- Knowledge Area: Output Contracts
- Software or Tool: Output formatting
- Main Book: “Refactoring”
What you’ll build: A normalization guide that enforces consistent output style across agents.
Why it teaches interoperability: Consistent outputs allow automation to parse responses reliably.
Core challenges you’ll face:
- Style extraction -> maps to prompt contracts
- Normalization rules -> maps to schema design
- Edge cases -> maps to validation
Real World Outcome
A standard output template with examples and validation checks.
What you will see:
- Required sections: consistent headings and fields
- Normalization rules: how to handle deviations
- Validation checklist: confirm outputs are parseable
The Core Question You’re Answering
“How do you make responses machine-friendly across different agents?”
Concepts You Must Understand First
- Schema validation
- What fields must always appear?
- Style constraints
- How do you keep responses concise and consistent?
- Parsing reliability
- What breaks downstream automation?
Questions to Guide Your Design
- Template design
- What is the minimal parseable structure?
- Error handling
- What do you do when the template is violated?
Thinking Exercise
Output Checklist
Create a checklist of five validation rules for responses.
The Interview Questions They’ll Ask
- “Why does output normalization matter in automation?”
- “What makes a response machine-parseable?”
- “How do you handle agent deviations?”
- “What are the risks of flexible outputs?”
- “How do you detect schema violations?”
Hints in Layers
Hint 1: Starting Point Define the sections you always want in outputs.
Hint 2: Next Level Create a strict checklist with required keys.
Hint 3: Technical Details Document a fallback mapping for nonconforming outputs.
Hint 4: Tools/Debugging Test with two agents and compare output structures.
Books That Will Help
| Topic | Book | Chapter |
|---|---|---|
| Refactoring structure | “Refactoring” by Martin Fowler | Ch. 1 |
| Contract enforcement | “Clean Architecture” | Ch. 9 |
| Data correctness | “Designing Data-Intensive Applications” | Ch. 2 |
Common Pitfalls & Debugging
Problem 1: “Outputs are verbose and inconsistent”
- Why: No defined template
- Fix: Add required sections and limit free-form text
Problem 2: “Parser breaks on edge cases”
- Why: Missing fields or unexpected ordering
- Fix: Add validation and reformat rules
Project 14: Multi-Agent Logging Standard
- File: P14-multi-agent-logging-standard.md
- Main Programming Language: Python
- Alternative Programming Languages: JavaScript
- Coolness Level: 3
- Business Potential: 4
- Difficulty: 3
- Knowledge Area: Observability
- Software or Tool: Logging
- Main Book: “Designing Data-Intensive Applications”
What you’ll build: A unified log event schema for all agent actions.
Why it teaches interoperability: Logs are the glue that lets you trace actions across systems.
Core challenges you’ll face:
- Event schema -> maps to data modeling
- Correlation IDs -> maps to traceability
- Privacy rules -> maps to governance
Real World Outcome
A log schema document and a sample log stream for multi-agent workflows.
What you will see:
- Event types: prompt, tool-call, output, error
- Correlation fields: run ID, task ID, agent ID
- Redaction rules: what to hide
The Core Question You’re Answering
“How do you trace a task across multiple agents?”
Concepts You Must Understand First
- Event schemas
- What minimal fields are required?
- Correlation identifiers
- How do you link events across agents?
- Redaction
- How do you protect sensitive data?
Questions to Guide Your Design
- Schema coverage
- Which actions must always be logged?
- Retention
- How long should logs be stored?
Thinking Exercise
Trace a Task
Write a list of events that should appear for a single task across two agents.
The Interview Questions They’ll Ask
- “What fields are critical in agent logs?”
- “How do you correlate events across tools?”
- “What is the risk of logging too much?”
- “How do you redact sensitive data?”
- “How do logs help debugging?”
Hints in Layers
Hint 1: Starting Point Define a minimal event schema with required fields.
Hint 2: Next Level Add correlation IDs and severity levels.
Hint 3: Technical Details Specify redaction rules for prompts and outputs.
Hint 4: Tools/Debugging Replay a log stream and verify task reconstruction.
Books That Will Help
| Topic | Book | Chapter |
|---|---|---|
| Event modeling | “Designing Data-Intensive Applications” | Ch. 11 |
| Reliability | “Release It!” | Ch. 8 |
| Governance | “Clean Architecture” | Ch. 12 |
Common Pitfalls & Debugging
Problem 1: “Logs are not correlated”
- Why: Missing consistent IDs
- Fix: Add required correlation fields
Problem 2: “Sensitive data leaked”
- Why: No redaction policy
- Fix: Add redaction rules and auditing
Project 15: Error Taxonomy and Retry Controller
- File: P15-error-taxonomy-and-retry-controller.md
- Main Programming Language: Python
- Alternative Programming Languages: Go
- Coolness Level: 3
- Business Potential: 4
- Difficulty: 3
- Knowledge Area: Reliability
- Software or Tool: Retry policies
- Main Book: “Release It!”
What you’ll build: A taxonomy of errors and a retry policy matrix for agents.
Why it teaches interoperability: Errors vary across CLIs and must be normalized for automation.
Core challenges you’ll face:
- Error classification -> maps to reliability
- Retry strategy -> maps to resilience design
- Escalation rules -> maps to governance
Real World Outcome
A standardized error catalog and guidance on retries vs hard stops.
What you will see:
- Error categories: transient, permanent, policy
- Retry policy: allowed retries per category
- Escalation rules: when to notify humans
The Core Question You’re Answering
“Which failures should be retried and which should stop automation?”
Concepts You Must Understand First
- Failure modes
- How do you tell transient from permanent errors?
- Backoff strategies
- How do you avoid retry storms?
- Escalation
- When does a human need to intervene?
Questions to Guide Your Design
- Error taxonomy
- What are the key error categories across tools?
- Policy mapping
- How do you map tool-specific errors to categories?
Thinking Exercise
Error Mapping
Write three example errors and map them to categories.
The Interview Questions They’ll Ask
- “Why do you need an error taxonomy?”
- “How do you choose retry policies?”
- “What is the risk of retrying everything?”
- “How do you detect permanent failures?”
- “How do you log error decisions?”
Hints in Layers
Hint 1: Starting Point Collect common error messages across agents.
Hint 2: Next Level Group them into a small set of categories.
Hint 3: Technical Details Define retry limits and escalation triggers.
Hint 4: Tools/Debugging Simulate a transient failure and confirm retry logic.
Books That Will Help
| Topic | Book | Chapter |
|---|---|---|
| Failure handling | “Release It!” | Ch. 4 |
| Resilient systems | “Designing Data-Intensive Applications” | Ch. 8 |
| Operational discipline | “Continuous Delivery” | Ch. 8 |
Common Pitfalls & Debugging
Problem 1: “Retries cause duplicate effects”
- Why: No idempotency checks
- Fix: Add idempotency tokens or preflight checks
Problem 2: “Errors are misclassified”
- Why: Overly broad categories
- Fix: Refine categories and add examples
Project 16: Context Budget Planner
- File: P16-context-budget-planner.md
- Main Programming Language: Python
- Alternative Programming Languages: JavaScript
- Coolness Level: 4
- Business Potential: 4
- Difficulty: 3
- Knowledge Area: Context Management
- Software or Tool: Memory and context
- Main Book: “AI Engineering”
What you’ll build: A planner that budgets context usage across agents and tasks.
Why it teaches interoperability: Different agents have different context limits and memory approaches.
Core challenges you’ll face:
- Context sizing -> maps to token budgeting
- Summarization rules -> maps to compression
- Priority signals -> maps to task design
Real World Outcome
A context budget worksheet that shows how to allocate context to tasks and agents.
What you will see:
- Budget table: task -> context allocation
- Summarization rules: what to shrink
- Overflow plan: fallback when context is exceeded
The Core Question You’re Answering
“What context is essential, and what can be summarized or externalized?”
Concepts You Must Understand First
- Context windows
- How do different agents limit input size?
- Summarization tradeoffs
- What information is safe to compress?
- External memory
- When should you store context outside the prompt?
Questions to Guide Your Design
- Budget strategy
- How do you allocate context across tasks?
- Overflow plan
- What happens when context is too large?
Thinking Exercise
Context Triage
List ten pieces of context and rank them by importance.
The Interview Questions They’ll Ask
- “What is a context budget and why does it matter?”
- “How do you summarize safely?”
- “What is the risk of context overflow?”
- “How do you decide what to keep?”
- “How does external memory help?”
Hints in Layers
Hint 1: Starting Point Measure how much context each agent supports.
Hint 2: Next Level Create a priority list of context elements.
Hint 3: Technical Details Define rules for summarization and external storage.
Hint 4: Tools/Debugging Test a large task and see what context must be trimmed.
Books That Will Help
| Topic | Book | Chapter |
|---|---|---|
| AI systems | “AI Engineering” by Chip Huyen | Ch. 2 |
| Data summarization | “Designing Data-Intensive Applications” | Ch. 11 |
| Communication clarity | “The Pragmatic Programmer” | Ch. 8 |
Common Pitfalls & Debugging
Problem 1: “Critical context is lost”
- Why: Poor prioritization
- Fix: Rank context by impact on task correctness
Problem 2: “Summaries are too vague”
- Why: No clear summarization rules
- Fix: Define summary templates
Project 17: Memory Import and Export Bridge
- File: P17-memory-import-and-export-bridge.md
- Main Programming Language: Python
- Alternative Programming Languages: JavaScript
- Coolness Level: 4
- Business Potential: 4
- Difficulty: 3
- Knowledge Area: Memory Management
- Software or Tool: Local memory systems
- Main Book: “Designing Data-Intensive Applications”
What you’ll build: A portability guide for moving memory across agent ecosystems.
Why it teaches interoperability: Shared memory and persistence are crucial for long-running automation.
Core challenges you’ll face:
- Data model alignment -> maps to schema conversion
- Privacy handling -> maps to governance
- Conflict resolution -> maps to versioning
Real World Outcome
A memory export format with a mapping to each agent’s storage scheme.
What you will see:
- Memory schema: shared fields and optional fields
- Import rules: how to transform data
- Conflict policy: how to merge duplicates
The Core Question You’re Answering
“How do you move durable knowledge between agents without losing meaning?”
Concepts You Must Understand First
- Memory schemas
- What fields are common across systems?
- Privacy and retention
- What should be excluded or anonymized?
- Versioning
- How do you handle changes to memory format?
Questions to Guide Your Design
- Export format
- What is the minimal portable format?
- Merge strategy
- How do you avoid duplicating memories?
Thinking Exercise
Memory Record Example
Write a hypothetical memory record and map it to two tools.
The Interview Questions They’ll Ask
- “What is the difference between context and memory?”
- “Why is memory portability hard?”
- “How do you handle conflicting memories?”
- “How do you ensure privacy in memory export?”
- “What versioning strategy would you use?”
Hints in Layers
Hint 1: Starting Point List the memory storage locations for each CLI.
Hint 2: Next Level Create a portable schema with required fields.
Hint 3: Technical Details Define merge and conflict resolution rules.
Hint 4: Tools/Debugging Test import with a small memory sample.
Books That Will Help
| Topic | Book | Chapter |
|---|---|---|
| Data migration | “Designing Data-Intensive Applications” | Ch. 3 |
| Privacy | “Security in Computing” | Ch. 5 |
| Data modeling | “Clean Architecture” | Ch. 10 |
Common Pitfalls & Debugging
Problem 1: “Memory records lose meaning”
- Why: Missing metadata fields
- Fix: Add context fields for origin and purpose
Problem 2: “Conflicting records overwrite each other”
- Why: No merge policy
- Fix: Add conflict resolution rules
Project 18: Cross-Agent Workspace Sync
- File: P18-cross-agent-workspace-sync.md
- Main Programming Language: Python
- Alternative Programming Languages: Go
- Coolness Level: 3
- Business Potential: 3
- Difficulty: 3
- Knowledge Area: File System, Sync
- Software or Tool: Workspace synchronization
- Main Book: “Designing Data-Intensive Applications”
What you’ll build: A synchronization plan for sharing workspace state across agents.
Why it teaches interoperability: Agents must see the same file system state to collaborate.
Core challenges you’ll face:
- State drift -> maps to consistency models
- Conflict resolution -> maps to merge strategy
- Change detection -> maps to event logging
Real World Outcome
A documented sync strategy and a conflict resolution playbook.
What you will see:
- Sync model: push, pull, or bidirectional
- Conflict rules: which changes win
- Snapshot plan: how to capture state
The Core Question You’re Answering
“How do agents stay consistent in the same workspace?”
Concepts You Must Understand First
- Consistency models
- What is eventual vs strong consistency?
- Conflict resolution
- How do you decide which edit wins?
- Change detection
- How do you detect drift early?
Questions to Guide Your Design
- Sync scope
- Which files are shared vs ignored?
- Conflict strategy
- When do you require human review?
Thinking Exercise
Conflict Scenario
Imagine two agents edit the same file differently. Define your resolution rule.
The Interview Questions They’ll Ask
- “What is workspace drift?”
- “How do you handle concurrent edits?”
- “Why does consistency matter for agents?”
- “What is your conflict resolution strategy?”
- “How do you prevent hidden changes?”
Hints in Layers
Hint 1: Starting Point Define which directories are shared and which are agent-specific.
Hint 2: Next Level Choose a consistency model and document it.
Hint 3: Technical Details Define a merge policy for conflicts.
Hint 4: Tools/Debugging Simulate a conflict and verify resolution rules.
Books That Will Help
| Topic | Book | Chapter |
|---|---|---|
| Consistency models | “Designing Data-Intensive Applications” | Ch. 5 |
| Version control | “Working Effectively with Legacy Code” | Ch. 2 |
| System boundaries | “Clean Architecture” | Ch. 9 |
Common Pitfalls & Debugging
Problem 1: “Agents overwrite each other”
- Why: No conflict policy
- Fix: Define merge rules and require human review for conflicts
Problem 2: “Sync misses important files”
- Why: Poor include/exclude rules
- Fix: Document explicit inclusion lists
Project 19: Secrets Broker Shim
- File: P19-secrets-broker-shim.md
- Main Programming Language: Python
- Alternative Programming Languages: Go
- Coolness Level: 3
- Business Potential: 4
- Difficulty: 3
- Knowledge Area: Security
- Software or Tool: Credential handling
- Main Book: “Security in Computing”
What you’ll build: A plan for handling secrets across multiple agent CLIs.
Why it teaches interoperability: Secrets must be shared safely and consistently.
Core challenges you’ll face:
- Secret storage -> maps to security policy
- Injection methods -> maps to tool configuration
- Audit trails -> maps to governance
Real World Outcome
A secrets broker strategy with allowed storage, access methods, and audit logging.
What you will see:
- Secret inventory: what secrets are needed
- Access rules: who can access them
- Audit plan: how access is recorded
The Core Question You’re Answering
“How do you share secrets safely across different agent tools?”
Concepts You Must Understand First
- Secret lifecycle
- How are secrets created, rotated, revoked?
- Least privilege
- What is the minimal scope for each secret?
- Audit logging
- How do you track secret access?
Questions to Guide Your Design
- Storage rules
- Where are secrets stored and who owns them?
- Injection paths
- How do you pass secrets into agent runs?
Thinking Exercise
Secret Inventory
List five secrets used in a typical automation pipeline and decide their scope.
The Interview Questions They’ll Ask
- “How do you manage secrets across tools?”
- “What is least privilege in the context of agents?”
- “How do you prevent secret leakage?”
- “What is the role of audit logging?”
- “How do you rotate secrets safely?”
Hints in Layers
Hint 1: Starting Point Identify where each CLI expects credentials.
Hint 2: Next Level Define a broker that injects secrets at runtime only.
Hint 3: Technical Details Specify rotation and revocation rules.
Hint 4: Tools/Debugging Audit a run and confirm no secrets are logged.
Books That Will Help
| Topic | Book | Chapter |
|---|---|---|
| Security basics | “Security in Computing” | Ch. 4 |
| Operational controls | “Release It!” | Ch. 7 |
| Governance | “Clean Architecture” | Ch. 12 |
Common Pitfalls & Debugging
Problem 1: “Secrets leak into logs”
- Why: No redaction policy
- Fix: Add redaction and scanning
Problem 2: “Agents can’t access secrets”
- Why: Misconfigured injection paths
- Fix: Document and test access paths
Project 20: Test Harness for Agents
- File: P20-test-harness-for-agents.md
- Main Programming Language: Python
- Alternative Programming Languages: JavaScript
- Coolness Level: 4
- Business Potential: 4
- Difficulty: 4
- Knowledge Area: Testing, Evaluation
- Software or Tool: Testing harness
- Main Book: “Clean Architecture”
What you’ll build: A standardized test harness to evaluate agent outputs.
Why it teaches interoperability: You need consistent benchmarks to compare agents.
Core challenges you’ll face:
- Test case design -> maps to evaluation quality
- Output verification -> maps to contract checks
- Regression tracking -> maps to version control
Real World Outcome
A suite of test cases with expected outputs and evaluation criteria.
What you will see:
- Test catalog: tasks grouped by complexity
- Evaluation rules: what counts as success
- Regression logs: differences across versions
The Core Question You’re Answering
“How do you measure agent performance consistently?”
Concepts You Must Understand First
- Test design
- How do you choose representative tasks?
- Evaluation criteria
- What metrics matter for correctness and quality?
- Regression tracking
- How do you detect changes over time?
Questions to Guide Your Design
- Test scope
- What tasks define baseline interoperability?
- Metrics
- How do you score and compare results?
Thinking Exercise
Test Matrix
List five tasks and define success criteria for each.
The Interview Questions They’ll Ask
- “What is a good agent benchmark?”
- “How do you measure correctness?”
- “What is regression testing for agents?”
- “How do you avoid biased test cases?”
- “How do you score outputs?”
Hints in Layers
Hint 1: Starting Point Define a small set of tasks that all agents can do.
Hint 2: Next Level Create a rubric for success and failure.
Hint 3: Technical Details Add a versioned results store for comparisons.
Hint 4: Tools/Debugging Run the same tests on two agents and compare results.
Books That Will Help
| Topic | Book | Chapter |
|---|---|---|
| Testing philosophy | “Clean Architecture” | Ch. 20 |
| Metrics | “Accelerate” | Ch. 3 |
| Continuous testing | “Continuous Delivery” | Ch. 7 |
Common Pitfalls & Debugging
Problem 1: “Tests are not representative”
- Why: Narrow task set
- Fix: Expand to include different task types
Problem 2: “Scores are inconsistent”
- Why: Ambiguous criteria
- Fix: Define explicit pass/fail rules
Project 21: Prompt Injection Red Team Lab
- File: P21-prompt-injection-red-team-lab.md
- Main Programming Language: Python
- Alternative Programming Languages: JavaScript
- Coolness Level: 4
- Business Potential: 3
- Difficulty: 4
- Knowledge Area: Security
- Software or Tool: Prompt security
- Main Book: “Security in Computing”
What you’ll build: A lab of adversarial prompt cases for multiple agents.
Why it teaches interoperability: Security is only as strong as the weakest agent in the chain.
Core challenges you’ll face:
- Threat modeling -> maps to security basics
- Test case generation -> maps to evaluation
- Mitigation rules -> maps to governance
Real World Outcome
A red team checklist and a set of adversarial prompts with mitigation notes.
What you will see:
- Attack categories: injection, data exfiltration, policy bypass
- Test cases: prompts designed to stress safety
- Mitigations: filters and policy rules
The Core Question You’re Answering
“How do you test and harden multi-agent systems against prompt attacks?”
Concepts You Must Understand First
- Threat modeling
- What assets are at risk?
- Prompt injection
- How does instruction hierarchy get subverted?
- Mitigation strategies
- What defenses are realistic and effective?
Questions to Guide Your Design
- Attack coverage
- What attacks are most relevant to coding tasks?
- Defense mapping
- Which mitigations apply to each agent?
Thinking Exercise
Attack Surface Map
List the parts of a task that could be manipulated by malicious input.
The Interview Questions They’ll Ask
- “What is prompt injection and why is it risky?”
- “How do you test for prompt injection?”
- “What is a realistic mitigation strategy?”
- “How do you enforce safe outputs?”
- “What is the weakest link in a multi-agent chain?”
Hints in Layers
Hint 1: Starting Point Collect known attack patterns from security notes.
Hint 2: Next Level Categorize them by impact and likelihood.
Hint 3: Technical Details Define a test plan with expected safe responses.
Hint 4: Tools/Debugging Run the same attack across multiple agents and compare results.
Books That Will Help
| Topic | Book | Chapter |
|---|---|---|
| Security basics | “Security in Computing” | Ch. 1-3 |
| Risk assessment | “Release It!” | Ch. 6 |
| Governance | “Clean Architecture” | Ch. 12 |
Common Pitfalls & Debugging
Problem 1: “Test cases are too trivial”
- Why: Lack of realistic attack patterns
- Fix: Use layered attacks and indirect prompts
Problem 2: “Mitigations break normal tasks”
- Why: Overly strict rules
- Fix: Add exception handling with logging
Project 22: Multi-Agent Code Review Pipeline
- File: P22-multi-agent-code-review-pipeline.md
- Main Programming Language: JavaScript
- Alternative Programming Languages: Python
- Coolness Level: 4
- Business Potential: 4
- Difficulty: 3
- Knowledge Area: Code Review
- Software or Tool: Review automation
- Main Book: “Clean Code”
What you’ll build: A workflow that routes code review tasks across multiple agents.
Why it teaches interoperability: Code review benefits from agent specialization and consistent standards.
Core challenges you’ll face:
- Review criteria -> maps to coding standards
- Conflict resolution -> maps to merging feedback
- Bias control -> maps to evaluation
Real World Outcome
A review pipeline where each agent checks different aspects and results are merged.
What you will see:
- Review lanes: style, correctness, security, performance
- Merge rules: combine feedback with deduping
- Final report: structured output for humans
The Core Question You’re Answering
“How do you combine multiple agent reviews into one consistent result?”
Concepts You Must Understand First
- Review criteria
- What issues should always be flagged?
- Feedback normalization
- How do you merge duplicates?
- Bias detection
- How do you reduce inconsistent feedback?
Questions to Guide Your Design
- Division of labor
- Which agent should check what?
- Merge policy
- How do you resolve conflicting feedback?
Thinking Exercise
Review Rubric
Draft a rubric with four categories and two checks each.
The Interview Questions They’ll Ask
- “Why use multiple agents for code review?”
- “How do you avoid duplicate feedback?”
- “What is a review rubric?”
- “How do you handle conflicting suggestions?”
- “How do you measure review quality?”
Hints in Layers
Hint 1: Starting Point Assign each agent a review specialty.
Hint 2: Next Level Define a merge policy with priority rules.
Hint 3: Technical Details Create a standard report format for final output.
Hint 4: Tools/Debugging Compare merged output to a human review.
Books That Will Help
| Topic | Book | Chapter |
|---|---|---|
| Code quality | “Clean Code” by Robert C. Martin | Ch. 2 |
| Refactoring feedback | “Refactoring” | Ch. 2 |
| Reliability | “Release It!” | Ch. 4 |
Common Pitfalls & Debugging
Problem 1: “Review feedback conflicts”
- Why: Overlapping responsibilities
- Fix: Define clear role boundaries
Problem 2: “Too much noise”
- Why: No severity thresholds
- Fix: Add severity levels and filtering
Project 23: Issue Triage Mesh
- File: P23-issue-triage-mesh.md
- Main Programming Language: Python
- Alternative Programming Languages: JavaScript
- Coolness Level: 3
- Business Potential: 4
- Difficulty: 3
- Knowledge Area: Workflow Automation
- Software or Tool: Issue trackers
- Main Book: “The Pragmatic Programmer”
What you’ll build: A triage workflow that assigns issues to agent specialties.
Why it teaches interoperability: It forces you to route real work across agent capabilities.
Core challenges you’ll face:
- Issue classification -> maps to taxonomy design
- Routing logic -> maps to orchestration
- Feedback loop -> maps to continuous improvement
Real World Outcome
A triage flow that labels, prioritizes, and assigns issues to appropriate agents.
What you will see:
- Issue categories: bug, feature, refactor, docs
- Routing rules: category -> agent
- Metrics: time to resolution
The Core Question You’re Answering
“Which agent should handle each kind of issue?”
Concepts You Must Understand First
- Issue taxonomy
- What categories exist in your project?
- Routing metrics
- What signals indicate best agent fit?
- Feedback loops
- How do you improve routing over time?
Questions to Guide Your Design
- Triage rules
- Which fields trigger routing decisions?
- Assignment policy
- How do you avoid overload on one agent?
Thinking Exercise
Issue Mapping
Take five real issues and map them to agents with reasons.
The Interview Questions They’ll Ask
- “Why automate issue triage?”
- “How do you categorize issues?”
- “What data do you need for routing?”
- “How do you measure triage quality?”
- “How do you handle ambiguous issues?”
Hints in Layers
Hint 1: Starting Point Define a small set of categories and map them to agents.
Hint 2: Next Level Add priority levels and escalation rules.
Hint 3: Technical Details Record assignment outcomes to refine routing.
Hint 4: Tools/Debugging Compare automated triage to human decisions.
Books That Will Help
| Topic | Book | Chapter |
|---|---|---|
| Workflow design | “The Pragmatic Programmer” | Ch. 7 |
| Process improvement | “Accelerate” | Ch. 5 |
| Architecture | “Fundamentals of Software Architecture” | Ch. 3 |
Common Pitfalls & Debugging
Problem 1: “Issues routed to wrong agent”
- Why: Category mapping too coarse
- Fix: Add more granular categories
Problem 2: “Agents overloaded”
- Why: No balancing strategy
- Fix: Add capacity limits and queues
Project 24: Documentation Generator Federation
- File: P24-documentation-generator-federation.md
- Main Programming Language: JavaScript
- Alternative Programming Languages: Python
- Coolness Level: 3
- Business Potential: 3
- Difficulty: 3
- Knowledge Area: Documentation Automation
- Software or Tool: Docs generation
- Main Book: “Clean Architecture”
What you’ll build: A system that uses multiple agents to generate documentation consistently.
Why it teaches interoperability: Documentation tasks require consistency across agents and outputs.
Core challenges you’ll face:
- Style consistency -> maps to output normalization
- Source alignment -> maps to context management
- Review process -> maps to governance
Real World Outcome
A documentation pipeline where each agent drafts a section and results are merged.
What you will see:
- Doc outline: sections assigned per agent
- Style guide: unified tone and formatting rules
- Merge report: consolidated output
The Core Question You’re Answering
“How do you ensure documentation is consistent across agents?”
Concepts You Must Understand First
- Style guides
- What rules define consistency?
- Source-of-truth
- How do you ensure agents reference the same facts?
- Review workflow
- Who approves the final output?
Questions to Guide Your Design
- Outline ownership
- Which agent writes which section?
- Merge rules
- How do you handle overlaps?
Thinking Exercise
Documentation Contract
Draft a short style guide with tone, structure, and required sections.
The Interview Questions They’ll Ask
- “Why use multiple agents for docs?”
- “How do you enforce a consistent style?”
- “How do you prevent factual drift?”
- “How do you merge sections safely?”
- “How do you validate documentation quality?”
Hints in Layers
Hint 1: Starting Point Create a clear outline with section owners.
Hint 2: Next Level Define a style guide with required elements.
Hint 3: Technical Details Use a merge checklist that rejects inconsistent sections.
Hint 4: Tools/Debugging Compare outputs for tone and structure alignment.
Books That Will Help
| Topic | Book | Chapter |
|---|---|---|
| Consistency | “Clean Architecture” | Ch. 5 |
| Editing discipline | “The Pragmatic Programmer” | Ch. 8 |
| Reliability | “Release It!” | Ch. 7 |
Common Pitfalls & Debugging
Problem 1: “Docs have inconsistent tone”
- Why: No shared style guide
- Fix: Add a mandatory style contract
Problem 2: “Docs contradict each other”
- Why: Different sources used
- Fix: Define a single source-of-truth
Project 25: Repo Indexing Strategy
- File: P25-repo-indexing-strategy.md
- Main Programming Language: Python
- Alternative Programming Languages: Go
- Coolness Level: 3
- Business Potential: 4
- Difficulty: 4
- Knowledge Area: Code Intelligence
- Software or Tool: Code indexing
- Main Book: “Designing Data-Intensive Applications”
What you’ll build: An indexing plan that helps multiple agents navigate large repositories.
Why it teaches interoperability: Shared indexing prevents redundant scanning and reduces context waste.
Core challenges you’ll face:
- Index design -> maps to search systems
- Update strategy -> maps to consistency
- Access controls -> maps to security
Real World Outcome
A repository indexing plan with update cadence and access rules.
What you will see:
- Index schema: files, symbols, dependencies
- Update policy: incremental vs full rebuild
- Access rules: who can query what
The Core Question You’re Answering
“How do agents share repository knowledge efficiently?”
Concepts You Must Understand First
- Indexing basics
- What data should be indexed?
- Incremental updates
- How do you update indexes after changes?
- Access control
- How do you limit sensitive data exposure?
Questions to Guide Your Design
- Index scope
- Which files are worth indexing?
- Query interface
- How do agents access the index?
Thinking Exercise
Index Scope
List the top five file types you want indexed and why.
The Interview Questions They’ll Ask
- “Why is indexing important for large repos?”
- “How do you keep indexes up to date?”
- “What is the tradeoff between full and incremental indexing?”
- “How do you protect sensitive files?”
- “How does indexing reduce context usage?”
Hints in Layers
Hint 1: Starting Point Start with file lists and dependency graphs.
Hint 2: Next Level Define incremental update triggers.
Hint 3: Technical Details Add access controls and query limits.
Hint 4: Tools/Debugging Test index queries against a known file change.
Books That Will Help
| Topic | Book | Chapter |
|---|---|---|
| Index design | “Designing Data-Intensive Applications” | Ch. 3 |
| Search data | “Algorithms” by Sedgewick | Ch. 5 |
| Security | “Security in Computing” | Ch. 6 |
Common Pitfalls & Debugging
Problem 1: “Index is stale”
- Why: Updates are manual
- Fix: Add automatic update triggers
Problem 2: “Index leaks sensitive data”
- Why: No access control
- Fix: Restrict indexed fields and queries
Project 26: Skill and Prompt Pack Manager
- File: P26-skill-and-prompt-pack-manager.md
- Main Programming Language: JavaScript
- Alternative Programming Languages: Python
- Coolness Level: 4
- Business Potential: 4
- Difficulty: 3
- Knowledge Area: Prompt Management
- Software or Tool: Skills and prompt packs
- Main Book: “The Pragmatic Programmer”
What you’ll build: A system for packaging prompts and skills for reuse across agents.
Why it teaches interoperability: Portable skill packs reduce duplication and standardize workflows.
Core challenges you’ll face:
- Packaging format -> maps to portability
- Versioning -> maps to change control
- Distribution -> maps to governance
Real World Outcome
A structured prompt pack format with versioning and compatibility notes.
What you will see:
- Pack schema: metadata, prompts, instructions
- Version rules: semantic versioning guidelines
- Compatibility map: which agents support which packs
The Core Question You’re Answering
“How do you reuse prompts and skills across different CLIs?”
Concepts You Must Understand First
- Prompt modularity
- How do you make prompts composable?
- Versioning
- How do you evolve packs safely?
- Distribution
- How do teams share and trust prompt packs?
Questions to Guide Your Design
- Pack structure
- What metadata is required?
- Compatibility rules
- How do you mark agent-specific constraints?
Thinking Exercise
Pack Outline
Outline a prompt pack for code review that includes metadata and rules.
The Interview Questions They’ll Ask
- “What is a prompt pack?”
- “How do you make prompts portable?”
- “What is semantic versioning used for?”
- “How do you distribute prompt packs safely?”
- “How do you manage compatibility?”
Hints in Layers
Hint 1: Starting Point Define a pack with name, version, and prompt list.
Hint 2: Next Level Add compatibility notes for each agent.
Hint 3: Technical Details Create a validation checklist for pack structure.
Hint 4: Tools/Debugging Try importing the pack into two different CLIs.
Books That Will Help
| Topic | Book | Chapter |
|---|---|---|
| Reusability | “The Pragmatic Programmer” | Ch. 4 |
| Interfaces | “Clean Architecture” | Ch. 8 |
| Release discipline | “Continuous Delivery” | Ch. 5 |
Common Pitfalls & Debugging
Problem 1: “Packs work only for one agent”
- Why: Hidden assumptions in prompts
- Fix: Add explicit compatibility notes
Problem 2: “Pack versions drift”
- Why: No version policy
- Fix: Use semantic versioning and changelogs
Project 27: Cross-CLI Command Adapter
- File: P27-cross-cli-command-adapter.md
- Main Programming Language: Python
- Alternative Programming Languages: Go
- Coolness Level: 4
- Business Potential: 4
- Difficulty: 4
- Knowledge Area: CLI Interop
- Software or Tool: CLI adapters
- Main Book: “Design Patterns”
What you’ll build: An adapter spec that maps equivalent commands across CLIs.
Why it teaches interoperability: Command mapping is the core of automation portability.
Core challenges you’ll face:
- Command translation -> maps to adapter design
- Argument normalization -> maps to schema design
- Error mapping -> maps to reliability
Real World Outcome
A command adapter table with canonical commands and tool-specific mappings.
What you will see:
- Command glossary: canonical actions
- Mapping table: CLI-specific flags
- Fallback rules: what to do when no match exists
The Core Question You’re Answering
“How can one automation script run on multiple agent CLIs?”
Concepts You Must Understand First
- Adapter pattern
- How do you translate one interface into another?
- Argument normalization
- How do you standardize flag meanings?
- Error mapping
- How do you unify error responses?
Questions to Guide Your Design
- Canonical commands
- What is the minimal command set?
- Fallbacks
- What happens when a CLI lacks a command?
Thinking Exercise
Command Translation
Pick two commands from two CLIs and map them to a canonical action.
The Interview Questions They’ll Ask
- “What is an adapter and why is it used?”
- “How do you normalize CLI arguments?”
- “What is a canonical command set?”
- “How do you handle missing features?”
- “How do you verify adapter correctness?”
Hints in Layers
Hint 1: Starting Point List core commands used in daily workflows.
Hint 2: Next Level Define canonical names for each action.
Hint 3: Technical Details Map each CLI command to canonical actions with notes.
Hint 4: Tools/Debugging Test translation by comparing outputs across agents.
Books That Will Help
| Topic | Book | Chapter |
|---|---|---|
| Adapter pattern | “Design Patterns” | Ch. 4 |
| Interface design | “Clean Architecture” | Ch. 8 |
| Reliability | “Release It!” | Ch. 5 |
Common Pitfalls & Debugging
Problem 1: “Command mappings are incomplete”
- Why: Hidden CLI features
- Fix: Expand mappings as new features emerge
Problem 2: “Adapters mask errors”
- Why: Errors normalized too aggressively
- Fix: Preserve raw error details for debugging
Project 28: Event-Driven Agent Bus
- File: P28-event-driven-agent-bus.md
- Main Programming Language: Go
- Alternative Programming Languages: Python, JavaScript
- Coolness Level: 5
- Business Potential: 5
- Difficulty: 5
- Knowledge Area: Distributed Systems
- Software or Tool: Event bus
- Main Book: “Designing Data-Intensive Applications”
What you’ll build: An event bus design that routes tasks between agents asynchronously.
Why it teaches interoperability: Event-driven systems decouple agents and enable scale.
Core challenges you’ll face:
- Event schema -> maps to data contracts
- Ordering guarantees -> maps to reliability
- Backpressure -> maps to system stability
Real World Outcome
A blueprint for an event bus that orchestrates multiple agents.
What you will see:
- Event types: task.request, task.result, task.error
- Routing rules: topic-based or queue-based
- Backpressure policy: how to slow down agents
The Core Question You’re Answering
“How do you decouple agents so they can scale independently?”
Concepts You Must Understand First
- Event-driven architecture
- Why use events instead of direct calls?
- Ordering guarantees
- When do you need strict ordering?
- Backpressure
- How do you prevent overload?
Questions to Guide Your Design
- Event schema
- What fields are required for routing?
- Reliability
- How do you handle lost events?
Thinking Exercise
Event Flow Map
Draw a flow of events for a single task across three agents.
The Interview Questions They’ll Ask
- “What is an event-driven architecture?”
- “Why is backpressure important?”
- “How do you handle event ordering?”
- “How do you recover from dropped events?”
- “What is the benefit of decoupling agents?”
Hints in Layers
Hint 1: Starting Point Define the core event types and payloads.
Hint 2: Next Level Decide on routing strategy and queue semantics.
Hint 3: Technical Details Specify retry and dead-letter handling.
Hint 4: Tools/Debugging Simulate an overload and observe the backpressure plan.
Books That Will Help
| Topic | Book | Chapter |
|---|---|---|
| Event systems | “Designing Data-Intensive Applications” | Ch. 11 |
| Reliability | “Release It!” | Ch. 7 |
| Architecture | “Building Microservices” by Sam Newman | Ch. 4 |
Common Pitfalls & Debugging
Problem 1: “Events pile up”
- Why: No backpressure strategy
- Fix: Implement throttling or queue limits
Problem 2: “Events lack context”
- Why: Payload too small
- Fix: Add required correlation fields
Project 29: Distributed Job Queue
- File: P29-distributed-job-queue.md
- Main Programming Language: Go
- Alternative Programming Languages: Python
- Coolness Level: 4
- Business Potential: 5
- Difficulty: 5
- Knowledge Area: Distributed Systems
- Software or Tool: Job queues
- Main Book: “Designing Data-Intensive Applications”
What you’ll build: A queue design for distributing agent tasks at scale.
Why it teaches interoperability: Queue systems allow different agent workers to collaborate efficiently.
Core challenges you’ll face:
- Task scheduling -> maps to fairness
- Retry semantics -> maps to reliability
- Worker registration -> maps to discovery
Real World Outcome
A queue architecture diagram and scheduling policy for agent tasks.
What you will see:
- Queue types: priority, FIFO, delayed
- Worker registry: agent capabilities
- Retry rules: per-task policies
The Core Question You’re Answering
“How do you distribute tasks across many agents efficiently?”
Concepts You Must Understand First
- Scheduling policies
- How do you choose which task to run next?
- Worker discovery
- How do agents register and advertise capabilities?
- Failure recovery
- How do you handle worker failure?
Questions to Guide Your Design
- Queue model
- What queue types do you need?
- Retry strategy
- How many retries and for which errors?
Thinking Exercise
Queue Priorities
Define three priority levels and which tasks go in each.
The Interview Questions They’ll Ask
- “What is a distributed job queue?”
- “How do you schedule tasks fairly?”
- “What happens when a worker dies?”
- “How do you avoid duplicate processing?”
- “How do you scale queue consumers?”
Hints in Layers
Hint 1: Starting Point Define a task format with required fields.
Hint 2: Next Level Design a worker registration and heartbeat mechanism.
Hint 3: Technical Details Add retry policies and dead-letter handling.
Hint 4: Tools/Debugging Simulate worker failure and verify requeue behavior.
Books That Will Help
| Topic | Book | Chapter |
|---|---|---|
| Queues | “Designing Data-Intensive Applications” | Ch. 11 |
| Reliability | “Release It!” | Ch. 5 |
| Architecture | “Building Microservices” | Ch. 5 |
Common Pitfalls & Debugging
Problem 1: “Tasks get stuck”
- Why: Missing timeout or heartbeat
- Fix: Add lease timeouts
Problem 2: “Duplicate processing”
- Why: No idempotency
- Fix: Add unique task IDs and dedupe logic
Project 30: Cost and Latency Budget Enforcer
- File: P30-cost-and-latency-budget-enforcer.md
- Main Programming Language: Python
- Alternative Programming Languages: Go
- Coolness Level: 4
- Business Potential: 5
- Difficulty: 4
- Knowledge Area: Operations
- Software or Tool: Budgeting
- Main Book: “Accelerate”
What you’ll build: A policy that caps cost and latency across agent runs.
Why it teaches interoperability: Cost control is essential when multiple agents run at scale.
Core challenges you’ll face:
- Budget modeling -> maps to metrics
- Policy enforcement -> maps to governance
- Fallback behavior -> maps to resilience
Real World Outcome
A budget policy document with thresholds and fallback actions.
What you will see:
- Cost caps: per task and per day
- Latency targets: max allowed delays
- Fallbacks: lower-cost agent choices
The Core Question You’re Answering
“How do you keep multi-agent automation within cost and time limits?”
Concepts You Must Understand First
- Cost modeling
- How do you estimate cost per task?
- Latency budgets
- What is acceptable delay for each task type?
- Fallback strategy
- What happens when budgets are exceeded?
Questions to Guide Your Design
- Budget scope
- Is it per task, per user, or per day?
- Policy enforcement
- Where do you enforce budgets in the pipeline?
Thinking Exercise
Budget Allocation
Allocate a daily budget across three task categories.
The Interview Questions They’ll Ask
- “Why do you need budgets for agents?”
- “How do you estimate costs?”
- “How do you enforce latency limits?”
- “What fallback options are reasonable?”
- “How do you monitor budget usage?”
Hints in Layers
Hint 1: Starting Point Define baseline costs for each agent and task type.
Hint 2: Next Level Set thresholds and document fallback options.
Hint 3: Technical Details Add alerts when budgets approach limits.
Hint 4: Tools/Debugging Simulate a budget breach and record the response.
Books That Will Help
| Topic | Book | Chapter |
|---|---|---|
| Metrics | “Accelerate” | Ch. 3 |
| Governance | “Clean Architecture” | Ch. 12 |
| Reliability | “Release It!” | Ch. 8 |
Common Pitfalls & Debugging
Problem 1: “Budgets are unrealistic”
- Why: No historical data
- Fix: Start with conservative limits and adjust
Problem 2: “Fallbacks are unclear”
- Why: No documented downgrade plan
- Fix: Define explicit fallback paths
Project 31: Human-in-the-Loop Gate
- File: P31-human-in-the-loop-gate.md
- Main Programming Language: JavaScript
- Alternative Programming Languages: Python
- Coolness Level: 3
- Business Potential: 4
- Difficulty: 3
- Knowledge Area: Governance
- Software or Tool: Approval workflows
- Main Book: “Clean Architecture”
What you’ll build: A gate that pauses automation for human approval at critical steps.
Why it teaches interoperability: Safe automation depends on predictable human checkpoints.
Core challenges you’ll face:
- Checkpoint design -> maps to risk analysis
- Approval flow -> maps to workflow design
- Audit logging -> maps to compliance
Real World Outcome
A workflow diagram showing where human approval is required and how it is recorded.
What you will see:
- Checkpoints: defined high-risk steps
- Approval form: required decision fields
- Audit trail: who approved what
The Core Question You’re Answering
“Where must humans remain in the loop for safety?”
Concepts You Must Understand First
- Risk classification
- How do you identify high-risk steps?
- Approval flow
- How do you record and enforce approvals?
- Audit logging
- How do you prove compliance?
Questions to Guide Your Design
- Checkpoint placement
- Which steps require human review?
- Approval criteria
- What must reviewers check?
Thinking Exercise
Approval Map
Pick a workflow and mark the steps that require approval.
The Interview Questions They’ll Ask
- “Why is human-in-the-loop important?”
- “How do you decide where to put gates?”
- “How do you document approvals?”
- “What is the risk of too many gates?”
- “How do you ensure approvals are not bypassed?”
Hints in Layers
Hint 1: Starting Point List all steps with potential irreversible impact.
Hint 2: Next Level Define approval forms with required fields.
Hint 3: Technical Details Document enforcement rules and audit log schema.
Hint 4: Tools/Debugging Test a workflow with a denied approval and verify halt behavior.
Books That Will Help
| Topic | Book | Chapter |
|---|---|---|
| Governance | “Clean Architecture” | Ch. 12 |
| Operational safety | “Release It!” | Ch. 7 |
| Process discipline | “Continuous Delivery” | Ch. 9 |
Common Pitfalls & Debugging
Problem 1: “Approvals are skipped”
- Why: No enforcement layer
- Fix: Add a mandatory gate in the pipeline
Problem 2: “Gates slow everything”
- Why: Too many checkpoints
- Fix: Limit gates to high-risk actions
Project 32: Semantic Diff and Patch Gate
- File: P32-semantic-diff-and-patch-gate.md
- Main Programming Language: Python
- Alternative Programming Languages: Go
- Coolness Level: 4
- Business Potential: 4
- Difficulty: 4
- Knowledge Area: Code Changes
- Software or Tool: Diff tools
- Main Book: “Refactoring”
What you’ll build: A gate that evaluates semantic diffs before changes are applied.
Why it teaches interoperability: Multi-agent edits require a safety gate to avoid conflicting patches.
Core challenges you’ll face:
- Semantic analysis -> maps to code understanding
- Patch validation -> maps to safety
- Conflict detection -> maps to consistency
Real World Outcome
A semantic diff checklist and a gate that blocks risky changes.
What you will see:
- Diff categories: refactor, behavior change, config
- Risk rules: which changes require approval
- Patch status: accepted, rejected, needs review
The Core Question You’re Answering
“How do you prevent unsafe changes across multiple agents?”
Concepts You Must Understand First
- Semantic diff
- How is it different from line diff?
- Risk scoring
- How do you classify risky changes?
- Conflict detection
- How do you detect overlapping edits?
Questions to Guide Your Design
- Risk thresholds
- What changes require human review?
- Patch sequencing
- How do you order patches from multiple agents?
Thinking Exercise
Diff Categories
List five diff types and classify them by risk.
The Interview Questions They’ll Ask
- “What is a semantic diff?”
- “How do you score patch risk?”
- “How do you handle conflicting patches?”
- “Why is semantic diff important for agents?”
- “What is the role of human review?”
Hints in Layers
Hint 1: Starting Point Define categories for change types.
Hint 2: Next Level Map categories to approval requirements.
Hint 3: Technical Details Document a patch ordering strategy and merge policy.
Hint 4: Tools/Debugging Compare a benign refactor to a breaking change.
Books That Will Help
| Topic | Book | Chapter |
|---|---|---|
| Refactoring | “Refactoring” | Ch. 3 |
| Architecture governance | “Clean Architecture” | Ch. 12 |
| Reliability | “Release It!” | Ch. 5 |
Common Pitfalls & Debugging
Problem 1: “Gate blocks too much”
- Why: Overly strict risk thresholds
- Fix: Refine categories and add exceptions
Problem 2: “Risky changes slip through”
- Why: Poor semantic analysis
- Fix: Improve diff classification rules
Project 33: Knowledge Base RAG Connector
- File: P33-knowledge-base-rag-connector.md
- Main Programming Language: Python
- Alternative Programming Languages: JavaScript
- Coolness Level: 4
- Business Potential: 5
- Difficulty: 4
- Knowledge Area: RAG, Knowledge
- Software or Tool: Knowledge sources
- Main Book: “AI Engineering”
What you’ll build: A connector plan that lets multiple agents query a shared knowledge base.
Why it teaches interoperability: Shared knowledge reduces duplicated context and improves accuracy.
Core challenges you’ll face:
- Knowledge indexing -> maps to data modeling
- Access control -> maps to security
- Context injection -> maps to prompt design
Real World Outcome
A knowledge connector spec with query rules and access policies.
What you will see:
- Knowledge sources: docs, issues, runbooks
- Query schema: required fields for retrieval
- Access policy: who can read what
The Core Question You’re Answering
“How do multiple agents safely share a single knowledge base?”
Concepts You Must Understand First
- RAG basics
- How does retrieval augment prompts?
- Access control
- How do you restrict sensitive data?
- Context injection
- How do you present retrieved info to agents?
Questions to Guide Your Design
- Query rules
- What metadata is required for retrieval?
- Privacy rules
- What data is prohibited from retrieval?
Thinking Exercise
Knowledge Inventory
List five knowledge sources and categorize by sensitivity.
The Interview Questions They’ll Ask
- “What is RAG and why use it?”
- “How do you control access to shared knowledge?”
- “How do you prevent outdated information?”
- “How do you inject retrieved context safely?”
- “How do you measure retrieval quality?”
Hints in Layers
Hint 1: Starting Point List the knowledge sources and access rules.
Hint 2: Next Level Define a retrieval query schema with filters.
Hint 3: Technical Details Specify how retrieved content is summarized for agents.
Hint 4: Tools/Debugging Test retrieval for a known query and validate results.
Books That Will Help
| Topic | Book | Chapter |
|---|---|---|
| AI systems | “AI Engineering” | Ch. 4 |
| Data reliability | “Designing Data-Intensive Applications” | Ch. 9 |
| Security | “Security in Computing” | Ch. 6 |
Common Pitfalls & Debugging
Problem 1: “Agents get outdated info”
- Why: No update policy
- Fix: Add refresh schedules and versioning
Problem 2: “Sensitive data leaks”
- Why: Weak access controls
- Fix: Add strict filters and logging
Project 34: Model Failover Switch
- File: P34-model-failover-switch.md
- Main Programming Language: Python
- Alternative Programming Languages: Go
- Coolness Level: 4
- Business Potential: 5
- Difficulty: 4
- Knowledge Area: Reliability
- Software or Tool: Model selection
- Main Book: “Release It!”
What you’ll build: A policy for switching between models or agents when failures occur.
Why it teaches interoperability: Resilience requires fallback options across agent ecosystems.
Core challenges you’ll face:
- Failover criteria -> maps to reliability design
- Compatibility checks -> maps to output normalization
- State handoff -> maps to context management
Real World Outcome
A failover policy with triggers, fallback order, and recovery steps.
What you will see:
- Failover triggers: timeout, error, cost
- Fallback chain: primary -> secondary -> tertiary
- Recovery plan: when to return to primary
The Core Question You’re Answering
“When should automation switch to a different agent or model?”
Concepts You Must Understand First
- Failover triggers
- What signals indicate a failure?
- Compatibility
- How do you ensure output compatibility across agents?
- State transfer
- How do you pass context to the fallback agent?
Questions to Guide Your Design
- Trigger thresholds
- What counts as a failure vs slowdown?
- Fallback order
- Which agents should be used first?
Thinking Exercise
Failover Scenario
Define a scenario where the primary agent fails and how the system responds.
The Interview Questions They’ll Ask
- “What is failover and why does it matter?”
- “How do you choose fallback agents?”
- “How do you avoid inconsistent outputs?”
- “What is the cost of failover?”
- “How do you restore to primary?”
Hints in Layers
Hint 1: Starting Point Define the primary and secondary agents for each task.
Hint 2: Next Level Add clear failure triggers and cooldown periods.
Hint 3: Technical Details Document context handoff requirements.
Hint 4: Tools/Debugging Simulate a timeout and verify failover behavior.
Books That Will Help
| Topic | Book | Chapter |
|---|---|---|
| Resilience | “Release It!” | Ch. 4 |
| Reliability | “Designing Data-Intensive Applications” | Ch. 8 |
| Architecture | “Fundamentals of Software Architecture” | Ch. 7 |
Common Pitfalls & Debugging
Problem 1: “Failover causes inconsistent outputs”
- Why: No normalized output contract
- Fix: Enforce a shared output schema
Problem 2: “Failover loops”
- Why: No cooldown strategy
- Fix: Add a cooldown window before retrying primary
Project 35: Compliance Audit Logger
- File: P35-compliance-audit-logger.md
- Main Programming Language: Python
- Alternative Programming Languages: JavaScript
- Coolness Level: 3
- Business Potential: 5
- Difficulty: 4
- Knowledge Area: Compliance
- Software or Tool: Audit logging
- Main Book: “Clean Architecture”
What you’ll build: A compliance logging spec that records agent decisions and actions.
Why it teaches interoperability: Interop systems must be auditable for trust and governance.
Core challenges you’ll face:
- Audit schema -> maps to compliance
- Retention policy -> maps to governance
- Access control -> maps to security
Real World Outcome
A compliance log schema with retention and access rules.
What you will see:
- Audit events: approvals, tool calls, changes
- Retention plan: how long logs are stored
- Access controls: who can read logs
The Core Question You’re Answering
“How do you prove automation actions were safe and compliant?”
Concepts You Must Understand First
- Auditability
- What events must always be recorded?
- Retention
- How long should logs exist?
- Access control
- Who should be allowed to view logs?
Questions to Guide Your Design
- Audit scope
- What is the minimal event set?
- Retention rules
- What regulatory or policy requirements apply?
Thinking Exercise
Audit Checklist
List the top ten events you would want in a compliance review.
The Interview Questions They’ll Ask
- “Why do you need compliance logging for agents?”
- “What events must be audited?”
- “How do you ensure logs are tamper resistant?”
- “How do you handle retention requirements?”
- “Who should access audit logs?”
Hints in Layers
Hint 1: Starting Point Define a minimal audit event schema.
Hint 2: Next Level Add retention and access policy notes.
Hint 3: Technical Details Define integrity checks for logs.
Hint 4: Tools/Debugging Perform an audit review on a sample run.
Books That Will Help
| Topic | Book | Chapter |
|---|---|---|
| Governance | “Clean Architecture” | Ch. 12 |
| Security | “Security in Computing” | Ch. 7 |
| Reliability | “Release It!” | Ch. 7 |
Common Pitfalls & Debugging
Problem 1: “Logs are incomplete”
- Why: Missing event coverage
- Fix: Expand event schema and enforce logging
Problem 2: “Logs are hard to access”
- Why: No indexing or search
- Fix: Add indexing and query support
Project 36: Offline and Edge Mode Playbook
- File: P36-offline-and-edge-mode-playbook.md
- Main Programming Language: Python
- Alternative Programming Languages: Go
- Coolness Level: 3
- Business Potential: 4
- Difficulty: 3
- Knowledge Area: Offline workflows
- Software or Tool: Local execution
- Main Book: “The Pragmatic Programmer”
What you’ll build: A playbook for operating agents with limited or no network access.
Why it teaches interoperability: Offline constraints force you to design portable, resilient workflows.
Core challenges you’ll face:
- Dependency caching -> maps to reliability
- Local context storage -> maps to memory management
- Sync strategy -> maps to consistency
Real World Outcome
An offline workflow plan with cached resources and sync rules.
What you will see:
- Dependency list: what must be cached
- Offline tasks: what can be done locally
- Sync strategy: how to reconcile changes later
The Core Question You’re Answering
“How do you keep agent workflows productive without network access?”
Concepts You Must Understand First
- Offline constraints
- What breaks when network is unavailable?
- Caching strategy
- What must be cached ahead of time?
- Sync reconciliation
- How do you merge changes after reconnecting?
Questions to Guide Your Design
- Offline scope
- Which tasks can be done offline?
- Reconciliation
- How do you handle conflicts after sync?
Thinking Exercise
Offline Readiness
List five resources you would need cached for a full day of work.
The Interview Questions They’ll Ask
- “What is the impact of offline constraints?”
- “How do you prepare for offline work?”
- “How do you reconcile changes after reconnecting?”
- “What tasks are risky offline?”
- “How do you ensure data integrity?”
Hints in Layers
Hint 1: Starting Point Identify the most critical dependencies and cache them.
Hint 2: Next Level Define which workflows can run offline.
Hint 3: Technical Details Create a sync protocol for reconnect events.
Hint 4: Tools/Debugging Simulate offline mode and record what fails.
Books That Will Help
| Topic | Book | Chapter |
|---|---|---|
| Pragmatic workflows | “The Pragmatic Programmer” | Ch. 5 |
| Reliability | “Release It!” | Ch. 4 |
| Consistency | “Designing Data-Intensive Applications” | Ch. 5 |
Common Pitfalls & Debugging
Problem 1: “Missing dependencies offline”
- Why: No cache plan
- Fix: Create a dependency inventory and cache list
Problem 2: “Conflicts after sync”
- Why: No reconciliation rules
- Fix: Define merge and conflict resolution steps
Project 37: Multi-tenant Agent Service
- File: P37-multi-tenant-agent-service.md
- Main Programming Language: Go
- Alternative Programming Languages: Python
- Coolness Level: 5
- Business Potential: 5
- Difficulty: 5
- Knowledge Area: Platform Engineering
- Software or Tool: Multi-tenant services
- Main Book: “Software Architecture in Practice”
What you’ll build: A multi-tenant service design that lets multiple teams share agent automation safely.
Why it teaches interoperability: It forces you to build strong boundaries and governance.
Core challenges you’ll face:
- Tenant isolation -> maps to security
- Quota management -> maps to governance
- Routing policies -> maps to orchestration
Real World Outcome
A multi-tenant architecture diagram and tenant policy definitions.
What you will see:
- Tenant boundaries: separate configs and data
- Quota rules: cost and usage limits
- Routing rules: per-tenant agent choices
The Core Question You’re Answering
“How do you serve multiple teams safely on the same agent platform?”
Concepts You Must Understand First
- Isolation
- How do you prevent cross-tenant data leaks?
- Quotas
- How do you enforce usage limits?
- Routing
- How do you customize agent choice per tenant?
Questions to Guide Your Design
- Tenant model
- How is tenant data stored and scoped?
- Policy enforcement
- Where are quotas checked?
Thinking Exercise
Tenant Policy Draft
Define a policy for two teams with different budgets and access.
The Interview Questions They’ll Ask
- “What is multi-tenancy and why is it hard?”
- “How do you isolate tenants?”
- “How do you enforce quotas?”
- “What is the risk of shared infrastructure?”
- “How do you audit tenant actions?”
Hints in Layers
Hint 1: Starting Point Define the tenant boundary and separate configs.
Hint 2: Next Level Add quota and billing rules.
Hint 3: Technical Details Define routing rules per tenant.
Hint 4: Tools/Debugging Simulate two tenants and verify isolation.
Books That Will Help
| Topic | Book | Chapter |
|---|---|---|
| Architecture | “Software Architecture in Practice” | Ch. 5 |
| Security | “Security in Computing” | Ch. 6 |
| Reliability | “Release It!” | Ch. 7 |
Common Pitfalls & Debugging
Problem 1: “Tenant data leaks”
- Why: Weak isolation boundaries
- Fix: Enforce strict separation and access controls
Problem 2: “Quota enforcement fails”
- Why: Quotas checked too late
- Fix: Enforce quotas at request entry
Project 38: Benchmark Suite for Agents
- File: P38-benchmark-suite-for-agents.md
- Main Programming Language: Python
- Alternative Programming Languages: JavaScript
- Coolness Level: 4
- Business Potential: 4
- Difficulty: 4
- Knowledge Area: Evaluation
- Software or Tool: Benchmarks
- Main Book: “Accelerate”
What you’ll build: A benchmark suite that measures quality, cost, and latency across agents.
Why it teaches interoperability: Benchmarks drive data-backed tool selection and trust.
Core challenges you’ll face:
- Metric selection -> maps to evaluation design
- Workload design -> maps to representativeness
- Reporting -> maps to observability
Real World Outcome
A benchmark report with standardized metrics for each agent and workload.
What you will see:
- Workload sets: simple, moderate, complex tasks
- Metrics: latency, cost, accuracy
- Comparison table: agent performance scores
The Core Question You’re Answering
“How do you compare agents objectively for your workflows?”
Concepts You Must Understand First
- Metric definition
- What metrics reflect real user value?
- Workload sampling
- How do you avoid biased tests?
- Reporting
- How do you present results clearly?
Questions to Guide Your Design
- Benchmark scope
- Which tasks should be included?
- Scoring
- How do you aggregate metrics?
Thinking Exercise
Metric Priorities
Rank three metrics and explain why they matter most.
The Interview Questions They’ll Ask
- “What makes a benchmark fair?”
- “How do you avoid biased workloads?”
- “What metrics matter for coding agents?”
- “How do you present results to stakeholders?”
- “How do you track benchmark drift?”
Hints in Layers
Hint 1: Starting Point Start with a small, representative workload set.
Hint 2: Next Level Define metrics and scoring rules.
Hint 3: Technical Details Create a report template with comparison tables.
Hint 4: Tools/Debugging Run the suite on two agents and compare results.
Books That Will Help
| Topic | Book | Chapter |
|---|---|---|
| Metrics | “Accelerate” | Ch. 3 |
| Evaluation | “AI Engineering” | Ch. 6 |
| Reporting | “The Pragmatic Programmer” | Ch. 7 |
Common Pitfalls & Debugging
Problem 1: “Benchmarks are not representative”
- Why: Too narrow task set
- Fix: Add diverse workloads
Problem 2: “Metrics are hard to interpret”
- Why: No normalization
- Fix: Provide normalized scores and context
Project 39: Incident Response Automation
- File: P39-incident-response-automation.md
- Main Programming Language: Python
- Alternative Programming Languages: Go
- Coolness Level: 4
- Business Potential: 5
- Difficulty: 4
- Knowledge Area: Reliability
- Software or Tool: Incident workflows
- Main Book: “Release It!”
What you’ll build: An automation playbook that uses multiple agents during incidents.
Why it teaches interoperability: Incidents require rapid coordination and reliable outputs.
Core challenges you’ll face:
- Runbook design -> maps to operations
- Task routing -> maps to orchestration
- Safety checks -> maps to governance
Real World Outcome
An incident response workflow where agents handle investigation, mitigation, and reporting.
What you will see:
- Runbook steps: detection, triage, mitigation
- Agent roles: which agent does what
- Postmortem report: standardized output
The Core Question You’re Answering
“How can agents accelerate incident response without increasing risk?”
Concepts You Must Understand First
- Incident stages
- What steps occur in a typical incident?
- Safety checks
- Which actions require approval?
- Postmortem structure
- What must be documented after the incident?
Questions to Guide Your Design
- Role assignment
- Which agent handles which stage?
- Approval gates
- Where must humans sign off?
Thinking Exercise
Incident Scenario
Describe an outage and map which agent helps at each step.
The Interview Questions They’ll Ask
- “Why use agents in incident response?”
- “How do you prevent unsafe automated actions?”
- “What is a postmortem and why is it needed?”
- “How do you coordinate tasks during incidents?”
- “How do you measure incident improvement?”
Hints in Layers
Hint 1: Starting Point Define a simple runbook with three stages.
Hint 2: Next Level Assign agents to each stage with responsibilities.
Hint 3: Technical Details Add approval gates and logging requirements.
Hint 4: Tools/Debugging Run a tabletop exercise and record outcomes.
Books That Will Help
| Topic | Book | Chapter |
|---|---|---|
| Reliability | “Release It!” | Ch. 9 |
| Operations | “The Phoenix Project” by Gene Kim | Ch. 24 |
| Metrics | “Accelerate” | Ch. 4 |
Common Pitfalls & Debugging
Problem 1: “Automation makes changes too quickly”
- Why: Missing approval gates
- Fix: Add human checks for risky actions
Problem 2: “Postmortems are incomplete”
- Why: No standardized report
- Fix: Require structured postmortem templates
Project 40: IDE Bridge Integration
- File: P40-ide-bridge-integration.md
- Main Programming Language: JavaScript
- Alternative Programming Languages: Python
- Coolness Level: 4
- Business Potential: 4
- Difficulty: 3
- Knowledge Area: Developer Experience
- Software or Tool: IDE integration
- Main Book: “The Pragmatic Programmer”
What you’ll build: A plan for bridging CLI agents with an IDE workflow.
Why it teaches interoperability: Developers need seamless transitions between CLI automation and editor actions.
Core challenges you’ll face:
- Context sync -> maps to workspace alignment
- Command routing -> maps to adapter design
- User experience -> maps to workflow design
Real World Outcome
A workflow diagram showing how IDE actions trigger multiple agent CLIs.
What you will see:
- Trigger points: file save, test run, diff view
- Agent routing: which CLI handles which trigger
- Result display: how outputs are surfaced in the IDE
The Core Question You’re Answering
“How do you connect CLI agents to the developer’s editor workflow?”
Concepts You Must Understand First
- Context sync
- How does the IDE share file state with agents?
- Command routing
- How do you map IDE actions to agents?
- Feedback presentation
- How do you surface agent outputs effectively?
Questions to Guide Your Design
- Trigger strategy
- Which IDE events are safe to automate?
- Feedback channels
- Where should results appear for developers?
Thinking Exercise
IDE Workflow Map
Map a typical developer action to an agent response.
The Interview Questions They’ll Ask
- “Why integrate agents with an IDE?”
- “How do you keep IDE context in sync?”
- “What actions should be automated?”
- “How do you avoid interrupting the developer?”
- “How do you display results effectively?”
Hints in Layers
Hint 1: Starting Point List the IDE actions you want to automate.
Hint 2: Next Level Map those actions to agent tasks.
Hint 3: Technical Details Define a minimal result display format.
Hint 4: Tools/Debugging Test with a single action and refine feedback.
Books That Will Help
| Topic | Book | Chapter |
|---|---|---|
| Workflow design | “The Pragmatic Programmer” | Ch. 7 |
| Interface design | “Clean Architecture” | Ch. 10 |
| Reliability | “Release It!” | Ch. 6 |
Common Pitfalls & Debugging
Problem 1: “Context mismatch”
- Why: IDE state not synced
- Fix: Add explicit sync steps before agent runs
Problem 2: “Outputs clutter the editor”
- Why: No output formatting rules
- Fix: Define concise summaries and links
Project 41: Multi-Agent Pair Programming Protocol
- File: P41-multi-agent-pair-programming-protocol.md
- Main Programming Language: Python
- Alternative Programming Languages: JavaScript
- Coolness Level: 4
- Business Potential: 4
- Difficulty: 4
- Knowledge Area: Collaboration
- Software or Tool: Agent coordination
- Main Book: “Fundamentals of Software Architecture”
What you’ll build: A protocol for two or more agents to collaborate on a single coding task.
Why it teaches interoperability: Pairing highlights coordination, conflict resolution, and shared context.
Core challenges you’ll face:
- Role assignment -> maps to division of labor
- Turn-taking -> maps to coordination
- Conflict handling -> maps to consistency
Real World Outcome
A collaboration protocol with roles, turn order, and merge rules.
What you will see:
- Role definitions: driver, navigator, reviewer
- Turn rules: when to hand off control
- Conflict resolution: how to merge proposals
The Core Question You’re Answering
“How do multiple agents collaborate without stepping on each other?”
Concepts You Must Understand First
- Role separation
- What does each agent specialize in?
- Turn-taking
- How do agents avoid simultaneous edits?
- Merge policy
- How do you resolve conflicting suggestions?
Questions to Guide Your Design
- Role assignment
- Which agent is best as driver vs reviewer?
- Conflict resolution
- When does a human decide?
Thinking Exercise
Pair Protocol Draft
Write a three-step cycle for two agents to collaborate on a feature.
The Interview Questions They’ll Ask
- “Why use multi-agent pair programming?”
- “How do you coordinate agent roles?”
- “How do you prevent conflicting edits?”
- “What is the role of a human supervisor?”
- “How do you measure collaboration success?”
Hints in Layers
Hint 1: Starting Point Define roles and responsibilities for each agent.
Hint 2: Next Level Create turn-taking rules and handoff format.
Hint 3: Technical Details Define a merge policy with conflict rules.
Hint 4: Tools/Debugging Simulate a simple task and observe coordination breakdowns.
Books That Will Help
| Topic | Book | Chapter |
|---|---|---|
| Collaboration | “Fundamentals of Software Architecture” | Ch. 3 |
| Governance | “Clean Architecture” | Ch. 12 |
| Process | “The Pragmatic Programmer” | Ch. 6 |
Common Pitfalls & Debugging
Problem 1: “Agents repeat each other”
- Why: Overlapping roles
- Fix: Clarify responsibilities and review stages
Problem 2: “Conflicts are unresolved”
- Why: No merge policy
- Fix: Add a human arbitration step
Project 42: Capstone - Interoperable Automation Platform
- File: P42-capstone-interoperable-automation-platform.md
- Main Programming Language: Go
- Alternative Programming Languages: Python, JavaScript
- Coolness Level: 5
- Business Potential: 5
- Difficulty: 5
- Knowledge Area: Platform Engineering
- Software or Tool: Multi-agent platform
- Main Book: “Software Architecture in Practice”
What you’ll build: A full platform design that unifies agents, configs, tools, memory, and safety.
Why it teaches interoperability: It integrates every concept into one coherent system.
Core challenges you’ll face:
- System architecture -> maps to platform design
- Governance -> maps to policy enforcement
- Scalability -> maps to reliability
Real World Outcome
A full platform blueprint with modules, data flows, and governance controls.
What you will see:
- Architecture diagram: adapters, routers, storage, UI
- Policy layer: approvals, sandbox rules, audits
- Operational plan: monitoring, upgrades, failover
The Core Question You’re Answering
“How do you build a production-grade multi-agent automation platform?”
Concepts You Must Understand First
- Platform architecture
- What are the core modules and their boundaries?
- Governance
- How do you enforce policy across the platform?
- Scalability
- How do you support growth without instability?
Questions to Guide Your Design
- Core modules
- Which modules are required for interoperability?
- Operational controls
- How will you monitor and update the platform?
Thinking Exercise
Platform Map
Draw a top-level map of modules and the data flow between them.
The Interview Questions They’ll Ask
- “What are the core modules of a multi-agent platform?”
- “How do you enforce governance at scale?”
- “How do you handle failures in a platform?”
- “What is the role of adapters?”
- “How do you prevent lock-in?”
Hints in Layers
Hint 1: Starting Point List the modules from Projects 1-41 and group them.
Hint 2: Next Level Define interfaces between modules.
Hint 3: Technical Details Design a deployment and monitoring plan.
Hint 4: Tools/Debugging Run a tabletop simulation of a platform outage.
Books That Will Help
| Topic | Book | Chapter |
|---|---|---|
| Architecture | “Software Architecture in Practice” | Ch. 6 |
| Reliability | “Release It!” | Ch. 9 |
| Governance | “Clean Architecture” | Ch. 12 |
Common Pitfalls & Debugging
Problem 1: “Platform becomes too complex”
- Why: Over-engineering
- Fix: Keep modules minimal and modular
Problem 2: “Policies are inconsistent”
- Why: Governance rules applied unevenly
- Fix: Centralize policy enforcement
Project Comparison Table
| Project | Difficulty | Time | Depth of Understanding | Fun Factor |
|---|---|---|---|---|
| 1. Agent Capability Matrix | Level 1 | Weekend | Medium | ★★☆☆☆ |
| 2. Config Precedence Detective | Level 2 | Weekend | Medium | ★★☆☆☆ |
| 3. Prompt Contract Spec | Level 2 | Weekend | Medium | ★★★☆☆ |
| 4. Tool Schema Registry | Level 3 | 1 Week | High | ★★★☆☆ |
| 5. Subagent Task Router | Level 3 | 1 Week | High | ★★★☆☆ |
| 6. Hook Lifecycle Harness | Level 2 | Weekend | Medium | ★★☆☆☆ |
| 7. Extension and Plugin Compatibility Lab | Level 3 | 1 Week | High | ★★★☆☆ |
| 8. MCP Gateway Prototype | Level 4 | 2 Weeks | High | ★★★★☆ |
| 9. Headless Batch Runner | Level 2 | Weekend | Medium | ★★★☆☆ |
| 10. Interactive Session Recorder | Level 2 | Weekend | Medium | ★★★☆☆ |
| 11. Approval Policy Simulator | Level 3 | 1 Week | High | ★★★☆☆ |
| 12. Sandbox Matrix Auditor | Level 2 | Weekend | Medium | ★★☆☆☆ |
| 13. Output Style Normalizer | Level 2 | Weekend | Medium | ★★★☆☆ |
| 14. Multi-Agent Logging Standard | Level 3 | 1 Week | High | ★★★☆☆ |
| 15. Error Taxonomy and Retry Controller | Level 3 | 1 Week | High | ★★★☆☆ |
| 16. Context Budget Planner | Level 3 | 1 Week | High | ★★★☆☆ |
| 17. Memory Import and Export Bridge | Level 3 | 1 Week | High | ★★★☆☆ |
| 18. Cross-Agent Workspace Sync | Level 3 | 1 Week | High | ★★★☆☆ |
| 19. Secrets Broker Shim | Level 3 | 1 Week | High | ★★★☆☆ |
| 20. Test Harness for Agents | Level 4 | 2 Weeks | High | ★★★★☆ |
| 21. Prompt Injection Red Team Lab | Level 4 | 2 Weeks | High | ★★★★☆ |
| 22. Multi-Agent Code Review Pipeline | Level 3 | 1 Week | High | ★★★☆☆ |
| 23. Issue Triage Mesh | Level 3 | 1 Week | High | ★★★☆☆ |
| 24. Documentation Generator Federation | Level 3 | 1 Week | High | ★★★☆☆ |
| 25. Repo Indexing Strategy | Level 4 | 2 Weeks | High | ★★★★☆ |
| 26. Skill and Prompt Pack Manager | Level 3 | 1 Week | High | ★★★☆☆ |
| 27. Cross-CLI Command Adapter | Level 4 | 2 Weeks | High | ★★★★☆ |
| 28. Event-Driven Agent Bus | Level 5 | 3 Weeks | Very High | ★★★★★ |
| 29. Distributed Job Queue | Level 5 | 3 Weeks | Very High | ★★★★★ |
| 30. Cost and Latency Budget Enforcer | Level 4 | 2 Weeks | High | ★★★★☆ |
| 31. Human-in-the-Loop Gate | Level 3 | 1 Week | High | ★★★☆☆ |
| 32. Semantic Diff and Patch Gate | Level 4 | 2 Weeks | High | ★★★★☆ |
| 33. Knowledge Base RAG Connector | Level 4 | 2 Weeks | High | ★★★★☆ |
| 34. Model Failover Switch | Level 4 | 2 Weeks | High | ★★★★☆ |
| 35. Compliance Audit Logger | Level 4 | 2 Weeks | High | ★★★★☆ |
| 36. Offline and Edge Mode Playbook | Level 3 | 1 Week | High | ★★★☆☆ |
| 37. Multi-tenant Agent Service | Level 5 | 3 Weeks | Very High | ★★★★★ |
| 38. Benchmark Suite for Agents | Level 4 | 2 Weeks | High | ★★★★☆ |
| 39. Incident Response Automation | Level 4 | 2 Weeks | High | ★★★★☆ |
| 40. IDE Bridge Integration | Level 3 | 1 Week | High | ★★★☆☆ |
| 41. Multi-Agent Pair Programming Protocol | Level 4 | 2 Weeks | High | ★★★★☆ |
| 42. Interoperable Automation Platform | Level 5 | 4 Weeks | Very High | ★★★★★ |
Recommendation
If you are new to multi-agent interoperability: Start with Project 1 to build a capability baseline. If you are a platform engineer: Start with Project 4 and Project 14 to define tool schemas and logging. If you want production-grade automation: Focus on Projects 28-42.
Final Overall Project: Interoperable Automation Platform
The Goal: Combine Projects 1-41 into a single platform that orchestrates multiple agents safely and reliably.
- Build adapters for each CLI
- Implement prompt contracts and output normalization
- Add tool registry and MCP gateway support
- Enforce safety with approvals and sandbox policies
- Add observability, logging, and audit trails
- Run benchmark suites and cost controls
- Deploy as a multi-tenant platform
Success Criteria: A single workflow can route tasks across agents, produce consistent outputs, and pass audit checks.
From Learning to Production: What’s Next?
After completing these projects, you’ve built educational implementations. Here’s how to transition to production-grade systems:
What You Built vs. What Production Needs
| Your Project | Production Equivalent | Gap to Fill |
|---|---|---|
| Tool Schema Registry | Enterprise tool catalog | Governance and approval process |
| Event-Driven Agent Bus | Message queue + workflow engine | Monitoring and scaling |
| Compliance Audit Logger | SIEM integration | Retention and legal controls |
Skills You Now Have
You can confidently discuss:
- Multi-agent orchestration and task routing
- Prompt contracts and output normalization
- Governance, approvals, and auditability
You can read source code of:
- Agent CLIs and plugin systems
- MCP servers and protocol adapters
You can architect:
- Multi-agent automation pipelines
- Safety and compliance layers
Recommended Next Steps
1. Contribute to Open Source:
- MCP and agent tooling repos: add adapters or documentation
2. Build a SaaS Around One Project:
- Idea: Multi-agent automation manager with policy controls
- Monetization: Tiered usage and enterprise compliance features
3. Get Certified:
- Security certification - to strengthen governance and audit practices
Career Paths Unlocked
With this knowledge, you can pursue:
- AI tooling platform engineer
- Developer productivity engineer
- Reliability automation engineer
Summary
This learning path covers multi-agent interoperability through 42 hands-on projects.
| # | Project Name | Main Language | Difficulty | Time Estimate |
|---|---|---|---|---|
| 1 | Agent Capability Matrix | Python | Level 1 | Weekend |
| 2 | Config Precedence Detective | Python | Level 2 | Weekend |
| 3 | Prompt Contract Spec | Python | Level 2 | Weekend |
| 4 | Tool Schema Registry | Python | Level 3 | 1 Week |
| 5 | Subagent Task Router | JavaScript | Level 3 | 1 Week |
| 6 | Hook Lifecycle Harness | JavaScript | Level 2 | Weekend |
| 7 | Extension and Plugin Compatibility Lab | JavaScript | Level 3 | 1 Week |
| 8 | MCP Gateway Prototype | Go | Level 4 | 2 Weeks |
| 9 | Headless Batch Runner | Python | Level 2 | Weekend |
| 10 | Interactive Session Recorder | Python | Level 2 | Weekend |
| 11 | Approval Policy Simulator | Python | Level 3 | 1 Week |
| 12 | Sandbox Matrix Auditor | Python | Level 2 | Weekend |
| 13 | Output Style Normalizer | Python | Level 2 | Weekend |
| 14 | Multi-Agent Logging Standard | Python | Level 3 | 1 Week |
| 15 | Error Taxonomy and Retry Controller | Python | Level 3 | 1 Week |
| 16 | Context Budget Planner | Python | Level 3 | 1 Week |
| 17 | Memory Import and Export Bridge | Python | Level 3 | 1 Week |
| 18 | Cross-Agent Workspace Sync | Python | Level 3 | 1 Week |
| 19 | Secrets Broker Shim | Python | Level 3 | 1 Week |
| 20 | Test Harness for Agents | Python | Level 4 | 2 Weeks |
| 21 | Prompt Injection Red Team Lab | Python | Level 4 | 2 Weeks |
| 22 | Multi-Agent Code Review Pipeline | JavaScript | Level 3 | 1 Week |
| 23 | Issue Triage Mesh | Python | Level 3 | 1 Week |
| 24 | Documentation Generator Federation | JavaScript | Level 3 | 1 Week |
| 25 | Repo Indexing Strategy | Python | Level 4 | 2 Weeks |
| 26 | Skill and Prompt Pack Manager | JavaScript | Level 3 | 1 Week |
| 27 | Cross-CLI Command Adapter | Python | Level 4 | 2 Weeks |
| 28 | Event-Driven Agent Bus | Go | Level 5 | 3 Weeks |
| 29 | Distributed Job Queue | Go | Level 5 | 3 Weeks |
| 30 | Cost and Latency Budget Enforcer | Python | Level 4 | 2 Weeks |
| 31 | Human-in-the-Loop Gate | JavaScript | Level 3 | 1 Week |
| 32 | Semantic Diff and Patch Gate | Python | Level 4 | 2 Weeks |
| 33 | Knowledge Base RAG Connector | Python | Level 4 | 2 Weeks |
| 34 | Model Failover Switch | Python | Level 4 | 2 Weeks |
| 35 | Compliance Audit Logger | Python | Level 4 | 2 Weeks |
| 36 | Offline and Edge Mode Playbook | Python | Level 3 | 1 Week |
| 37 | Multi-tenant Agent Service | Go | Level 5 | 3 Weeks |
| 38 | Benchmark Suite for Agents | Python | Level 4 | 2 Weeks |
| 39 | Incident Response Automation | Python | Level 4 | 2 Weeks |
| 40 | IDE Bridge Integration | JavaScript | Level 3 | 1 Week |
| 41 | Multi-Agent Pair Programming Protocol | Python | Level 4 | 2 Weeks |
| 42 | Interoperable Automation Platform | Go | Level 5 | 4 Weeks |
Expected Outcomes
After completing these projects, you will:
- Design interop layers across multiple coding agents
- Build prompt contracts and tool schema registries
- Orchestrate workflows with safety, approvals, and audits
- Evaluate and benchmark agents with objective metrics
- Architect a production-grade multi-agent automation platform
You’ll have built a complete, working multi-agent interoperability ecosystem from first principles.
Additional Resources & References
Core CLI and Agent Documentation
- https://developers.openai.com/codex/cli
- https://developers.openai.com/codex/noninteractive
- https://developers.openai.com/codex/config-basic
- https://developers.openai.com/codex/config-advanced
- https://developers.openai.com/codex/config-reference
- https://deepwiki.com/openai/codex
- https://deepwiki.com/openai/skills
- https://code.claude.com/docs/en/sub-agents
- https://code.claude.com/docs/en/cli-reference
- https://code.claude.com/docs/en/hooks
- https://code.claude.com/docs/en/plugins-reference
- https://code.claude.com/docs/en/terminal-config
- https://code.claude.com/docs/en/model-config
- https://code.claude.com/docs/en/memory
- https://code.claude.com/docs/en/plugins
- https://code.claude.com/docs/en/skills
- https://code.claude.com/docs/en/output-styles
- https://code.claude.com/docs/en/hooks-guide
- https://code.claude.com/docs/en/headless
- https://code.claude.com/docs/en/mcp
- https://kiro.dev/docs/cli/chat/subagents/
- https://kiro.dev/docs/cli/chat/manage-prompts/
- https://kiro.dev/docs/cli/chat/context/
- https://kiro.dev/docs/cli/chat/configuration/
- https://kiro.dev/docs/cli/custom-agents/
- https://kiro.dev/docs/cli/custom-agents/configuration-reference/
- https://kiro.dev/docs/cli/code-intelligence/
- https://kiro.dev/docs/cli/hooks/
- https://kiro.dev/docs/cli/steering/
- https://kiro.dev/docs/cli/experimental/
- https://kiro.dev/docs/cli/experimental/knowledge-management/
- https://github.com/google-gemini/gemini-cli
- https://deepwiki.com/google-gemini/gemini-cli
- https://geminicli.com/docs/cli/commands/
- https://geminicli.com/docs/cli/custom-commands/
- https://geminicli.com/docs/cli/headless/
- https://geminicli.com/docs/cli/system-prompt/
- https://geminicli.com/docs/core/
- https://geminicli.com/docs/core/tools-api/
- https://geminicli.com/docs/core/memport/
- https://geminicli.com/docs/tools/
- https://geminicli.com/docs/tools/shell/
- https://geminicli.com/docs/hooks/
- https://geminicli.com/docs/#extensions
- https://geminicli.com/docs/extensions/
MCP and Extensions
- https://github.com/ChromeDevTools/chrome-devtools-mcp
- https://github.com/Dicklesworthstone/cass_memory_system
- https://github.com/gemini-cli-extensions/code-review
- https://github.com/gemini-cli-extensions/nanobanana
- https://github.com/gemini-cli-extensions/conductor
- https://github.com/johnlindquist/mdflow
- https://mdflow.dev/
Internal Guides Used
- AI_CODING_AGENTS/CLAUDE_CODE_ADVANCED_PROJECTS.md
- AI_CODING_AGENTS/CLAUDE_CODE_MASTERY_40_PROJECTS.md
- AI_CODING_AGENTS/KIRO_DOCUMENTATION_RESEARCH.md
- AI_CODING_AGENTS/LEARN_KIRO_CLI_MASTERY.md
- AI_AGENTS_LLM_RAG/LEARN_LLM_MEMORY.md
- AI_AGENTS_LLM_RAG/PROMPT_ENGINEERING_PROJECTS.md
- AI_AGENTS_LLM_RAG/AI_AGENTS_PROJECTS.md
Books
Foundations (from your library):
- “Clean Architecture” by Robert C. Martin
- “Designing Data-Intensive Applications” by Martin Kleppmann
- “Release It!” by Michael T. Nygard
- “Continuous Delivery” by David Farley and Jez Humble
- “The Pragmatic Programmer” by Andrew Hunt and David Thomas
- “Fundamentals of Software Architecture” by Mark Richards and Neal Ford
- “AI Engineering” by Chip Huyen
- “Security in Computing” by Charles Pfleeger