Multi-Agent Coding Interoperability Mastery - Real World Projects

Goal: Build a deep, first-principles understanding of how multiple coding agents interoperate across CLIs, headless modes, tools, hooks, memory, and configuration. You will design practical automation systems that coordinate Claude Code, Codex, Gemini CLI, Kiro CLI, and similar agents without lock-in. You will learn how to standardize prompts, normalize outputs, route tasks, and verify safety policies across heterogeneous agent runtimes. By the end, you will be able to architect reliable, auditable, and cost-aware multi-agent automation pipelines for real coding work.

Why Multi-Agent Interoperability Matters

Tooling for AI-assisted coding has shifted from single-assistant workflows to ecosystems of specialized agents. Each CLI has different strengths: some excel at repository analysis, others at headless batch runs, and others at hooks, plugins, or MCP integrations. Interoperability lets you combine those strengths, reduce vendor lock-in, and standardize safety and quality across the team. It also enables durable automation: when one tool changes, you can adapt the adapter layer instead of rewriting workflows.

Siloed agents vs interoperable agents:

Siloed CLI Agents                        Interoperable Agent Mesh

[Claude Code]    [Codex]                  [Claude] [Codex] [Gemini]
      |            |                         |       |       |
   Scripts      Scripts                     +--------+-------+
      |            |                         |  Interop Layer |
      +------X-----+                         +--------+-------+
           No sharing                        |  Shared Policies|
                                             |  Shared Schemas |
                                             +-----------------+

Siloed vs Interoperable Agents

Prerequisites & Background Knowledge

Before starting these projects, you should have foundational understanding in these areas:

Essential Prerequisites (Must Have)

Programming Skills:

Comfortable reading and structuring automation scripts in Python or JavaScript
Familiarity with JSON and YAML configuration patterns

CLI and Tooling Fundamentals:

Basic shell usage and file system navigation
Git workflows and repository structure
Recommended Reading: “The Linux Command Line” by William E. Shotts - Ch. 1-7

Software Design Basics:

Interfaces, contracts, and separation of concerns
Recommended Reading: “Clean Architecture” by Robert C. Martin - Ch. 1-4

Helpful But Not Required

DevOps and Delivery:

CI/CD concepts and automation pipelines
Can learn during: Projects 28-31

Systems Thinking:

Observability and reliability basics
Can learn during: Projects 14-18

Self-Assessment Questions

Before starting, ask yourself:

✅ Can you read a CLI tool configuration file and explain precedence rules?
✅ Can you explain the difference between interactive and headless modes?
✅ Can you define a contract for a tool output and validate it?

If you answered “no” to questions 1-3: Spend 1-2 weeks on the “Recommended Reading” books above before starting. If you answered “yes” to all: You’re ready to begin.

Development Environment Setup

Required Tools:

Git
A scripting language runtime (Python or Node.js)
At least two AI coding CLIs installed (Claude Code, Codex, Gemini CLI, Kiro CLI, or similar)

Recommended Tools:

JSON tooling (jq or similar)
A task runner (Make, Just, or similar)

Testing Your Setup:

# Verify you have the basics
$ [command to show CLI versions]
[expected output showing installed tools]

Time Investment:

Simple projects (1-10): Weekend (4-8 hours each)
Moderate projects (11-30): 1 week (10-20 hours each)
Complex projects (31-42): 2+ weeks (20-40 hours each)
Total sprint: 6-12 months if doing all projects sequentially

Important Reality Check: Interoperability work is messy by nature. You will spend time normalizing outputs, handling edge cases, and reconciling conflicting expectations between tools. This is the learning.

Core Concept Analysis

1. Agent Surfaces and Interaction Modes

Agents expose different surfaces: REPL, headless batch mode, subagents, or plugin hooks. Interoperability requires a clear model of each surface and how tasks should move across them.

User Input -> Interactive REPL -> Agent Actions
                    |
                    v
              Headless Batch

Agent Interaction Modes

2. Configuration and Instruction Hierarchy

Every agent has layered configuration: system prompts, user prompts, tool configs, and runtime overrides. Interoperability depends on mapping these layers into a common contract.

3. Tool APIs, Extensions, and Plugins

Agents differ in tool invocation models and extension ecosystems. You need a schema registry that normalizes tools into a shared vocabulary.

4. Task Routing and Subagent Orchestration

Modern CLIs can spawn subagents. Your interoperability layer should define when to delegate, how to pass context, and how to merge results.

Main Agent
   |-- Subagent A: search
   |-- Subagent B: test
   |-- Subagent C: patch
        |
   Merge Results

Subagent Orchestration

5. Context and Memory Management

Agents have limited context windows and different memory systems. Interoperability requires explicit rules for what to keep, summarize, and externalize.

6. Safety, Approval, and Sandbox Boundaries

Each CLI has its own approval policy and sandboxing model. A unified automation system must enforce the strictest policy and provide human checkpoints.

7. Observability and Evaluation

Without shared logs and metrics, a multi-agent system is opaque. Define a consistent event schema to track prompts, tool calls, and outcomes.

8. Protocols for Interop (MCP and Beyond)

Model Context Protocol and similar specs enable standardized tool and data sharing. Understanding these protocols is essential for scaling across tools.

Concept Summary Table

This section provides a map of the mental models you will build during these projects.

Concept Cluster	What You Need to Internalize
Agent Surfaces	How interactive, headless, and hook-based modes differ and how to bridge them
Configuration Hierarchy	How layered prompts and configs interact across tools
Tool/Plugin Ecosystems	How to normalize tool schemas and capability discovery
Orchestration	How to delegate tasks, merge results, and handle failures
Memory & Context	How to budget context and synchronize memory across agents
Safety & Governance	How to enforce approvals, sandboxing, and audit trails
Observability	How to define logs and evaluation signals for automation
Interoperability Protocols	How MCP and similar specs enable shared context

Deep Dive Reading by Concept

This section maps each concept to specific book chapters for deeper understanding.

Interoperability Architecture

Concept	Book & Chapter	Why This Matters
Interface contracts	“Clean Architecture” by Robert C. Martin - Ch. 1-4	Clean boundaries make adapters feasible
System boundaries	“Fundamentals of Software Architecture” by Mark Richards and Neal Ford - Ch. 2-4	Helps define stable seams

Automation and Reliability

Concept	Book & Chapter	Why This Matters
Delivery pipelines	“Continuous Delivery” by David Farley and Jez Humble - Ch. 1-3	Builds reliable automation primitives
Operational stability	“Release It!” by Michael T. Nygard - Ch. 1-3	Teaches resilience patterns

Data, Memory, and Observability

Concept	Book & Chapter	Why This Matters
Data flow models	“Designing Data-Intensive Applications” by Martin Kleppmann - Ch. 1-2	Foundations for shared logs
Metrics and feedback	“Accelerate” by Nicole Forsgren et al. - Ch. 2-3	Measurement for automation success

Quick Start: Your First 48 Hours

Feeling overwhelmed? Start here instead of reading everything:

Day 1 (4 hours):

Read the “Agent Surfaces” and “Configuration” concepts above
Install two CLIs and capture their version outputs
Start Project 1 and Project 2, focusing only on mapping capabilities
Do not optimize yet, just document differences

Day 2 (4 hours):

Write a simple prompt contract (Project 3)
Sketch a tool registry map (Project 4)
Review the Core Question for Project 3
Document one friction point per CLI

End of Weekend: You now understand how agent surfaces and configuration layers shape interoperability. That is the core mental model.

Next Steps:

If it clicked: Continue to Project 5
If confused: Re-read the Concept Summary Table
If frustrated: Take a break. Interop work is hard.

Recommended Learning Path

Path 1: The Pragmatic Automator (Recommended Start)

Best for: Engineers who want working automation quickly

Start with Project 1 - Build a capability matrix
Then Project 3 - Define prompt contracts
Then Project 9 - Headless batch execution

Path 2: The Platform Engineer

Best for: People building team-wide agent tooling

Start with Project 4 - Tool schema registry
Then Project 14 - Logging standard
Then Project 28 - Event-driven agent bus

Path 3: The Researcher

Best for: Those focused on evaluation and benchmarks

Start with Project 16 - Context budget planner
Then Project 20 - Test harness
Then Project 38 - Benchmark suite

Project List

The following projects guide you from basic interoperability to full automation platforms.

Agent Capability Matrix
Config Precedence Detective
Prompt Contract Spec
Tool Schema Registry
Subagent Task Router
Hook Lifecycle Harness
Extension and Plugin Compatibility Lab
MCP Gateway Prototype
Headless Batch Runner
Interactive Session Recorder
Approval Policy Simulator
Sandbox Matrix Auditor
Output Style Normalizer
Multi-Agent Logging Standard
Error Taxonomy and Retry Controller
Context Budget Planner
Memory Import and Export Bridge
Cross-Agent Workspace Sync
Secrets Broker Shim
Test Harness for Agents
Prompt Injection Red Team Lab
Multi-Agent Code Review Pipeline
Issue Triage Mesh
Documentation Generator Federation
Repo Indexing Strategy
Skill and Prompt Pack Manager
Cross-CLI Command Adapter
Event-Driven Agent Bus
Distributed Job Queue
Cost and Latency Budget Enforcer
Human-in-the-Loop Gate
Semantic Diff and Patch Gate
Knowledge Base RAG Connector
Model Failover Switch
Compliance Audit Logger
Offline and Edge Mode Playbook
Multi-tenant Agent Service
Benchmark Suite for Agents
Incident Response Automation
IDE Bridge Integration
Multi-Agent Pair Programming Protocol
Capstone: Interoperable Automation Platform

Project 1: Agent Capability Matrix

File: P01-agent-capability-matrix.md
Main Programming Language: Python
Alternative Programming Languages: JavaScript, Go
Coolness Level: 2
Business Potential: 3
Difficulty: 1
Knowledge Area: Tooling, Documentation
Software or Tool: Claude Code, Codex CLI, Gemini CLI, Kiro CLI
Main Book: “The Pragmatic Programmer”

What you’ll build: A structured capability matrix that compares agent features and limits.

Why it teaches interoperability: You cannot bridge tools until you see their mismatched surfaces and strengths.

Core challenges you’ll face:

Capability discovery -> maps to configuration and documentation analysis
Normalization -> maps to contract design
Coverage gaps -> maps to fallback strategy

Real World Outcome

A single reference matrix that lists each CLI agent, its modes, hooks, tools, and configuration scope. You can answer which agent to use for a given task and why.

What you will see:

Matrix table: Each row is an agent, each column is a capability
Notes column: Documented limitations and missing features
Decision notes: Simple guidance on which agent fits which task

Command Line Outcome Example:

# 1. Export CLI versions
$ [command]
[expected output]

# 2. Extract config locations
$ [command]
[expected output]

# 3. Summarize capabilities
$ [command]
[expected output]

The Core Question You’re Answering

“What is the minimum shared feature set that every agent can support?”

Before you write any code, sit with this question. It defines your interoperability baseline.

Concepts You Must Understand First

Stop and research these before coding:

CLI surfaces
- What is a REPL versus headless execution?
- Which features exist only in interactive mode?
- Book Reference: “The Pragmatic Programmer” Ch. 3
Capability taxonomy
- How do you categorize tools, hooks, and extensions?
- What is a stable naming scheme?
Configuration precedence
- Where do defaults, user configs, and project configs override each other?

Questions to Guide Your Design

Matrix structure
- What columns must be present to compare agents fairly?
- How do you annotate partial support?
Validation rules
- How do you verify a capability is real, not a marketing claim?

Thinking Exercise

Capability Taxonomy Sketch

Before coding, sketch a list of 10 capabilities and categorize them by surface.

[Capability] -> [Surface] -> [Evidence]

Questions while sketching:

Which capabilities overlap across all agents?
Which capabilities are unique and may require adapters?
Which capabilities are unstable or experimental?

The Interview Questions They’ll Ask

“How do you compare CLI agent capabilities in a consistent way?”
“What is a good interoperability baseline and why?”
“How do you verify a feature actually works?”
“What is the risk of assuming feature parity?”
“How do you document capability gaps?”

Hints in Layers

Hint 1: Starting Point Begin with a list of the surfaces: interactive, headless, hooks, plugins, MCP.

Hint 2: Next Level Define a schema with fields for mode, configuration, and tool support.

Hint 3: Technical Details Represent each agent as a row in a structured document and validate it with checks.

Hint 4: Tools/Debugging Use a diff tool to compare versions over time and highlight changes.

Books That Will Help

Topic	Book	Chapter
Capability analysis	“The Pragmatic Programmer” by Thomas and Hunt	Ch. 3
Interface boundaries	“Clean Architecture” by Robert C. Martin	Ch. 1-4
Observability basics	“Release It!” by Michael T. Nygard	Ch. 1

Common Pitfalls & Debugging

Problem 1: “Every agent seems to support everything”

Why: Docs are high-level and omit limitations
Fix: Validate each capability with a concrete test
Quick test: [command that probes capability]

Problem 2: “Capabilities change between versions”

Why: Rapid releases alter defaults
Fix: Capture version metadata with every matrix entry
Quick test: [command that prints version info]

Project 2: Config Precedence Detective

File: P02-config-precedence-detective.md
Main Programming Language: Python
Alternative Programming Languages: JavaScript, Go
Coolness Level: 2
Business Potential: 3
Difficulty: 2
Knowledge Area: Configuration Management
Software or Tool: Claude Code, Codex CLI, Gemini CLI, Kiro CLI
Main Book: “Clean Architecture”

What you’ll build: A documented map of configuration precedence across agents.

Why it teaches interoperability: Config rules define how automation behaves under different environments.

Core challenges you’ll face:

Config discovery -> maps to filesystem probing
Override hierarchy -> maps to policy design
Conflict resolution -> maps to standardization

Real World Outcome

A precedence chart that shows which config files and flags override others for each CLI.

What you will see:

Precedence diagram: Global -> user -> project -> runtime
Conflict table: mismatched key names across tools
Standard mapping: a normalized config glossary

Command Line Outcome Example:

$ [command to list config paths]
[expected output]

The Core Question You’re Answering

“When two settings conflict, which one wins and why?”

Concepts You Must Understand First

Layered configuration
- How do defaults interact with environment overrides?
- Book Reference: “Clean Architecture” Ch. 7
Config schema design
- How do you handle deprecated or renamed keys?
Environment scoping
- How should project overrides be scoped to avoid global drift?

Questions to Guide Your Design

Precedence rules
- What is the canonical order for each CLI?
- How will you represent exceptions?
Mapping strategy
- Which keys are equivalent across tools?
- Where must you keep agent-specific names?

Thinking Exercise

Override Graph

Sketch a graph of config layers and mark which layer should own a safety rule.

The Interview Questions They’ll Ask

“Why does configuration precedence matter in automation?”
“How do you avoid unintentional overrides?”
“What is the best place to store safety defaults?”
“How do you manage config drift across tools?”
“How do you standardize keys across CLIs?”

Hints in Layers

Hint 1: Starting Point List all known config files per CLI.

Hint 2: Next Level Create a precedence ladder and mark where each key is set.

Hint 3: Technical Details Define a mapping table between native keys and normalized keys.

Hint 4: Tools/Debugging Simulate overrides by toggling one setting at a time and observing output.

Books That Will Help

Topic	Book	Chapter
Config discipline	“Clean Architecture” by Robert C. Martin	Ch. 7
Operations control	“Release It!” by Michael T. Nygard	Ch. 2
Automation hygiene	“Continuous Delivery” by Farley and Humble	Ch. 4

Common Pitfalls & Debugging

Problem 1: “Settings seem ignored”

Why: A higher-precedence layer overrides them
Fix: Annotate all layers and retest
Quick test: [command to print effective config]

Problem 2: “Same key means different things”

Why: Semantic drift across tools
Fix: Add a normalization glossary

Project 3: Prompt Contract Spec

File: P03-prompt-contract-spec.md
Main Programming Language: Python
Alternative Programming Languages: JavaScript, Go
Coolness Level: 3
Business Potential: 4
Difficulty: 2
Knowledge Area: Prompt Engineering, Contracts
Software or Tool: All agents
Main Book: “Clean Architecture”

What you’ll build: A portable prompt contract that standardizes instructions, constraints, and output rules.

Why it teaches interoperability: A shared prompt contract is the foundation of consistent behavior.

Core challenges you’ll face:

Instruction hierarchy -> maps to system vs user prompt separation
Output schema -> maps to validation rules
Style normalization -> maps to post-processing

Real World Outcome

A prompt contract document and a validation checklist that each agent can follow.

What you will see:

Prompt template: Required sections for any task
Output schema: Required fields in responses
Compatibility notes: Exceptions per CLI

Command Line Outcome Example:

$ [command that validates prompt contract]
[expected output]

The Core Question You’re Answering

“How do you make different agents follow the same instructions?”

Concepts You Must Understand First

Instruction hierarchy
- How do system and user instructions interact?
- Book Reference: “Clean Architecture” Ch. 8
Structured outputs
- What fields must be present in every response?
Policy constraints
- Which rules must never be violated across tools?

Questions to Guide Your Design

Contract scope
- What is universal vs tool-specific?
Validation
- How do you detect missing fields or invalid formats?

Thinking Exercise

Prompt Contract Skeleton

Write a minimal contract in plain language with three required sections: goal, constraints, output.

The Interview Questions They’ll Ask

“What is a prompt contract?”
“How do you normalize outputs across agents?”
“What risks come from inconsistent prompts?”
“How do you validate agent adherence?”
“How do you handle exceptions?”

Hints in Layers

Hint 1: Starting Point List the most common prompt instructions you give today.

Hint 2: Next Level Standardize those instructions into required sections.

Hint 3: Technical Details Create a schema-like checklist that every response must satisfy.

Hint 4: Tools/Debugging Compare outputs from two agents using the same contract and record deviations.

Books That Will Help

Topic	Book	Chapter
Contract design	“Clean Architecture” by Robert C. Martin	Ch. 8
Refactoring outputs	“Refactoring” by Martin Fowler	Ch. 1
Reliability	“Release It!” by Michael T. Nygard	Ch. 3

Common Pitfalls & Debugging

Problem 1: “Agents ignore formatting rules”

Why: Contract not explicit enough
Fix: Add a strict required output checklist

Problem 2: “Output is verbose or inconsistent”

Why: No defined tone or length guidance
Fix: Add style constraints to the contract

Project 4: Tool Schema Registry

File: P04-tool-schema-registry.md
Main Programming Language: Python
Alternative Programming Languages: JavaScript, Go
Coolness Level: 3
Business Potential: 4
Difficulty: 3
Knowledge Area: Tooling, Schemas
Software or Tool: MCP, CLI tools
Main Book: “Designing Data-Intensive Applications”

What you’ll build: A registry that maps tool schemas across agents.

Why it teaches interoperability: Normalized tool schemas allow a single task to call tools across agents.

Core challenges you’ll face:

Schema extraction -> maps to tool introspection
Normalization -> maps to registry design
Versioning -> maps to change management

Real World Outcome

A registry file or service listing tool names, inputs, outputs, and compatibility notes.

What you will see:

Tool catalog: each tool with a normalized name
Schema mapping: input and output fields by agent
Version map: supported schema versions

The Core Question You’re Answering

“How can tools be invoked consistently across different agents?”

Concepts You Must Understand First

Schema normalization
- How do you define a canonical tool signature?
Versioning
- How do you handle incompatible changes?
Tool discovery
- How do you detect available tools in each CLI?

Questions to Guide Your Design

Registry format
- Should it be file-based, service-based, or both?
Compatibility rules
- How do you label partial support?

Thinking Exercise

Tool Mapping Table

Create a small table mapping three tools across two agents and note mismatches.

The Interview Questions They’ll Ask

“Why do you need a tool schema registry?”
“How do you handle schema drift?”
“How do you normalize tool names?”
“What is the difference between discovery and registration?”
“How do you version tool interfaces?”

Hints in Layers

Hint 1: Starting Point Pick five tools and document their inputs and outputs.

Hint 2: Next Level Define a canonical schema that all tools map to.

Hint 3: Technical Details Add version fields and compatibility notes for each mapping.

Hint 4: Tools/Debugging Simulate a schema mismatch and describe fallback behavior.

Books That Will Help

Topic	Book	Chapter
Data modeling	“Designing Data-Intensive Applications” by Martin Kleppmann	Ch. 1
Interfaces	“Clean Architecture” by Robert C. Martin	Ch. 8
Change management	“Release It!” by Michael T. Nygard	Ch. 5

Common Pitfalls & Debugging

Problem 1: “Registry becomes out of date”

Why: Tools update frequently
Fix: Add version metadata and review cadence

Problem 2: “Schema mappings are ambiguous”

Why: Vague field definitions
Fix: Add explicit field descriptions and examples

Project 5: Subagent Task Router

File: P05-subagent-task-router.md
Main Programming Language: JavaScript
Alternative Programming Languages: Python, Go
Coolness Level: 4
Business Potential: 4
Difficulty: 3
Knowledge Area: Orchestration
Software or Tool: Subagents, task routing
Main Book: “Fundamentals of Software Architecture”

What you’ll build: A concrete task-routing spec that decides which agent handles which task, how context is packaged, and how outputs are merged into a single answer.

Why it teaches interoperability: Real interoperability is not just calling multiple tools. It is deciding when to delegate, what to send, and how to reassemble results without losing intent.

Core challenges you’ll face:

Task decomposition -> maps to architecture and boundaries
Context handoff -> maps to state and minimal contracts
Result merging -> maps to normalization and conflict resolution

Real World Outcome

A router document that defines task categories, routing rules, handoff schema, and merge rules.

What you will see:

Routing table: task type -> primary agent and fallback
Handoff schema: minimal required fields by task
Merge rules: how to combine partial answers and resolve conflicts

Example routing spec (excerpt):

routes:
  repo_search:
    primary: codex
    fallback: claude
    required_context: [repo_path, query, constraints]
  api_design:
    primary: claude
    fallback: gemini
    required_context: [requirements, existing_endpoints, auth_model]
  test_execution:
    primary: codex
    fallback: kiro
    required_context: [test_cmd, env, timeout]

handoff_schema:
  base:
    task_id: string
    goal: string
    constraints: [string]
    artifacts: [path]
    success_criteria: [string]

merge_rules:
  - If two agents disagree, prefer the one with evidence links or command output.
  - If both answers are partial, compose by section and mark unresolved items.

The Core Question You’re Answering

“When should a task be delegated and to whom?”

Concepts You Must Understand First

Task granularity
- What is a unit of work that can stand alone?
Context minimalism
- What is the smallest context that still avoids ambiguity?
Merge strategy
- How do you reconcile conflicting outputs without losing evidence?

Questions to Guide Your Design

Routing rules
- What signals determine the right agent: task type, risk, or cost?
Fallbacks
- What is the fallback path when an agent fails or times out?
Handoff schema
- Which fields are mandatory vs optional for each task type?
Merge policy
- How do you label conflicts and decide a final output?

Thinking Exercise

Delegation Tree

Draw a tree that splits a complex task into three subagent assignments. For each node, write the minimum context payload and the expected return artifact.

The Interview Questions They’ll Ask

“How do you decide when to delegate tasks?”
“What context must be passed to a subagent?”
“How do you merge conflicting outputs?”
“What is the cost of over-delegation?”
“How do you handle subagent failure?”
“How do you validate the output from an untrusted agent?”
“When is a single agent faster than a multi-agent split?”

Hints in Layers

Hint 1: Starting Point List three task types that are naturally separable and name the best agent for each.

Hint 2: Next Level Define the minimal fields each handoff must include and write a schema.

Hint 3: Technical Details Write merge rules for conflicts and missing data.

Hint 4: Tools/Debugging Run a tiny task through two agents and compare which produced evidence.

Books That Will Help

Topic	Book	Chapter
Architecture tradeoffs	“Fundamentals of Software Architecture”	Ch. 2
Resilience	“Release It!”	Ch. 4
System boundaries	“Clean Architecture”	Ch. 9

Common Pitfalls & Debugging

Problem 1: “Too many handoffs”

Why: Over-decomposition
Fix: Merge tasks until each handoff has clear purpose

Problem 2: “Subagent misses context”

Why: Handoff schema too sparse
Fix: Add required context fields

Problem 3: “Conflicting answers with no decision”

Why: No merge policy or evidence requirement
Fix: Add a rule to prefer outputs with citations or command output

Problem 4: “Fallbacks never trigger”

Why: Failure detection is vague
Fix: Define explicit timeout and error conditions for retries

Project 6: Hook Lifecycle Harness

File: P06-hook-lifecycle-harness.md
Main Programming Language: JavaScript
Alternative Programming Languages: Python, Go
Coolness Level: 3
Business Potential: 3
Difficulty: 2
Knowledge Area: Hooks, Automation
Software or Tool: CLI hooks
Main Book: “Continuous Delivery”

What you’ll build: A harness that documents pre and post hooks across agents.

Why it teaches interoperability: Hooks are the seams where automation integrates.

Core challenges you’ll face:

Hook discovery -> maps to CLI introspection
Ordering -> maps to lifecycle modeling
Side effects -> maps to reliability concerns

Real World Outcome

A hook lifecycle diagram and checklist for each CLI tool.

What you will see:

Hook map: pre-task, post-task, error hooks
Timing notes: when hooks fire
Compatibility: which hooks are portable

The Core Question You’re Answering

“Where can automation safely intervene in each agent?”

Concepts You Must Understand First

Lifecycle events
- Which events are stable vs experimental?
Side effects
- What should never happen inside a hook?
Ordering
- How do hooks compose across tools?

Questions to Guide Your Design

Hook priority
- Which hooks should run first?
Safety boundaries
- What hooks require approval?

Thinking Exercise

Hook Timeline

Draw a timeline of one CLI execution and mark hook points.

The Interview Questions They’ll Ask

“What are hooks and why are they important?”
“What risks do hooks introduce?”
“How do you enforce ordering?”
“How do you share hook behavior across tools?”
“How do you debug hook failures?”

Hints in Layers

Hint 1: Starting Point List all hook types described in docs.

Hint 2: Next Level Map them to a shared lifecycle schema.

Hint 3: Technical Details Define a minimal payload each hook should receive.

Hint 4: Tools/Debugging Simulate a failure hook and record response.

Books That Will Help

Topic	Book	Chapter
Pipeline control	“Continuous Delivery”	Ch. 2
Operational safety	“Release It!”	Ch. 5
Automation style	“The Pragmatic Programmer”	Ch. 6

Common Pitfalls & Debugging

Problem 1: “Hooks run in the wrong order”

Why: Unclear lifecycle rules
Fix: Define and document a canonical timeline

Problem 2: “Hooks hide failures”

Why: Errors swallowed inside scripts
Fix: Make failure conditions explicit

Project 7: Extension and Plugin Compatibility Lab

File: P07-extension-and-plugin-compatibility-lab.md
Main Programming Language: JavaScript
Alternative Programming Languages: Python
Coolness Level: 4
Business Potential: 4
Difficulty: 3
Knowledge Area: Extensions, Plugins
Software or Tool: Plugins, extensions, custom agents
Main Book: “Design Patterns”

What you’ll build: A compatibility report for extensions across agent ecosystems.

Why it teaches interoperability: Extensions often define unique features that must be bridged.

Core challenges you’ll face:

Capability mapping -> maps to adapter patterns
Version compatibility -> maps to change control
Isolation -> maps to sandboxing

Real World Outcome

A compatibility grid that shows which extensions can be adapted across tools and how.

What you will see:

Extension list: categorized by purpose
Portability rating: high, medium, low
Adapter notes: what to build or omit

The Core Question You’re Answering

“Which extension features can be made portable and which cannot?”

Concepts You Must Understand First

Adapter pattern
- How do you translate one plugin system to another?
Isolation boundaries
- What must stay sandboxed?
Versioning
- How do you manage extensions across releases?

Questions to Guide Your Design

Portability criteria
- What makes an extension portable?
Risk analysis
- Which extensions introduce security or compliance risks?

Thinking Exercise

Adapter Mapping

Pick one extension and list the hooks or tools it depends on in each CLI.

The Interview Questions They’ll Ask

“How do plugin systems differ across tools?”
“What is an adapter and why is it needed?”
“How do you assess portability?”
“What is the risk of extension lock-in?”
“How do you manage extension versions?”

Hints in Layers

Hint 1: Starting Point Select three extensions from different ecosystems.

Hint 2: Next Level Compare their APIs and dependencies.

Hint 3: Technical Details Map them into an adapter interface with optional fields.

Hint 4: Tools/Debugging Document the minimal test to confirm compatibility.

Books That Will Help

Topic	Book	Chapter
Adapter pattern	“Design Patterns” by GoF	Ch. 4
Modularity	“Clean Architecture”	Ch. 6
Risk tradeoffs	“Release It!”	Ch. 6

Common Pitfalls & Debugging

Problem 1: “Extensions assume unavailable tools”

Why: Ecosystem-specific dependencies
Fix: Add capability checks before enabling

Problem 2: “Extension output is inconsistent”

Why: Different output schemas
Fix: Normalize outputs in adapters

Project 8: MCP Gateway Prototype

File: P08-mcp-gateway-prototype.md
Main Programming Language: Go
Alternative Programming Languages: Python, JavaScript
Coolness Level: 4
Business Potential: 4
Difficulty: 4
Knowledge Area: Protocols, MCP
Software or Tool: MCP
Main Book: “Designing Data-Intensive Applications”

What you’ll build: A gateway concept that exposes shared MCP servers to multiple CLIs.

Why it teaches interoperability: MCP is a key protocol for shared tool and context access.

Core challenges you’ll face:

Protocol translation -> maps to interoperability
Security boundaries -> maps to policy design
Multi-client support -> maps to concurrency

Real World Outcome

A conceptual gateway design with message flow diagrams and connection rules.

What you will see:

Protocol diagram: client -> gateway -> MCP server
Auth mapping: how credentials are passed
Failure handling: retry and fallback rules

The Core Question You’re Answering

“How do multiple agents safely share MCP resources?”

Concepts You Must Understand First

Protocol basics
- How does MCP represent tools and resources?
Security scopes
- How do you limit access per agent?
Concurrency
- How do multiple clients share the gateway?

Questions to Guide Your Design

Gateway boundaries
- Where do you terminate auth?
Resource naming
- How do you avoid collisions?

Thinking Exercise

Gateway Request Flow

Sketch a request flow from two agents to one MCP server.

The Interview Questions They’ll Ask

“What problem does MCP solve?”
“Why use a gateway instead of direct access?”
“How do you secure shared resources?”
“How do you handle resource collisions?”
“How do you scale MCP access?”

Hints in Layers

Hint 1: Starting Point List the MCP servers you want to share.

Hint 2: Next Level Define an authentication and routing layer.

Hint 3: Technical Details Describe request envelopes and error handling rules.

Hint 4: Tools/Debugging Create a failure scenario and define a fallback response.

Books That Will Help

Topic	Book	Chapter
Protocol design	“Designing Data-Intensive Applications”	Ch. 4
Concurrency	“Operating Systems: Three Easy Pieces”	Ch. 26
Security	“Security in Computing” by Pfleeger	Ch. 3

Common Pitfalls & Debugging

Problem 1: “Gateway becomes a bottleneck”

Why: Single shared resource without scaling plan
Fix: Define sharding or replication

Problem 2: “Access rules are inconsistent”

Why: Policies applied per tool instead of per agent
Fix: Centralize policy evaluation

Project 9: Headless Batch Runner

File: P09-headless-batch-runner.md
Main Programming Language: Python
Alternative Programming Languages: Go, JavaScript
Coolness Level: 3
Business Potential: 4
Difficulty: 2
Knowledge Area: Automation
Software or Tool: Headless mode
Main Book: “Continuous Delivery”

What you’ll build: A batch execution plan that runs tasks headlessly across agents.

Why it teaches interoperability: Headless mode is critical for CI and scheduled jobs.

Core challenges you’ll face:

Input packaging -> maps to repeatable runs
Output capture -> maps to logging
Error detection -> maps to reliability

Real World Outcome

A runbook showing how to execute a queue of tasks through multiple agents without interactive steps.

What you will see:

Batch manifest: list of tasks and inputs
Output archive: standardized logs
Failure summary: auto-retry guidance

The Core Question You’re Answering

“How do you run agents in automation without human intervention?”

Concepts You Must Understand First

Headless mode
- What changes in behavior without a REPL?
Idempotency
- How do you avoid repeated damage on retries?
Output capture
- How do you store results for later analysis?

Questions to Guide Your Design

Batch inputs
- What is the minimal input schema?
Error handling
- When should you retry vs stop?

Thinking Exercise

Batch Run Scenario

Imagine three tasks failing in different stages. Define what your runner should do.

The Interview Questions They’ll Ask

“What is the difference between headless and interactive execution?”
“Why is idempotency important in batch automation?”
“How do you capture headless outputs reliably?”
“When do you retry and when do you fail fast?”
“How do you audit batch runs?”

Hints in Layers

Hint 1: Starting Point Design a task manifest with input and expected output fields.

Hint 2: Next Level Define how each agent is invoked in headless mode.

Hint 3: Technical Details Specify output capture and logging rules for each run.

Hint 4: Tools/Debugging Run a dry-run mode that prints intended actions without executing.

Books That Will Help

Topic	Book	Chapter
Automation pipelines	“Continuous Delivery”	Ch. 1-2
Reliability	“Release It!”	Ch. 1
Shell discipline	“Effective Shell” by Dave Kerr	Ch. 3

Common Pitfalls & Debugging

Problem 1: “Batch runs produce inconsistent output”

Why: Output capture not standardized
Fix: Define a single output envelope schema

Problem 2: “Retries cause side effects”

Why: Tasks are not idempotent
Fix: Add preflight checks and safe guards

Project 10: Interactive Session Recorder

File: P10-interactive-session-recorder.md
Main Programming Language: Python
Alternative Programming Languages: JavaScript
Coolness Level: 2
Business Potential: 3
Difficulty: 2
Knowledge Area: Observability
Software or Tool: CLI session logs
Main Book: “Release It!”

What you’ll build: A system that records interactive sessions and replays them as headless scripts.

Why it teaches interoperability: It bridges human exploration with automated repeatability.

Core challenges you’ll face:

Command capture -> maps to event logging
State reconstruction -> maps to reproducibility
Replay safety -> maps to approval policies

Real World Outcome

A session log format and a replay plan that turns interactive work into automation.

What you will see:

Session log: ordered steps with timestamps
Replay plan: steps converted to batch tasks
Safety notes: checkpoints for risky actions

The Core Question You’re Answering

“How do you convert human-driven sessions into automated runs?”

Concepts You Must Understand First

Event logging
- What should be captured and why?
Determinism
- What changes between interactive and headless runs?
Safety checkpoints
- Which steps require human approval?

Questions to Guide Your Design

Log schema
- What fields are required for replay?
Replay validation
- How do you detect drift?

Thinking Exercise

Replay Checklist

List five interactive actions and decide if each can be safely replayed.

The Interview Questions They’ll Ask

“What is the value of session recording?”
“How do you handle nondeterminism?”
“What is the risk of replaying interactive actions?”
“How do you store session logs safely?”
“How do you verify a replay succeeded?”

Hints in Layers

Hint 1: Starting Point Start with a simple timestamped log of actions.

Hint 2: Next Level Define a replayable action schema.

Hint 3: Technical Details Add environment snapshot fields to reduce drift.

Hint 4: Tools/Debugging Compare outputs between interactive and replay runs.

Books That Will Help

Topic	Book	Chapter
Reliability patterns	“Release It!”	Ch. 2
Automation discipline	“Continuous Delivery”	Ch. 3
Logging practices	“Designing Data-Intensive Applications”	Ch. 11

Common Pitfalls & Debugging

Problem 1: “Replays fail because of drift”

Why: Environment not captured
Fix: Log environment metadata with each session

Problem 2: “Session logs are incomplete”

Why: Missing step types or artifacts
Fix: Add required fields to the log schema

Project 11: Approval Policy Simulator

File: P11-approval-policy-simulator.md
Main Programming Language: Python
Alternative Programming Languages: JavaScript
Coolness Level: 3
Business Potential: 4
Difficulty: 3
Knowledge Area: Safety, Governance
Software or Tool: Approval policies
Main Book: “Release It!”

What you’ll build: A simulator that shows when approvals are required across agents.

Why it teaches interoperability: A unified automation system must follow the strictest approval policy.

Core challenges you’ll face:

Policy mapping -> maps to rules engines
Action classification -> maps to risk assessment
Human checkpoints -> maps to governance

Real World Outcome

A policy matrix showing which actions need approval per agent and a common enforcement rule.

What you will see:

Policy table: action -> approval required
Risk tiers: low, medium, high
Unified rule: safest default policy

The Core Question You’re Answering

“How do you enforce safety across agents with different rules?”

Concepts You Must Understand First

Risk classification
- How do you rate actions by potential impact?
Policy precedence
- Which rule wins when policies conflict?
Human approval flow
- How do you record approvals?

Questions to Guide Your Design

Policy schema
- What fields define an approval rule?
Enforcement strategy
- Where do you enforce the strictest policy?

Thinking Exercise

Approval Scenarios

List five actions and assign a required approval level for each.

The Interview Questions They’ll Ask

“Why do you need approval policies for agents?”
“How do you handle conflicting safety rules?”
“What is the strictest-policy principle?”
“How do you audit approvals?”
“How do you avoid blocking safe tasks?”

Hints in Layers

Hint 1: Starting Point Collect policy descriptions from each agent.

Hint 2: Next Level Normalize them into a single schema.

Hint 3: Technical Details Define a resolution rule that always selects the strictest policy.

Hint 4: Tools/Debugging Test the rule with sample actions and verify expected approvals.

Books That Will Help

Topic	Book	Chapter
Risk awareness	“Release It!”	Ch. 3
Governance	“Clean Architecture”	Ch. 12
Delivery controls	“Continuous Delivery”	Ch. 9

Common Pitfalls & Debugging

Problem 1: “Policies are inconsistent”

Why: Different agents use different categories
Fix: Normalize into a unified risk taxonomy

Problem 2: “Too many approvals”

Why: Overly strict defaults
Fix: Add exceptions with clear justification

Project 12: Sandbox Matrix Auditor

File: P12-sandbox-matrix-auditor.md
Main Programming Language: Python
Alternative Programming Languages: JavaScript
Coolness Level: 2
Business Potential: 3
Difficulty: 2
Knowledge Area: Security, Sandboxing
Software or Tool: Sandboxing models
Main Book: “Security in Computing”

What you’ll build: A matrix that compares sandbox boundaries across agents.

Why it teaches interoperability: You need consistent safety boundaries when chaining tools.

Core challenges you’ll face:

Boundary mapping -> maps to security analysis
Capability restrictions -> maps to policy enforcement
Escalation paths -> maps to approval flows

Real World Outcome

A sandbox comparison chart with action categories and allowed scopes.

What you will see:

Sandbox table: file, network, process access
Escalation rules: when approvals are required
Unified baseline: least-privilege defaults

The Core Question You’re Answering

“What is the safest common sandbox that still enables automation?”

Concepts You Must Understand First

Least privilege
- What is the minimal access needed?
Escalation
- When can permissions be raised safely?
Auditability
- How do you log privileged actions?

Questions to Guide Your Design

Baseline policy
- Which sandbox rules must always apply?
Exceptions
- How do you document and approve exceptions?

Thinking Exercise

Sandbox Gap Analysis

List three tasks and identify which agent supports each safely.

The Interview Questions They’ll Ask

“What is the least-privilege principle?”
“How do sandbox rules differ across tools?”
“When should you allow escalation?”
“How do you audit privileged actions?”
“What is the risk of permissive defaults?”

Hints in Layers

Hint 1: Starting Point List the default sandbox settings of each CLI.

Hint 2: Next Level Compare them against a strict baseline.

Hint 3: Technical Details Define an escalation process with approvals.

Hint 4: Tools/Debugging Try a restricted action in each CLI and record behavior.

Books That Will Help

Topic	Book	Chapter
Security fundamentals	“Security in Computing”	Ch. 2
Operational safety	“Release It!”	Ch. 6
Risk controls	“Clean Architecture”	Ch. 12

Common Pitfalls & Debugging

Problem 1: “Sandbox is too permissive”

Why: Defaults prioritized convenience
Fix: Adopt a strict baseline and add exceptions sparingly

Problem 2: “Automation fails due to restrictions”

Why: Missing approvals or unclear escalation
Fix: Document explicit escalation flows

Project 13: Output Style Normalizer

File: P13-output-style-normalizer.md
Main Programming Language: Python
Alternative Programming Languages: JavaScript
Coolness Level: 3
Business Potential: 3
Difficulty: 2
Knowledge Area: Output Contracts
Software or Tool: Output formatting
Main Book: “Refactoring”

What you’ll build: A normalization guide that enforces consistent output style across agents.

Why it teaches interoperability: Consistent outputs allow automation to parse responses reliably.

Core challenges you’ll face:

Style extraction -> maps to prompt contracts
Normalization rules -> maps to schema design
Edge cases -> maps to validation

Real World Outcome

A standard output template with examples and validation checks.

What you will see:

Required sections: consistent headings and fields
Normalization rules: how to handle deviations
Validation checklist: confirm outputs are parseable

The Core Question You’re Answering

“How do you make responses machine-friendly across different agents?”

Concepts You Must Understand First

Schema validation
- What fields must always appear?
Style constraints
- How do you keep responses concise and consistent?
Parsing reliability
- What breaks downstream automation?

Questions to Guide Your Design

Template design
- What is the minimal parseable structure?
Error handling
- What do you do when the template is violated?

Thinking Exercise

Output Checklist

Create a checklist of five validation rules for responses.

The Interview Questions They’ll Ask

“Why does output normalization matter in automation?”
“What makes a response machine-parseable?”
“How do you handle agent deviations?”
“What are the risks of flexible outputs?”
“How do you detect schema violations?”

Hints in Layers

Hint 1: Starting Point Define the sections you always want in outputs.

Hint 2: Next Level Create a strict checklist with required keys.

Hint 3: Technical Details Document a fallback mapping for nonconforming outputs.

Hint 4: Tools/Debugging Test with two agents and compare output structures.

Books That Will Help

Topic	Book	Chapter
Refactoring structure	“Refactoring” by Martin Fowler	Ch. 1
Contract enforcement	“Clean Architecture”	Ch. 9
Data correctness	“Designing Data-Intensive Applications”	Ch. 2

Common Pitfalls & Debugging

Problem 1: “Outputs are verbose and inconsistent”

Why: No defined template
Fix: Add required sections and limit free-form text

Problem 2: “Parser breaks on edge cases”

Why: Missing fields or unexpected ordering
Fix: Add validation and reformat rules

Project 14: Multi-Agent Logging Standard

File: P14-multi-agent-logging-standard.md
Main Programming Language: Python
Alternative Programming Languages: JavaScript
Coolness Level: 3
Business Potential: 4
Difficulty: 3
Knowledge Area: Observability
Software or Tool: Logging
Main Book: “Designing Data-Intensive Applications”

What you’ll build: A unified log event schema for all agent actions.

Why it teaches interoperability: Logs are the glue that lets you trace actions across systems.

Core challenges you’ll face:

Event schema -> maps to data modeling
Correlation IDs -> maps to traceability
Privacy rules -> maps to governance

Real World Outcome

A log schema document and a sample log stream for multi-agent workflows.

What you will see:

Event types: prompt, tool-call, output, error
Correlation fields: run ID, task ID, agent ID
Redaction rules: what to hide

The Core Question You’re Answering

“How do you trace a task across multiple agents?”

Concepts You Must Understand First

Event schemas
- What minimal fields are required?
Correlation identifiers
- How do you link events across agents?
Redaction
- How do you protect sensitive data?

Questions to Guide Your Design

Schema coverage
- Which actions must always be logged?
Retention
- How long should logs be stored?

Thinking Exercise

Trace a Task

Write a list of events that should appear for a single task across two agents.

The Interview Questions They’ll Ask

“What fields are critical in agent logs?”
“How do you correlate events across tools?”
“What is the risk of logging too much?”
“How do you redact sensitive data?”
“How do logs help debugging?”

Hints in Layers

Hint 1: Starting Point Define a minimal event schema with required fields.

Hint 2: Next Level Add correlation IDs and severity levels.

Hint 3: Technical Details Specify redaction rules for prompts and outputs.

Hint 4: Tools/Debugging Replay a log stream and verify task reconstruction.

Books That Will Help

Topic	Book	Chapter
Event modeling	“Designing Data-Intensive Applications”	Ch. 11
Reliability	“Release It!”	Ch. 8
Governance	“Clean Architecture”	Ch. 12

Common Pitfalls & Debugging

Problem 1: “Logs are not correlated”

Why: Missing consistent IDs
Fix: Add required correlation fields

Problem 2: “Sensitive data leaked”

Why: No redaction policy
Fix: Add redaction rules and auditing

Project 15: Error Taxonomy and Retry Controller

File: P15-error-taxonomy-and-retry-controller.md
Main Programming Language: Python
Alternative Programming Languages: Go
Coolness Level: 3
Business Potential: 4
Difficulty: 3
Knowledge Area: Reliability
Software or Tool: Retry policies
Main Book: “Release It!”

What you’ll build: A taxonomy of errors and a retry policy matrix for agents.

Why it teaches interoperability: Errors vary across CLIs and must be normalized for automation.

Core challenges you’ll face:

Error classification -> maps to reliability
Retry strategy -> maps to resilience design
Escalation rules -> maps to governance

Real World Outcome

A standardized error catalog and guidance on retries vs hard stops.

What you will see:

Error categories: transient, permanent, policy
Retry policy: allowed retries per category
Escalation rules: when to notify humans

The Core Question You’re Answering

“Which failures should be retried and which should stop automation?”

Concepts You Must Understand First

Failure modes
- How do you tell transient from permanent errors?
Backoff strategies
- How do you avoid retry storms?
Escalation
- When does a human need to intervene?

Questions to Guide Your Design

Error taxonomy
- What are the key error categories across tools?
Policy mapping
- How do you map tool-specific errors to categories?

Thinking Exercise

Error Mapping

Write three example errors and map them to categories.

The Interview Questions They’ll Ask

“Why do you need an error taxonomy?”
“How do you choose retry policies?”
“What is the risk of retrying everything?”
“How do you detect permanent failures?”
“How do you log error decisions?”

Hints in Layers

Hint 1: Starting Point Collect common error messages across agents.

Hint 2: Next Level Group them into a small set of categories.

Hint 3: Technical Details Define retry limits and escalation triggers.

Hint 4: Tools/Debugging Simulate a transient failure and confirm retry logic.

Books That Will Help

Topic	Book	Chapter
Failure handling	“Release It!”	Ch. 4
Resilient systems	“Designing Data-Intensive Applications”	Ch. 8
Operational discipline	“Continuous Delivery”	Ch. 8

Common Pitfalls & Debugging

Problem 1: “Retries cause duplicate effects”

Why: No idempotency checks
Fix: Add idempotency tokens or preflight checks

Problem 2: “Errors are misclassified”

Why: Overly broad categories
Fix: Refine categories and add examples

Project 16: Context Budget Planner

File: P16-context-budget-planner.md
Main Programming Language: Python
Alternative Programming Languages: JavaScript
Coolness Level: 4
Business Potential: 4
Difficulty: 3
Knowledge Area: Context Management
Software or Tool: Memory and context
Main Book: “AI Engineering”

What you’ll build: A planner that budgets context usage across agents and tasks.

Why it teaches interoperability: Different agents have different context limits and memory approaches.

Core challenges you’ll face:

Context sizing -> maps to token budgeting
Summarization rules -> maps to compression
Priority signals -> maps to task design

Real World Outcome

A context budget worksheet that shows how to allocate context to tasks and agents.

What you will see:

Budget table: task -> context allocation
Summarization rules: what to shrink
Overflow plan: fallback when context is exceeded

The Core Question You’re Answering

“What context is essential, and what can be summarized or externalized?”

Concepts You Must Understand First

Context windows
- How do different agents limit input size?
Summarization tradeoffs
- What information is safe to compress?
External memory
- When should you store context outside the prompt?

Questions to Guide Your Design

Budget strategy
- How do you allocate context across tasks?
Overflow plan
- What happens when context is too large?

Thinking Exercise

Context Triage

List ten pieces of context and rank them by importance.

The Interview Questions They’ll Ask

“What is a context budget and why does it matter?”
“How do you summarize safely?”
“What is the risk of context overflow?”
“How do you decide what to keep?”
“How does external memory help?”

Hints in Layers

Hint 1: Starting Point Measure how much context each agent supports.

Hint 2: Next Level Create a priority list of context elements.

Hint 3: Technical Details Define rules for summarization and external storage.

Hint 4: Tools/Debugging Test a large task and see what context must be trimmed.

Books That Will Help

Topic	Book	Chapter
AI systems	“AI Engineering” by Chip Huyen	Ch. 2
Data summarization	“Designing Data-Intensive Applications”	Ch. 11
Communication clarity	“The Pragmatic Programmer”	Ch. 8

Common Pitfalls & Debugging

Problem 1: “Critical context is lost”

Why: Poor prioritization
Fix: Rank context by impact on task correctness

Problem 2: “Summaries are too vague”

Why: No clear summarization rules
Fix: Define summary templates

Project 17: Memory Import and Export Bridge

File: P17-memory-import-and-export-bridge.md
Main Programming Language: Python
Alternative Programming Languages: JavaScript
Coolness Level: 4
Business Potential: 4
Difficulty: 3
Knowledge Area: Memory Management
Software or Tool: Local memory systems
Main Book: “Designing Data-Intensive Applications”

What you’ll build: A portability guide for moving memory across agent ecosystems.

Why it teaches interoperability: Shared memory and persistence are crucial for long-running automation.

Core challenges you’ll face:

Data model alignment -> maps to schema conversion
Privacy handling -> maps to governance
Conflict resolution -> maps to versioning

Real World Outcome

A memory export format with a mapping to each agent’s storage scheme.

What you will see:

Memory schema: shared fields and optional fields
Import rules: how to transform data
Conflict policy: how to merge duplicates

The Core Question You’re Answering

“How do you move durable knowledge between agents without losing meaning?”

Concepts You Must Understand First

Memory schemas
- What fields are common across systems?
Privacy and retention
- What should be excluded or anonymized?
Versioning
- How do you handle changes to memory format?

Questions to Guide Your Design

Export format
- What is the minimal portable format?
Merge strategy
- How do you avoid duplicating memories?

Thinking Exercise

Memory Record Example

Write a hypothetical memory record and map it to two tools.

The Interview Questions They’ll Ask

“What is the difference between context and memory?”
“Why is memory portability hard?”
“How do you handle conflicting memories?”
“How do you ensure privacy in memory export?”
“What versioning strategy would you use?”

Hints in Layers

Hint 1: Starting Point List the memory storage locations for each CLI.

Hint 2: Next Level Create a portable schema with required fields.

Hint 3: Technical Details Define merge and conflict resolution rules.

Hint 4: Tools/Debugging Test import with a small memory sample.

Books That Will Help

Topic	Book	Chapter
Data migration	“Designing Data-Intensive Applications”	Ch. 3
Privacy	“Security in Computing”	Ch. 5
Data modeling	“Clean Architecture”	Ch. 10

Common Pitfalls & Debugging

Problem 1: “Memory records lose meaning”

Why: Missing metadata fields
Fix: Add context fields for origin and purpose

Problem 2: “Conflicting records overwrite each other”

Why: No merge policy
Fix: Add conflict resolution rules

Project 18: Cross-Agent Workspace Sync

File: P18-cross-agent-workspace-sync.md
Main Programming Language: Python
Alternative Programming Languages: Go
Coolness Level: 3
Business Potential: 3
Difficulty: 3
Knowledge Area: File System, Sync
Software or Tool: Workspace synchronization
Main Book: “Designing Data-Intensive Applications”

What you’ll build: A synchronization plan for sharing workspace state across agents.

Why it teaches interoperability: Agents must see the same file system state to collaborate.

Core challenges you’ll face:

State drift -> maps to consistency models
Conflict resolution -> maps to merge strategy
Change detection -> maps to event logging

Real World Outcome

A documented sync strategy and a conflict resolution playbook.

What you will see:

Sync model: push, pull, or bidirectional
Conflict rules: which changes win
Snapshot plan: how to capture state

The Core Question You’re Answering

“How do agents stay consistent in the same workspace?”

Concepts You Must Understand First

Consistency models
- What is eventual vs strong consistency?
Conflict resolution
- How do you decide which edit wins?
Change detection
- How do you detect drift early?

Questions to Guide Your Design

Sync scope
- Which files are shared vs ignored?
Conflict strategy
- When do you require human review?

Thinking Exercise

Conflict Scenario

Imagine two agents edit the same file differently. Define your resolution rule.

The Interview Questions They’ll Ask

“What is workspace drift?”
“How do you handle concurrent edits?”
“Why does consistency matter for agents?”
“What is your conflict resolution strategy?”
“How do you prevent hidden changes?”

Hints in Layers

Hint 1: Starting Point Define which directories are shared and which are agent-specific.

Hint 2: Next Level Choose a consistency model and document it.

Hint 3: Technical Details Define a merge policy for conflicts.

Hint 4: Tools/Debugging Simulate a conflict and verify resolution rules.

Books That Will Help

Topic	Book	Chapter
Consistency models	“Designing Data-Intensive Applications”	Ch. 5
Version control	“Working Effectively with Legacy Code”	Ch. 2
System boundaries	“Clean Architecture”	Ch. 9

Common Pitfalls & Debugging

Problem 1: “Agents overwrite each other”

Why: No conflict policy
Fix: Define merge rules and require human review for conflicts

Problem 2: “Sync misses important files”

Why: Poor include/exclude rules
Fix: Document explicit inclusion lists

Project 19: Secrets Broker Shim

File: P19-secrets-broker-shim.md
Main Programming Language: Python
Alternative Programming Languages: Go
Coolness Level: 3
Business Potential: 4
Difficulty: 3
Knowledge Area: Security
Software or Tool: Credential handling
Main Book: “Security in Computing”

What you’ll build: A plan for handling secrets across multiple agent CLIs.

Why it teaches interoperability: Secrets must be shared safely and consistently.

Core challenges you’ll face:

Secret storage -> maps to security policy
Injection methods -> maps to tool configuration
Audit trails -> maps to governance

Real World Outcome

A secrets broker strategy with allowed storage, access methods, and audit logging.

What you will see:

Secret inventory: what secrets are needed
Access rules: who can access them
Audit plan: how access is recorded

The Core Question You’re Answering

“How do you share secrets safely across different agent tools?”

Concepts You Must Understand First

Secret lifecycle
- How are secrets created, rotated, revoked?
Least privilege
- What is the minimal scope for each secret?
Audit logging
- How do you track secret access?

Questions to Guide Your Design

Storage rules
- Where are secrets stored and who owns them?
Injection paths
- How do you pass secrets into agent runs?

Thinking Exercise

Secret Inventory

List five secrets used in a typical automation pipeline and decide their scope.

The Interview Questions They’ll Ask

“How do you manage secrets across tools?”
“What is least privilege in the context of agents?”
“How do you prevent secret leakage?”
“What is the role of audit logging?”
“How do you rotate secrets safely?”

Hints in Layers

Hint 1: Starting Point Identify where each CLI expects credentials.

Hint 2: Next Level Define a broker that injects secrets at runtime only.

Hint 3: Technical Details Specify rotation and revocation rules.

Hint 4: Tools/Debugging Audit a run and confirm no secrets are logged.

Books That Will Help

Topic	Book	Chapter
Security basics	“Security in Computing”	Ch. 4
Operational controls	“Release It!”	Ch. 7
Governance	“Clean Architecture”	Ch. 12

Common Pitfalls & Debugging

Problem 1: “Secrets leak into logs”

Why: No redaction policy
Fix: Add redaction and scanning

Problem 2: “Agents can’t access secrets”

Why: Misconfigured injection paths
Fix: Document and test access paths

Project 20: Test Harness for Agents

File: P20-test-harness-for-agents.md
Main Programming Language: Python
Alternative Programming Languages: JavaScript
Coolness Level: 4
Business Potential: 4
Difficulty: 4
Knowledge Area: Testing, Evaluation
Software or Tool: Testing harness
Main Book: “Clean Architecture”

What you’ll build: A standardized test harness to evaluate agent outputs.

Why it teaches interoperability: You need consistent benchmarks to compare agents.

Core challenges you’ll face:

Test case design -> maps to evaluation quality
Output verification -> maps to contract checks
Regression tracking -> maps to version control

Real World Outcome

A suite of test cases with expected outputs and evaluation criteria.

What you will see:

Test catalog: tasks grouped by complexity
Evaluation rules: what counts as success
Regression logs: differences across versions

The Core Question You’re Answering

“How do you measure agent performance consistently?”

Concepts You Must Understand First

Test design
- How do you choose representative tasks?
Evaluation criteria
- What metrics matter for correctness and quality?
Regression tracking
- How do you detect changes over time?

Questions to Guide Your Design

Test scope
- What tasks define baseline interoperability?
Metrics
- How do you score and compare results?

Thinking Exercise

Test Matrix

List five tasks and define success criteria for each.

The Interview Questions They’ll Ask

“What is a good agent benchmark?”
“How do you measure correctness?”
“What is regression testing for agents?”
“How do you avoid biased test cases?”
“How do you score outputs?”

Hints in Layers

Hint 1: Starting Point Define a small set of tasks that all agents can do.

Hint 2: Next Level Create a rubric for success and failure.

Hint 3: Technical Details Add a versioned results store for comparisons.

Hint 4: Tools/Debugging Run the same tests on two agents and compare results.

Books That Will Help

Topic	Book	Chapter
Testing philosophy	“Clean Architecture”	Ch. 20
Metrics	“Accelerate”	Ch. 3
Continuous testing	“Continuous Delivery”	Ch. 7

Common Pitfalls & Debugging

Problem 1: “Tests are not representative”

Why: Narrow task set
Fix: Expand to include different task types

Problem 2: “Scores are inconsistent”

Why: Ambiguous criteria
Fix: Define explicit pass/fail rules

Project 21: Prompt Injection Red Team Lab

File: P21-prompt-injection-red-team-lab.md
Main Programming Language: Python
Alternative Programming Languages: JavaScript
Coolness Level: 4
Business Potential: 3
Difficulty: 4
Knowledge Area: Security
Software or Tool: Prompt security
Main Book: “Security in Computing”

What you’ll build: A lab of adversarial prompt cases for multiple agents.

Why it teaches interoperability: Security is only as strong as the weakest agent in the chain.

Core challenges you’ll face:

Threat modeling -> maps to security basics
Test case generation -> maps to evaluation
Mitigation rules -> maps to governance

Real World Outcome

A red team checklist and a set of adversarial prompts with mitigation notes.

What you will see:

Attack categories: injection, data exfiltration, policy bypass
Test cases: prompts designed to stress safety
Mitigations: filters and policy rules

The Core Question You’re Answering

“How do you test and harden multi-agent systems against prompt attacks?”

Concepts You Must Understand First

Threat modeling
- What assets are at risk?
Prompt injection
- How does instruction hierarchy get subverted?
Mitigation strategies
- What defenses are realistic and effective?

Questions to Guide Your Design

Attack coverage
- What attacks are most relevant to coding tasks?
Defense mapping
- Which mitigations apply to each agent?

Thinking Exercise

Attack Surface Map

List the parts of a task that could be manipulated by malicious input.

The Interview Questions They’ll Ask

“What is prompt injection and why is it risky?”
“How do you test for prompt injection?”
“What is a realistic mitigation strategy?”
“How do you enforce safe outputs?”
“What is the weakest link in a multi-agent chain?”

Hints in Layers

Hint 1: Starting Point Collect known attack patterns from security notes.

Hint 2: Next Level Categorize them by impact and likelihood.

Hint 3: Technical Details Define a test plan with expected safe responses.

Hint 4: Tools/Debugging Run the same attack across multiple agents and compare results.

Books That Will Help

Topic	Book	Chapter
Security basics	“Security in Computing”	Ch. 1-3
Risk assessment	“Release It!”	Ch. 6
Governance	“Clean Architecture”	Ch. 12

Common Pitfalls & Debugging

Problem 1: “Test cases are too trivial”

Why: Lack of realistic attack patterns
Fix: Use layered attacks and indirect prompts

Problem 2: “Mitigations break normal tasks”

Why: Overly strict rules
Fix: Add exception handling with logging

Project 22: Multi-Agent Code Review Pipeline

File: P22-multi-agent-code-review-pipeline.md
Main Programming Language: JavaScript
Alternative Programming Languages: Python
Coolness Level: 4
Business Potential: 4
Difficulty: 3
Knowledge Area: Code Review
Software or Tool: Review automation
Main Book: “Clean Code”

What you’ll build: A workflow that routes code review tasks across multiple agents.

Why it teaches interoperability: Code review benefits from agent specialization and consistent standards.

Core challenges you’ll face:

Review criteria -> maps to coding standards
Conflict resolution -> maps to merging feedback
Bias control -> maps to evaluation

Real World Outcome

A review pipeline where each agent checks different aspects and results are merged.

What you will see:

Review lanes: style, correctness, security, performance
Merge rules: combine feedback with deduping
Final report: structured output for humans

The Core Question You’re Answering

“How do you combine multiple agent reviews into one consistent result?”

Concepts You Must Understand First

Review criteria
- What issues should always be flagged?
Feedback normalization
- How do you merge duplicates?
Bias detection
- How do you reduce inconsistent feedback?

Questions to Guide Your Design

Division of labor
- Which agent should check what?
Merge policy
- How do you resolve conflicting feedback?

Thinking Exercise

Review Rubric

Draft a rubric with four categories and two checks each.

The Interview Questions They’ll Ask

“Why use multiple agents for code review?”
“How do you avoid duplicate feedback?”
“What is a review rubric?”
“How do you handle conflicting suggestions?”
“How do you measure review quality?”

Hints in Layers

Hint 1: Starting Point Assign each agent a review specialty.

Hint 2: Next Level Define a merge policy with priority rules.

Hint 3: Technical Details Create a standard report format for final output.

Hint 4: Tools/Debugging Compare merged output to a human review.

Books That Will Help

Topic	Book	Chapter
Code quality	“Clean Code” by Robert C. Martin	Ch. 2
Refactoring feedback	“Refactoring”	Ch. 2
Reliability	“Release It!”	Ch. 4

Common Pitfalls & Debugging

Problem 1: “Review feedback conflicts”

Why: Overlapping responsibilities
Fix: Define clear role boundaries

Problem 2: “Too much noise”

Why: No severity thresholds
Fix: Add severity levels and filtering

Project 23: Issue Triage Mesh

File: P23-issue-triage-mesh.md
Main Programming Language: Python
Alternative Programming Languages: JavaScript
Coolness Level: 3
Business Potential: 4
Difficulty: 3
Knowledge Area: Workflow Automation
Software or Tool: Issue trackers
Main Book: “The Pragmatic Programmer”

What you’ll build: A triage workflow that assigns issues to agent specialties.

Why it teaches interoperability: It forces you to route real work across agent capabilities.

Core challenges you’ll face:

Issue classification -> maps to taxonomy design
Routing logic -> maps to orchestration
Feedback loop -> maps to continuous improvement

Real World Outcome

A triage flow that labels, prioritizes, and assigns issues to appropriate agents.

What you will see:

Issue categories: bug, feature, refactor, docs
Routing rules: category -> agent
Metrics: time to resolution

The Core Question You’re Answering

“Which agent should handle each kind of issue?”

Concepts You Must Understand First

Issue taxonomy
- What categories exist in your project?
Routing metrics
- What signals indicate best agent fit?
Feedback loops
- How do you improve routing over time?

Questions to Guide Your Design

Triage rules
- Which fields trigger routing decisions?
Assignment policy
- How do you avoid overload on one agent?

Thinking Exercise

Issue Mapping

Take five real issues and map them to agents with reasons.

The Interview Questions They’ll Ask

“Why automate issue triage?”
“How do you categorize issues?”
“What data do you need for routing?”
“How do you measure triage quality?”
“How do you handle ambiguous issues?”

Hints in Layers

Hint 1: Starting Point Define a small set of categories and map them to agents.

Hint 2: Next Level Add priority levels and escalation rules.

Hint 3: Technical Details Record assignment outcomes to refine routing.

Hint 4: Tools/Debugging Compare automated triage to human decisions.

Books That Will Help

Topic	Book	Chapter
Workflow design	“The Pragmatic Programmer”	Ch. 7
Process improvement	“Accelerate”	Ch. 5
Architecture	“Fundamentals of Software Architecture”	Ch. 3

Common Pitfalls & Debugging

Problem 1: “Issues routed to wrong agent”

Why: Category mapping too coarse
Fix: Add more granular categories

Problem 2: “Agents overloaded”

Why: No balancing strategy
Fix: Add capacity limits and queues

Project 24: Documentation Generator Federation

File: P24-documentation-generator-federation.md
Main Programming Language: JavaScript
Alternative Programming Languages: Python
Coolness Level: 3
Business Potential: 3
Difficulty: 3
Knowledge Area: Documentation Automation
Software or Tool: Docs generation
Main Book: “Clean Architecture”

What you’ll build: A system that uses multiple agents to generate documentation consistently.

Why it teaches interoperability: Documentation tasks require consistency across agents and outputs.

Core challenges you’ll face:

Style consistency -> maps to output normalization
Source alignment -> maps to context management
Review process -> maps to governance

Real World Outcome

A documentation pipeline where each agent drafts a section and results are merged.

What you will see:

Doc outline: sections assigned per agent
Style guide: unified tone and formatting rules
Merge report: consolidated output

The Core Question You’re Answering

“How do you ensure documentation is consistent across agents?”

Concepts You Must Understand First

Style guides
- What rules define consistency?
Source-of-truth
- How do you ensure agents reference the same facts?
Review workflow
- Who approves the final output?

Questions to Guide Your Design

Outline ownership
- Which agent writes which section?
Merge rules
- How do you handle overlaps?

Thinking Exercise

Documentation Contract

Draft a short style guide with tone, structure, and required sections.

The Interview Questions They’ll Ask

“Why use multiple agents for docs?”
“How do you enforce a consistent style?”
“How do you prevent factual drift?”
“How do you merge sections safely?”
“How do you validate documentation quality?”

Hints in Layers

Hint 1: Starting Point Create a clear outline with section owners.

Hint 2: Next Level Define a style guide with required elements.

Hint 3: Technical Details Use a merge checklist that rejects inconsistent sections.

Hint 4: Tools/Debugging Compare outputs for tone and structure alignment.

Books That Will Help

Topic	Book	Chapter
Consistency	“Clean Architecture”	Ch. 5
Editing discipline	“The Pragmatic Programmer”	Ch. 8
Reliability	“Release It!”	Ch. 7

Common Pitfalls & Debugging

Problem 1: “Docs have inconsistent tone”

Why: No shared style guide
Fix: Add a mandatory style contract

Problem 2: “Docs contradict each other”

Why: Different sources used
Fix: Define a single source-of-truth

Project 25: Repo Indexing Strategy

File: P25-repo-indexing-strategy.md
Main Programming Language: Python
Alternative Programming Languages: Go
Coolness Level: 3
Business Potential: 4
Difficulty: 4
Knowledge Area: Code Intelligence
Software or Tool: Code indexing
Main Book: “Designing Data-Intensive Applications”

What you’ll build: An indexing plan that helps multiple agents navigate large repositories.

Why it teaches interoperability: Shared indexing prevents redundant scanning and reduces context waste.

Core challenges you’ll face:

Index design -> maps to search systems
Update strategy -> maps to consistency
Access controls -> maps to security

Real World Outcome

A repository indexing plan with update cadence and access rules.

What you will see:

Index schema: files, symbols, dependencies
Update policy: incremental vs full rebuild
Access rules: who can query what

The Core Question You’re Answering

“How do agents share repository knowledge efficiently?”

Concepts You Must Understand First

Indexing basics
- What data should be indexed?
Incremental updates
- How do you update indexes after changes?
Access control
- How do you limit sensitive data exposure?

Questions to Guide Your Design

Index scope
- Which files are worth indexing?
Query interface
- How do agents access the index?

Thinking Exercise

Index Scope

List the top five file types you want indexed and why.

The Interview Questions They’ll Ask

“Why is indexing important for large repos?”
“How do you keep indexes up to date?”
“What is the tradeoff between full and incremental indexing?”
“How do you protect sensitive files?”
“How does indexing reduce context usage?”

Hints in Layers

Hint 1: Starting Point Start with file lists and dependency graphs.

Hint 2: Next Level Define incremental update triggers.

Hint 3: Technical Details Add access controls and query limits.

Hint 4: Tools/Debugging Test index queries against a known file change.

Books That Will Help

Topic	Book	Chapter
Index design	“Designing Data-Intensive Applications”	Ch. 3
Search data	“Algorithms” by Sedgewick	Ch. 5
Security	“Security in Computing”	Ch. 6

Common Pitfalls & Debugging

Problem 1: “Index is stale”

Why: Updates are manual
Fix: Add automatic update triggers

Problem 2: “Index leaks sensitive data”

Why: No access control
Fix: Restrict indexed fields and queries

Project 26: Skill and Prompt Pack Manager

File: P26-skill-and-prompt-pack-manager.md
Main Programming Language: JavaScript
Alternative Programming Languages: Python
Coolness Level: 4
Business Potential: 4
Difficulty: 3
Knowledge Area: Prompt Management
Software or Tool: Skills and prompt packs
Main Book: “The Pragmatic Programmer”

What you’ll build: A system for packaging prompts and skills for reuse across agents.

Why it teaches interoperability: Portable skill packs reduce duplication and standardize workflows.

Core challenges you’ll face:

Packaging format -> maps to portability
Versioning -> maps to change control
Distribution -> maps to governance

Real World Outcome

A structured prompt pack format with versioning and compatibility notes.

What you will see:

Pack schema: metadata, prompts, instructions
Version rules: semantic versioning guidelines
Compatibility map: which agents support which packs

The Core Question You’re Answering

“How do you reuse prompts and skills across different CLIs?”

Concepts You Must Understand First

Prompt modularity
- How do you make prompts composable?
Versioning
- How do you evolve packs safely?
Distribution
- How do teams share and trust prompt packs?

Questions to Guide Your Design

Pack structure
- What metadata is required?
Compatibility rules
- How do you mark agent-specific constraints?

Thinking Exercise

Pack Outline

Outline a prompt pack for code review that includes metadata and rules.

The Interview Questions They’ll Ask

“What is a prompt pack?”
“How do you make prompts portable?”
“What is semantic versioning used for?”
“How do you distribute prompt packs safely?”
“How do you manage compatibility?”

Hints in Layers

Hint 1: Starting Point Define a pack with name, version, and prompt list.

Hint 2: Next Level Add compatibility notes for each agent.

Hint 3: Technical Details Create a validation checklist for pack structure.

Hint 4: Tools/Debugging Try importing the pack into two different CLIs.

Books That Will Help

Topic	Book	Chapter
Reusability	“The Pragmatic Programmer”	Ch. 4
Interfaces	“Clean Architecture”	Ch. 8
Release discipline	“Continuous Delivery”	Ch. 5

Common Pitfalls & Debugging

Problem 1: “Packs work only for one agent”

Why: Hidden assumptions in prompts
Fix: Add explicit compatibility notes

Problem 2: “Pack versions drift”

Why: No version policy
Fix: Use semantic versioning and changelogs

Project 27: Cross-CLI Command Adapter

File: P27-cross-cli-command-adapter.md
Main Programming Language: Python
Alternative Programming Languages: Go
Coolness Level: 4
Business Potential: 4
Difficulty: 4
Knowledge Area: CLI Interop
Software or Tool: CLI adapters
Main Book: “Design Patterns”

What you’ll build: An adapter spec that maps equivalent commands across CLIs.

Why it teaches interoperability: Command mapping is the core of automation portability.

Core challenges you’ll face:

Command translation -> maps to adapter design
Argument normalization -> maps to schema design
Error mapping -> maps to reliability

Real World Outcome

A command adapter table with canonical commands and tool-specific mappings.

What you will see:

Command glossary: canonical actions
Mapping table: CLI-specific flags
Fallback rules: what to do when no match exists

The Core Question You’re Answering

“How can one automation script run on multiple agent CLIs?”

Concepts You Must Understand First

Adapter pattern
- How do you translate one interface into another?
Argument normalization
- How do you standardize flag meanings?
Error mapping
- How do you unify error responses?

Questions to Guide Your Design

Canonical commands
- What is the minimal command set?
Fallbacks
- What happens when a CLI lacks a command?

Thinking Exercise

Command Translation

Pick two commands from two CLIs and map them to a canonical action.

The Interview Questions They’ll Ask

“What is an adapter and why is it used?”
“How do you normalize CLI arguments?”
“What is a canonical command set?”
“How do you handle missing features?”
“How do you verify adapter correctness?”

Hints in Layers

Hint 1: Starting Point List core commands used in daily workflows.

Hint 2: Next Level Define canonical names for each action.

Hint 3: Technical Details Map each CLI command to canonical actions with notes.

Hint 4: Tools/Debugging Test translation by comparing outputs across agents.

Books That Will Help

Topic	Book	Chapter
Adapter pattern	“Design Patterns”	Ch. 4
Interface design	“Clean Architecture”	Ch. 8
Reliability	“Release It!”	Ch. 5

Common Pitfalls & Debugging

Problem 1: “Command mappings are incomplete”

Why: Hidden CLI features
Fix: Expand mappings as new features emerge

Problem 2: “Adapters mask errors”

Why: Errors normalized too aggressively
Fix: Preserve raw error details for debugging

Project 28: Event-Driven Agent Bus

File: P28-event-driven-agent-bus.md
Main Programming Language: Go
Alternative Programming Languages: Python, JavaScript
Coolness Level: 5
Business Potential: 5
Difficulty: 5
Knowledge Area: Distributed Systems
Software or Tool: Event bus
Main Book: “Designing Data-Intensive Applications”

What you’ll build: An event bus design that routes tasks between agents asynchronously.

Why it teaches interoperability: Event-driven systems decouple agents and enable scale.

Core challenges you’ll face:

Event schema -> maps to data contracts
Ordering guarantees -> maps to reliability
Backpressure -> maps to system stability

Real World Outcome

A blueprint for an event bus that orchestrates multiple agents.

What you will see:

Event types: task.request, task.result, task.error
Routing rules: topic-based or queue-based
Backpressure policy: how to slow down agents

The Core Question You’re Answering

“How do you decouple agents so they can scale independently?”

Concepts You Must Understand First

Event-driven architecture
- Why use events instead of direct calls?
Ordering guarantees
- When do you need strict ordering?
Backpressure
- How do you prevent overload?

Questions to Guide Your Design

Event schema
- What fields are required for routing?
Reliability
- How do you handle lost events?

Thinking Exercise

Event Flow Map

Draw a flow of events for a single task across three agents.

The Interview Questions They’ll Ask

“What is an event-driven architecture?”
“Why is backpressure important?”
“How do you handle event ordering?”
“How do you recover from dropped events?”
“What is the benefit of decoupling agents?”

Hints in Layers

Hint 1: Starting Point Define the core event types and payloads.

Hint 2: Next Level Decide on routing strategy and queue semantics.

Hint 3: Technical Details Specify retry and dead-letter handling.

Hint 4: Tools/Debugging Simulate an overload and observe the backpressure plan.

Books That Will Help

Topic	Book	Chapter
Event systems	“Designing Data-Intensive Applications”	Ch. 11
Reliability	“Release It!”	Ch. 7
Architecture	“Building Microservices” by Sam Newman	Ch. 4

Common Pitfalls & Debugging

Problem 1: “Events pile up”

Why: No backpressure strategy
Fix: Implement throttling or queue limits

Problem 2: “Events lack context”

Why: Payload too small
Fix: Add required correlation fields

Project 29: Distributed Job Queue

File: P29-distributed-job-queue.md
Main Programming Language: Go
Alternative Programming Languages: Python
Coolness Level: 4
Business Potential: 5
Difficulty: 5
Knowledge Area: Distributed Systems
Software or Tool: Job queues
Main Book: “Designing Data-Intensive Applications”

What you’ll build: A queue design for distributing agent tasks at scale.

Why it teaches interoperability: Queue systems allow different agent workers to collaborate efficiently.

Core challenges you’ll face:

Task scheduling -> maps to fairness
Retry semantics -> maps to reliability
Worker registration -> maps to discovery

Real World Outcome

A queue architecture diagram and scheduling policy for agent tasks.

What you will see:

Queue types: priority, FIFO, delayed
Worker registry: agent capabilities
Retry rules: per-task policies

The Core Question You’re Answering

“How do you distribute tasks across many agents efficiently?”

Concepts You Must Understand First

Scheduling policies
- How do you choose which task to run next?
Worker discovery
- How do agents register and advertise capabilities?
Failure recovery
- How do you handle worker failure?

Questions to Guide Your Design

Queue model
- What queue types do you need?
Retry strategy
- How many retries and for which errors?

Thinking Exercise

Queue Priorities

Define three priority levels and which tasks go in each.

The Interview Questions They’ll Ask

“What is a distributed job queue?”
“How do you schedule tasks fairly?”
“What happens when a worker dies?”
“How do you avoid duplicate processing?”
“How do you scale queue consumers?”

Hints in Layers

Hint 1: Starting Point Define a task format with required fields.

Hint 2: Next Level Design a worker registration and heartbeat mechanism.

Hint 3: Technical Details Add retry policies and dead-letter handling.

Hint 4: Tools/Debugging Simulate worker failure and verify requeue behavior.

Books That Will Help

Topic	Book	Chapter
Queues	“Designing Data-Intensive Applications”	Ch. 11
Reliability	“Release It!”	Ch. 5
Architecture	“Building Microservices”	Ch. 5

Common Pitfalls & Debugging

Problem 1: “Tasks get stuck”

Why: Missing timeout or heartbeat
Fix: Add lease timeouts

Problem 2: “Duplicate processing”

Why: No idempotency
Fix: Add unique task IDs and dedupe logic

Project 30: Cost and Latency Budget Enforcer

File: P30-cost-and-latency-budget-enforcer.md
Main Programming Language: Python
Alternative Programming Languages: Go
Coolness Level: 4
Business Potential: 5
Difficulty: 4
Knowledge Area: Operations
Software or Tool: Budgeting
Main Book: “Accelerate”

What you’ll build: A policy that caps cost and latency across agent runs.

Why it teaches interoperability: Cost control is essential when multiple agents run at scale.

Core challenges you’ll face:

Budget modeling -> maps to metrics
Policy enforcement -> maps to governance
Fallback behavior -> maps to resilience

Real World Outcome

A budget policy document with thresholds and fallback actions.

What you will see:

Cost caps: per task and per day
Latency targets: max allowed delays
Fallbacks: lower-cost agent choices

The Core Question You’re Answering

“How do you keep multi-agent automation within cost and time limits?”

Concepts You Must Understand First

Cost modeling
- How do you estimate cost per task?
Latency budgets
- What is acceptable delay for each task type?
Fallback strategy
- What happens when budgets are exceeded?

Questions to Guide Your Design

Budget scope
- Is it per task, per user, or per day?
Policy enforcement
- Where do you enforce budgets in the pipeline?

Thinking Exercise

Budget Allocation

Allocate a daily budget across three task categories.

The Interview Questions They’ll Ask

“Why do you need budgets for agents?”
“How do you estimate costs?”
“How do you enforce latency limits?”
“What fallback options are reasonable?”
“How do you monitor budget usage?”

Hints in Layers

Hint 1: Starting Point Define baseline costs for each agent and task type.

Hint 2: Next Level Set thresholds and document fallback options.

Hint 3: Technical Details Add alerts when budgets approach limits.

Hint 4: Tools/Debugging Simulate a budget breach and record the response.

Books That Will Help

Topic	Book	Chapter
Metrics	“Accelerate”	Ch. 3
Governance	“Clean Architecture”	Ch. 12
Reliability	“Release It!”	Ch. 8

Common Pitfalls & Debugging

Problem 1: “Budgets are unrealistic”

Why: No historical data
Fix: Start with conservative limits and adjust

Problem 2: “Fallbacks are unclear”

Why: No documented downgrade plan
Fix: Define explicit fallback paths

Project 31: Human-in-the-Loop Gate

File: P31-human-in-the-loop-gate.md
Main Programming Language: JavaScript
Alternative Programming Languages: Python
Coolness Level: 3
Business Potential: 4
Difficulty: 3
Knowledge Area: Governance
Software or Tool: Approval workflows
Main Book: “Clean Architecture”

What you’ll build: A gate that pauses automation for human approval at critical steps.

Why it teaches interoperability: Safe automation depends on predictable human checkpoints.

Core challenges you’ll face:

Checkpoint design -> maps to risk analysis
Approval flow -> maps to workflow design
Audit logging -> maps to compliance

Real World Outcome

A workflow diagram showing where human approval is required and how it is recorded.

What you will see:

Checkpoints: defined high-risk steps
Approval form: required decision fields
Audit trail: who approved what

The Core Question You’re Answering

“Where must humans remain in the loop for safety?”

Concepts You Must Understand First

Risk classification
- How do you identify high-risk steps?
Approval flow
- How do you record and enforce approvals?
Audit logging
- How do you prove compliance?

Questions to Guide Your Design

Checkpoint placement
- Which steps require human review?
Approval criteria
- What must reviewers check?

Thinking Exercise

Approval Map

Pick a workflow and mark the steps that require approval.

The Interview Questions They’ll Ask

“Why is human-in-the-loop important?”
“How do you decide where to put gates?”
“How do you document approvals?”
“What is the risk of too many gates?”
“How do you ensure approvals are not bypassed?”

Hints in Layers

Hint 1: Starting Point List all steps with potential irreversible impact.

Hint 2: Next Level Define approval forms with required fields.

Hint 3: Technical Details Document enforcement rules and audit log schema.

Hint 4: Tools/Debugging Test a workflow with a denied approval and verify halt behavior.

Books That Will Help

Topic	Book	Chapter
Governance	“Clean Architecture”	Ch. 12
Operational safety	“Release It!”	Ch. 7
Process discipline	“Continuous Delivery”	Ch. 9

Common Pitfalls & Debugging

Problem 1: “Approvals are skipped”

Why: No enforcement layer
Fix: Add a mandatory gate in the pipeline

Problem 2: “Gates slow everything”

Why: Too many checkpoints
Fix: Limit gates to high-risk actions

Project 32: Semantic Diff and Patch Gate

File: P32-semantic-diff-and-patch-gate.md
Main Programming Language: Python
Alternative Programming Languages: Go
Coolness Level: 4
Business Potential: 4
Difficulty: 4
Knowledge Area: Code Changes
Software or Tool: Diff tools
Main Book: “Refactoring”

What you’ll build: A gate that evaluates semantic diffs before changes are applied.

Why it teaches interoperability: Multi-agent edits require a safety gate to avoid conflicting patches.

Core challenges you’ll face:

Semantic analysis -> maps to code understanding
Patch validation -> maps to safety
Conflict detection -> maps to consistency

Real World Outcome

A semantic diff checklist and a gate that blocks risky changes.

What you will see:

Diff categories: refactor, behavior change, config
Risk rules: which changes require approval
Patch status: accepted, rejected, needs review

The Core Question You’re Answering

“How do you prevent unsafe changes across multiple agents?”

Concepts You Must Understand First

Semantic diff
- How is it different from line diff?
Risk scoring
- How do you classify risky changes?
Conflict detection
- How do you detect overlapping edits?

Questions to Guide Your Design

Risk thresholds
- What changes require human review?
Patch sequencing
- How do you order patches from multiple agents?

Thinking Exercise

Diff Categories

List five diff types and classify them by risk.

The Interview Questions They’ll Ask

“What is a semantic diff?”
“How do you score patch risk?”
“How do you handle conflicting patches?”
“Why is semantic diff important for agents?”
“What is the role of human review?”

Hints in Layers

Hint 1: Starting Point Define categories for change types.

Hint 2: Next Level Map categories to approval requirements.

Hint 3: Technical Details Document a patch ordering strategy and merge policy.

Hint 4: Tools/Debugging Compare a benign refactor to a breaking change.

Books That Will Help

Topic	Book	Chapter
Refactoring	“Refactoring”	Ch. 3
Architecture governance	“Clean Architecture”	Ch. 12
Reliability	“Release It!”	Ch. 5

Common Pitfalls & Debugging

Problem 1: “Gate blocks too much”

Why: Overly strict risk thresholds
Fix: Refine categories and add exceptions

Problem 2: “Risky changes slip through”

Why: Poor semantic analysis
Fix: Improve diff classification rules

Project 33: Knowledge Base RAG Connector

File: P33-knowledge-base-rag-connector.md
Main Programming Language: Python
Alternative Programming Languages: JavaScript
Coolness Level: 4
Business Potential: 5
Difficulty: 4
Knowledge Area: RAG, Knowledge
Software or Tool: Knowledge sources
Main Book: “AI Engineering”

What you’ll build: A connector plan that lets multiple agents query a shared knowledge base.

Why it teaches interoperability: Shared knowledge reduces duplicated context and improves accuracy.

Core challenges you’ll face:

Knowledge indexing -> maps to data modeling
Access control -> maps to security
Context injection -> maps to prompt design

Real World Outcome

A knowledge connector spec with query rules and access policies.

What you will see:

Knowledge sources: docs, issues, runbooks
Query schema: required fields for retrieval
Access policy: who can read what

The Core Question You’re Answering

“How do multiple agents safely share a single knowledge base?”

Concepts You Must Understand First

RAG basics
- How does retrieval augment prompts?
Access control
- How do you restrict sensitive data?
Context injection
- How do you present retrieved info to agents?

Questions to Guide Your Design

Query rules
- What metadata is required for retrieval?
Privacy rules
- What data is prohibited from retrieval?

Thinking Exercise

Knowledge Inventory

List five knowledge sources and categorize by sensitivity.

The Interview Questions They’ll Ask

“What is RAG and why use it?”
“How do you control access to shared knowledge?”
“How do you prevent outdated information?”
“How do you inject retrieved context safely?”
“How do you measure retrieval quality?”

Hints in Layers

Hint 1: Starting Point List the knowledge sources and access rules.

Hint 2: Next Level Define a retrieval query schema with filters.

Hint 3: Technical Details Specify how retrieved content is summarized for agents.

Hint 4: Tools/Debugging Test retrieval for a known query and validate results.

Books That Will Help

Topic	Book	Chapter
AI systems	“AI Engineering”	Ch. 4
Data reliability	“Designing Data-Intensive Applications”	Ch. 9
Security	“Security in Computing”	Ch. 6

Common Pitfalls & Debugging

Problem 1: “Agents get outdated info”

Why: No update policy
Fix: Add refresh schedules and versioning

Problem 2: “Sensitive data leaks”

Why: Weak access controls
Fix: Add strict filters and logging

Project 34: Model Failover Switch

File: P34-model-failover-switch.md
Main Programming Language: Python
Alternative Programming Languages: Go
Coolness Level: 4
Business Potential: 5
Difficulty: 4
Knowledge Area: Reliability
Software or Tool: Model selection
Main Book: “Release It!”

What you’ll build: A policy for switching between models or agents when failures occur.

Why it teaches interoperability: Resilience requires fallback options across agent ecosystems.

Core challenges you’ll face:

Failover criteria -> maps to reliability design
Compatibility checks -> maps to output normalization
State handoff -> maps to context management

Real World Outcome

A failover policy with triggers, fallback order, and recovery steps.

What you will see:

Failover triggers: timeout, error, cost
Fallback chain: primary -> secondary -> tertiary
Recovery plan: when to return to primary

The Core Question You’re Answering

“When should automation switch to a different agent or model?”

Concepts You Must Understand First

Failover triggers
- What signals indicate a failure?
Compatibility
- How do you ensure output compatibility across agents?
State transfer
- How do you pass context to the fallback agent?

Questions to Guide Your Design

Trigger thresholds
- What counts as a failure vs slowdown?
Fallback order
- Which agents should be used first?

Thinking Exercise

Failover Scenario

Define a scenario where the primary agent fails and how the system responds.

The Interview Questions They’ll Ask

“What is failover and why does it matter?”
“How do you choose fallback agents?”
“How do you avoid inconsistent outputs?”
“What is the cost of failover?”
“How do you restore to primary?”

Hints in Layers

Hint 1: Starting Point Define the primary and secondary agents for each task.

Hint 2: Next Level Add clear failure triggers and cooldown periods.

Hint 3: Technical Details Document context handoff requirements.

Hint 4: Tools/Debugging Simulate a timeout and verify failover behavior.

Books That Will Help

Topic	Book	Chapter
Resilience	“Release It!”	Ch. 4
Reliability	“Designing Data-Intensive Applications”	Ch. 8
Architecture	“Fundamentals of Software Architecture”	Ch. 7

Common Pitfalls & Debugging

Problem 1: “Failover causes inconsistent outputs”

Why: No normalized output contract
Fix: Enforce a shared output schema

Problem 2: “Failover loops”

Why: No cooldown strategy
Fix: Add a cooldown window before retrying primary

Project 35: Compliance Audit Logger

File: P35-compliance-audit-logger.md
Main Programming Language: Python
Alternative Programming Languages: JavaScript
Coolness Level: 3
Business Potential: 5
Difficulty: 4
Knowledge Area: Compliance
Software or Tool: Audit logging
Main Book: “Clean Architecture”

What you’ll build: A compliance logging spec that records agent decisions and actions.

Why it teaches interoperability: Interop systems must be auditable for trust and governance.

Core challenges you’ll face:

Audit schema -> maps to compliance
Retention policy -> maps to governance
Access control -> maps to security

Real World Outcome

A compliance log schema with retention and access rules.

What you will see:

Audit events: approvals, tool calls, changes
Retention plan: how long logs are stored
Access controls: who can read logs

The Core Question You’re Answering

“How do you prove automation actions were safe and compliant?”

Concepts You Must Understand First

Auditability
- What events must always be recorded?
Retention
- How long should logs exist?
Access control
- Who should be allowed to view logs?

Questions to Guide Your Design

Audit scope
- What is the minimal event set?
Retention rules
- What regulatory or policy requirements apply?

Thinking Exercise

Audit Checklist

List the top ten events you would want in a compliance review.

The Interview Questions They’ll Ask

“Why do you need compliance logging for agents?”
“What events must be audited?”
“How do you ensure logs are tamper resistant?”
“How do you handle retention requirements?”
“Who should access audit logs?”

Hints in Layers

Hint 1: Starting Point Define a minimal audit event schema.

Hint 2: Next Level Add retention and access policy notes.

Hint 3: Technical Details Define integrity checks for logs.

Hint 4: Tools/Debugging Perform an audit review on a sample run.

Books That Will Help

Topic	Book	Chapter
Governance	“Clean Architecture”	Ch. 12
Security	“Security in Computing”	Ch. 7
Reliability	“Release It!”	Ch. 7

Common Pitfalls & Debugging

Problem 1: “Logs are incomplete”

Why: Missing event coverage
Fix: Expand event schema and enforce logging

Problem 2: “Logs are hard to access”

Why: No indexing or search
Fix: Add indexing and query support

Project 36: Offline and Edge Mode Playbook

File: P36-offline-and-edge-mode-playbook.md
Main Programming Language: Python
Alternative Programming Languages: Go
Coolness Level: 3
Business Potential: 4
Difficulty: 3
Knowledge Area: Offline workflows
Software or Tool: Local execution
Main Book: “The Pragmatic Programmer”

What you’ll build: A playbook for operating agents with limited or no network access.

Why it teaches interoperability: Offline constraints force you to design portable, resilient workflows.

Core challenges you’ll face:

Dependency caching -> maps to reliability
Local context storage -> maps to memory management
Sync strategy -> maps to consistency

Real World Outcome

An offline workflow plan with cached resources and sync rules.

What you will see:

Dependency list: what must be cached
Offline tasks: what can be done locally
Sync strategy: how to reconcile changes later

The Core Question You’re Answering

“How do you keep agent workflows productive without network access?”

Concepts You Must Understand First

Offline constraints
- What breaks when network is unavailable?
Caching strategy
- What must be cached ahead of time?
Sync reconciliation
- How do you merge changes after reconnecting?

Questions to Guide Your Design

Offline scope
- Which tasks can be done offline?
Reconciliation
- How do you handle conflicts after sync?

Thinking Exercise

Offline Readiness

List five resources you would need cached for a full day of work.

The Interview Questions They’ll Ask

“What is the impact of offline constraints?”
“How do you prepare for offline work?”
“How do you reconcile changes after reconnecting?”
“What tasks are risky offline?”
“How do you ensure data integrity?”

Hints in Layers

Hint 1: Starting Point Identify the most critical dependencies and cache them.

Hint 2: Next Level Define which workflows can run offline.

Hint 3: Technical Details Create a sync protocol for reconnect events.

Hint 4: Tools/Debugging Simulate offline mode and record what fails.

Books That Will Help

Topic	Book	Chapter
Pragmatic workflows	“The Pragmatic Programmer”	Ch. 5
Reliability	“Release It!”	Ch. 4
Consistency	“Designing Data-Intensive Applications”	Ch. 5

Common Pitfalls & Debugging

Problem 1: “Missing dependencies offline”

Why: No cache plan
Fix: Create a dependency inventory and cache list

Problem 2: “Conflicts after sync”

Why: No reconciliation rules
Fix: Define merge and conflict resolution steps

Project 37: Multi-tenant Agent Service

File: P37-multi-tenant-agent-service.md
Main Programming Language: Go
Alternative Programming Languages: Python
Coolness Level: 5
Business Potential: 5
Difficulty: 5
Knowledge Area: Platform Engineering
Software or Tool: Multi-tenant services
Main Book: “Software Architecture in Practice”

What you’ll build: A multi-tenant service design that lets multiple teams share agent automation safely.

Why it teaches interoperability: It forces you to build strong boundaries and governance.

Core challenges you’ll face:

Tenant isolation -> maps to security
Quota management -> maps to governance
Routing policies -> maps to orchestration

Real World Outcome

A multi-tenant architecture diagram and tenant policy definitions.

What you will see:

Tenant boundaries: separate configs and data
Quota rules: cost and usage limits
Routing rules: per-tenant agent choices

The Core Question You’re Answering

“How do you serve multiple teams safely on the same agent platform?”

Concepts You Must Understand First

Isolation
- How do you prevent cross-tenant data leaks?
Quotas
- How do you enforce usage limits?
Routing
- How do you customize agent choice per tenant?

Questions to Guide Your Design

Tenant model
- How is tenant data stored and scoped?
Policy enforcement
- Where are quotas checked?

Thinking Exercise

Tenant Policy Draft

Define a policy for two teams with different budgets and access.

The Interview Questions They’ll Ask

“What is multi-tenancy and why is it hard?”
“How do you isolate tenants?”
“How do you enforce quotas?”
“What is the risk of shared infrastructure?”
“How do you audit tenant actions?”

Hints in Layers

Hint 1: Starting Point Define the tenant boundary and separate configs.

Hint 2: Next Level Add quota and billing rules.

Hint 3: Technical Details Define routing rules per tenant.

Hint 4: Tools/Debugging Simulate two tenants and verify isolation.

Books That Will Help

Topic	Book	Chapter
Architecture	“Software Architecture in Practice”	Ch. 5
Security	“Security in Computing”	Ch. 6
Reliability	“Release It!”	Ch. 7

Common Pitfalls & Debugging

Problem 1: “Tenant data leaks”

Why: Weak isolation boundaries
Fix: Enforce strict separation and access controls

Problem 2: “Quota enforcement fails”

Why: Quotas checked too late
Fix: Enforce quotas at request entry

Project 38: Benchmark Suite for Agents

File: P38-benchmark-suite-for-agents.md
Main Programming Language: Python
Alternative Programming Languages: JavaScript
Coolness Level: 4
Business Potential: 4
Difficulty: 4
Knowledge Area: Evaluation
Software or Tool: Benchmarks
Main Book: “Accelerate”

What you’ll build: A benchmark suite that measures quality, cost, and latency across agents.

Why it teaches interoperability: Benchmarks drive data-backed tool selection and trust.

Core challenges you’ll face:

Metric selection -> maps to evaluation design
Workload design -> maps to representativeness
Reporting -> maps to observability

Real World Outcome

A benchmark report with standardized metrics for each agent and workload.

What you will see:

Workload sets: simple, moderate, complex tasks
Metrics: latency, cost, accuracy
Comparison table: agent performance scores

The Core Question You’re Answering

“How do you compare agents objectively for your workflows?”

Concepts You Must Understand First

Metric definition
- What metrics reflect real user value?
Workload sampling
- How do you avoid biased tests?
Reporting
- How do you present results clearly?

Questions to Guide Your Design

Benchmark scope
- Which tasks should be included?
Scoring
- How do you aggregate metrics?

Thinking Exercise

Metric Priorities

Rank three metrics and explain why they matter most.

The Interview Questions They’ll Ask

“What makes a benchmark fair?”
“How do you avoid biased workloads?”
“What metrics matter for coding agents?”
“How do you present results to stakeholders?”
“How do you track benchmark drift?”

Hints in Layers

Hint 1: Starting Point Start with a small, representative workload set.

Hint 2: Next Level Define metrics and scoring rules.

Hint 3: Technical Details Create a report template with comparison tables.

Hint 4: Tools/Debugging Run the suite on two agents and compare results.

Books That Will Help

Topic	Book	Chapter
Metrics	“Accelerate”	Ch. 3
Evaluation	“AI Engineering”	Ch. 6
Reporting	“The Pragmatic Programmer”	Ch. 7

Common Pitfalls & Debugging

Problem 1: “Benchmarks are not representative”

Why: Too narrow task set
Fix: Add diverse workloads

Problem 2: “Metrics are hard to interpret”

Why: No normalization
Fix: Provide normalized scores and context

Project 39: Incident Response Automation

File: P39-incident-response-automation.md
Main Programming Language: Python
Alternative Programming Languages: Go
Coolness Level: 4
Business Potential: 5
Difficulty: 4
Knowledge Area: Reliability
Software or Tool: Incident workflows
Main Book: “Release It!”

What you’ll build: An automation playbook that uses multiple agents during incidents.

Why it teaches interoperability: Incidents require rapid coordination and reliable outputs.

Core challenges you’ll face:

Runbook design -> maps to operations
Task routing -> maps to orchestration
Safety checks -> maps to governance

Real World Outcome

An incident response workflow where agents handle investigation, mitigation, and reporting.

What you will see:

Runbook steps: detection, triage, mitigation
Agent roles: which agent does what
Postmortem report: standardized output

The Core Question You’re Answering

“How can agents accelerate incident response without increasing risk?”

Concepts You Must Understand First

Incident stages
- What steps occur in a typical incident?
Safety checks
- Which actions require approval?
Postmortem structure
- What must be documented after the incident?

Questions to Guide Your Design

Role assignment
- Which agent handles which stage?
Approval gates
- Where must humans sign off?

Thinking Exercise

Incident Scenario

Describe an outage and map which agent helps at each step.

The Interview Questions They’ll Ask

“Why use agents in incident response?”
“How do you prevent unsafe automated actions?”
“What is a postmortem and why is it needed?”
“How do you coordinate tasks during incidents?”
“How do you measure incident improvement?”

Hints in Layers

Hint 1: Starting Point Define a simple runbook with three stages.

Hint 2: Next Level Assign agents to each stage with responsibilities.

Hint 3: Technical Details Add approval gates and logging requirements.

Hint 4: Tools/Debugging Run a tabletop exercise and record outcomes.

Books That Will Help

Topic	Book	Chapter
Reliability	“Release It!”	Ch. 9
Operations	“The Phoenix Project” by Gene Kim	Ch. 24
Metrics	“Accelerate”	Ch. 4

Common Pitfalls & Debugging

Problem 1: “Automation makes changes too quickly”

Why: Missing approval gates
Fix: Add human checks for risky actions

Problem 2: “Postmortems are incomplete”

Why: No standardized report
Fix: Require structured postmortem templates

Project 40: IDE Bridge Integration

File: P40-ide-bridge-integration.md
Main Programming Language: JavaScript
Alternative Programming Languages: Python
Coolness Level: 4
Business Potential: 4
Difficulty: 3
Knowledge Area: Developer Experience
Software or Tool: IDE integration
Main Book: “The Pragmatic Programmer”

What you’ll build: A plan for bridging CLI agents with an IDE workflow.

Why it teaches interoperability: Developers need seamless transitions between CLI automation and editor actions.

Core challenges you’ll face:

Context sync -> maps to workspace alignment
Command routing -> maps to adapter design
User experience -> maps to workflow design

Real World Outcome

A workflow diagram showing how IDE actions trigger multiple agent CLIs.

What you will see:

Trigger points: file save, test run, diff view
Agent routing: which CLI handles which trigger
Result display: how outputs are surfaced in the IDE

The Core Question You’re Answering

“How do you connect CLI agents to the developer’s editor workflow?”

Concepts You Must Understand First

Context sync
- How does the IDE share file state with agents?
Command routing
- How do you map IDE actions to agents?
Feedback presentation
- How do you surface agent outputs effectively?

Questions to Guide Your Design

Trigger strategy
- Which IDE events are safe to automate?
Feedback channels
- Where should results appear for developers?

Thinking Exercise

IDE Workflow Map

Map a typical developer action to an agent response.

The Interview Questions They’ll Ask

“Why integrate agents with an IDE?”
“How do you keep IDE context in sync?”
“What actions should be automated?”
“How do you avoid interrupting the developer?”
“How do you display results effectively?”

Hints in Layers

Hint 1: Starting Point List the IDE actions you want to automate.

Hint 2: Next Level Map those actions to agent tasks.

Hint 3: Technical Details Define a minimal result display format.

Hint 4: Tools/Debugging Test with a single action and refine feedback.

Books That Will Help

Topic	Book	Chapter
Workflow design	“The Pragmatic Programmer”	Ch. 7
Interface design	“Clean Architecture”	Ch. 10
Reliability	“Release It!”	Ch. 6

Common Pitfalls & Debugging

Problem 1: “Context mismatch”

Why: IDE state not synced
Fix: Add explicit sync steps before agent runs

Problem 2: “Outputs clutter the editor”

Why: No output formatting rules
Fix: Define concise summaries and links

Project 41: Multi-Agent Pair Programming Protocol

File: P41-multi-agent-pair-programming-protocol.md
Main Programming Language: Python
Alternative Programming Languages: JavaScript
Coolness Level: 4
Business Potential: 4
Difficulty: 4
Knowledge Area: Collaboration
Software or Tool: Agent coordination
Main Book: “Fundamentals of Software Architecture”

What you’ll build: A protocol for two or more agents to collaborate on a single coding task.

Why it teaches interoperability: Pairing highlights coordination, conflict resolution, and shared context.

Core challenges you’ll face:

Role assignment -> maps to division of labor
Turn-taking -> maps to coordination
Conflict handling -> maps to consistency

Real World Outcome

A collaboration protocol with roles, turn order, and merge rules.

What you will see:

Role definitions: driver, navigator, reviewer
Turn rules: when to hand off control
Conflict resolution: how to merge proposals

The Core Question You’re Answering

“How do multiple agents collaborate without stepping on each other?”

Concepts You Must Understand First

Role separation
- What does each agent specialize in?
Turn-taking
- How do agents avoid simultaneous edits?
Merge policy
- How do you resolve conflicting suggestions?

Questions to Guide Your Design

Role assignment
- Which agent is best as driver vs reviewer?
Conflict resolution
- When does a human decide?

Thinking Exercise

Pair Protocol Draft

Write a three-step cycle for two agents to collaborate on a feature.

The Interview Questions They’ll Ask

“Why use multi-agent pair programming?”
“How do you coordinate agent roles?”
“How do you prevent conflicting edits?”
“What is the role of a human supervisor?”
“How do you measure collaboration success?”

Hints in Layers

Hint 1: Starting Point Define roles and responsibilities for each agent.

Hint 2: Next Level Create turn-taking rules and handoff format.

Hint 3: Technical Details Define a merge policy with conflict rules.

Hint 4: Tools/Debugging Simulate a simple task and observe coordination breakdowns.

Books That Will Help

Topic	Book	Chapter
Collaboration	“Fundamentals of Software Architecture”	Ch. 3
Governance	“Clean Architecture”	Ch. 12
Process	“The Pragmatic Programmer”	Ch. 6

Common Pitfalls & Debugging

Problem 1: “Agents repeat each other”

Why: Overlapping roles
Fix: Clarify responsibilities and review stages

Problem 2: “Conflicts are unresolved”

Why: No merge policy
Fix: Add a human arbitration step

Project 42: Capstone - Interoperable Automation Platform

File: P42-capstone-interoperable-automation-platform.md
Main Programming Language: Go
Alternative Programming Languages: Python, JavaScript
Coolness Level: 5
Business Potential: 5
Difficulty: 5
Knowledge Area: Platform Engineering
Software or Tool: Multi-agent platform
Main Book: “Software Architecture in Practice”

What you’ll build: A full platform design that unifies agents, configs, tools, memory, and safety.

Why it teaches interoperability: It integrates every concept into one coherent system.

Core challenges you’ll face:

System architecture -> maps to platform design
Governance -> maps to policy enforcement
Scalability -> maps to reliability

Real World Outcome

A full platform blueprint with modules, data flows, and governance controls.

What you will see:

Architecture diagram: adapters, routers, storage, UI
Policy layer: approvals, sandbox rules, audits
Operational plan: monitoring, upgrades, failover

The Core Question You’re Answering

“How do you build a production-grade multi-agent automation platform?”

Concepts You Must Understand First

Platform architecture
- What are the core modules and their boundaries?
Governance
- How do you enforce policy across the platform?
Scalability
- How do you support growth without instability?

Questions to Guide Your Design

Core modules
- Which modules are required for interoperability?
Operational controls
- How will you monitor and update the platform?

Thinking Exercise

Platform Map

Draw a top-level map of modules and the data flow between them.

The Interview Questions They’ll Ask

“What are the core modules of a multi-agent platform?”
“How do you enforce governance at scale?”
“How do you handle failures in a platform?”
“What is the role of adapters?”
“How do you prevent lock-in?”

Hints in Layers

Hint 1: Starting Point List the modules from Projects 1-41 and group them.

Hint 2: Next Level Define interfaces between modules.

Hint 3: Technical Details Design a deployment and monitoring plan.

Hint 4: Tools/Debugging Run a tabletop simulation of a platform outage.

Books That Will Help

Topic	Book	Chapter
Architecture	“Software Architecture in Practice”	Ch. 6
Reliability	“Release It!”	Ch. 9
Governance	“Clean Architecture”	Ch. 12

Common Pitfalls & Debugging

Problem 1: “Platform becomes too complex”

Why: Over-engineering
Fix: Keep modules minimal and modular

Problem 2: “Policies are inconsistent”

Why: Governance rules applied unevenly
Fix: Centralize policy enforcement

Project Comparison Table

Project	Difficulty	Time	Depth of Understanding	Fun Factor
1. Agent Capability Matrix	Level 1	Weekend	Medium	★★☆☆☆
2. Config Precedence Detective	Level 2	Weekend	Medium	★★☆☆☆
3. Prompt Contract Spec	Level 2	Weekend	Medium	★★★☆☆
4. Tool Schema Registry	Level 3	1 Week	High	★★★☆☆
5. Subagent Task Router	Level 3	1 Week	High	★★★☆☆
6. Hook Lifecycle Harness	Level 2	Weekend	Medium	★★☆☆☆
7. Extension and Plugin Compatibility Lab	Level 3	1 Week	High	★★★☆☆
8. MCP Gateway Prototype	Level 4	2 Weeks	High	★★★★☆
9. Headless Batch Runner	Level 2	Weekend	Medium	★★★☆☆
10. Interactive Session Recorder	Level 2	Weekend	Medium	★★★☆☆
11. Approval Policy Simulator	Level 3	1 Week	High	★★★☆☆
12. Sandbox Matrix Auditor	Level 2	Weekend	Medium	★★☆☆☆
13. Output Style Normalizer	Level 2	Weekend	Medium	★★★☆☆
14. Multi-Agent Logging Standard	Level 3	1 Week	High	★★★☆☆
15. Error Taxonomy and Retry Controller	Level 3	1 Week	High	★★★☆☆
16. Context Budget Planner	Level 3	1 Week	High	★★★☆☆
17. Memory Import and Export Bridge	Level 3	1 Week	High	★★★☆☆
18. Cross-Agent Workspace Sync	Level 3	1 Week	High	★★★☆☆
19. Secrets Broker Shim	Level 3	1 Week	High	★★★☆☆
20. Test Harness for Agents	Level 4	2 Weeks	High	★★★★☆
21. Prompt Injection Red Team Lab	Level 4	2 Weeks	High	★★★★☆
22. Multi-Agent Code Review Pipeline	Level 3	1 Week	High	★★★☆☆
23. Issue Triage Mesh	Level 3	1 Week	High	★★★☆☆
24. Documentation Generator Federation	Level 3	1 Week	High	★★★☆☆
25. Repo Indexing Strategy	Level 4	2 Weeks	High	★★★★☆
26. Skill and Prompt Pack Manager	Level 3	1 Week	High	★★★☆☆
27. Cross-CLI Command Adapter	Level 4	2 Weeks	High	★★★★☆
28. Event-Driven Agent Bus	Level 5	3 Weeks	Very High	★★★★★
29. Distributed Job Queue	Level 5	3 Weeks	Very High	★★★★★
30. Cost and Latency Budget Enforcer	Level 4	2 Weeks	High	★★★★☆
31. Human-in-the-Loop Gate	Level 3	1 Week	High	★★★☆☆
32. Semantic Diff and Patch Gate	Level 4	2 Weeks	High	★★★★☆
33. Knowledge Base RAG Connector	Level 4	2 Weeks	High	★★★★☆
34. Model Failover Switch	Level 4	2 Weeks	High	★★★★☆
35. Compliance Audit Logger	Level 4	2 Weeks	High	★★★★☆
36. Offline and Edge Mode Playbook	Level 3	1 Week	High	★★★☆☆
37. Multi-tenant Agent Service	Level 5	3 Weeks	Very High	★★★★★
38. Benchmark Suite for Agents	Level 4	2 Weeks	High	★★★★☆
39. Incident Response Automation	Level 4	2 Weeks	High	★★★★☆
40. IDE Bridge Integration	Level 3	1 Week	High	★★★☆☆
41. Multi-Agent Pair Programming Protocol	Level 4	2 Weeks	High	★★★★☆
42. Interoperable Automation Platform	Level 5	4 Weeks	Very High	★★★★★

Recommendation

If you are new to multi-agent interoperability: Start with Project 1 to build a capability baseline. If you are a platform engineer: Start with Project 4 and Project 14 to define tool schemas and logging. If you want production-grade automation: Focus on Projects 28-42.

Final Overall Project: Interoperable Automation Platform

The Goal: Combine Projects 1-41 into a single platform that orchestrates multiple agents safely and reliably.

Build adapters for each CLI
Implement prompt contracts and output normalization
Add tool registry and MCP gateway support
Enforce safety with approvals and sandbox policies
Add observability, logging, and audit trails
Run benchmark suites and cost controls
Deploy as a multi-tenant platform

Success Criteria: A single workflow can route tasks across agents, produce consistent outputs, and pass audit checks.

From Learning to Production: What’s Next?

After completing these projects, you’ve built educational implementations. Here’s how to transition to production-grade systems:

What You Built vs. What Production Needs

Your Project	Production Equivalent	Gap to Fill
Tool Schema Registry	Enterprise tool catalog	Governance and approval process
Event-Driven Agent Bus	Message queue + workflow engine	Monitoring and scaling
Compliance Audit Logger	SIEM integration	Retention and legal controls

Skills You Now Have

You can confidently discuss:

Multi-agent orchestration and task routing
Prompt contracts and output normalization
Governance, approvals, and auditability

You can read source code of:

Agent CLIs and plugin systems
MCP servers and protocol adapters

You can architect:

Multi-agent automation pipelines
Safety and compliance layers

Recommended Next Steps

1. Contribute to Open Source:

MCP and agent tooling repos: add adapters or documentation

2. Build a SaaS Around One Project:

Idea: Multi-agent automation manager with policy controls
Monetization: Tiered usage and enterprise compliance features

3. Get Certified:

Security certification - to strengthen governance and audit practices

Career Paths Unlocked

With this knowledge, you can pursue:

AI tooling platform engineer
Developer productivity engineer
Reliability automation engineer

Summary

This learning path covers multi-agent interoperability through 42 hands-on projects.

#	Project Name	Main Language	Difficulty	Time Estimate
1	Agent Capability Matrix	Python	Level 1	Weekend
2	Config Precedence Detective	Python	Level 2	Weekend
3	Prompt Contract Spec	Python	Level 2	Weekend
4	Tool Schema Registry	Python	Level 3	1 Week
5	Subagent Task Router	JavaScript	Level 3	1 Week
6	Hook Lifecycle Harness	JavaScript	Level 2	Weekend
7	Extension and Plugin Compatibility Lab	JavaScript	Level 3	1 Week
8	MCP Gateway Prototype	Go	Level 4	2 Weeks
9	Headless Batch Runner	Python	Level 2	Weekend
10	Interactive Session Recorder	Python	Level 2	Weekend
11	Approval Policy Simulator	Python	Level 3	1 Week
12	Sandbox Matrix Auditor	Python	Level 2	Weekend
13	Output Style Normalizer	Python	Level 2	Weekend
14	Multi-Agent Logging Standard	Python	Level 3	1 Week
15	Error Taxonomy and Retry Controller	Python	Level 3	1 Week
16	Context Budget Planner	Python	Level 3	1 Week
17	Memory Import and Export Bridge	Python	Level 3	1 Week
18	Cross-Agent Workspace Sync	Python	Level 3	1 Week
19	Secrets Broker Shim	Python	Level 3	1 Week
20	Test Harness for Agents	Python	Level 4	2 Weeks
21	Prompt Injection Red Team Lab	Python	Level 4	2 Weeks
22	Multi-Agent Code Review Pipeline	JavaScript	Level 3	1 Week
23	Issue Triage Mesh	Python	Level 3	1 Week
24	Documentation Generator Federation	JavaScript	Level 3	1 Week
25	Repo Indexing Strategy	Python	Level 4	2 Weeks
26	Skill and Prompt Pack Manager	JavaScript	Level 3	1 Week
27	Cross-CLI Command Adapter	Python	Level 4	2 Weeks
28	Event-Driven Agent Bus	Go	Level 5	3 Weeks
29	Distributed Job Queue	Go	Level 5	3 Weeks
30	Cost and Latency Budget Enforcer	Python	Level 4	2 Weeks
31	Human-in-the-Loop Gate	JavaScript	Level 3	1 Week
32	Semantic Diff and Patch Gate	Python	Level 4	2 Weeks
33	Knowledge Base RAG Connector	Python	Level 4	2 Weeks
34	Model Failover Switch	Python	Level 4	2 Weeks
35	Compliance Audit Logger	Python	Level 4	2 Weeks
36	Offline and Edge Mode Playbook	Python	Level 3	1 Week
37	Multi-tenant Agent Service	Go	Level 5	3 Weeks
38	Benchmark Suite for Agents	Python	Level 4	2 Weeks
39	Incident Response Automation	Python	Level 4	2 Weeks
40	IDE Bridge Integration	JavaScript	Level 3	1 Week
41	Multi-Agent Pair Programming Protocol	Python	Level 4	2 Weeks
42	Interoperable Automation Platform	Go	Level 5	4 Weeks

Expected Outcomes

After completing these projects, you will:

Design interop layers across multiple coding agents
Build prompt contracts and tool schema registries
Orchestrate workflows with safety, approvals, and audits
Evaluate and benchmark agents with objective metrics
Architect a production-grade multi-agent automation platform

You’ll have built a complete, working multi-agent interoperability ecosystem from first principles.

Additional Resources & References

Core CLI and Agent Documentation

https://developers.openai.com/codex/cli
https://developers.openai.com/codex/noninteractive
https://developers.openai.com/codex/config-basic
https://developers.openai.com/codex/config-advanced
https://developers.openai.com/codex/config-reference
https://deepwiki.com/openai/codex
https://deepwiki.com/openai/skills
https://code.claude.com/docs/en/sub-agents
https://code.claude.com/docs/en/cli-reference
https://code.claude.com/docs/en/hooks
https://code.claude.com/docs/en/plugins-reference
https://code.claude.com/docs/en/terminal-config
https://code.claude.com/docs/en/model-config
https://code.claude.com/docs/en/memory
https://code.claude.com/docs/en/plugins
https://code.claude.com/docs/en/skills
https://code.claude.com/docs/en/output-styles
https://code.claude.com/docs/en/hooks-guide
https://code.claude.com/docs/en/headless
https://code.claude.com/docs/en/mcp
https://kiro.dev/docs/cli/chat/subagents/
https://kiro.dev/docs/cli/chat/manage-prompts/
https://kiro.dev/docs/cli/chat/context/
https://kiro.dev/docs/cli/chat/configuration/
https://kiro.dev/docs/cli/custom-agents/
https://kiro.dev/docs/cli/custom-agents/configuration-reference/
https://kiro.dev/docs/cli/code-intelligence/
https://kiro.dev/docs/cli/hooks/
https://kiro.dev/docs/cli/steering/
https://kiro.dev/docs/cli/experimental/
https://kiro.dev/docs/cli/experimental/knowledge-management/
https://github.com/google-gemini/gemini-cli
https://deepwiki.com/google-gemini/gemini-cli
https://geminicli.com/docs/cli/commands/
https://geminicli.com/docs/cli/custom-commands/
https://geminicli.com/docs/cli/headless/
https://geminicli.com/docs/cli/system-prompt/
https://geminicli.com/docs/core/
https://geminicli.com/docs/core/tools-api/
https://geminicli.com/docs/core/memport/
https://geminicli.com/docs/tools/
https://geminicli.com/docs/tools/shell/
https://geminicli.com/docs/hooks/
https://geminicli.com/docs/#extensions
https://geminicli.com/docs/extensions/

MCP and Extensions

https://github.com/ChromeDevTools/chrome-devtools-mcp
https://github.com/Dicklesworthstone/cass_memory_system
https://github.com/gemini-cli-extensions/code-review
https://github.com/gemini-cli-extensions/nanobanana
https://github.com/gemini-cli-extensions/conductor
https://github.com/johnlindquist/mdflow
https://mdflow.dev/

Internal Guides Used

AI_CODING_AGENTS/CLAUDE_CODE_ADVANCED_PROJECTS.md
AI_CODING_AGENTS/CLAUDE_CODE_MASTERY_40_PROJECTS.md
AI_CODING_AGENTS/KIRO_DOCUMENTATION_RESEARCH.md
AI_CODING_AGENTS/LEARN_KIRO_CLI_MASTERY.md
AI_AGENTS_LLM_RAG/LEARN_LLM_MEMORY.md
AI_AGENTS_LLM_RAG/PROMPT_ENGINEERING_PROJECTS.md
AI_AGENTS_LLM_RAG/AI_AGENTS_PROJECTS.md

Books

Foundations (from your library):

“Clean Architecture” by Robert C. Martin
“Designing Data-Intensive Applications” by Martin Kleppmann
“Release It!” by Michael T. Nygard
“Continuous Delivery” by David Farley and Jez Humble
“The Pragmatic Programmer” by Andrew Hunt and David Thomas
“Fundamentals of Software Architecture” by Mark Richards and Neal Ford
“AI Engineering” by Chip Huyen
“Security in Computing” by Charles Pfleeger