Project 1: Role-Defined Orchestrator
Build a role-based orchestrator that routes tasks to specialized agents with explicit contracts and validation.
Quick Reference
| Attribute | Value |
|---|---|
| Difficulty | Level 3 |
| Time Estimate | 8-12 hours |
| Language | Python (Alternatives: TypeScript, Go) |
| Prerequisites | Basic LLM prompting, structured outputs, logging |
| Key Topics | Roles, autonomy, escalation, validation |
1. Learning Objectives
By completing this project, you will:
- Define explicit role contracts for planner, researcher, and critic agents.
- Implement escalation paths when confidence is low.
- Build validation checks that enforce output structure and evidence.
- Produce a trace log that makes every decision auditable.
2. Theoretical Foundation
2.1 Core Concepts
- Role Contracts: A role is a contract specifying scope, inputs, outputs, and constraints.
- Autonomy vs. Control: Autonomy improves flexibility but increases risk without guardrails.
- Escalation Paths: When confidence is low, agents should request help rather than guess.
2.2 Why This Matters
Without explicit roles, agents overlap, duplicate work, or miss critical tasks. Role contracts make responsibilities testable and allow the system to detect failures early.
2.3 Historical Context / Background
Early distributed AI used role-specific agents (e.g., blackboard systems). Modern LLMs make role specification easier but do not remove the need for clear contracts.
2.4 Common Misconceptions
- “Roles are just prompts.” Roles are system-level interfaces with validation rules.
- “More autonomy always helps.” It can reduce reliability without guardrails.
3. Project Specification
3.1 What You Will Build
A small orchestrator that accepts a task, delegates to three roles, validates outputs, and returns a final result with trace logs.
3.2 Functional Requirements
- Role Registry: Store role contracts and descriptions.
- Task Router: Assign task segments to roles.
- Validation Layer: Ensure outputs match role contracts.
- Escalation Logic: Trigger revisions when outputs fail validation.
3.3 Non-Functional Requirements
- Reliability: Outputs must be traceable to a role.
- Usability: Logs must be readable and structured.
- Safety: No agent can bypass validation.
3.4 Example Usage / Output
$ run-orchestrator --task "Summarize a topic with sources and risks"
[Planner] plan created (3 steps)
[Researcher] 4 sources collected
[Critic] 2 claims validated, 1 claim flagged
[Orchestrator] final summary ready (trace id: T-001)
3.5 Real World Outcome
A user submits a task. The system returns a structured answer plus a trace log showing which agent produced each piece of information and which sources were validated.
4. Solution Architecture
4.1 High-Level Design
User Task -> Orchestrator -> [Planner, Researcher, Critic] -> Validation -> Final Output
4.2 Key Components
| Component | Responsibility | Key Decisions |
|---|---|---|
| Role Registry | Define contracts | Human-defined schema |
| Task Router | Split tasks by role | Simple rule-based routing |
| Validator | Check outputs | Schema + evidence checks |
| Escalation Manager | Handle failures | Retry and fallback policies |
4.3 Data Structures
Pseudo-structures:
STRUCT RoleContract:
name
required_inputs
required_outputs
constraints
done_condition
STRUCT TraceEvent:
task_id
role
timestamp
status
notes
4.4 Algorithm Overview
Role Routing Algorithm
- Parse task into subtasks.
- Assign subtasks to roles based on contract.
- Validate each output.
- Escalate or accept.
Complexity Analysis:
- Time: O(R * T) where R = roles, T = subtasks
- Space: O(T) for logs
5. Implementation Guide
5.1 Development Environment Setup
Install a runtime (Python or Node), set up a basic logging folder, and verify you can run a sample task.
5.2 Project Structure
project-root/
├── roles/
├── orchestrator/
├── validation/
├── logs/
└── README.md
5.3 The Core Question You’re Answering
“How do I assign clear responsibilities to agents so their outputs are reliable and auditable?”
5.4 Concepts You Must Understand First
- Role Contracts
- What is a role responsible for?
- Book Reference: “Clean Architecture” - Ch. 11
- Validation
- How do you check structure and evidence?
- Book Reference: “Release It!” - Ch. 4
5.5 Questions to Guide Your Design
- Role Boundaries
- Where does one role’s responsibility end?
- Escalation
- What triggers a retry or fallback?
5.6 Thinking Exercise
Sketch a flow of a task from Planner to Critic. Identify where errors could happen and how you’d detect them.
5.7 The Interview Questions They’ll Ask
- “How do you design role contracts for LLM agents?”
- “What is the difference between validation and evaluation?”
- “How do you prevent role overlap?”
- “What is an escalation policy?”
- “How do you audit agent output?”
5.8 Hints in Layers
Hint 1: Start with role definitions Write a minimal contract for each role.
Hint 2: Structure outputs Enforce a consistent output format.
Hint 3: Add validation Reject outputs missing evidence.
Hint 4: Add escalation Retry once, then fallback to a critic.
5.9 Books That Will Help
| Topic | Book | Chapter |
|---|---|---|
| Role boundaries | “Clean Architecture” | Ch. 11 |
| Reliability | “Release It!” | Ch. 4 |
5.10 Implementation Phases
Phase 1: Foundation (2-3 hours)
Goals:
- Define role contracts
- Create a basic router
Tasks:
- List role inputs/outputs
- Define a routing policy
Checkpoint: Able to route a task to roles.
Phase 2: Core Functionality (3-4 hours)
Goals:
- Implement validation checks
- Add trace logging
Tasks:
- Validate output structure
- Record logs for each step
Checkpoint: Logs show all role outputs.
Phase 3: Polish & Edge Cases (2-3 hours)
Goals:
- Add escalation
- Handle low-confidence outputs
Tasks:
- Add retry policy
- Add fallback to critic
Checkpoint: System rejects invalid outputs and escalates.
5.11 Key Implementation Decisions
| Decision | Options | Recommendation | Rationale |
|---|---|---|---|
| Output format | Free text vs schema | Schema | Enables validation |
| Escalation | Retry vs human | Retry then critic | Keeps workflow automated |
6. Testing Strategy
6.1 Test Categories
| Category | Purpose | Examples |
|---|---|---|
| Unit Tests | Validate roles and schema | Role contract validation |
| Integration Tests | Full pipeline | Task routed through all agents |
| Edge Case Tests | Missing evidence | Reject output |
6.2 Critical Test Cases
- Missing Evidence: Output without sources should fail.
- Invalid Schema: Output without required fields should fail.
- Low Confidence: Trigger escalation.
6.3 Test Data
Task: "Summarize X"
Expected: summary + evidence + risks
7. Common Pitfalls & Debugging
7.1 Frequent Mistakes
| Pitfall | Symptom | Solution |
|---|---|---|
| Role overlap | Duplicate outputs | Narrow role contracts |
| Missing validation | Hallucinations | Enforce schema checks |
| No escalation | Stuck loops | Add timeouts |
7.2 Debugging Strategies
- Trace logs by task ID.
- Compare outputs to role contract expectations.
7.3 Performance Traps
- Too many retries can explode cost.
- Overly strict validation may cause infinite loops.
8. Extensions & Challenges
8.1 Beginner Extensions
- Add a new “Editor” role.
- Add confidence scoring.
8.2 Intermediate Extensions
- Add a role registry UI.
- Add a human escalation path.
8.3 Advanced Extensions
- Auto-generate role contracts from templates.
- Add multi-task batch routing.
9. Real-World Connections
9.1 Industry Applications
- Agentic copilots for software delivery.
- Multi-stage compliance reviews.
9.2 Related Open Source Projects
- LangGraph (workflow-oriented agent orchestration)
- AutoGen (multi-agent collaboration frameworks)
9.3 Interview Relevance
- Explaining role boundaries and validation policies is a common interview topic.
10. Resources
10.1 Essential Reading
- “Clean Architecture” by Robert C. Martin - Role boundaries and interfaces
- “Release It!” by Michael Nygard - Reliability and failure handling
10.2 Video Resources
- Architecture trade-offs talks (search for “software architecture trade-offs”)
10.3 Tools & Documentation
- FIPA ACL Specification: http://www.fipa.org/specs/fipa00061/
10.4 Related Projects in This Series
- Next Project: Planning Board with Delegation (P02)
11. Self-Assessment Checklist
11.1 Understanding
- I can define role contracts with explicit constraints
- I can explain escalation policies
11.2 Implementation
- Role outputs are validated
- Trace logs are produced
11.3 Growth
- I can describe trade-offs in autonomy vs control
12. Submission / Completion Criteria
Minimum Viable Completion:
- Role contracts are documented
- Validation checks exist
- Logs are produced
Full Completion:
- Escalation works
- Failure cases are tested
Excellence (Going Above & Beyond):
- Human escalation and audit UI added
This guide was generated from LEARN_COMPLEX_MULTI_AGENT_SYSTEMS_DEEP_DIVE.md. For the complete learning path, see the README.