Project 1: Role-Defined Orchestrator

Build a role-based orchestrator that routes tasks to specialized agents with explicit contracts and validation.

Quick Reference

Attribute	Value
Difficulty	Level 3
Time Estimate	8-12 hours
Language	Python (Alternatives: TypeScript, Go)
Prerequisites	Basic LLM prompting, structured outputs, logging
Key Topics	Roles, autonomy, escalation, validation

1. Learning Objectives

By completing this project, you will:

Define explicit role contracts for planner, researcher, and critic agents.
Implement escalation paths when confidence is low.
Build validation checks that enforce output structure and evidence.
Produce a trace log that makes every decision auditable.

2. Theoretical Foundation

2.1 Core Concepts

Role Contracts: A role is a contract specifying scope, inputs, outputs, and constraints.
Autonomy vs. Control: Autonomy improves flexibility but increases risk without guardrails.
Escalation Paths: When confidence is low, agents should request help rather than guess.

2.2 Why This Matters

Without explicit roles, agents overlap, duplicate work, or miss critical tasks. Role contracts make responsibilities testable and allow the system to detect failures early.

2.3 Historical Context / Background

Early distributed AI used role-specific agents (e.g., blackboard systems). Modern LLMs make role specification easier but do not remove the need for clear contracts.

2.4 Common Misconceptions

“Roles are just prompts.” Roles are system-level interfaces with validation rules.
“More autonomy always helps.” It can reduce reliability without guardrails.

3. Project Specification

3.1 What You Will Build

A small orchestrator that accepts a task, delegates to three roles, validates outputs, and returns a final result with trace logs.

3.2 Functional Requirements

Role Registry: Store role contracts and descriptions.
Task Router: Assign task segments to roles.
Validation Layer: Ensure outputs match role contracts.
Escalation Logic: Trigger revisions when outputs fail validation.

3.3 Non-Functional Requirements

Reliability: Outputs must be traceable to a role.
Usability: Logs must be readable and structured.
Safety: No agent can bypass validation.

3.4 Example Usage / Output

$ run-orchestrator --task "Summarize a topic with sources and risks"

[Planner] plan created (3 steps)
[Researcher] 4 sources collected
[Critic] 2 claims validated, 1 claim flagged
[Orchestrator] final summary ready (trace id: T-001)

3.5 Real World Outcome

A user submits a task. The system returns a structured answer plus a trace log showing which agent produced each piece of information and which sources were validated.

4. Solution Architecture

4.1 High-Level Design

User Task -> Orchestrator -> [Planner, Researcher, Critic] -> Validation -> Final Output

4.2 Key Components

Component	Responsibility	Key Decisions
Role Registry	Define contracts	Human-defined schema
Task Router	Split tasks by role	Simple rule-based routing
Validator	Check outputs	Schema + evidence checks
Escalation Manager	Handle failures	Retry and fallback policies

4.3 Data Structures

Pseudo-structures:

STRUCT RoleContract:
  name
  required_inputs
  required_outputs
  constraints
  done_condition

STRUCT TraceEvent:
  task_id
  role
  timestamp
  status
  notes

4.4 Algorithm Overview

Role Routing Algorithm

Parse task into subtasks.
Assign subtasks to roles based on contract.
Validate each output.
Escalate or accept.

Complexity Analysis:

Time: O(R * T) where R = roles, T = subtasks
Space: O(T) for logs

5. Implementation Guide

5.1 Development Environment Setup

Install a runtime (Python or Node), set up a basic logging folder, and verify you can run a sample task.

5.2 Project Structure

project-root/
├── roles/
├── orchestrator/
├── validation/
├── logs/
└── README.md

5.3 The Core Question You’re Answering

“How do I assign clear responsibilities to agents so their outputs are reliable and auditable?”

5.4 Concepts You Must Understand First

Role Contracts
- What is a role responsible for?
- Book Reference: “Clean Architecture” - Ch. 11
Validation
- How do you check structure and evidence?
- Book Reference: “Release It!” - Ch. 4

5.5 Questions to Guide Your Design

Role Boundaries
- Where does one role’s responsibility end?
Escalation
- What triggers a retry or fallback?

5.6 Thinking Exercise

Sketch a flow of a task from Planner to Critic. Identify where errors could happen and how you’d detect them.

5.7 The Interview Questions They’ll Ask

“How do you design role contracts for LLM agents?”
“What is the difference between validation and evaluation?”
“How do you prevent role overlap?”
“What is an escalation policy?”
“How do you audit agent output?”

5.8 Hints in Layers

Hint 1: Start with role definitions Write a minimal contract for each role.

Hint 2: Structure outputs Enforce a consistent output format.

Hint 3: Add validation Reject outputs missing evidence.

Hint 4: Add escalation Retry once, then fallback to a critic.

5.9 Books That Will Help

Topic	Book	Chapter
Role boundaries	“Clean Architecture”	Ch. 11
Reliability	“Release It!”	Ch. 4

5.10 Implementation Phases

Phase 1: Foundation (2-3 hours)

Goals:

Define role contracts
Create a basic router

Tasks:

List role inputs/outputs
Define a routing policy

Checkpoint: Able to route a task to roles.

Phase 2: Core Functionality (3-4 hours)

Goals:

Implement validation checks
Add trace logging

Tasks:

Validate output structure
Record logs for each step

Checkpoint: Logs show all role outputs.

Phase 3: Polish & Edge Cases (2-3 hours)

Goals:

Add escalation
Handle low-confidence outputs

Tasks:

Add retry policy
Add fallback to critic

Checkpoint: System rejects invalid outputs and escalates.

5.11 Key Implementation Decisions

Decision	Options	Recommendation	Rationale
Output format	Free text vs schema	Schema	Enables validation
Escalation	Retry vs human	Retry then critic	Keeps workflow automated

6. Testing Strategy

6.1 Test Categories

Category	Purpose	Examples
Unit Tests	Validate roles and schema	Role contract validation
Integration Tests	Full pipeline	Task routed through all agents
Edge Case Tests	Missing evidence	Reject output

6.2 Critical Test Cases

Missing Evidence: Output without sources should fail.
Invalid Schema: Output without required fields should fail.
Low Confidence: Trigger escalation.

6.3 Test Data

Task: "Summarize X"
Expected: summary + evidence + risks

7. Common Pitfalls & Debugging

7.1 Frequent Mistakes

Pitfall	Symptom	Solution
Role overlap	Duplicate outputs	Narrow role contracts
Missing validation	Hallucinations	Enforce schema checks
No escalation	Stuck loops	Add timeouts

7.2 Debugging Strategies

Trace logs by task ID.
Compare outputs to role contract expectations.

7.3 Performance Traps

Too many retries can explode cost.
Overly strict validation may cause infinite loops.

8. Extensions & Challenges

8.1 Beginner Extensions

Add a new “Editor” role.
Add confidence scoring.

8.2 Intermediate Extensions

Add a role registry UI.
Add a human escalation path.

8.3 Advanced Extensions

Auto-generate role contracts from templates.
Add multi-task batch routing.

9. Real-World Connections

9.1 Industry Applications

Agentic copilots for software delivery.
Multi-stage compliance reviews.

LangGraph (workflow-oriented agent orchestration)
AutoGen (multi-agent collaboration frameworks)

9.3 Interview Relevance

Explaining role boundaries and validation policies is a common interview topic.

10. Resources

10.1 Essential Reading

“Clean Architecture” by Robert C. Martin - Role boundaries and interfaces
“Release It!” by Michael Nygard - Reliability and failure handling

10.2 Video Resources

Architecture trade-offs talks (search for “software architecture trade-offs”)

10.3 Tools & Documentation

FIPA ACL Specification: http://www.fipa.org/specs/fipa00061/

Next Project: Planning Board with Delegation (P02)

11. Self-Assessment Checklist

11.1 Understanding

I can define role contracts with explicit constraints
I can explain escalation policies

11.2 Implementation

Role outputs are validated
Trace logs are produced

11.3 Growth

I can describe trade-offs in autonomy vs control

12. Submission / Completion Criteria

Minimum Viable Completion:

Role contracts are documented
Validation checks exist
Logs are produced

Full Completion:

Escalation works
Failure cases are tested

Excellence (Going Above & Beyond):

Human escalation and audit UI added

This guide was generated from LEARN_COMPLEX_MULTI_AGENT_SYSTEMS_DEEP_DIVE.md. For the complete learning path, see the README.

Project 1: Role-Defined Orchestrator

Quick Reference

1. Learning Objectives

2. Theoretical Foundation

2.1 Core Concepts

2.2 Why This Matters

2.3 Historical Context / Background

2.4 Common Misconceptions

3. Project Specification

3.1 What You Will Build

3.2 Functional Requirements

3.3 Non-Functional Requirements

3.4 Example Usage / Output

3.5 Real World Outcome

4. Solution Architecture

4.1 High-Level Design

4.2 Key Components

4.3 Data Structures

4.4 Algorithm Overview

5. Implementation Guide

5.1 Development Environment Setup

5.2 Project Structure

5.3 The Core Question You’re Answering

5.4 Concepts You Must Understand First

5.5 Questions to Guide Your Design

5.6 Thinking Exercise

5.7 The Interview Questions They’ll Ask

5.8 Hints in Layers

5.9 Books That Will Help

5.10 Implementation Phases

Phase 1: Foundation (2-3 hours)

Phase 2: Core Functionality (3-4 hours)

Phase 3: Polish & Edge Cases (2-3 hours)

5.11 Key Implementation Decisions

6. Testing Strategy

6.1 Test Categories

6.2 Critical Test Cases

6.3 Test Data

7. Common Pitfalls & Debugging

7.1 Frequent Mistakes

7.2 Debugging Strategies

7.3 Performance Traps

8. Extensions & Challenges

8.1 Beginner Extensions

8.2 Intermediate Extensions

8.3 Advanced Extensions

9. Real-World Connections

9.1 Industry Applications

9.2 Related Open Source Projects

9.3 Interview Relevance

10. Resources

10.1 Essential Reading

10.2 Video Resources

10.3 Tools & Documentation

10.4 Related Projects in This Series

11. Self-Assessment Checklist

11.1 Understanding

11.2 Implementation

11.3 Growth

12. Submission / Completion Criteria