Project 1: Role-Defined Orchestrator

Build a role-based orchestrator that routes tasks to specialized agents with explicit contracts and validation.

Quick Reference

Attribute Value
Difficulty Level 3
Time Estimate 8-12 hours
Language Python (Alternatives: TypeScript, Go)
Prerequisites Basic LLM prompting, structured outputs, logging
Key Topics Roles, autonomy, escalation, validation

1. Learning Objectives

By completing this project, you will:

  1. Define explicit role contracts for planner, researcher, and critic agents.
  2. Implement escalation paths when confidence is low.
  3. Build validation checks that enforce output structure and evidence.
  4. Produce a trace log that makes every decision auditable.

2. Theoretical Foundation

2.1 Core Concepts

  • Role Contracts: A role is a contract specifying scope, inputs, outputs, and constraints.
  • Autonomy vs. Control: Autonomy improves flexibility but increases risk without guardrails.
  • Escalation Paths: When confidence is low, agents should request help rather than guess.

2.2 Why This Matters

Without explicit roles, agents overlap, duplicate work, or miss critical tasks. Role contracts make responsibilities testable and allow the system to detect failures early.

2.3 Historical Context / Background

Early distributed AI used role-specific agents (e.g., blackboard systems). Modern LLMs make role specification easier but do not remove the need for clear contracts.

2.4 Common Misconceptions

  • “Roles are just prompts.” Roles are system-level interfaces with validation rules.
  • “More autonomy always helps.” It can reduce reliability without guardrails.

3. Project Specification

3.1 What You Will Build

A small orchestrator that accepts a task, delegates to three roles, validates outputs, and returns a final result with trace logs.

3.2 Functional Requirements

  1. Role Registry: Store role contracts and descriptions.
  2. Task Router: Assign task segments to roles.
  3. Validation Layer: Ensure outputs match role contracts.
  4. Escalation Logic: Trigger revisions when outputs fail validation.

3.3 Non-Functional Requirements

  • Reliability: Outputs must be traceable to a role.
  • Usability: Logs must be readable and structured.
  • Safety: No agent can bypass validation.

3.4 Example Usage / Output

$ run-orchestrator --task "Summarize a topic with sources and risks"

[Planner] plan created (3 steps)
[Researcher] 4 sources collected
[Critic] 2 claims validated, 1 claim flagged
[Orchestrator] final summary ready (trace id: T-001)

3.5 Real World Outcome

A user submits a task. The system returns a structured answer plus a trace log showing which agent produced each piece of information and which sources were validated.


4. Solution Architecture

4.1 High-Level Design

User Task -> Orchestrator -> [Planner, Researcher, Critic] -> Validation -> Final Output

4.2 Key Components

Component Responsibility Key Decisions
Role Registry Define contracts Human-defined schema
Task Router Split tasks by role Simple rule-based routing
Validator Check outputs Schema + evidence checks
Escalation Manager Handle failures Retry and fallback policies

4.3 Data Structures

Pseudo-structures:

STRUCT RoleContract:
  name
  required_inputs
  required_outputs
  constraints
  done_condition

STRUCT TraceEvent:
  task_id
  role
  timestamp
  status
  notes

4.4 Algorithm Overview

Role Routing Algorithm

  1. Parse task into subtasks.
  2. Assign subtasks to roles based on contract.
  3. Validate each output.
  4. Escalate or accept.

Complexity Analysis:

  • Time: O(R * T) where R = roles, T = subtasks
  • Space: O(T) for logs

5. Implementation Guide

5.1 Development Environment Setup

Install a runtime (Python or Node), set up a basic logging folder, and verify you can run a sample task.

5.2 Project Structure

project-root/
├── roles/
├── orchestrator/
├── validation/
├── logs/
└── README.md

5.3 The Core Question You’re Answering

“How do I assign clear responsibilities to agents so their outputs are reliable and auditable?”

5.4 Concepts You Must Understand First

  1. Role Contracts
    • What is a role responsible for?
    • Book Reference: “Clean Architecture” - Ch. 11
  2. Validation
    • How do you check structure and evidence?
    • Book Reference: “Release It!” - Ch. 4

5.5 Questions to Guide Your Design

  1. Role Boundaries
    • Where does one role’s responsibility end?
  2. Escalation
    • What triggers a retry or fallback?

5.6 Thinking Exercise

Sketch a flow of a task from Planner to Critic. Identify where errors could happen and how you’d detect them.

5.7 The Interview Questions They’ll Ask

  1. “How do you design role contracts for LLM agents?”
  2. “What is the difference between validation and evaluation?”
  3. “How do you prevent role overlap?”
  4. “What is an escalation policy?”
  5. “How do you audit agent output?”

5.8 Hints in Layers

Hint 1: Start with role definitions Write a minimal contract for each role.

Hint 2: Structure outputs Enforce a consistent output format.

Hint 3: Add validation Reject outputs missing evidence.

Hint 4: Add escalation Retry once, then fallback to a critic.


5.9 Books That Will Help

Topic Book Chapter
Role boundaries “Clean Architecture” Ch. 11
Reliability “Release It!” Ch. 4

5.10 Implementation Phases

Phase 1: Foundation (2-3 hours)

Goals:

  • Define role contracts
  • Create a basic router

Tasks:

  1. List role inputs/outputs
  2. Define a routing policy

Checkpoint: Able to route a task to roles.

Phase 2: Core Functionality (3-4 hours)

Goals:

  • Implement validation checks
  • Add trace logging

Tasks:

  1. Validate output structure
  2. Record logs for each step

Checkpoint: Logs show all role outputs.

Phase 3: Polish & Edge Cases (2-3 hours)

Goals:

  • Add escalation
  • Handle low-confidence outputs

Tasks:

  1. Add retry policy
  2. Add fallback to critic

Checkpoint: System rejects invalid outputs and escalates.

5.11 Key Implementation Decisions

Decision Options Recommendation Rationale
Output format Free text vs schema Schema Enables validation
Escalation Retry vs human Retry then critic Keeps workflow automated

6. Testing Strategy

6.1 Test Categories

Category Purpose Examples
Unit Tests Validate roles and schema Role contract validation
Integration Tests Full pipeline Task routed through all agents
Edge Case Tests Missing evidence Reject output

6.2 Critical Test Cases

  1. Missing Evidence: Output without sources should fail.
  2. Invalid Schema: Output without required fields should fail.
  3. Low Confidence: Trigger escalation.

6.3 Test Data

Task: "Summarize X"
Expected: summary + evidence + risks

7. Common Pitfalls & Debugging

7.1 Frequent Mistakes

Pitfall Symptom Solution
Role overlap Duplicate outputs Narrow role contracts
Missing validation Hallucinations Enforce schema checks
No escalation Stuck loops Add timeouts

7.2 Debugging Strategies

  • Trace logs by task ID.
  • Compare outputs to role contract expectations.

7.3 Performance Traps

  • Too many retries can explode cost.
  • Overly strict validation may cause infinite loops.

8. Extensions & Challenges

8.1 Beginner Extensions

  • Add a new “Editor” role.
  • Add confidence scoring.

8.2 Intermediate Extensions

  • Add a role registry UI.
  • Add a human escalation path.

8.3 Advanced Extensions

  • Auto-generate role contracts from templates.
  • Add multi-task batch routing.

9. Real-World Connections

9.1 Industry Applications

  • Agentic copilots for software delivery.
  • Multi-stage compliance reviews.
  • LangGraph (workflow-oriented agent orchestration)
  • AutoGen (multi-agent collaboration frameworks)

9.3 Interview Relevance

  • Explaining role boundaries and validation policies is a common interview topic.

10. Resources

10.1 Essential Reading

  • “Clean Architecture” by Robert C. Martin - Role boundaries and interfaces
  • “Release It!” by Michael Nygard - Reliability and failure handling

10.2 Video Resources

  • Architecture trade-offs talks (search for “software architecture trade-offs”)

10.3 Tools & Documentation

  • FIPA ACL Specification: http://www.fipa.org/specs/fipa00061/
  • Next Project: Planning Board with Delegation (P02)

11. Self-Assessment Checklist

11.1 Understanding

  • I can define role contracts with explicit constraints
  • I can explain escalation policies

11.2 Implementation

  • Role outputs are validated
  • Trace logs are produced

11.3 Growth

  • I can describe trade-offs in autonomy vs control

12. Submission / Completion Criteria

Minimum Viable Completion:

  • Role contracts are documented
  • Validation checks exist
  • Logs are produced

Full Completion:

  • Escalation works
  • Failure cases are tested

Excellence (Going Above & Beyond):

  • Human escalation and audit UI added

This guide was generated from LEARN_COMPLEX_MULTI_AGENT_SYSTEMS_DEEP_DIVE.md. For the complete learning path, see the README.