Project 24: Architecture Pattern Decision Lab

Benchmark reactive, planner, critic-loop, and multi-agent architectures on one workload.


Quick Reference

Attribute Value
Difficulty Level 4: Expert
Time Estimate 12-22 hours
Language TypeScript (alt: Python)
Prerequisites Projects 5, 8, 13
Key Topics pattern tradeoffs, memory design, workflow durability, tool contracts

Learning Objectives

  1. Compare architecture patterns with the same task and metrics.
  2. Design memory and state boundaries for long-running workflows.
  3. Enforce deterministic tool contracts across patterns.
  4. Produce an evidence-backed architecture decision record.

The Core Question You’re Answering

“Which architecture pattern is optimal for this workload and why?”


Concepts You Must Understand First

Concept Why It Matters Where to Learn
Reactive vs deliberative control Shapes latency and quality ReAct resources
Critic loops Improves correctness at cost reflexion references
Durable workflow state Enables long-running jobs workflow-engine docs
Memory taxonomy Prevents context bloat memory architecture papers

Theoretical Foundation

Workload -> Pattern Runner -> Metrics Collector -> Tradeoff Matrix -> ADR

Pattern choice should be data-driven, not framework-driven.


Project Specification

What You’ll Build

A harness that runs one business workflow through four patterns and records quality, latency, cost, and operational complexity.

Functional Requirements

  1. Four pattern implementations with shared tool contracts
  2. Common workload and evaluator
  3. Metrics and comparison report
  4. Architecture decision recommendation

Non-Functional Requirements

  • Reproducible benchmark runs
  • Comparable prompt and tool budgets
  • Explicit decision criteria

Real World Outcome

$ npm run p24:compare -- --workload "vendor_onboarding"
[reactive] success=0.71 latency=2.2s cost=$0.03
[planner] success=0.86 latency=4.9s cost=$0.07
[critic] success=0.90 latency=7.4s cost=$0.11
[hierarchical] success=0.92 latency=8.1s cost=$0.14
[artifact] architecture_decision_matrix.md

Architecture Overview

Pattern Adapters -> Shared Tool Layer -> Common Evaluator -> Decision Reporter

Implementation Guide

Phase 1: Shared Interfaces

  • Normalize state and tool contract schemas.

Phase 2: Pattern Implementations

  • Build and run all pattern variants.

Phase 3: Decision Artifact

  • Publish tradeoff matrix + recommendation.

Testing Strategy

  • Interface conformance tests
  • Repeatability tests
  • Tool contract validation tests

Common Pitfalls & Debugging

Pitfall Symptom Fix
Unequal budgets invalid comparisons lock prompt/tool budgets
Hidden state non-reproducible results explicit state snapshots
Coordination overhead multi-agent underperforms add only necessary specialists

Interview Questions They’ll Ask

  1. When is reactive architecture enough?
  2. What does critic loop improve and cost?
  3. How do you compare pattern fairness?
  4. How do you decide stateful vs stateless runtime?

Hints in Layers

  • Hint 1: Fix workload before coding patterns.
  • Hint 2: Use one evaluator for all variants.
  • Hint 3: Track operational complexity explicitly.
  • Hint 4: Write ADR from measured evidence only.

Submission / Completion Criteria

Minimum Completion

  • Four patterns benchmarked on common workload

Full Completion

  • Tradeoff matrix + documented recommendation

Excellence

  • Decision reproducible under reruns and stress inputs