Project 19: Legacy Agent Migration to Graph Runtime
Migrate a chain-style legacy agent into a graph/state-machine runtime while preserving behavior and reducing failure rates.
Quick Reference
| Attribute | Value |
|---|---|
| Difficulty | Level 3: Advanced |
| Time Estimate | 12-24 hours |
| Language | Python (alt: TypeScript) |
| Prerequisites | Projects 5, 9, 13 |
| Key Topics | characterization tests, migration strategy, shadow rollout |
Learning Objectives
- Freeze legacy behavior with characterization tests.
- Refactor implicit chain state into explicit graph state.
- Run shadow comparisons and quantify parity gaps.
- Roll out with canary and automatic rollback rules.
The Core Question You’re Answering
“How do you modernize agent architecture without breaking production behavior?”
Concepts You Must Understand First
| Concept | Why It Matters | Where to Learn |
|---|---|---|
| Characterization testing | Protects behavior during refactors | refactoring literature |
| Graph execution | Explicit transitions and guard conditions | LangGraph docs |
| Canary/shadow rollout | Safe deployment for risky migrations | SRE release practices |
Theoretical Foundation
Legacy Chain -> Characterization Suite -> Graph Refactor -> Shadow Compare -> Canary Rollout
Migration quality is measured by parity and reliability, not aesthetic code changes.
Project Specification
What You’ll Build
A migration toolkit that:
- Captures legacy outputs and traces
- Runs equivalent graph runtime path
- Reports pass/regression deltas
- Supports canary promotion/rollback
Functional Requirements
- Legacy baseline capture
- Graph node/edge mapping
- Shadow comparison report
- Rollout guardrails and rollback trigger
Non-Functional Requirements
- Reproducible test slices
- Fast rollback path
- Clear ownership of parity exceptions
Real World Outcome
$ python p19_migrate.py --mode shadow
[legacy] pass_rate=71%
[graph] pass_rate=79%
[delta] +12 improved, -3 regressed
[rollout] canary enabled at 10%
[artifact] migration_report.md + parity_failures.csv
Architecture Overview
Traffic Mirror -> Legacy Runtime + Graph Runtime -> Comparator -> Rollout Controller
Implementation Guide
Phase 1: Baseline Lock
- Build characterization suite from real workloads.
Phase 2: Graph Refactor
- Node-by-node migration with feature flags.
Phase 3: Rollout and Governance
- Canary policies, rollback triggers, parity debt log.
Testing Strategy
- Historical replay test set
- Diff-based output comparison tests
- Canary health checks and rollback drills
Common Pitfalls & Debugging
| Pitfall | Symptom | Fix |
|---|---|---|
| Unfrozen baseline | shifting parity target | snapshot baseline dataset |
| Hidden implicit state | unexplained regressions | define explicit state schema |
| Slow rollback | prolonged incident risk | prewired kill switch and fallback path |
Interview Questions They’ll Ask
- Why migrate from chain to graph architecture?
- How do you define acceptable parity drift?
- How do you operationalize shadow mode?
- What triggers automatic rollback?
Hints in Layers
- Hint 1: Freeze baseline before touching architecture.
- Hint 2: Migrate one branch at a time.
- Hint 3: Compare traces, not only outputs.
- Hint 4: Keep legacy fallback live until stable.
Submission / Completion Criteria
Minimum Completion
- Shadow mode comparison between legacy and graph runtime
Full Completion
- Canary rollout with rollback automation
Excellence
- Quantified reliability gains with minimal parity debt