Project 26: Deployment and Infrastructure Blueprint for Agent Scale
Compare hosting topologies, enforce scaling controls, and implement model/provider failover.
Quick Reference
| Attribute | Value |
|---|---|
| Difficulty | Level 4: Expert |
| Time Estimate | 12-22 hours |
| Language | TypeScript + IaC |
| Prerequisites | Projects 13, 17, 18 |
| Key Topics | serverless vs workers, queues, tenancy, rate limiting, failover |
Learning Objectives
- Choose execution topology by task profile.
- Implement queue and backpressure controls.
- Add model/provider failover and route policies.
- Validate multi-tenant behavior under load.
The Core Question You’re Answering
“Which infrastructure design sustains reliability and margin at production traffic levels?”
Concepts You Must Understand First
| Concept | Why It Matters | Where to Learn |
|---|---|---|
| Queue-driven architecture | Controls burst traffic | distributed systems references |
| Backpressure and rate limits | prevents cascading failures | SRE patterns |
| Multi-model routing | balances cost/latency/quality | model routing literature |
| Tenant isolation | protects enterprise workloads | platform architecture guidance |
Theoretical Foundation
Ingress -> Queue -> Worker Pool -> Tool/Model Router -> Output + Metrics
Infrastructure is a set of explicit tradeoffs, not a default cloud template.
Project Specification
What You’ll Build
A deployment blueprint with comparative benchmark results for:
- Serverless burst path
- Long-running worker path
- Queue-based async path
- Provider failover path
Functional Requirements
- Topology benchmark runner
- Rate-limit aware queue controls
- Model/provider failover policy
- Tenant-aware resource controls
Non-Functional Requirements
- Reproducible load tests
- Cost per task visibility
- Controlled degradation behavior
Real World Outcome
$ make p26-load-test
[topology] workers=24 serverless_burst=enabled
[throughput] rpm=1800 success=96.8% p95=3.4s
[failover] provider_b_activations=39
[cost] avg=$0.019/task
[artifact] scaling_runbook.md
Architecture Overview
API Gateway -> Queue -> Workers -> Model Router -> Result Bus -> Monitoring
Implementation Guide
Phase 1: Baseline Topology
- Implement and benchmark one topology first.
Phase 2: Scale Controls
- Add queue, rate-limit, and tenant controls.
Phase 3: Failover + Economics
- Add provider failover and cost reporting.
Testing Strategy
- Burst traffic tests
- Provider outage simulations
- Tenant-noisy-neighbor tests
Common Pitfalls & Debugging
| Pitfall | Symptom | Fix |
|---|---|---|
| Wrong topology for long tasks | timeout/cost spikes | move to durable workers |
| Failover quality drop | hidden regressions | quality floors per route |
| Shared queue starvation | tenant complaints | priority queues + quotas |
Interview Questions They’ll Ask
- When is serverless a bad fit?
- How do you implement safe failover?
- How do you enforce tenant fairness under load?
- How do you evaluate topology decisions objectively?
Hints in Layers
- Hint 1: Benchmark before optimizing.
- Hint 2: Add dead-letter queues early.
- Hint 3: Keep routing and workflow logic decoupled.
- Hint 4: Stress test with outage drills.
Submission / Completion Criteria
Minimum Completion
- One topology benchmark + failover demo
Full Completion
- Multi-topology comparison and tenant controls
Excellence
- Production-grade runbook with tested incident paths