LEARN QUEUES MESSAGE BROKERS PROJECTS
Learn Message Queues (RabbitMQ, Kafka, etc.) by Building
File: LEARN_QUEUES_MESSAGE_BROKERS_PROJECTS.md
0. Your Position (What this plan assumes)
You want to understand “queues” the way a broker implements them, not just how to call a client SDK. That means you will build small, working slices that force you to confront:
- Delivery guarantees vs. throughput
- Ordering vs. parallelism
- Persistence vs. latency
- Backpressure, retries, poison messages, and dead-lettering
- Replication, leader election, and failure recovery (Kafka-style)
1. Core Concept Analysis (what you must internalize)
A queue/broker is not “a list of messages.” It is a set of coordinated mechanisms:
1) The contract (what does “delivered” mean?)
- At-most-once, at-least-once, effectively-once; ack/nack; redelivery rules; deduplication.
2) Flow control
- Prefetch/windowing, batching, push vs pull, consumer lag, backpressure.
3) Routing / topology
- Simple queue vs exchange-based routing (RabbitMQ-style), partitions (Kafka-style).
4) State & durability
- In-memory vs durable; write-ahead log; fsync policy; retention; compaction.
5) Coordination
- Consumer groups, rebalancing, offset tracking; membership and heartbeats.
6) Replication & failure
- Leader/follower, ISR/quorums, elections, fencing, replay and recovery.
7) Operational reality
- Metrics (lag, throughput, redelivery), tuning, overload behavior, upgrade safety.
Key mental model:
- RabbitMQ (classic AMQP) is fundamentally a routing + queueing system with explicit topology (exchanges/bindings) and consumer acknowledgements/prefetch.
- Kafka is fundamentally a partitioned replicated log where “consuming” is reading from a log position (offset), coordinated by consumer groups.
(References in chat response include RabbitMQ docs on exchanges, acknowledgements/prefetch, and Kafka docs on offsets, protocol, replication/idempotence.)
2. Project Ideas (14 projects)
Project 1: In-Memory Work Queue + Backpressure Simulator
- File: LEARN_QUEUES_MESSAGE_BROKERS_PROJECTS.md
- Main Programming Language: Go
- Alternative Programming Languages: Rust, Python, Elixir
- Coolness Level: Level 2: Practical but Forgettable
- Business Potential: 1. The “Resume Gold”
- Difficulty: Level 2: Intermediate
- Knowledge Area: Concurrency / Backpressure
- Software or Tool: Queue runtime (build)
- Main Book: Computer Systems: A Programmer’s Perspective (Bryant & O’Hallaron)
What you’ll build: A local queue service that accepts “jobs,” dispatches to workers, and visibly throttles producers when workers fall behind.
Why it teaches queues: You will implement the first non-negotiable reality: producers and consumers rarely match throughput, so you must design backpressure instead of hoping.
Core challenges you’ll face:
- Designing a bounded buffer and deciding what happens on overflow → backpressure semantics
- Worker concurrency vs ordering (FIFO vs parallel) → ordering contracts
- Visibility: show queue depth, worker utilization, and drop/throttle events → operational thinking
Key Concepts
- Backpressure & load shedding: Designing Data-Intensive Applications (Kleppmann) — the sections on throughput, buffering, and backpressure
- Concurrency primitives: CS:APP — synchronization concepts and reasoning about concurrency
Difficulty: Intermediate
Time estimate: 1–2 weeks
Prerequisites: Basic concurrency in your chosen language; comfort with networking basics (local TCP/HTTP is enough)
Real world outcome
- A terminal dashboard (or simple web page) showing queue depth, rate in/out, and when throttling activates.
Implementation Hints
- Force overload (producer faster than consumer). Your design is “correct” only if it degrades predictably and doesn’t crash.
- Decide explicitly: block producers, drop, or spill to disk (spill comes in Project 2).
Learning milestones
- Queue works under normal load → you understand producer/consumer coordination
- Under overload it stays stable (no runaway memory) → you understand backpressure
- You can explain the tradeoff of “block vs drop vs buffer” → you understand queue contracts
Project 2: Durable File-Backed Queue (Write-Ahead Log)
- File: LEARN_QUEUES_MESSAGE_BROKERS_PROJECTS.md
- Main Programming Language: Go
- Alternative Programming Languages: Rust, C, Java
- Coolness Level: Level 3: Genuinely Clever
- Business Potential: 3. The “Service & Support” Model
- Difficulty: Level 3: Advanced
- Knowledge Area: Persistence / WAL
- Software or Tool: Write-Ahead Log (build)
- Main Book: Designing Data-Intensive Applications (Kleppmann)
What you’ll build: A queue that survives restarts using an append-only log + an index of “in-flight” vs “acked.”
Why it teaches queues: Durability is not “save it somewhere.” It is defining when a message is considered safely stored and how you recover after crashes.
Core challenges you’ll face:
- WAL format: framing, checksums, partial writes → crash safety
- Ack tracking and replay on restart → at-least-once delivery mechanics
- Compaction: reclaim space without losing state → log maintenance
Key Concepts
- Append-only logs & recovery: DDIA — log-structured storage concepts
- Crash consistency: CS:APP — storage and system-level I/O fundamentals
Difficulty: Advanced
Time estimate: 2–4 weeks
Prerequisites: Project 1; familiarity with files and durability tradeoffs (what fsync implies conceptually)
Real world outcome
- Kill the process mid-stream; restart; prove which messages re-deliver and why (with a replay report).
Implementation Hints
- Treat “power loss” as a first-class scenario: truncated last record, corrupt tail, etc.
- Make recovery a visible step: print recovery actions and the reconstructed queue state.
Learning milestones
- Survives clean restart → persistence basics
- Survives crash mid-write → you understand WAL and partial writes
- Space doesn’t grow forever → you understand compaction
Project 3: Mini AMQP Router (Exchanges, Bindings, Routing Keys)
- File: LEARN_QUEUES_MESSAGE_BROKERS_PROJECTS.md
- Main Programming Language: Go
- Alternative Programming Languages: Elixir, Rust, Java
- Coolness Level: Level 3: Genuinely Clever
- Business Potential: 2. The “Micro-SaaS / Pro Tool”
- Difficulty: Level 3: Advanced
- Knowledge Area: Message Routing / Topology
- Software or Tool: Exchange/Binding router (build)
- Main Book: RabbitMQ in Depth (Alvaro Videla & Jason J. W. Williams) (recommended)
What you’ll build: A simplified “broker core” that routes published messages to queues using direct and topic-style rules.
Why it teaches queues: RabbitMQ’s power is topology. You will understand why messages are not “sent to a queue” but routed via exchange rules.
Core challenges you’ll face:
- Implementing direct routing and topic wildcards (
*and#) → routing semantics - Managing bindings efficiently as they grow → data structure design
- Observability: show why a message went to Q1 vs Q2 → debuggability
Key Concepts
- Exchange types and routing behavior: RabbitMQ docs — “Exchanges” (direct/topic/fanout)
- Topic wildcard matching rules: AMQP 0-9-1 specification — topic exchange definition
Difficulty: Advanced
Time estimate: 2–3 weeks
Prerequisites: Project 1; comfort implementing small parsers and maps/tries
Real world outcome
- A demo where publishing with different routing keys visibly lands in different queues; unmatched routes are reported.
Implementation Hints
- Start with only: publish, declare queue, bind, consume.
- For topic routing, choose one strategy: naive scan first, then optimize with a trie-like matcher.
Learning milestones
- Direct routing works → you understand binding keys
- Topic wildcards work → you understand pattern routing
- You can explain topology design choices → you understand why RabbitMQ has exchanges
Project 4: Ack/Nack, Redelivery, and Poison-Message Lab
- File: LEARN_QUEUES_MESSAGE_BROKERS_PROJECTS.md
- Main Programming Language: Go
- Alternative Programming Languages: Python, Elixir, Java
- Coolness Level: Level 3: Genuinely Clever
- Business Potential: 3. The “Service & Support” Model
- Difficulty: Level 2: Intermediate
- Knowledge Area: Delivery Guarantees
- Software or Tool: Redelivery / DLQ (build)
- Main Book: Designing Data-Intensive Applications (Kleppmann)
What you’ll build: A consumer runtime that supports manual ack, negative ack, requeue, and a dead-letter queue after N failures.
Why it teaches queues: “At-least-once” only becomes real when you see duplicates, redeliveries, and the need to isolate poison messages.
Core challenges you’ll face:
- Defining when a message is “done” (ack) vs “failed” (nack/requeue) → delivery semantics
- Implementing retry policy and DLQ thresholds → failure isolation
- Ensuring requeued messages don’t starve others → fairness and scheduling
Key Concepts
- Consumer acknowledgements vs auto-ack: RabbitMQ docs — “Consumer Acknowledgements”
- Negative acknowledgements: RabbitMQ docs —
basic.nack/ reject semantics
Difficulty: Intermediate
Time estimate: 1–2 weeks
Prerequisites: Project 1
Real world outcome
- A run report showing: processed count, redelivery count, DLQ count, and sample poison message trace.
Implementation Hints
- Force failures deterministically (e.g., “fail every 5th message”) so you can prove behavior.
- Track message attempts and first-seen time; this becomes your “message lifecycle.”
Learning milestones
- Manual ack works → you understand “done means acked”
- Requeue causes duplicates → you understand at-least-once
- DLQ prevents system poisoning → you understand production queue hygiene
Project 5: Prefetch Tuning Playground (Flow Control You Can See)
- File: LEARN_QUEUES_MESSAGE_BROKERS_PROJECTS.md
- Main Programming Language: Go
- Alternative Programming Languages: Python, Java, Elixir
- Coolness Level: Level 2: Practical but Forgettable
- Business Potential: 2. The “Micro-SaaS / Pro Tool”
- Difficulty: Level 2: Intermediate
- Knowledge Area: Flow Control / QoS
- Software or Tool: Prefetch / windowing (build)
- Main Book: Designing Data-Intensive Applications (Kleppmann)
What you’ll build: A consumer that enforces a configurable “unacked window” and produces graphs of throughput vs latency vs redeliveries as you change the window.
Why it teaches queues: Prefetch is the broker’s lever for balancing throughput and fairness; you’ll learn why “bigger buffer” can worsen tail latency and starvation.
Core challenges you’ll face:
- Enforcing “max unacked per consumer” → QoS enforcement
- Measuring tail latency and head-of-line blocking → queueing effects
- Designing experiments to isolate variables → systems thinking
Key Concepts
- Prefetch / QoS limits: RabbitMQ docs — “Consumer Prefetch”
- Unacked messages as the true backlog: RabbitMQ docs — acknowledgement behavior
Difficulty: Intermediate
Time estimate: Weekend–1 week
Prerequisites: Project 4
Real world outcome
- A single chart or dashboard: throughput and p95 latency as prefetch changes.
Implementation Hints
- Keep the workload constant (same message size/processing time) while changing prefetch.
- Record both “queue depth” and “in-flight (unacked) depth.”
Learning milestones
- Windowing works → you understand flow control
- You can predict starvation scenarios → you understand fairness
- You can justify a prefetch choice → you understand tuning tradeoffs
Project 6: Kafka-Lite (Partitioned Append-Only Log Broker)
- File: LEARN_QUEUES_MESSAGE_BROKERS_PROJECTS.md
- Main Programming Language: Go
- Alternative Programming Languages: Rust, Java, C++
- Coolness Level: Level 4: Hardcore Tech Flex
- Business Potential: 1. The “Resume Gold”
- Difficulty: Level 4: Expert
- Knowledge Area: Logs / Streaming
- Software or Tool: Partitioned log broker (build)
- Main Book: Designing Data-Intensive Applications (Kleppmann)
What you’ll build: A minimal broker that stores messages in topic partitions as append-only segment files and supports a pull-based consumer reading by offset.
Why it teaches queues: This is the core “Kafka mental model”: messages are not removed when consumed; consumers advance a position in a replicated log.
Core challenges you’ll face:
- Segment file design (rollover, indexing) → log structure
- Consumer read API (fetch from offset, batching) → pull-based consumption
- Retention policy (time/size based) → data lifecycle management
Key Concepts
- Kafka protocol as request/response over TCP: Apache Kafka “Protocol” guide
- Offsets & consumer progress: Apache Kafka “Distribution / consumer offset tracking” docs
Difficulty: Expert
Time estimate: 1 month+
Prerequisites: Project 2; comfort with file formats and incremental reading
Real world outcome
- A demonstrator where two consumers read the same partition independently at different offsets, proving “log not queue.”
Implementation Hints
- Do not start with replication; start with a single broker, one partition, one segment file.
- Make offsets visible: every fetch prints the offset range returned and next offset.
Learning milestones
- Single partition works → you understand append-only logs
- Segment rollover works → you understand retention primitives
- Multiple consumers at different offsets work → you understand streaming semantics
Project 7: Consumer Group Coordinator (Rebalance + Offset Commit)
- File: LEARN_QUEUES_MESSAGE_BROKERS_PROJECTS.md
- Main Programming Language: Go
- Alternative Programming Languages: Java, Rust, Elixir
- Coolness Level: Level 4: Hardcore Tech Flex
- Business Potential: 1. The “Resume Gold”
- Difficulty: Level 4: Expert
- Knowledge Area: Coordination / Membership
- Software or Tool: Group coordinator (build)
- Main Book: Designing Data-Intensive Applications (Kleppmann)
What you’ll build: A coordinator service that assigns partitions to group members, detects failures via heartbeats, and stores committed offsets per group.
Why it teaches queues: “Consumer groups” are the hidden complexity of Kafka. Rebalancing is where ordering, latency spikes, and duplicates appear.
Core challenges you’ll face:
- Membership protocol (join/leave/heartbeat) → distributed coordination
- Partition assignment strategies (range/round-robin) → fairness vs locality
- Offset commit semantics (commit timing, replay after restart) → processing guarantees
Key Concepts
- Group coordinator and offset storage concept: Apache Kafka docs — consumer offset tracking via coordinator
- Wire protocol framing: Apache Kafka “Protocol” guide (size-delimited messages)
Difficulty: Expert
Time estimate: 1 month+
Prerequisites: Project 6; basic distributed systems instincts
Real world outcome
- Start 3 consumers in a group; stop one; watch partitions rebalance and offsets resume correctly.
Implementation Hints
- Make rebalances explicit events and log a “before/after” assignment map.
- Commit offsets only after “processing done” to see why commit timing matters.
Learning milestones
- Stable assignment works → you understand consumer groups
- Failure triggers rebalance → you understand coordination under churn
- Restart resumes from committed offsets → you understand stateful consumption
Project 8: Replication + Leader Election (3-Node Log with ISR)
- File: LEARN_QUEUES_MESSAGE_BROKERS_PROJECTS.md
- Main Programming Language: Go
- Alternative Programming Languages: Rust, Java, C++
- Coolness Level: Level 5: Pure Magic (Super Cool)
- Business Potential: 1. The “Resume Gold”
- Difficulty: Level 4: Expert
- Knowledge Area: Replication / Consensus-lite
- Software or Tool: Replicated log (build)
- Main Book: Designing Data-Intensive Applications (Kleppmann)
What you’ll build: A 3-node replicated partition with leader/follower replication, ISR tracking, and controlled leader failover.
Why it teaches queues: This is the durability core behind “acks=all” and “min ISR.” You will learn what data is safe when nodes fail mid-write.
Core challenges you’ll face:
- Replication protocol: append, replicate, advance high-watermark → durability model
- ISR membership changes under lag → safety vs availability
- Leader election and fencing old leaders → split-brain prevention
Key Concepts
- Durability vs ack settings (leader waits for ISR): Kafka producer configs (
acks, idempotence prerequisites) - Leader election and ISR/ELR mechanics (modern Kafka): Kafka operations docs on replication and leader eligibility
Difficulty: Expert
Time estimate: 1–2 months
Prerequisites: Project 6; comfort with failure injection
Real world outcome
- A scripted failure demo: kill leader during writes; show which offsets are committed vs rolled back, with a clear report.
Implementation Hints
- Start with a deterministic “test harness” that can pause a follower, delay replication, or crash a node.
- Track and display “high watermark” and “last replicated” per follower.
Learning milestones
- Replication works normally → you understand leader/follower
- Failure doesn’t corrupt log → you understand safety invariants
- You can explain min-ISR tradeoffs → you understand durability in practice
Project 9: Delivery Semantics Lab (Duplicates, Idempotence, “Effectively Once”)
- File: LEARN_QUEUES_MESSAGE_BROKERS_PROJECTS.md
- Main Programming Language: Go
- Alternative Programming Languages: Java, Python, Rust
- Coolness Level: Level 3: Genuinely Clever
- Business Potential: 3. The “Service & Support” Model
- Difficulty: Level 3: Advanced
- Knowledge Area: Semantics / Idempotency
- Software or Tool: Deduplication + idempotent producer (build)
- Main Book: Designing Data-Intensive Applications (Kleppmann)
What you’ll build: A producer/consumer pair that can be switched between at-most-once / at-least-once / “idempotent” modes, with a ledger proving duplicates or their absence.
Why it teaches queues: The industry’s hardest truth: “exactly once” is rarely end-to-end unless you control the entire pipeline and state changes are transactional.
Core challenges you’ll face:
- Generating stable message IDs and deduplicating safely → idempotence patterns
- Showing duplicates under retries and failures → at-least-once reality
- Demonstrating the “side effects” problem (DB writes, external APIs) → end-to-end correctness
Key Concepts
- Idempotent producer constraints: Apache Kafka producer configs (
enable.idempotence,acks=all, in-flight limits) - Publisher confirms vs consumer acks: RabbitMQ docs — confirms and acknowledgements
Difficulty: Advanced
Time estimate: 2–3 weeks
Prerequisites: Project 4 (acks), plus basic persistence (a simple local “ledger” file is enough)
Real world outcome
- A reproducible report: “we injected 5 failures; duplicates occurred in mode X, not in mode Y, for reason Z.”
Implementation Hints
- Separate “message delivered” from “business effect applied.” Your ledger should track both.
- Make retries explicit and controlled (fixed schedule, injected timeouts).
Learning milestones
- You can force duplicates → you understand why at-least-once duplicates happen
- Deduplication works → you understand idempotency patterns
- You can explain why “exactly once” needs transactional boundaries → you understand the real problem
Project 10: Retention + Log Compaction (Keep What Matters, Drop the Rest)
- File: LEARN_QUEUES_MESSAGE_BROKERS_PROJECTS.md
- Main Programming Language: Go
- Alternative Programming Languages: Rust, Java, C++
- Coolness Level: Level 4: Hardcore Tech Flex
- Business Potential: 2. The “Micro-SaaS / Pro Tool”
- Difficulty: Level 4: Expert
- Knowledge Area: Storage / Data Lifecycle
- Software or Tool: Compaction engine (build)
- Main Book: Designing Data-Intensive Applications (Kleppmann)
What you’ll build: A segment maintenance job that enforces retention and optionally compacts by key (keeping only the latest value per key).
Why it teaches queues: This is the hidden feature that turns a log into a durable “source of truth” rather than an infinite tape.
Core challenges you’ll face:
- Segment scanning and rewrite without breaking offsets → log invariants
- Key index maintenance and tombstones → state representation
- Compaction scheduling to avoid impacting reads/writes → operational constraints
Key Concepts
- Log-structured storage and compaction concepts: DDIA — LSM/compaction mental model
- Kafka’s “record batches” and binary framing mindset: Kafka protocol guide (to think in batches/segments)
Difficulty: Expert
Time estimate: 1 month+
Prerequisites: Project 6
Real world outcome
- A before/after visualization: storage size shrinks while “latest state per key” remains correct.
Implementation Hints
- Start with retention only; add compaction later.
- Maintain a “mapping report” so you can explain what got removed and why.
Learning milestones
- Retention works → you understand data lifecycle
- Compaction preserves latest state → you understand key-based semantics
- You can explain tombstones and deletes → you understand real stream storage
Project 11: Delay Queues and Scheduled Delivery (Timers Are Hard)
- File: LEARN_QUEUES_MESSAGE_BROKERS_PROJECTS.md
- Main Programming Language: Go
- Alternative Programming Languages: Elixir, Java, Python
- Coolness Level: Level 3: Genuinely Clever
- Business Potential: 2. The “Micro-SaaS / Pro Tool”
- Difficulty: Level 3: Advanced
- Knowledge Area: Scheduling / Time
- Software or Tool: Delay queue (build)
- Main Book: Operating Systems: Three Easy Pieces (Arpaci-Dusseau)
What you’ll build: A queue that supports “deliver not before time T” and a scheduler that releases messages when due, under load.
Why it teaches queues: Delays interact with ordering, persistence, and crash recovery; it’s a great trapdoor into “this is harder than it looks.”
Core challenges you’ll face:
- Efficient time-ordered storage (timing wheel, heap) → data structures for time
- Correctness across restarts (no lost timers) → durable scheduling
- Burst handling when many timers fire simultaneously → thundering herd control
Key Concepts
- Time and scheduling in systems: OSTEP — time and scheduling mindset (timers, waiting, wakeups)
- Queue fairness under burst: RabbitMQ prefetch concepts (think in windows of in-flight work)
Difficulty: Advanced
Time estimate: 2–4 weeks
Prerequisites: Project 2 (durability) recommended
Real world outcome
- A demo that schedules 10,000 delayed tasks and fires them with bounded jitter, even after a restart.
Implementation Hints
- Decide whether “due time” is strict or best-effort; measure jitter and report it.
- Persist the schedule state; recovery must reconcile “now” with overdue items.
Learning milestones
- Delayed delivery works → you understand time-ordered queues
- Restart correctness holds → you understand durable scheduling
- Under burst it stays stable → you understand overload control
Project 12: Observability for Queues (Lag, Throughput, Redelivery)
- File: LEARN_QUEUES_MESSAGE_BROKERS_PROJECTS.md
- Main Programming Language: Go
- Alternative Programming Languages: Python, Elixir, Rust
- Coolness Level: Level 2: Practical but Forgettable
- Business Potential: 3. The “Service & Support” Model
- Difficulty: Level 2: Intermediate
- Knowledge Area: Observability / Metrics
- Software or Tool: Metrics + dashboard (build)
- Main Book: Release It! (Michael T. Nygard) (recommended)
What you’ll build: A metrics package that reports queue depth, in-flight, consumer lag, retry rates, and “time in queue.”
Why it teaches queues: You’ll learn what production operators watch and why “queue depth” alone is often misleading.
Core challenges you’ll face:
- Defining the right metrics and invariants → operational correctness
- Aggregation under high volume → systems efficiency
- Alert conditions (lag increasing, retries spiking) → SRE thinking
Key Concepts
- In-flight/unacked vs queued: RabbitMQ docs — ack + prefetch implications
- Consumer lag as a first-class metric: Kafka offset tracking model (consumer progress)
Difficulty: Intermediate
Time estimate: 1–2 weeks
Prerequisites: Any earlier project
Real world outcome
- A dashboard showing: lag per consumer, retries per minute, p95 “time in queue,” and DLQ growth.
Implementation Hints
- Make metrics “pull” (scrape) or “push” intentionally; document tradeoffs.
- Show one anomaly (e.g., slow consumer) and prove the metrics identify it.
Learning milestones
- Metrics reflect reality → you understand what to measure
- Alerts catch real failure modes → you understand operational signals
- You can debug a slowdown using only metrics → you understand production queueing
Project 13: Load Generator + Benchmark Harness for Brokers
- File: LEARN_QUEUES_MESSAGE_BROKERS_PROJECTS.md
- Main Programming Language: Go
- Alternative Programming Languages: Rust, Python, Java
- Coolness Level: Level 2: Practical but Forgettable
- Business Potential: 2. The “Micro-SaaS / Pro Tool”
- Difficulty: Level 3: Advanced
- Knowledge Area: Performance / Benchmarking
- Software or Tool: Benchmark harness (build)
- Main Book: CS:APP (Bryant & O’Hallaron)
What you’ll build: A repeatable harness that produces load patterns (steady, bursty, fanout) and measures throughput, latency percentiles, and tail behavior.
Why it teaches queues: Benchmarks reveal the hidden costs of acknowledgements, persistence, batching, and flow control.
Core challenges you’ll face:
- Generating stable load without measuring yourself → benchmark validity
- Capturing latency distributions (p50/p95/p99) → tail latency thinking
- Comparing modes (acks, batching, prefetch) → tradeoff quantification
Key Concepts
- Windowing/batching tradeoffs: Kafka protocol and batching mindset
- Prefetch impacts: RabbitMQ consumer prefetch concepts
Difficulty: Advanced
Time estimate: 2–3 weeks
Prerequisites: Project 5 or 6 strongly recommended
Real world outcome
- A benchmark report that ranks configurations by throughput and p99 latency, with clear conclusions.
Implementation Hints
- Keep “clocking” and measurement overhead minimal; validate the harness itself.
- Include failure mode benchmarks: consumer restarts, network delay injection.
Learning milestones
- Reproducible results → you understand measurement discipline
- Clear tradeoffs emerge → you understand broker tuning knobs
- You can design a fair comparison → you understand systems evaluation
Project 14: Workflow Engine on Top of Queues (Retries, DLQ, Idempotency)
- File: LEARN_QUEUES_MESSAGE_BROKERS_PROJECTS.md
- Main Programming Language: Go
- Alternative Programming Languages: Python, Elixir, Java
- Coolness Level: Level 3: Genuinely Clever
- Business Potential: 4. The “Open Core” Infrastructure
- Difficulty: Level 3: Advanced
- Knowledge Area: Distributed Workflows
- Software or Tool: Worker framework (build)
- Main Book: Designing Data-Intensive Applications (Kleppmann)
What you’ll build: A small worker framework that supports task definitions, retries with backoff, idempotency keys, DLQs, and a minimal admin UI.
Why it teaches queues: Real systems are not “consume and print.” They are workflows with retries, partial failures, and external side effects.
Core challenges you’ll face:
- Defining task lifecycle states (queued, running, retrying, dead) → state machines
- Exactly-what-happened audit trail → debuggability
- Preventing duplicate side effects → idempotency in practice
Key Concepts
- Ack/retry/DLQ patterns: RabbitMQ ack/nack and reliability docs
- Idempotence constraints: Kafka producer idempotence requirements
Difficulty: Advanced
Time estimate: 1 month+
Prerequisites: Projects 4 and 9
Real world outcome
- A usable local “job runner” where you can submit tasks and watch them retry, succeed, or land in DLQ with audit trails.
Implementation Hints
- Treat every task execution as producing an “execution record” (attempt number, result, timestamps).
- Build a minimal UI: list tasks, filter failures, requeue from DLQ.
Learning milestones
- Retries + DLQ behave predictably → you understand robust queue-based workflows
- You can diagnose failures from audit logs → you understand operational needs
- You can prevent duplicate side effects → you understand correctness under retries
2.1.2 Project Comparison Table
| Project | Difficulty | Time | Depth of Understanding | Fun Factor |
|---|---|---|---|---|
| In-Memory Work Queue + Backpressure | Intermediate | 1–2 weeks | Medium | Medium |
| Durable File-Backed Queue (WAL) | Advanced | 2–4 weeks | High | Medium |
| Mini AMQP Router (Exchanges/Bindings) | Advanced | 2–3 weeks | High | High |
| Ack/Nack + DLQ Lab | Intermediate | 1–2 weeks | High | Medium |
| Prefetch Tuning Playground | Intermediate | Weekend–1 week | Medium | Medium |
| Kafka-Lite Partitioned Log | Expert | 1 month+ | Very High | High |
| Consumer Group Coordinator | Expert | 1 month+ | Very High | Medium |
| Replication + Leader Election | Expert | 1–2 months | Extremely High | High |
2.1.3 Recommendation (what to start with)
Start with Project 4 (Ack/Nack + DLQ Lab), then Project 3 (Mini AMQP Router), then Project 6 (Kafka-Lite).
Why this order:
1) Project 4 makes “delivery semantics” real immediately (duplicates, redelivery, poison messages).
2) Project 3 makes RabbitMQ’s topology intuition concrete (routing is the point).
3) Project 6 builds the Kafka mental model (log + offsets), which is a fundamentally different paradigm.
If you only have weekends: do 4 → 5 → 9 (you’ll still learn the core semantics deeply).
2.1.4 Final Overall Project (capstone): QueueLab — A Unified Broker Playground
- File: LEARN_QUEUES_MESSAGE_BROKERS_PROJECTS.md
- Main Programming Language: Go
- Alternative Programming Languages: Rust, Java, Elixir
- Coolness Level: Level 5: Pure Magic (Super Cool)
- Business Potential: 4. The “Open Core” Infrastructure
- Difficulty: Level 4: Expert
- Knowledge Area: Messaging Systems / Distributed Systems
- Software or Tool: Local broker platform + UI (build)
- Main Book: Designing Data-Intensive Applications (Kleppmann)
What you’ll build: A local “queue laboratory” that can run in two modes:
- AMQP-mode: exchange/binding routing, push consumers, ack/nack, prefetch, DLQs
- Log-mode: topics/partitions, pull consumers by offset, consumer groups, retention/compaction
Plus a single dashboard that shows: routing decisions, in-flight windows, consumer lag, replication status, and redelivery/DLQ rates.
Why it teaches queues: It forces you to build and compare the two dominant broker models side-by-side, with the same workload, so the differences become undeniable.
Core challenges you’ll face
- Designing a common “message lifecycle model” across both paradigms → conceptual mastery
- Failure injection suite (kill node, delay follower, crash consumer) → real-world resilience
- Operators’ dashboard with actionable metrics and timelines → production-grade thinking
Key Concepts
- RabbitMQ reliability and replicated queues concept: RabbitMQ reliability/quorum queue docs
- Kafka protocol, offsets, and idempotence constraints: Apache Kafka protocol + producer config docs
Real world outcome
- A demo environment where you can run the same “order processing” workload on both modes and generate a report:
- throughput, p95/p99 latency
- duplicate rate
- time-to-recover after node failure
- operational signals (lag, retries, DLQ)
Implementation Hints
- Treat this as “a product.” Define 3–5 standard scenarios: fanout logs, task queue, event sourcing stream, burst load, node failure.
- Make every scenario produce a saved report so the outcome is measurable and repeatable.
Learning milestones
- Both modes pass the same workload → you understand conceptual differences
- Failure injection produces explainable outcomes → you understand durability and recovery
- Dashboard tells you what’s wrong without guessing → you understand operating message systems
Summary (Projects + Main Language)
- In-Memory Work Queue + Backpressure Simulator — Go
- Durable File-Backed Queue (Write-Ahead Log) — Go
- Mini AMQP Router (Exchanges, Bindings, Routing Keys) — Go
- Ack/Nack, Redelivery, and Poison-Message Lab — Go
- Prefetch Tuning Playground — Go
- Kafka-Lite (Partitioned Append-Only Log Broker) — Go
- Consumer Group Coordinator (Rebalance + Offset Commit) — Go
- Replication + Leader Election (3-Node Log with ISR) — Go
- Delivery Semantics Lab (Duplicates, Idempotence, “Effectively Once”) — Go
- Retention + Log Compaction — Go
- Delay Queues and Scheduled Delivery — Go
- Observability for Queues (Lag, Throughput, Redelivery) — Go
- Load Generator + Benchmark Harness for Brokers — Go
- Workflow Engine on Top of Queues — Go
Capstone: QueueLab — Go