LEARN PUBSUB DEEP DIVE

Learn Pub/Sub: From Zero to Event-Driven Architect

Goal: Deeply understand the Publish-Subscribe pattern—from the core theory to implementing robust, scalable, and resilient distributed systems using message brokers like Redis, RabbitMQ, and Kafka.

Why Learn Pub/Sub?

The Publish-Subscribe (Pub/Sub) pattern is the backbone of modern distributed systems, microservices, and real-time applications. It allows different parts of a complex system to communicate without being directly connected, leading to greater scalability, resilience, and flexibility.

Instead of one component calling another directly (a tight coupling), publishers send messages to a channel, and subscribers listen to that channel, without any knowledge of each other. This decoupling is a superpower for building complex applications.

After completing these projects, you will:

Understand the fundamental difference between a message broker and a direct API call.
Choose the right message broker (Redis, RabbitMQ, Kafka) for the right job.
Design and build event-driven microservices that are scalable and fault-tolerant.
Implement real-time notification systems and data streaming pipelines.
Grasp advanced concepts like delivery guarantees, persistence, and backpressure.

Core Concept Analysis

The Pub/Sub Architecture

                                  ┌─────────────────┐
                                  │                 │
                                  │  Message Broker │
                                  │ (e.g., RabbitMQ)│
                                  │                 │
                                  └─────────────────┘
                                          ▲
                                          │ 1. Publish
                                          │
                            ┌─────────────┴─────────────┐
                            │      "new_user" TOPIC     │
                            └─────────────┬─────────────┘
                                          │ 2. Fan-out
                    ┌─────────────────────┼─────────────────────┐
                    │                     │                     │
                    ▼ 3a. Deliver         ▼ 3b. Deliver         ▼ 3c. Deliver
            ┌───────────────┐     ┌───────────────┐     ┌───────────────┐
            │  Subscriber A │     │  Subscriber B │     │  Subscriber C │
            │ (Welcome Email│     │ (Analytics    │     │ (Search Index │
            │    Service)   │     │   Service)    │     │     Service)  │
            └───────────────┘     └───────────────┘     └───────────────┘

Publishers (e.g., a User Signup API) know nothing about the subscribers.
Subscribers know nothing about the publisher. They only care about the topic.

Key Concepts Explained

1. The Core Components

Publisher: The application component that creates and sends a message. It doesn’t know or care who receives it.
Subscriber: The application component that receives messages. It expresses interest in one or more topics.
Message: The packet of data being sent. It typically has a payload and some metadata (headers).
Topic/Channel: A named channel to which messages are published. Subscribers subscribe to topics.
Broker: The central intermediary responsible for receiving messages from publishers and routing them to the correct subscribers. This is the heart of the system.

2. Message Brokers: A Quick Comparison

Broker	Type	Strengths	Weaknesses	Use Case
Redis	In-Memory Data Store	Extremely fast, simple, lightweight	“Fire-and-forget”, no message persistence	Caching, real-time notifications, leaderboards
RabbitMQ	Smart Broker, Dumb Consumer	Complex routing, message persistence, per-message ACKs	Lower throughput than Kafka, complex setup	Task queues, traditional messaging, microservice communication
Kafka	Dumb Broker, Smart Consumer	Extremely high throughput, durable log, replayable	High latency, complex consumer logic (offset management)	Event streaming, log aggregation, data pipelines

3. Key Messaging Concepts

Queue vs. Topic:
- A Topic is for fan-out: one message goes to all subscribers. Think of a newspaper subscription.
- A Queue is typically for load balancing: many publishers send to one queue, and many consumers compete to pull messages from it. Only one consumer gets a given message. Think of a line at a bank.
Persistence: The broker’s ability to save messages to disk. If a subscriber is offline, it can receive the message when it comes back online.
Acknowledgement (ACK): A signal from a subscriber to the broker confirming it has successfully processed a message. If the broker doesn’t receive an ACK, it will often re-deliver the message to another subscriber.
Delivery Guarantees:
- At Most Once: The broker sends the message and forgets. If it gets lost, it’s gone. (e.g., Redis Pub/Sub)
- At Least Once: The broker guarantees delivery, but a subscriber might receive the same message multiple times if an ACK is lost. The subscriber must be idempotent. (e.g., RabbitMQ, Kafka)
- Exactly Once: The holy grail. Extremely difficult to achieve, requires coordination between the broker and the client.

Project List

The following 7 projects will guide you from basic principles to building sophisticated, real-world event-driven systems.

Project 1: In-Memory Pub/Sub Hub

File: LEARN_PUBSUB_DEEP_DIVE.md
Main Programming Language: Python
Alternative Programming Languages: Go, Node.js, C#
Coolness Level: Level 3: Genuinely Clever
Business Potential: 1. The “Resume Gold”
Difficulty: Level 1: Beginner
Knowledge Area: Software Architecture / Design Patterns
Software or Tool: N/A (Pure code)
Main Book: “Design Patterns: Elements of Reusable Object-Oriented Software” by the “Gang of Four” (Observer Pattern)

What you’ll build: A simple pub/sub system that runs entirely within a single application. You’ll create a central “Hub” class that allows parts of your code to subscribe to topics with a callback function, and other parts to publish messages to those topics, triggering the callbacks.

Why it teaches pub/sub: This project forces you to implement the core logic of the pattern from scratch, without a broker. You’ll understand the fundamental mechanics of topics, handlers, and message dispatching. It also perfectly illustrates the limitations that real message brokers exist to solve (e.g., no persistence, no inter-process communication).

Core challenges you’ll face:

Storing subscriptions → maps to creating a data structure (like a dictionary) to hold topics and lists of subscriber callbacks
Dispatching messages → maps to looping through subscribers for a given topic and calling their functions
Decoupling publisher and subscriber → maps to ensuring the publisher knows nothing about the subscriber objects, only the topic name
Handling errors in subscribers → maps to what should happen if one subscriber’s callback throws an exception?

Key Concepts:

Observer Pattern: The formal design pattern that pub/sub is based on.
Callbacks/Function Pointers: The mechanism for registering subscriber interest.
Data Structures: Using dictionaries/hashtables to manage topics and subscribers.

Difficulty: Beginner Time estimate: Weekend Prerequisites: Basic programming concepts (functions, classes, dictionaries/maps).

Real world outcome: You’ll have a running console application demonstrating decoupled communication.

# Console Output:
# --- Setting up subscribers ---
# Analytics service subscribed to 'user_registered'
# Email service subscribed to 'user_registered'
# Logger subscribed to 'user_registered'
# Logger also subscribed to 'order_placed'
#
# --- Publishing event: user_registered ---
# [Email] Sending welcome email to john.doe@example.com
# [Analytics] Tracking new user: john.doe
# [Logger] Event 'user_registered': {'user': 'john.doe'}
#
# --- Publishing event: order_placed ---
# [Logger] Event 'order_placed': {'order_id': 123, 'amount': 99.99}

The user registration logic only published one event; it had no idea three different systems would react to it.

Implementation Hints:

Create a PubSubHub class.
Inside the class, have a dictionary, e.g., self.topics = {}. The keys will be topic strings, and the values will be lists of callback functions.
Implement a subscribe(topic, handler) method. It should check if the topic exists in the dictionary. If not, create an empty list. Then, append the handler function to the list.
Implement a publish(topic, message) method. It should look up the topic in the dictionary. If it exists, iterate through the list of handlers and call each one, passing the message as an argument.
Wrap the handler calls in a try/except block to prevent one failing subscriber from stopping the others.

Learning milestones:

A subscriber’s function is called when an event is published → You have a working dispatch loop.
Multiple subscribers to the same topic all receive the message → You’ve implemented one-to-many fan-out.
Publishing to a topic with no subscribers does nothing and doesn’t crash → Your code is robust to empty topics.
You can articulate why this system wouldn’t work for multi-process or multi-server applications → You understand the need for a message broker.

Project 2: Chat Application with Redis Pub/Sub

File: LEARN_PUBSUB_DEEP_DIVE.md
Main Programming Language: Python
Alternative Programming Languages: Go, Node.js
Coolness Level: Level 3: Genuinely Clever
Business Potential: 1. The “Resume Gold”
Difficulty: Level 2: Intermediate
Knowledge Area: Real-time Communication / Message Brokers
Software or Tool: Redis
Main Book: “Redis in Action” by Josiah Carlson

What you’ll build: A simple command-line chat application. Multiple users can run the client, join a “chat room” (which is just a Redis pub/sub topic), and any message sent by one user is instantly received by all other users in the same room.

Why it teaches pub/sub: This project introduces an external message broker. You’ll move from an in-process system to a client-server architecture. You’ll learn how to connect to a broker, subscribe to a topic in a blocking loop, and publish messages from a separate thread or process. It also highlights the “fire-and-forget” nature of Redis pub/sub.

Core challenges you’ll face:

Connecting to Redis → maps to using a Redis client library
Subscribing to a topic and listening for messages → maps to creating a subscription object and running a blocking loop to listen
Publishing and subscribing at the same time → maps to using threading or asyncio to handle user input (publishing) while simultaneously listening for incoming messages
Handling multiple chat rooms → maps to subscribing to different topics

Key Concepts:

Client-Server Architecture: Your app is a client to the Redis server (broker).
Blocking I/O: The subscriber loop will “block” while it waits for messages.
Concurrency: Using threads or async I/O to manage multiple tasks at once.
“Fire-and-forget” delivery: If a chat client isn’t running, it will not receive messages sent during that time.

Difficulty: Intermediate Time estimate: Weekend Prerequisites: Project 1, basic understanding of concurrency, Docker for running Redis easily.

Real world outcome: You can open multiple terminal windows and have them talk to each other.

Terminal 1:

$ ./chat_client.py --username Alice --room general
Joined room: general
[You] > Hello everyone!
[Bob] > Hi Alice!

Terminal 2:

$ ./chat_client.py --username Bob --room general
Joined room: general
[Alice] > Hello everyone!
[You] > Hi Alice!

Implementation Hints:

Run Redis in Docker: docker run -d -p 6379:6379 redis.
Use a Redis client library for your language (e.g., redis-py for Python).
The core of your subscriber logic will be a pubsub object from the library. You’ll call .subscribe('room_name') and then loop over pubsub.listen(). This loop will block until a message arrives.
Because the listener loop is blocking, you need another way to get user input. A simple approach is to run the listener in a separate thread. The main thread can handle input() from the user.
When the user types a message, use the same Redis client connection to publish('room_name', message_text).
Structure your messages. Sending a JSON string like {"username": "Alice", "text": "Hello!"} is better than sending raw text.

Learning milestones:

Your client can publish a message and see it in redis-cli’s MONITOR command → You can connect and publish.
Your client can subscribe and receive a message published from redis-cli → You can subscribe and listen.
Two instances of your client can talk to each other → You have working bi-directional communication through the broker.
You can explain why a user who joins late doesn’t see old messages → You understand Redis’s “at-most-once” delivery and lack of persistence in its pub/sub.

Project 3: Persistent Task Queue with RabbitMQ

File: LEARN_PUBSUB_DEEP_DIVE.md
Main Programming Language: Python
Alternative Programming Languages: Go, C#, Java
Coolness Level: Level 3: Genuinely Clever
Business Potential: 3. The “Service & Support” Model
Difficulty: Level 3: Advanced
Knowledge Area: Asynchronous Processing / Distributed Systems
Software or Tool: RabbitMQ
Main Book: “RabbitMQ in Action” by Alvaro Videla and Jason J.W. Williams

What you’ll build: A system for processing “tasks” asynchronously. You’ll have a “producer” script that sends tasks (e.g., “process video ID 123”, “send email to user 456”) to a RabbitMQ queue. You’ll also build a “worker” script that pulls tasks from the queue, pretends to process them, and sends an acknowledgement (ACK).

Why it teaches pub/sub: This project introduces a more traditional and robust message broker. You’ll learn about key concepts that Redis pub/sub lacks: persistence (messages survive a broker restart), acknowledgements (ensuring tasks are completed), and work queues (distributing tasks among multiple workers for parallel processing).

Core challenges you’ll face:

Understanding RabbitMQ’s AMQP model → maps to Exchanges, Queues, and Bindings
Declaring durable queues → maps to ensuring your queue and messages survive a broker restart
Implementing message acknowledgements → maps to calling channel.basic_ack() only after a task is successfully completed
Setting up competing consumers → maps to running multiple instances of your worker script and seeing them share the workload
Handling worker failure → maps to observing how un-ACK’d messages are re-queued when a worker script crashes

Key Concepts:

AMQP (Advanced Message Queuing Protocol): The protocol RabbitMQ uses.
Exchanges, Queues, Bindings: The core routing components of RabbitMQ. A producer sends to an exchange, which routes the message to one or more queues based on binding rules.
Message Durability: Making sure a message isn’t lost if the broker crashes.
Acknowledgements (ACKs) and Re-queuing: The mechanism for guaranteeing “at-least-once” delivery.

Difficulty: Advanced Time estimate: 1-2 weeks Prerequisites: Project 2, Docker for running RabbitMQ.

Real world outcome: You’ll see true distributed workload processing.

Producer Terminal:

$ ./producer.py --task "Process video 1"
$ ./producer.py --task "Process video 2"
$ ./producer.py --task "Process video 3"
$ ./producer.py --task "Process video 4"
Sent task: Process video 1
Sent task: Process video 2
Sent task: Process video 3
Sent task: Process video 4

Worker 1 Terminal:

$ ./worker.py
[*] Waiting for messages.
[x] Received 'Process video 1'. Processing... Done.
[x] Received 'Process video 3'. Processing... Done.

Worker 2 Terminal:

$ ./worker.py
[*] Waiting for messages.
[x] Received 'Process video 2'. Processing... Done.
[x] Received 'Process video 4'. Processing... Done.

If you kill Worker 1 while it’s “processing” a task, you’ll see that task get re-delivered to Worker 2.

Implementation Hints:

Run RabbitMQ in Docker: docker run -d -p 5672:5672 -p 15672:15672 rabbitmq:3-management. The second port is for the web management UI.
Use a client library like pika for Python.
In both producer and consumer, you need to connect and get a channel.
In both, declare the queue: channel.queue_declare(queue='task_queue', durable=True). Doing this in both ensures the queue exists before it’s used.
Producer: Use channel.basic_publish(). Set the delivery_mode=2 property to make messages persistent.
Worker: Use channel.basic_consume(). The key is to set auto_ack=False. This gives you manual control. Your callback function will receive the message.
In your worker’s callback, after you finish your “work” (e.g., after time.sleep()), call channel.basic_ack(delivery_tag=method.delivery_tag).

Learning milestones:

A message sent by the producer is received by the worker → Your basic connection and queue setup is correct.
If you stop the broker and restart it, the messages are still in the queue → You’ve correctly implemented durable queues and persistent messages.
If you kill a worker mid-task, the task is re-delivered to another worker → You’ve mastered manual ACKs and “at-least-once” delivery.
Running two workers causes them to share the tasks from the queue → You’ve implemented a competing consumer pattern.

Project 4: Real-Time Notification System

File: LEARN_PUBSUB_DEEP_DIVE.md
Main Programming Language: Python (with FastAPI/Flask) + JavaScript
Alternative Programming Languages: Go, Node.js
Coolness Level: Level 4: Hardcore Tech Flex
Business Potential: 2. The “Micro-SaaS / Pro Tool”
Difficulty: Level 3: Advanced
Knowledge Area: Web Development / Real-time Systems
Software or Tool: RabbitMQ (or Redis), WebSockets
Main Book: “Designing Data-Intensive Applications” by Martin Kleppmann

What you’ll build: A web application with a “live feed.” A backend web server (e.g., an API for posting comments) will publish an event to a message broker every time a new comment is posted. A separate “notification” service subscribes to these events and pushes them over a WebSocket to all connected web browsers, which then dynamically update the page without a refresh.

Why it teaches pub/sub: This project demonstrates a classic use case for pub/sub: decoupling a core business action (writing to a database) from secondary “side effects” (sending notifications). It shows how pub/sub enables real-time features and event-driven architectures in a user-facing product.

Core challenges you’ll face:

Integrating publishing into a web request → maps to calling basic_publish inside an API endpoint (e.g., after saving a comment to the database)
Managing WebSocket connections → maps to keeping track of all connected browser clients
Broadcasting messages to clients → maps to when the notification service gets a message from the broker, it must loop through all its connected WebSocket clients and forward the message
Frontend JavaScript to handle WebSocket events → maps to writing client-side code to connect to the WebSocket and dynamically add elements to the DOM

Key Concepts:

Event-Driven Architecture: The core of your application logic is driven by events, not direct calls.
Decoupling: The API that creates comments has no idea that a real-time notification system even exists.
WebSockets: A protocol for persistent, bi-directional communication between a browser and a server.
Fan-out Exchange: In RabbitMQ, you’ll use a “fanout” exchange to ensure every notification service instance gets a copy of the event.

Difficulty: Advanced Time estimate: 2-3 weeks Prerequisites: Project 3, basic web development (HTML/JS, a backend framework like Flask or FastAPI).

Real world outcome: A user can have a web page open. Another user (or you, via an API call) posts a comment. A second later, the comment appears on the first user’s page automatically.

System Diagram:

 Browser 1        Browser 2
    ▲ │               ▲ │
    │ │ WebSocket     │ │ WebSocket
    │ ▼               │ ▼
┌───────────────┐ ┌───────────────┐
│ Notification  │ │ Notification  │ (Can be scaled)
│ Service       │ │ Service       │
└───────┬───────┘ └───────┬───────┘
        │ Subscribes      │ Subscribes
        └─────────┐ ┌─────┘
                  ▼ ▼
            ┌────────────────┐
            │ RabbitMQ Broker│
            │ (comments topic) │
            └───────▲────────┘
                    │ Publishes
            ┌───────┴────────┐
            │   Comment API  │
            └────────────────┘

Implementation Hints:

Comment API: Create a simple HTTP endpoint (e.g., POST /comments). When it receives a request, it saves the comment to a database and then publishes a message like {"comment": "...", "user": "..."} to a RabbitMQ fanout exchange.
Notification Service: This is a separate process. It connects to RabbitMQ and subscribes to the queue bound to the fanout exchange. It also runs a WebSocket server. It needs to maintain a list of all connected clients.
When the notification service receives a message from RabbitMQ, it iterates through its list of connected WebSocket clients and sends the message payload to each one.
Frontend: The JavaScript on your web page will establish a WebSocket connection to your notification service. It will have an onmessage event handler that parses the incoming data and uses DOM manipulation to create and append a new comment element to the page.

Learning milestones:

Posting to the API causes a message to appear in the RabbitMQ queue → The publishing part is working.
The notification service prints a message to its console when a message is received from RabbitMQ → The subscribing part is working.
Your browser can connect to the WebSocket server → The real-time connection is established.
A new comment posted via the API appears on the web page without a refresh → The entire end-to-end flow is complete.

Project 5: Website Clickstream Analysis with Kafka

File: LEARN_PUBSUB_DEEP_DIVE.md
Main Programming Language: Python or Java
Alternative Programming Languages: Go, Scala
Coolness Level: Level 4: Hardcore Tech Flex
Business Potential: 4. The “Open Core” Infrastructure
Difficulty: Level 4: Expert
Knowledge Area: Big Data / Event Streaming
Software or Tool: Apache Kafka
Main Book: “Kafka: The Definitive Guide” by Gwen Shapira, Neha Narkhede, and Todd Palino

What you’ll build: A data pipeline that simulates and processes a stream of website click events. A “producer” script will generate thousands of fake click events ({ "user_id": "...", "page": "...", "timestamp": "..." }) and publish them to a Kafka topic. A “consumer” script will read from this topic in real-time, calculate some simple aggregate metrics (e.g., page views per minute), and print them to the console.

Why it teaches pub/sub: This project introduces you to the world of high-throughput event streaming with Kafka. You’ll learn that Kafka is more than a message queue; it’s a durable, ordered log. You’ll grapple with concepts like partitions (for parallelism), consumer groups (for scaling), and offsets (for tracking a consumer’s position in the log).

Core challenges you’ll face:

Setting up Kafka and Zookeeper → maps to understanding Kafka’s primary dependency (or using KRaft mode)
Configuring Topics and Partitions → maps to using Kafka’s CLI tools to create a topic with multiple partitions for parallel processing
Implementing a reliable producer → maps to handling acknowledgements and retries to ensure messages are sent
Implementing a consumer group → maps to running multiple instances of your consumer and seeing them automatically split the partitions to process in parallel
Managing consumer offsets → maps to understanding how Kafka tracks a consumer’s progress and the difference between auto and manual commit

Key Concepts:

The Log Abstraction: The core idea that a topic is an ordered, immutable sequence of records.
Partitions: A topic is split into partitions. Ordering is only guaranteed within a partition.
Offsets: Each message in a partition has a unique, sequential ID called an offset. Consumers track their progress using this offset.
Consumer Groups: A set of consumers that cooperate to consume from a topic. Kafka automatically assigns partitions to consumers in a group.
Replayability: Because the log is durable, a consumer can “rewind” and re-process messages from the beginning.

Difficulty: Expert Time estimate: 2-3 weeks Prerequisites: Project 3, familiarity with command-line tools, Docker.

Real world outcome: You’ll have a running data pipeline.

Producer Terminal:

$ ./producer.py
Producing 1000 events/sec to 'clicks' topic...

Consumer 1 Terminal:

$ ./consumer.py --group my-aggregator
Processing partitions: [0, 1]
--- Minute 1 ---
/home: 5021 views
/pricing: 2341 views
/docs: 1024 views

Consumer 2 Terminal:

$ ./consumer.py --group my-aggregator
Processing partitions: [2, 3]
<... contributes to the same aggregation ...>

If you kill a consumer, you will see the other one automatically get assigned the orphaned partitions and continue processing the full stream.

Implementation Hints:

Running Kafka is the first hurdle. Use a docker-compose.yml file to run both Zookeeper and Kafka. Many good examples exist online.
Use the Kafka command-line tools (which are inside the Docker container) to create your topic: kafka-topics.sh --create --topic clicks --partitions 4 ....
Use a well-supported client library like kafka-python or confluent-kafka-python.
Producer: Create a KafkaProducer instance. Loop and call producer.send('clicks', event_data).
Consumer: Create a KafkaConsumer. When you instantiate it, you specify the topic and a group_id. The library handles the consumer group coordination automatically. The consumer object is an iterator, so you can simply for message in consumer: ....
To see scaling, run your consumer script in two separate terminals with the same group_id. Then use the admin tools to describe the consumer group and see how the partitions are split between them.

Learning milestones:

A message sent by the producer is received by the consumer → Your Kafka cluster and basic client configurations are working.
Messages are distributed across multiple partitions → The producer is load-balancing correctly.
Starting a second consumer in the same group causes it to process a subset of the partitions → You have successfully implemented a scalable consumer group.
If you stop and restart a consumer, it picks up where it left off → You understand offset management and at-least-once delivery.

Project 6: Event-Driven Microservices Communication

File: LEARN_PUBSUB_DEEP_DIVE.md
Main Programming Language: Go or Python
Alternative Programming Languages: C#, Java, Node.js
Coolness Level: Level 4: Hardcore Tech Flex
Business Potential: 4. The “Open Core” Infrastructure
Difficulty: Level 4: Expert
Knowledge Area: Microservices / Distributed Architecture
Software or Tool: RabbitMQ or Kafka, Docker
Main Book: “Building Microservices” by Sam Newman

What you’ll build: A small ecosystem of three microservices that communicate purely through a message broker.

Order Service: Has an API to create an order. When an order is created, it saves it to its own database and publishes an OrderCreated event.
Payment Service: Subscribes to OrderCreated events. When it receives one, it “processes payment” and publishes an OrderPaid event.
Shipping Service: Subscribes to OrderPaid events. When it receives one, it “arranges for shipping”.

Why it teaches pub/sub: This is the canonical example of using pub/sub for building resilient, decoupled systems. No service calls another’s API directly. They are completely independent and only react to events. This allows you to update, deploy, or even have the Payment Service go down for an hour without the Order Service even knowing or caring.

Core challenges you’ll face:

Designing your event schemas → maps to deciding what data an OrderCreated event needs to contain
Ensuring service independence → maps to making sure no service has a direct dependency (like an HTTP call) on another
Handling event chains → maps to tracing the flow of an order from OrderCreated to OrderPaid
Idempotent consumers → maps to ensuring that if the Payment Service receives the same OrderCreated event twice, it doesn’t charge the customer twice
Data consistency across services → maps to understanding the challenges of “eventual consistency”

Key Concepts:

Choreography vs. Orchestration: Your services are “choreographed” (they react to each other’s events) rather than “orchestrated” (a central service telling them what to do).
Idempotency: A critical concept for reliable messaging. An operation is idempotent if running it multiple times has the same effect as running it once.
Eventual Consistency: Because services are updated asynchronously, the overall system state is not instantly consistent, but will become so “eventually”.
Service Autonomy: Each service has its own database and can be developed and deployed independently.

Difficulty: Expert Time estimate: 2-4 weeks Prerequisites: Project 3 or 5, experience with web frameworks and databases.

Real world outcome: You can run and interact with your distributed system.

Call the Order Service API: POST /orders
Watch the logs for the Order Service: INFO: Order 123 created. Publishing OrderCreated event.
Watch the logs for the Payment Service: INFO: Received OrderCreated event for 123. Processing payment... Payment successful. Publishing OrderPaid event.
Watch the logs for the Shipping Service: INFO: Received OrderPaid event for 123. Arranging shipment.

You can kill the Payment Service, create several orders, and then restart the Payment Service. You’ll see it immediately start processing the backlog of OrderCreated events from the queue. The system is resilient.

Implementation Hints:

Use Docker Compose to define and run your three services and the message broker (RabbitMQ is a good choice here).
Each service is a separate application (e.g., its own Python/Flask app).
Order Service: On its API endpoint, after saving to its DB, it publishes a message.
Payment and Shipping Services: These don’t need APIs. They are “headless” services that start up, connect to the broker, and begin listening for messages in a loop.
To achieve idempotency in the Payment Service, it should have its own database to track which order IDs it has already processed. When it receives an event, it first checks if it has seen that ID before.

Learning milestones:

The Payment Service reacts to an event from the Order Service → Your basic event flow is working.
You can define and version your event schemas (e.g., using JSON Schema) → You are thinking about contracts between services.
If the Payment Service is down, messages queue up and are processed when it restarts → You’ve built a resilient system.
The Payment Service does not double-charge a customer if it receives a duplicate event → You have successfully implemented an idempotent consumer.

Project 7: Cloud-Native Pub/Sub (AWS SNS/SQS)

File: LEARN_PUBSUB_DEEP_DIVE.md
Main Programming Language: Python (using Boto3)
Alternative Programming Languages: Go, Node.js, Java
Coolness Level: Level 3: Genuinely Clever
Business Potential: 3. The “Service & Support” Model
Difficulty: Level 3: Advanced
Knowledge Area: Cloud Computing / Serverless
Software or Tool: AWS SNS (Simple Notification Service), SQS (Simple Queue Service)
Main Book: “Amazon Web Services in Action” by Michael Wittig and Andreas Wittig

What you’ll build: Re-implement Project 3 (Task Queue) using managed cloud services. You’ll use AWS SNS as the topic and SQS as the queue. A publisher (e.g., a Lambda function) will publish a message to an SNS topic. That topic will be configured to forward messages to an SQS queue. A consumer (e.g., a script running on an EC2 instance or another Lambda) will poll the SQS queue for messages to process.

Why it teaches pub/sub: This project teaches you how to use pub/sub in a real-world cloud environment. You’ll move from managing broker software yourself to consuming it as a service. You’ll learn about the trade-offs and new concepts involved, like IAM permissions, push vs. pull subscriptions, and autoscaling.

Core challenges you’ll face:

Configuring IAM Permissions → maps to creating roles and policies that allow your code to access SNS and SQS
Understanding the SNS-to-SQS fan-out pattern → maps to subscribing an SQS queue to an SNS topic
Polling the SQS queue → maps to implementing a loop that polls for messages, as SQS is pull-based
Deleting messages after processing → maps to the SQS version of message acknowledgement
Configuring a Dead-Letter Queue (DLQ) → maps to automatically moving messages that fail processing repeatedly to a separate queue for inspection

Key Concepts:

Managed Services: The benefits and drawbacks of letting a cloud provider manage your infrastructure.
IAM (Identity and Access Management): The security model of the cloud.
Push vs. Pull: SNS pushes messages to subscribers. SQS requires consumers to poll (pull) for messages.
Dead-Letter Queue (DLQ): A critical pattern for robust messaging systems to handle poison pills (malformed or unprocessable messages).

Difficulty: Advanced Time estimate: 1-2 weeks Prerequisites: An AWS account, basic familiarity with the AWS console or CLI.

Real world outcome: A fully serverless or cloud-native message processing pipeline. You can publish a message using the AWS CLI or SDK, and see it get processed by your consumer code without having to run a broker yourself. The system is infinitely scalable by default.

System Diagram:

┌───────────┐   1. Publish   ┌─────────┐   2. Forward   ┌─────────┐   3. Poll & Process   ┌─────────┐
│ Publisher │────────────────►│ SNS Topic │────────────► │ SQS Queue │─────────────────────► │ Consumer│
│ (Lambda)  │                └─────────┘              └────┬────┘                       └────┬────┘
└───────────┘                                             │ 4. Delete                       │
                                                          └─────────────────────────────────┘

Implementation Hints:

Use the AWS console or CLI to create an SNS topic and an SQS queue.
Subscribe the SQS queue to the SNS topic. This is a key step. You’ll also need to configure a policy to allow SNS to send messages to your queue.
Set up a Dead-Letter Queue (DLQ) for your main SQS queue. This is just another SQS queue. Configure the main queue’s “redrive policy”.
Publisher: Use the AWS SDK (Boto3 for Python) to get an SNS client. The call is simply sns_client.publish(TopicArn=..., Message=...).
Consumer: Use the SDK to get an SQS client. Your main loop will call sqs_client.receive_message(QueueUrl=...). This call “long-polls,” waiting for messages to arrive.
Once you’ve processed a message from SQS, you must explicitly delete it using sqs_client.delete_message(QueueUrl=..., ReceiptHandle=...). The ReceiptHandle is a temporary token for the message you just received. This is the SQS equivalent of an ACK.

Learning milestones:

A message published to the SNS topic appears in the SQS queue → Your topic-queue subscription is configured correctly.
Your consumer script can poll and receive a message from the SQS queue → Your consumer’s IAM permissions and logic are correct.
After processing, the message is deleted from the queue and doesn’t reappear → You’ve mastered the SQS acknowledgement (delete) lifecycle.
A message that your consumer fails to process is automatically moved to the DLQ after a few tries → You’ve built a robust, production-ready message processing system.

Summary

Project	Difficulty	Time	Key Technology
In-Memory Pub/Sub Hub	Beginner	Weekend	Python/Go (Core Language)
Chat Application with Redis Pub/Sub	Intermediate	Weekend	Redis
Persistent Task Queue with RabbitMQ	Advanced	1-2 weeks	RabbitMQ
Real-Time Notification System	Advanced	2-3 weeks	RabbitMQ + WebSockets
Website Clickstream Analysis with Kafka	Expert	2-3 weeks	Apache Kafka
Event-Driven Microservices Communication	Expert	2-4 weeks	RabbitMQ/Kafka + Docker
Cloud-Native Pub/Sub (AWS SNS/SQS)	Advanced	1-2 weeks	AWS SNS/SQS