Project 9: MQTT Sensor Node (Publish + Subscribe)

Build an MQTT-enabled sensor node that publishes data reliably and responds to commands.

Quick Reference

Attribute Value
Difficulty Intermediate
Time Estimate 1–2 weekends
Main Programming Language Python (Alternatives: Go, Rust, Node.js)
Alternative Programming Languages Go, Rust, Node.js
Coolness Level High
Business Potential High
Prerequisites Networking basics, P03 sensor loop
Key Topics MQTT protocol, QoS, reconnect strategies, payload schemas

1. Learning Objectives

By completing this project, you will:

  1. Publish sensor data to an MQTT broker with QoS control.
  2. Subscribe to command topics and change device behavior.
  3. Implement reconnect logic and offline buffering.
  4. Design a versioned payload schema.

2. All Theory Needed (Per-Concept Breakdown)

Concept 1: MQTT Protocol Semantics, QoS, and Offline Behavior

Fundamentals

MQTT is a lightweight publish/subscribe protocol designed for unreliable networks and constrained devices. Clients publish messages to topics, and subscribers receive messages for topics they are interested in. Quality of Service (QoS) levels determine delivery guarantees: QoS 0 is “at most once,” QoS 1 is “at least once,” and QoS 2 is “exactly once.” In real deployments, networks drop connections, so devices must reconnect and handle offline periods. Understanding MQTT session state, QoS, and buffering is essential for a reliable sensor node.

Deep Dive into the concept

The MQTT broker is the central message router. Clients connect over TCP, authenticate (optionally with username/password or certificates), and then publish or subscribe to topics. The topic namespace is hierarchical (e.g., sensors/pi-zero2w/temp). When a message is published, the broker forwards it to all subscribers of that topic. QoS levels control delivery semantics. QoS 0 provides no acknowledgement; if the message is lost, it is lost. QoS 1 requires an acknowledgment (PUBACK) so the sender retries if not acknowledged. This introduces possible duplicates, so the receiver must handle idempotency. QoS 2 uses a four-way handshake to guarantee exactly once but is more expensive and rarely used in constrained devices.

Offline behavior is critical. When a device disconnects, it may keep collecting sensor data. If you drop data, you lose history; if you buffer too much, you run out of memory. MQTT supports persistent sessions and retained messages. A persistent session allows the broker to queue QoS 1/2 messages while the client is offline, but this only applies to subscribed messages, not published telemetry. For outgoing data, you must implement local buffering (e.g., a queue on disk) and publish when connectivity returns. This queue should be bounded and have a policy: drop oldest, compress, or summarize.

Reconnection strategies must avoid “message storms.” If the device reconnects after hours offline and tries to publish thousands of messages at once, the broker may be overloaded. Instead, publish in batches with rate limiting. You should also include a sequence number or timestamp in payloads so that subscribers can detect duplicates or out-of-order events. This is especially important for QoS 1, which can deliver duplicates.

Payload schemas matter. A sensor reading is more than a number; it includes units, timestamps, device ID, and schema version. If you change the payload format later, subscribers need a way to interpret both versions. A simple approach is to include a schema_version field in each payload and maintain backward compatibility. Because MQTT payloads are raw bytes, JSON is a common choice for readability, but CBOR or protobuf are more efficient. For this project, JSON is sufficient.

Security: MQTT brokers are often exposed on networks. At minimum, you should use username/password authentication and restrict topics. TLS is recommended for production but optional for a local lab. For headless devices, storing credentials securely is a challenge; environment variables or config files with correct permissions are acceptable for this project.

How this fit on projects

This concept is used in §3 (requirements), §4 (architecture), and §5.10 (implementation). It is foundational for the capstone.

Definitions & key terms

  • Broker: MQTT server that routes messages.
  • Topic: Hierarchical channel name.
  • QoS: Delivery guarantee level.
  • Retained message: Last message stored by broker for new subscribers.
  • Session: Connection state managed by broker.

Mental model diagram (ASCII)

Sensor -> MQTT Publish -> Broker -> Subscribers
Command <- MQTT Subscribe <- Broker <- Controller

How it works (step-by-step, with invariants and failure modes)

  1. Connect to broker with client ID.
  2. Subscribe to command topics.
  3. Publish sensor data on interval.
  4. On disconnect, buffer data locally.
  5. Reconnect and flush buffer with rate limit.

Failure modes:

  • Broker unreachable -> buffer grows.
  • QoS 1 duplicates -> need idempotency.
  • Credential failure -> connection refused.

Minimal concrete example

client.publish("sensors/pi/temp", payload=json.dumps(data), qos=1)

Common misconceptions

  • “QoS 1 means no duplicates.” It can duplicate.
  • “Retained messages store all history.” Only the last retained message is stored.

Check-your-understanding questions

  1. When should you use QoS 1 instead of QoS 0?
  2. Why is local buffering needed for outgoing messages?
  3. What does a retained message do?

Check-your-understanding answers

  1. When you need delivery confirmation at the cost of duplicates.
  2. Brokers only buffer subscribed messages; publishes are lost if offline.
  3. It stores the last message for new subscribers to receive immediately.

Real-world applications

  • IoT telemetry, industrial monitoring, smart home devices.

Where you’ll apply it

  • This project: §3.2, §5.10.
  • Other projects: Project 17.

References

  • MQTT 3.1.1 specification
  • “Designing Connected Products” — MQTT basics

Key insights

Reliability comes from explicit buffering and idempotent payloads, not just QoS.

Summary

MQTT is lightweight but requires careful design around QoS, buffering, and reconnects.

Homework/Exercises to practice the concept

  1. Test QoS 0 vs QoS 1 with network drops.
  2. Implement a simple disk-backed queue and replay it.
  3. Design a JSON payload with schema versioning.

Solutions to the homework/exercises

  1. QoS 1 retries after reconnect; QoS 0 does not.
  2. Use a simple append-only file with max length.
  3. Include schema_version and device_id fields.

3. Project Specification

3.1 What You Will Build

A sensor node that publishes readings to an MQTT broker and listens for control commands.

3.2 Functional Requirements

  1. Publish sensor data to sensors/{device_id}/temperature.
  2. Subscribe to devices/{device_id}/command.
  3. Implement reconnect with offline buffering.
  4. Provide a status command that changes device behavior.

3.3 Non-Functional Requirements

  • Performance: Publish cycle < 500 ms.
  • Reliability: No data loss for <= 10 minutes offline.
  • Usability: Clear logs and error messages.

3.4 Example Usage / Output

$ mqtt_subscribe sensors/pi-zero2w/temperature
22.4

3.5 Data Formats / Schemas / Protocols

Payload JSON:

{"schema_version":1,"device_id":"pi-zero2w","temp_c":22.4,"ts":"2026-01-01T11:00:00Z"}

3.6 Edge Cases

  • Broker unavailable.
  • Duplicate messages with QoS 1.
  • Invalid command payload.

3.7 Real World Outcome

Sensor data reaches subscribers consistently, and commands change device behavior.

3.7.1 How to Run (Copy/Paste)

python3 mqtt_node.py --broker 192.168.1.10 --device pi-zero2w

3.7.2 Golden Path Demo (Deterministic)

export FIXED_TIME="2026-01-01T11:00:00Z"
python3 mqtt_node.py --simulate --publish "temp=22.4"

Expected output:

[2026-01-01T11:00:00Z] Published QoS1 to sensors/pi-zero2w/temperature

3.7.3 Failure Demo (Deterministic)

python3 mqtt_node.py --broker 192.0.2.1

Expected output:

[ERROR] Broker unreachable

Exit code: 91

3.7.4 CLI Exit Codes

  • 0: Success
  • 90: Auth failure
  • 91: Broker unreachable
  • 92: Publish buffer full

4. Solution Architecture

4.1 High-Level Design

Sensor Loop -> MQTT Publisher -> Broker -> Subscribers
               ^
               |
        MQTT Subscriber <- Commands

4.2 Key Components

| Component | Responsibility | Key Decisions | |—|—|—| | Publisher | Send telemetry | QoS level | | Subscriber | Receive commands | Topic naming | | Buffer | Store offline data | Memory vs disk | | Schema | Encode payload | JSON vs binary |

4.3 Data Structures (No Full Code)

queue = deque(maxlen=1000)

4.4 Algorithm Overview

Key Algorithm: Buffered Publish

  1. Add reading to queue.
  2. If connected, publish with QoS.
  3. On reconnect, flush with rate limit.

Complexity Analysis:

  • Time: O(n) for flush, O(1) for single publish
  • Space: O(n) queue size

5. Implementation Guide

5.1 Development Environment Setup

pip install paho-mqtt

5.2 Project Structure

project-root/
├── mqtt_node.py
├── buffer.py
└── README.md

5.3 The Core Question You’re Answering

“How does a small device communicate reliably over unreliable networks?”

5.4 Concepts You Must Understand First

  1. MQTT QoS levels and duplicates.
  2. Topic hierarchy and naming.
  3. Offline buffering strategies.

5.5 Questions to Guide Your Design

  1. How will you avoid message storms after reconnect?
  2. How will you validate command payloads?

5.6 Thinking Exercise

Design a topic tree for a fleet of 100 devices.

5.7 The Interview Questions They’ll Ask

  1. What is QoS 1 and why can it duplicate messages?
  2. How do you secure MQTT traffic?
  3. Why is MQTT good for constrained devices?

5.8 Hints in Layers

Hint 1: Publish one sensor value every minute.

Hint 2: Add a command subscription.

Hint 3: Implement buffering and replay on reconnect.

5.9 Books That Will Help

| Topic | Book | Chapter | |—|—|—| | MQTT | Designing Connected Products | Ch. 4 | | Networking | Computer Networks | Ch. 3 |

5.10 Implementation Phases

Phase 1: Basic publish (3 hours)

  • Connect and publish one value.

Phase 2: Subscribe (3 hours)

  • Add command handling.

Phase 3: Offline buffering (4 hours)

  • Implement queue and replay.

5.11 Key Implementation Decisions

| Decision | Options | Recommendation | Rationale | |—|—|—|—| | QoS | 0 / 1 | 1 | Reliability with manageable duplicates | | Buffer | Memory / Disk | Disk for >10 min | Avoid data loss |


6. Testing Strategy

6.1 Test Categories

| Category | Purpose | Examples | |—|—|—| | Unit Tests | Payload schema | JSON validation | | Integration Tests | Broker publish | Mosquitto local broker | | Edge Case Tests | Network drop | Disconnect simulation |

6.2 Critical Test Cases

  1. Publish succeeds with QoS 1.
  2. Duplicate messages are handled gracefully.
  3. Offline buffer replays after reconnect.

6.3 Test Data

{"temp_c":22.4,"ts":"2026-01-01T11:00:00Z"}

7. Common Pitfalls & Debugging

7.1 Frequent Mistakes

| Pitfall | Symptom | Solution | |—|—|—| | No buffering | Data lost | Add queue | | Too aggressive replay | Broker overload | Rate limit | | Invalid JSON | Subscriber errors | Schema validation |

7.2 Debugging Strategies

  • Use mosquitto_sub to inspect messages.
  • Log publish retries and reconnects.

7.3 Performance Traps

  • Publishing too frequently saturates Wi-Fi.

8. Extensions & Challenges

8.1 Beginner Extensions

  • Add a retained “last seen” status message.

8.2 Intermediate Extensions

  • Add TLS with self-signed certs.

8.3 Advanced Extensions

  • Implement protobuf payloads for efficiency.

9. Real-World Connections

9.1 Industry Applications

  • Industrial telemetry, smart home systems, fleet monitoring.
  • Mosquitto MQTT broker.

9.3 Interview Relevance

  • MQTT and IoT reliability questions are common.

10. Resources

10.1 Essential Reading

  • MQTT 3.1.1 specification.

10.2 Video Resources

  • MQTT broker setup tutorials.

10.3 Tools & Documentation

  • Mosquitto docs and CLI tools.

11. Self-Assessment Checklist

11.1 Understanding

  • I can explain QoS and why duplicates happen.
  • I can explain MQTT topic design.

11.2 Implementation

  • Device publishes reliably.
  • Commands change device behavior.

11.3 Growth

  • I can discuss offline buffering in interviews.

12. Submission / Completion Criteria

Minimum Viable Completion:

  • Publish a sensor value to a broker.

Full Completion:

  • Publish/subscribe with buffering and schema.

Excellence (Going Above & Beyond):

  • TLS + device provisioning automation.