M5Stack Cardputer: Project-Based Learning Sprint

Goal: Build deep mastery of the M5Stack Cardputer as a complete embedded product platform. You will learn how to capture real-time input without missed events, render fluid UI on a small SPI TFT, stream audio into DSP pipelines, store structured data safely on microSD, and ship battery-powered tools that behave predictably. By the end, you will be able to design and debug multi-task firmware, build USB and BLE HID devices responsibly, and create portable field tools that handle timing, power, and data integrity constraints in the real world.


Introduction: What This Guide Covers

M5Stack Cardputer is a pocket-sized embedded computer built around the ESP32-S3, combining a keyboard, TFT display, audio I/O, microSD storage, IR emitter, and USB OTG in a single handheld form factor. It is a perfect platform for learning how real embedded products behave: multiple peripherals active at once, strict timing, limited RAM, noisy inputs, and real users.

What you will build (by the end of this guide):

  • A WiFi sniffer that captures 802.11 frames and exports PCAP files for Wireshark.
  • A universal IR remote that learns and replays real-world remotes.
  • A real-time audio spectrum analyzer with FFT and smooth UI.
  • USB and BLE HID devices that act like a keyboard or mouse.
  • A mini-OS launcher that coordinates multiple apps safely.
  • A full security toolkit capstone that unifies all subsystems.

Scope (what is included):

  • ESP32-S3 concurrency, DMA, and FreeRTOS task design.
  • Keyboard matrix scanning, debouncing, and event pipelines.
  • SPI display rendering, buffering, and UI architecture.
  • microSD storage, file formats (PCAP/CSV), and durability.
  • WiFi promiscuous capture, BLE HID over GATT, USB HID via TinyUSB.
  • Audio capture (I2S) and DSP basics (FFT, windowing).
  • IR protocols and microsecond-level timing.

Out of scope (for this guide):

  • RF hardware design or custom antenna work.
  • Cellular radios (LTE/5G) and GNSS RF front-ends.
  • Full cryptographic protocol design (we focus on applied security boundaries).

The Big Picture (Mental Model)

       [INPUTS]                  [CAPTURE]                    [PROCESS]                  [OUTPUT]
   +---------------+         +---------------+            +---------------+         +---------------+
   | Keyboard      |  ---->  | Scan/Debounce |  ----->    | Events/State  |  ---->  | UI + Actions  |
   +---------------+         +---------------+            +---------------+         +---------------+
   | Mic (I2S)     |  ---->  | DMA Buffers   |  ----->    | FFT/Bands     |  ---->  | Spectrum UI   |
   +---------------+         +---------------+            +---------------+         +---------------+
   | WiFi/BLE/USB  |  ---->  | Driver Cbs    |  ----->    | Parse/Filter  |  ---->  | Logs/HID      |
   +---------------+         +---------------+            +---------------+         +---------------+
   | IR Receiver*  |  ---->  | Timer Capture |  ----->    | Decode/Store  |  ---->  | Replay        |
   +---------------+         +---------------+            +---------------+         +---------------+

   * External IR receiver via Grove port.

Key Terms You Will See Everywhere

  • DMA: Direct Memory Access. Hardware-assisted data movement that bypasses CPU copies.
  • Ring buffer: A circular queue used for high-rate data without allocation churn.
  • GATT: BLE attribute protocol that defines services/characteristics.
  • HID: Human Interface Device class (keyboards, mice) over USB or BLE.
  • PCAP: Packet capture file format used by Wireshark and tcpdump.

Cardputer Hardware Snapshot (from official M5Stack docs)

Base Cardputer (K132):

  • SoC: ESP32-S3FN8 dual-core Xtensa LX7 (up to 240 MHz)
  • Wireless: 2.4 GHz Wi-Fi
  • Flash: 8 MB
  • Display: ST7789V2, 1.14 inch, 240 x 135
  • Keyboard: 56 keys (4 x 14 matrix)
  • Buttons: Reset + User button
  • USB: USB-C with USB-OTG + USB Serial/JTAG
  • Audio: MEMS mic (SPM1423) + 8 ohm 1 W I2S speaker (NS4168)
  • Storage: microSD slot
  • Expansion: HY2.0-4P (Grove-compatible) port
  • IR: Onboard IR emitter
  • Battery: 120 mAh internal + 1400 mAh base
  • IR emitter distance: 410 cm (0 deg), 170 cm (45 deg), 66 cm (90 deg)
  • Power: Sleep current 0.26 uA @ 4.2 V; working current 165.7 mA (keyboard mode), 255.6 mA (IR mode)

Cardputer-Adv (K132-Adv) highlights (if you have the advanced version):

  • Audio: ES8311 audio codec + NS4150B amplifier, 3.5 mm audio output
  • Sensors: BMI270 6-axis IMU
  • Expansion: EXT 2.54 mm 14P expansion + HY2.0-4P port
  • Battery: 1750 mAh (base)
  • Key actuation: ~160 gf

How to Use This Guide

  1. Read the Theory Primer first. It is the textbook for all projects.
  2. Pick a project aligned with your goals and build the smallest working slice.
  3. Only add complexity after you can verify the basic path works.
  4. Keep a lab notebook: log bugs, timing values, and fixes.
  5. Do not move to the next project until you meet the Definition of Done.

Prerequisites & Background Knowledge

Essential Prerequisites (Must Have)

Programming Skills:

  • Solid C/C++ fundamentals (pointers, structs, bitwise operations).
  • Comfort reading datasheets or peripheral APIs.
  • Ability to debug with serial logs and reason about state machines.

Embedded Fundamentals:

  • SPI and I2S basics (clocked serial buses).
  • GPIO configuration (input/output/pull-ups).
  • Interrupt vs polling tradeoffs.

Recommended Reading (from your library):

  • Making Embedded Systems by Elecia White - Ch. 1-3 (constraints, timing, and design tradeoffs).
  • Effective C, 2nd Edition by Robert C. Seacord - Ch. 7-9 (defensive coding, memory safety).

Helpful But Not Required

FreeRTOS task design:

  • You can learn this while building Projects 1, 3, and 6.

Networking basics:

  • 802.11 frames, MAC addresses, channel hopping.

DSP fundamentals:

  • FFT concepts and windowing (learn in Project 3).

Self-Assessment Questions

  1. Can you explain why DMA is critical for real-time audio or WiFi capture?
  2. Can you implement a ring buffer in C without bugs?
  3. Do you know how to avoid blocking inside interrupt or driver callbacks?
  4. Can you read a binary header and handle endianness correctly?
  5. Can you design a finite-state machine for input events?

If you answered “no” to questions 1-3, spend 1-2 weeks reading the recommended books above and building tiny experiments before starting.

Development Environment Setup

Required Tools:

  • ESP-IDF v5.1+ (recommended for ESP32-S3) or Arduino IDE with ESP32-S3 core.
  • Python 3.10+ (ESP-IDF tooling), CMake, and Ninja (bundled by ESP-IDF installer).
  • xtensa-esp32s3-elf toolchain (installed by ESP-IDF).
  • USB-C data cable + serial monitor (idf.py monitor or minicom).
  • microSD card (Projects 1, 4, 5, 6, 8).
  • Git (to version firmware and keep stable checkpoints).

Recommended Tools:

  • Logic analyzer (for IR and timing verification).
  • Wireshark or tshark (for PCAP validation).
  • A known-good USB host (laptop) for USB HID testing.
  • GPS module (UART) for wardriving project.

Testing Your Setup:

# Confirm ESP-IDF is installed
$ idf.py --version
ESP-IDF v5.x

# Set the target and build a sample (one-time per repo)
$ idf.py set-target esp32s3
$ idf.py build

# Build and flash the hello-world example
$ idf.py -p /dev/ttyUSB0 flash monitor
I (301) hello_world: Hello world!

Time Investment

  • Intermediate projects (2, 7): 1-2 weeks each
  • Advanced projects (1, 3, 4, 5): 2-3 weeks each
  • Expert project (6): 1 month+
  • Capstone (8): 2-3 months

Important Reality Check

These projects are deliberately hard. Expect UI flicker, race conditions, data corruption, and timing bugs. The goal is not perfection on the first pass. The learning happens in layers:

  1. First pass: Get a minimal version working.
  2. Second pass: Understand why it works.
  3. Third pass: Fix edge cases and stability.
  4. Fourth pass: Turn it into a portable, reliable tool.

Big Picture / Mental Model

                +----------------------+         +----------------------+
                |   High-Rate Inputs   |         |  Low-Rate Inputs     |
                |  (WiFi, I2S, USB)    |         |  (Keys, IR, Buttons) |
                +----------+-----------+         +----------+-----------+
                           |                                |
                           v                                v
                    +-------------+                  +-------------+
                    | DMA/Ring    |                  | Debounce    |
                    | Buffers     |                  | State       |
                    +------+------+                  +------+------+
                           |                                |
                           v                                v
                    +-------------+                  +-------------+
                    | Parse/      |                  | Event Bus   |
                    | Transform   |                  | + Actions   |
                    +------+------+                  +------+------+
                           |                                |
                           +---------------+----------------+
                                           v
                                   +---------------+
                                   | UI Renderer   |
                                   | + Storage     |
                                   +---------------+

Theory Primer (Read This Before Coding)

This section is your mini-book. Each chapter explains the core concept and includes diagrams, examples, misconceptions, exercises, and solutions.


Chapter 1: ESP32-S3 Execution Model and FreeRTOS Concurrency

Fundamentals

The ESP32-S3 is a dual-core MCU designed for real-time workloads, which means it can handle multiple tasks at once without an operating system like Linux. In practice, you usually run FreeRTOS tasks across two cores: one core may be dedicated to time-critical capture (WiFi, I2S, USB) while the other handles UI rendering and user input. The key idea is to isolate high-rate data ingestion from slow or bursty work like screen drawing or file writes. A task is a schedulable unit that can block on queues or timers, while interrupts should do the minimum possible work and defer heavy processing to a task. Understanding how to avoid blocking critical callbacks is the difference between a stable tool and a device that drops events under load. The ESP32-S3 also uses DMA engines to move data without CPU involvement, which reduces jitter and keeps your capture stable even when the UI is busy.

FreeRTOS is not a luxury on this device; it is the operating environment. Tasks, queues, and timers are the tools that let you turn noisy, asynchronous hardware into deterministic software behavior. A good rule is to treat interrupts as mere signalers and tasks as the only place where heavy work happens. This allows you to reason about timing, reproduce bugs, and scale your firmware as features grow. When you later combine WiFi capture, UI rendering, and audio processing, the only way to keep everything stable is to have a clear concurrency model that you can explain and test.

Deep Dive

On the ESP32-S3, timing is everything. The WiFi driver, BLE stack, USB device stack, and I2S audio engine all rely on predictable CPU availability. FreeRTOS provides preemptive scheduling: higher-priority tasks can interrupt lower-priority ones, and each task runs until it blocks or yields. When you enable WiFi promiscuous mode, your packet callback is executed in the WiFi driver task context, not in your application task. If you parse frames or write to the SD card inside that callback, you will starve the driver and drop packets. The correct pattern is to copy a pointer (or a small header) into a ring buffer and return immediately, then let a separate parsing task consume the data. The same logic applies to I2S audio: the DMA engine triggers a buffer-ready interrupt, the ISR queues the buffer, and a DSP task processes it later. This separation builds the habit of designing pipelines where capture, processing, and output are decoupled.

Memory and allocation strategy matter. Dynamic allocation inside hot paths creates fragmentation and latency spikes. Instead, pre-allocate buffers and reuse them. Use FreeRTOS queues or ring buffers to pass fixed-size buffer descriptors between tasks. You can pin tasks to specific cores to reduce contention: for example, pin WiFi capture to one core and UI rendering to the other. This reduces cache thrash and makes timing more predictable. Use event groups for state transitions (e.g., SD card mounted, WiFi initialized) and keep a watchdog to reset hung subsystems.

Finally, understand the difference between ISR-safe APIs and normal APIs. FreeRTOS provides special ISR versions of queue send and semaphore give, which must be used inside callbacks or interrupts. If you call a blocking function inside a callback, you can deadlock your system. Build a mental model of your system as a set of pipelines with backpressure: if the consumer cannot keep up, you either drop old data, throttle capture, or reduce UI updates. This mindset is central to all projects in this guide.

Another subtlety is core affinity and CPU usage measurement. The ESP32-S3 gives you two cores, but the wireless stacks already consume a meaningful portion of CPU time. If you stack too many high-priority tasks on the same core, you will see random latency spikes and missed events. Use the system task watchdog and runtime statistics to measure actual CPU use under load. Create a diagnostic screen or serial log that reports queue depths and drop counts; this is often the only way to diagnose a bottleneck in the field.

DMA adds performance but also constraints. DMA buffers usually must be allocated in internal memory and must be aligned to specific boundaries. If you allocate a buffer in the wrong memory region, DMA transfers can fail silently or cause corrupted data. That is why ESP-IDF offers DMA-capable allocation flags. Plan your memory budget early: audio buffers, packet buffers, and framebuffers can consume RAM faster than you expect. The safest approach is to statically allocate the maximum buffer set you will need and then turn features on and off at runtime depending on the active tool.

Finally, concurrency design should include failure handling. Every task should have a clear timeout strategy. If a queue is full for too long, log an error and drop data. If a subsystem fails to initialize, report it clearly and enter a safe mode instead of continuing in a broken state. This is how real products behave: they degrade gracefully instead of failing catastrophically.

How This Fits in Projects

  • Project 1 and 5: keep WiFi callbacks short, use queues, and avoid SD writes in driver context.
  • Project 3: split I2S capture, FFT, and UI into separate tasks with clear priorities.
  • Project 4 and 7: run HID logic in a clean event loop without blocking USB/BLE tasks.
  • Project 6 and Capstone: build a stable task topology and watchdog strategy.

Definitions & Key Terms

  • Task: A schedulable unit of work managed by FreeRTOS.
  • ISR: Interrupt Service Routine; short, time-critical code path.
  • Priority inversion: A low-priority task blocks a high-priority task due to resource contention.
  • Backpressure: When a consumer cannot keep up, producers must slow down or drop data.

Mental Model Diagram

[High-rate ISR] -> [Queue/Ring Buffer] -> [Processing Task] -> [UI/Storage]
        |                 |                      |               |
   Must be fast      Fixed-size            Can block        Slow I/O

How It Works (Step-by-Step)

  1. Peripheral hardware generates an interrupt (DMA buffer ready).
  2. ISR posts a pointer/descriptor to a queue and returns immediately.
  3. A processing task wakes, consumes the buffer, and performs heavier work.
  4. The UI/storage task consumes processed results and updates screen or files.
  5. If the queue fills, drop old data or reduce the capture rate.

Minimal Concrete Example

// ISR-safe enqueue of a buffer descriptor
void IRAM_ATTR i2s_isr_handler(void *arg) {
    BaseType_t xHigherPriorityTaskWoken = pdFALSE;
    BufferDesc desc = get_dma_desc();
    xQueueSendFromISR(audio_queue, &desc, &xHigherPriorityTaskWoken);
    if (xHigherPriorityTaskWoken) {
        portYIELD_FROM_ISR();
    }
}

Common Misconceptions

  • “I can do parsing in the callback.” -> That will drop frames under load.
  • “More tasks always improves performance.” -> Too many tasks can add overhead and jitter.
  • “Dynamic allocation is fine for small buffers.” -> Fragmentation and latency grow over time.

Check-Your-Understanding Questions

  1. Why must callbacks return quickly on ESP32-S3?
  2. What is the difference between a queue and a ring buffer?
  3. When should you pin a task to a core?

Check-Your-Understanding Answers

  1. Because callbacks run in driver context; long work blocks hardware events.
  2. Queues carry copies of items; ring buffers carry raw data with fixed size.
  3. When you need predictable timing and want to reduce cross-core contention.

Real-World Applications

  • WiFi frame capture without packet loss.
  • Audio streaming without stutter.
  • Smooth UI updates while background tasks run.

Where You Will Apply It

Used heavily in Projects 1, 3, 4, 5, and 6.

References

  • ESP-IDF WiFi sniffer callback guidance (promiscuous mode). (See official ESP-IDF docs.)
  • Making Embedded Systems by Elecia White - Ch. 5-6.

Key Insight

Key insight: You are not writing one program; you are building pipelines that must keep up with real-time signals.

Summary

Concurrency on the ESP32-S3 is about keeping hot paths short and moving work to tasks that can block safely. This discipline shows up in every project.

Homework/Exercises

  1. Write a queue-based producer/consumer demo that drops data when the queue is full.
  2. Measure how long a task can block before the UI becomes sluggish.

Solutions

  1. Use a fixed-size queue and check for pdFAIL to count drops.
  2. Add timing logs around UI draw calls and test at 15/30/60 FPS.

Chapter 2: GPIO, Matrix Scanning, and Input Debouncing

Fundamentals

The Cardputer keyboard is a 4x14 matrix. Instead of 56 dedicated GPIO pins, the keyboard is wired as rows and columns. To detect key presses, you drive one row at a time and read which columns are active. This dramatically reduces pin usage but introduces challenges: ghosting (false keys when multiple are pressed), debounce (the electrical bouncing that creates false transitions), and event timing (you need to handle long presses and key repeats). The core idea is that key scanning is not a one-shot event; it is a continuous sampling loop. Your firmware must convert noisy electrical signals into clean, semantic events such as “key down”, “key up”, “long press”, and “chord”. That conversion requires well-designed state machines and timing.

Matrix scanning is a small system by itself: it is sampling, filtering, and event generation. The keyboard provides raw electrical signals, but the application needs semantic events with timing. You are effectively building a tiny input driver: it must be predictable, low-latency, and robust even when the rest of the system is busy. This chapter is foundational because the same event pipeline will later feed USB HID, BLE HID, UI navigation, and macro execution. If the keyboard pipeline is unstable, every higher-level feature will feel broken.

Deep Dive

Matrix scanning works by iterating over each row, setting it low (or high), and reading all columns to see which keys connect that row to a column. The scan rate determines how quickly you can detect keys and how much CPU time the scan loop consumes. Typical scan rates are 100-1000 Hz. Higher rates reduce latency but require more CPU and can increase noise. The debouncer is a small state machine per key: it requires a key to be stable for a minimum number of consecutive scans before changing its state. This avoids false key events caused by switch bounce or electrical noise. A common approach is to store a counter per key that increments on stable reads and resets on changes. When the counter reaches a threshold, the state changes. This is deterministic and easy to test.

Ghosting occurs when multiple keys share rows/columns and create unintended connections. Some keyboards include diodes to prevent ghosting, but many low-cost matrices do not. You can mitigate ghosting in software by detecting certain key patterns and ignoring invalid combinations, or by limiting chords to two keys. Another important aspect is key repeat and long-press detection. You need a timer-based event system: after a key is pressed, start a timer; if the key remains pressed, emit a long-press event. For repeat, emit periodic events at a configured repeat interval. In embedded systems, these time-based events should be handled in a dedicated input task to avoid jitter.

Finally, integrate input events into a single queue so the UI can consume them asynchronously. The UI should not scan keys directly. Instead, the input task scans and debounces, then pushes normalized events. This is the same pipeline principle from Chapter 1, applied to low-rate inputs. A clean input pipeline makes everything else simpler, including macro engines, text editors, and HID mapping in Projects 4 and 7.

At the electrical level, matrix scanning relies on pull-ups or pull-downs to establish default states. You need to configure GPIOs correctly to avoid floating inputs. A common approach is to set columns as inputs with pull-ups and drive one row low at a time. When a key is pressed, the corresponding column reads low. Scan order and timing matter; if you scan too slowly, you will miss fast taps. If you scan too fast, you waste CPU and may introduce noise or crosstalk. A stable scan loop should run in a dedicated task or timer callback with a fixed cadence.

Ghosting and masking are real. Without diodes, certain three-key combinations can appear as a fourth phantom key. You can detect ghosting by checking if multiple keys share rows and columns in a way that creates ambiguity. In software, you can either ignore ambiguous states or restrict chord usage. For HID or text input, you also need key mapping: map matrix positions to logical keys, apply modifiers (Shift, Ctrl), and support layers (Fn). A clean mapping system keeps your firmware maintainable and allows later projects to reuse it.

Debounce is not just delay; it is filtering based on stability. The simplest stable approach is an N-sample filter: only accept a state change if the same value is read N times in a row. More advanced approaches use integrator filters that accumulate toward pressed or released states. Track timestamps for each key to produce key repeat events and long-press events. Do not emit repeats inside the scanning loop; instead, enqueue events into an input queue and let a UI task handle them. This keeps your input driver deterministic and prevents UI rendering from breaking key timing.

How This Fits in Projects

  • Project 2: accurate pulse timing UI depends on clean key input for menu navigation.
  • Project 4 and 7: key events drive HID reports and macro engines.
  • Project 6: the launcher UI relies on predictable key repeat and long-press behavior.

Definitions & Key Terms

  • Matrix scanning: Cycling through rows/columns to detect key presses.
  • Debounce: Filtering rapid toggles caused by mechanical bounce.
  • Ghosting: False key detections caused by multiple keys pressed.
  • Chord: A combination of simultaneous key presses.

Mental Model Diagram

Row drive -> Column read -> Debounce -> Event queue -> UI/action
   4 rows       14 cols      per-key     FIFO queue   consumer

How It Works (Step-by-Step)

  1. Drive row 0 low, read all 14 columns.
  2. Repeat for rows 1-3.
  3. Update debounce counters for each key.
  4. Generate key down/up events when stable states change.
  5. Generate long-press and repeat events based on timers.

Minimal Concrete Example

for (row = 0; row < 4; row++) {
    drive_row(row);
    for (col = 0; col < 14; col++) {
        bool pressed = read_col(col);
        debounce_update(row, col, pressed);
    }
}
if (event_ready()) {
    xQueueSend(input_queue, &event, 0);
}

Common Misconceptions

  • “I can scan only when a key is pressed.” -> You cannot detect transitions reliably that way.
  • “Debounce is just a delay.” -> It is a state machine with explicit stability rules.

Check-Your-Understanding Questions

  1. What causes ghosting in a keyboard matrix?
  2. Why should key scanning run at a steady rate?
  3. How do you detect a long press without blocking?

Check-Your-Understanding Answers

  1. Multiple pressed keys create unintended row/column paths.
  2. Consistent sampling gives predictable latency and debounce behavior.
  3. Use a timer or tick count and emit a long-press event when the threshold is met.

Real-World Applications

  • Text input, menu navigation, macro engines, HID mapping.

Where You Will Apply It

Projects 2, 4, 6, and 7.

References

  • M5Stack Cardputer keyboard matrix (4x14). (See official M5Stack docs.)
  • Making Embedded Systems - Ch. 4-5.

Key Insight

Key insight: Key scanning is a data pipeline; treat it like signal processing, not a single GPIO read.

Summary

Matrix scanning plus debouncing turns noisy hardware into deterministic events. This is the foundation of every UI and HID feature.

Homework/Exercises

  1. Implement a per-key debounce state machine and test it at 100 Hz.
  2. Add long-press and repeat events with configurable delays.

Solutions

  1. Use a counter that requires N stable scans before state change.
  2. Store press timestamps and emit repeat events at a fixed interval.

Chapter 3: SPI Display and UI Rendering on ST7789

Fundamentals

The Cardputer uses a 1.14-inch ST7789V2 TFT display driven over SPI. SPI displays are bandwidth-limited and cannot refresh a full frame at high rates without optimization. You must decide how to render: full-screen buffer, partial updates (dirty rectangles), or line-by-line drawing. The key is to minimize SPI transfer volume while keeping the UI smooth. Because the display is small (240x135), you can sometimes fit a full framebuffer in memory, but doing so competes with audio buffers and packet capture. UI rendering should be treated as a low-priority consumer of state changes, not a real-time data source.

The ST7789 is a command-driven display: you send commands to set the drawing window and then stream pixel data over SPI. The bus is fast but not infinite, and the MCU must share it with other devices and tasks. This means your UI architecture must be intentional. You should treat display rendering like a periodic consumer of state. If you let every small event trigger a full redraw, your system will stutter whenever WiFi or audio loads increase.

Deep Dive

The ST7789 expects pixel data in a specific format (often RGB565). Each update begins with a command sequence that sets the drawing window and then streams pixel data. If you redraw the entire screen at 30 FPS, you transfer roughly 2401352 bytes = ~64 KB per frame. At 30 FPS, that is almost 2 MB/s of SPI traffic, which is feasible but consumes CPU and DMA bandwidth. When the WiFi sniffer or audio DSP is active, full-frame redraws may cause stutter. Dirty rectangle rendering is an effective optimization: track which regions changed, then update only those areas. For example, in a packet counter UI, only the numbers change, so you redraw a small rectangle instead of the entire screen.

Another key is to decouple rendering from state updates. Use a UI task that wakes at a fixed frame rate (e.g., 30 Hz) and renders the latest state snapshot. Avoid rendering on every event; that causes jitter and redundant draws. Use double buffering if possible: render into a buffer in RAM, then blast it to the display in a single DMA transfer. If full buffers are too large, use line buffers or tile-based rendering. Fonts and text rendering are often the bottleneck; pre-render glyphs or use fixed-width fonts when possible. For animated elements, precompute frames or use simple primitives.

Finally, design a UI architecture that separates data from rendering. Use a state struct (e.g., UiState) that contains current values like packet counts, RSSI, or menu selection. The UI task reads that struct atomically and renders the view. This allows background tasks to update state without touching display code directly. For complex apps, implement a simple UI component system: each widget draws itself from state and knows its bounding rectangle for dirty-region updates. This pattern scales into the mini-OS project.

The ST7789 command set includes address window commands that let you update only a sub-rectangle of the screen. This is the foundation for dirty-rectangle rendering. It also supports different pixel formats (usually RGB565) and byte orderings that must match your drawing routines. If colors appear swapped or inverted, it is often a byte order issue. Use a test pattern early to verify color correctness.

SPI throughput depends on both clock rate and overhead. Each draw call includes command bytes plus pixel data. If you render many small widgets separately, the command overhead can dominate. One optimization is to batch draw calls: compute a single rectangle that encloses all changes, then draw that region once. Another optimization is to pre-render static elements (borders, labels) into a background buffer and only redraw dynamic overlays.

Text rendering is expensive. Glyph rasterization can become the bottleneck on small MCUs. Precompute glyph bitmaps or use fixed-width fonts. If you support multiple font sizes, cache the rendered glyphs in RAM and reuse them. For smooth animations, avoid complex alpha blending; instead, use simple filled rectangles or lines. If your UI needs a scrolling log, store the log in RAM and redraw only the newly added line, shifting the rest using a fast block fill.

Finally, define a UI state struct and treat rendering as a pure function of that state. Background tasks update the state; the UI task periodically renders it. This eliminates race conditions and makes it easy to test the UI with synthetic state data.

How This Fits in Projects

  • Project 1 and 5: show live counters without stalling capture pipelines.
  • Project 3: render FFT bars at a stable frame rate.
  • Project 6 and Capstone: build a consistent UI framework shared across apps.

Definitions & Key Terms

  • RGB565: 16-bit color format (5 bits red, 6 green, 5 blue).
  • Dirty rectangle: A small region of the screen that needs redraw.
  • Framebuffer: A RAM buffer holding a complete screen image.

Mental Model Diagram

[State Update] -> [UI Task @ 30Hz] -> [Render -> SPI DMA] -> [Display]
      |                   |                 |
      |                   +-- dirty rects --+

How It Works (Step-by-Step)

  1. Background tasks update a shared UI state struct.
  2. UI task wakes at fixed FPS.
  3. UI computes dirty regions (or full frame).
  4. SPI DMA sends pixel data to the display.
  5. UI task sleeps until next frame.

Minimal Concrete Example

void ui_task(void *arg) {
    while (1) {
        UiState snapshot = ui_state_copy();
        draw_status_bar(snapshot);
        draw_main_panel(snapshot);
        st7789_flush();
        vTaskDelay(pdMS_TO_TICKS(33));
    }
}

Common Misconceptions

  • “Full redraw is always fine.” -> It can starve other subsystems under load.
  • “UI should update on every event.” -> That causes jitter and wasted SPI bandwidth.

Check-Your-Understanding Questions

  1. Why do dirty rectangles improve performance?
  2. What is the tradeoff between full framebuffer and partial rendering?
  3. Why use a fixed UI frame rate?

Check-Your-Understanding Answers

  1. They reduce the amount of pixel data transferred.
  2. Full framebuffer gives easy drawing but uses RAM; partial rendering saves RAM but is complex.
  3. It limits SPI usage and provides stable, predictable updates.

Real-World Applications

  • Real-time counters, spectrum displays, menu UIs.

Where You Will Apply It

Projects 1, 3, 5, 6, 7, and 8.

References

  • M5Stack Cardputer display specs (ST7789V2, 240x135). (See official M5Stack docs.)

Key Insight

Key insight: A smooth UI is not about raw FPS; it is about predictable frame pacing under load.

Summary

Rendering is a bandwidth problem. Use fixed frame rates, dirty rectangles, and state-driven UI to keep the device responsive.

Homework/Exercises

  1. Implement a UI that redraws only a changing numeric counter.
  2. Measure SPI transfer time for full-frame vs partial updates.

Solutions

  1. Track the bounding box of the counter text and redraw only that area.
  2. Log timestamps before and after SPI transfer calls and compare.

Chapter 4: Storage, File Formats, and Data Integrity on microSD

Fundamentals

microSD storage gives you persistence beyond onboard flash. The Cardputer uses a microSD slot that typically operates over SPI. This makes it slower than direct flash but still sufficient for logs and captures. The file system is usually FAT, which is easy to read on PCs but not designed for heavy write workloads. You must be careful with write patterns to avoid fragmentation and corruption, especially during power loss. Data formats such as CSV (for wardriving) and PCAP (for packet capture) need strict binary layout or your files will not open in common tools.

microSD gives you huge storage compared to onboard flash, but it behaves like a removable disk with all the fragility that implies. You should assume that cards can be removed, corrupted, or slow. A robust design treats the SD card as an optional subsystem: if it fails, your system should keep running and clearly show the error. This mindset changes how you handle files, buffers, and log rotation.

Deep Dive

When writing to microSD, every file operation touches the FAT structures. Frequent small writes cause fragmentation and wear. The best strategy is to buffer in RAM, write in larger chunks, and rotate files rather than appending tiny records. For example, your WiFi sniffer should buffer several packets and flush periodically instead of writing on every frame. Use preallocated files if possible to reduce fragmentation. For configuration and small key-value data, consider NVS (non-volatile storage) rather than the SD card.

PCAP is a binary format with a fixed 24-byte global header followed by per-packet record headers. Endianness matters: the magic value indicates byte order and timestamp resolution (microseconds vs nanoseconds). The standard version is 2.4, and the snaplen field is typically 65,535 bytes (the max capture length). The link-layer type identifies how Wireshark interprets the payload (e.g., raw 802.11). If any of these fields are wrong, Wireshark will show “corrupt file” errors or misinterpret your frames. Always test with a small capture and tshark -r file.pcap before collecting long runs. CSV is easier but still requires consistent columns and escaped values. Decide on a strict schema and stick to it so your logs remain parsable.

Data integrity is also about power. If the device resets or the battery dies during a write, FAT metadata can be corrupted. Mitigate this by calling fsync or closing files when possible, and by writing logs in append-only chunks with a checksum or version field. Maintain a simple journal file that indicates which log files are complete. For large captures, implement file rotation (e.g., 10 MB per file) to limit loss if corruption occurs. These practices make your tools trustworthy.

FAT filesystems update metadata frequently. A small append can trigger multiple metadata writes, which increases wear and risk of corruption. The best approach is to buffer in RAM and write in large chunks. For long-running logs, use file rotation: create a new file when a size limit is reached and close the old one cleanly. This keeps filesystem metadata consistent and limits data loss if the device powers off.

Use atomic update patterns for configuration files. Write the new config to a temp file, call fsync, then rename the file to replace the old one. This prevents partial writes from corrupting the config. If you need to store small settings, use NVS; it is safer for small updates and reduces SD wear. For large binary logs, include a short header with a version, timestamp, and checksum so you can validate integrity during analysis.

Validation is part of the workflow. Always test your PCAP and CSV files on a desktop before doing long captures. Build a “self-test” mode that writes a small log, closes it, and then reopens it to verify headers and counts. In embedded systems, logging is only useful if it is trustworthy; you must design for that from the start.

SD cards also have variable latency. Some cards pause for tens or hundreds of milliseconds during internal housekeeping. If your capture pipeline blocks on a slow write, you will lose data. Buffering decouples capture from storage, but you should still expose a “dropped records” counter so users know the quality of a capture. For very long runs, maintain a manifest file that lists each log file with start/end timestamps so post-processing tools can reconstruct a full timeline.

How This Fits in Projects

  • Project 1: PCAP correctness and buffered writes prevent corrupted captures.
  • Project 5: long CSV logs require rotation and integrity checks.
  • Project 6 and Capstone: settings storage must survive power loss.

Definitions & Key Terms

  • FAT: File Allocation Table; common file system for SD cards.
  • PCAP: Packet capture format with global + per-packet headers.
  • NVS: Non-volatile storage in ESP32 for small key-value data.

Mental Model Diagram

[RAM buffer] -> [Chunk write] -> [FAT file] -> [PC tool]
    |               |                |            |
  batching       fsync           rotation     validation

How It Works (Step-by-Step)

  1. Accumulate data in a RAM buffer.
  2. Write in fixed-size chunks (e.g., 4-16 KB).
  3. Call fsync or close file periodically.
  4. Rotate files when size limit reached.
  5. Validate by opening on a PC.

Minimal Concrete Example

typedef struct {
    uint32_t magic;      // 0xa1b2c3d4 (microsecond) or 0xa1b23c4d (nanosecond)
    uint16_t v_major;    // 2
    uint16_t v_minor;    // 4
    int32_t  thiszone;   // GMT offset
    uint32_t sigfigs;    // accuracy
    uint32_t snaplen;    // max capture length
    uint32_t network;    // link-layer type
} pcap_hdr_t;

pcap_hdr_t hdr = {0xa1b2c3d4, 2, 4, 0, 0, 65535, LINKTYPE_IEEE802_11};
fwrite(&hdr, 1, sizeof(hdr), log_file);

if (buffer_len >= CHUNK_SIZE) {
    fwrite(buffer, 1, buffer_len, log_file);
    fflush(log_file);
    buffer_len = 0;
}

Common Misconceptions

  • “SD card is fast enough for any logging.” -> Small writes are a performance killer.
  • “CSV is always safe.” -> Without schema and escaping, CSV becomes ambiguous.

Check-Your-Understanding Questions

  1. Why is chunked writing more reliable than per-packet writes?
  2. What does the PCAP magic header indicate?
  3. How do you reduce log corruption after power loss?

Check-Your-Understanding Answers

  1. It reduces filesystem overhead and fragmentation.
  2. It defines byte order and file format version.
  3. Use fsync, file rotation, and journaling.

Real-World Applications

  • WiFi packet captures, wardriving logs, configuration storage.

Where You Will Apply It

Projects 1, 5, 6, and 8.

References

  • Wireshark PCAP format documentation. (See Wireshark docs.)

Key Insight

Key insight: File reliability on embedded systems is about disciplined write patterns, not just calling fwrite.

Summary

microSD is powerful but fragile. Buffer, chunk, rotate, and validate to keep data trustworthy.

Homework/Exercises

  1. Write a CSV logger and verify it in a spreadsheet.
  2. Create a PCAP file with a single fake packet and open it in Wireshark.

Solutions

  1. Use a fixed header and check column counts.
  2. Use the correct PCAP global header and a dummy record.

Chapter 5: 802.11 Sniffering and Wireless Capture

Fundamentals

WiFi promiscuous mode lets the ESP32-S3 capture raw 802.11 frames without joining a network. This is powerful but sensitive to timing. The driver delivers frames via a callback executed in the WiFi task context. If you do heavy work there, you will drop frames and destabilize the driver. You must treat frame capture as a high-rate stream that feeds a pipeline: capture -> filter -> parse -> store -> display. Also, sniffing is only legal on networks you are authorized to monitor; ethical boundaries are part of the engineering discipline.

802.11 is a multi-layer protocol even at the link layer. Frames include MAC headers, addresses, sequence numbers, and flags that describe the direction of traffic. The ESP32-S3 only supports 2.4 GHz WiFi, so all scanning and hopping is within channels 1-13 depending on region. This limitation is important when interpreting capture results. Understanding the structure of management frames (beacons, probes) is essential because those frames carry the SSID and network capabilities that you will parse in the sniffer and wardriving projects.

Deep Dive

802.11 frames come in three major types: management (beacons, probes), control (RTS/CTS), and data. In promiscuous mode, the ESP32-S3 can deliver these frames along with metadata such as RSSI, channel, and timestamp. Channel hopping is essential if you want coverage across the 2.4 GHz band, but hopping reduces dwell time on any one channel and increases the chance of missing frames. You must choose a hop schedule based on your goals: surveying many networks quickly vs. capturing deep traffic on one channel.

The capture callback should do minimal work: copy the packet or its header into a ring buffer and return. In ESP-IDF, the promiscuous callback runs in the WiFi task context, so any blocking work here directly reduces capture quality. Use esp_wifi_set_promiscuous_filter() (and, if needed, esp_wifi_set_promiscuous_ctrl_filter()) to reduce load by filtering to management frames or specific control frames. A parser task can decode the 802.11 header, extract SSIDs from beacon frames, and classify frame types. A storage task can convert packets into PCAP records and write them in batches. A UI task can display live stats (packets per second, top SSIDs). Each stage adds latency; balance that against memory usage by sizing buffers carefully.

Another important detail is timestamping. The ESP32-S3 provides microsecond timers. Record timestamps as close to capture as possible, and store them in PCAP record headers. If you need absolute time, you can sync to NTP when connected or use GPS time in the wardriving project. Finally, be aware that enabling promiscuous mode can impact WiFi throughput if the device is also connected; sniffing is best done in WIFI_MODE_NULL or STA mode without heavy traffic.

ESP-IDF delivers promiscuous packets with metadata that includes RSSI, channel, and rate. The raw payload is an 802.11 frame without additional headers like radiotap. This means you must construct a PCAP record that fits the expected link-layer type or choose a PCAP variant that matches the data you have. Many implementations use the 802.11 header directly and let Wireshark interpret it. Validate this early so your captures open correctly.

Channel hopping strategy affects data quality. A common approach is to dwell for 200-500 ms per channel, then hop to the next channel in a round-robin pattern. If you dwell too briefly, you will miss beacons; if you dwell too long, you will miss other channels. Implement dwell as a timer in a separate task so it does not block the capture callback. Also consider filters: if your goal is network discovery, keep only beacon and probe frames. This dramatically reduces data rate and makes logging easier.

Capture pipelines should include a clear drop policy. If buffers fill, you can either drop new packets or overwrite old ones. For discovery tools, dropping old data is usually fine. For forensic tools, you may need to stop capture and warn the user. Make this behavior explicit in the UI so users understand the tradeoff. Finally, incorporate legal and ethical safeguards directly in the firmware: show a warning screen at startup and require user confirmation before enabling sniffing.

You can extract a surprising amount of information from management frames alone. Beacon frames include capability flags that indicate security (open, WPA, WPA2) and supported data rates. For survey tools, deduplicate by BSSID rather than by station MAC, because many devices randomize their MAC addresses. This keeps your statistics meaningful without capturing payload data.

How This Fits in Projects

  • Project 1: capture, parse, and export PCAP without dropping frames.
  • Project 5: scan and merge SSIDs with GPS to build wardriving logs.

Definitions & Key Terms

  • Promiscuous mode: Capture all frames on a channel, not just those addressed to the device.
  • Beacon: Management frame advertising a network SSID.
  • RSSI: Received Signal Strength Indicator, a rough signal power measure.

Mental Model Diagram

[WiFi Driver] -> [Promiscuous Callback] -> [Ring Buffer] -> [Parser] -> [PCAP]
                                              |                 |
                                              +---> [UI Stats]  +---> [Filters]

How It Works (Step-by-Step)

  1. Enable promiscuous mode and set filters.
  2. Capture packets in callback and enqueue descriptors.
  3. Parser task classifies frames and extracts fields.
  4. Storage task writes PCAP records in chunks.
  5. UI task displays live stats and channel info.

Minimal Concrete Example

void wifi_promisc_cb(void *buf, wifi_promiscuous_pkt_type_t type) {
    PacketDesc desc = copy_header(buf, type);
    ringbuf_push(&pkt_rb, &desc);
}

Common Misconceptions

  • “Promiscuous mode is just a flag.” -> It changes driver behavior and timing constraints.
  • “Channel hopping is always better.” -> It reduces depth per channel.

Check-Your-Understanding Questions

  1. Why must the promiscuous callback be short?
  2. What tradeoff does channel hopping introduce?
  3. Why is PCAP validation important?

Check-Your-Understanding Answers

  1. It runs in the WiFi task and can drop frames if blocked.
  2. Breadth vs depth: more networks vs fewer packets per network.
  3. PCAP has strict headers; one error can corrupt the file.

Real-World Applications

  • Wireless site surveys, authorized security audits, protocol debugging.

Where You Will Apply It

Projects 1 and 5.

References

  • ESP-IDF WiFi sniffer mode documentation. (See official ESP-IDF docs.)

Key Insight

Key insight: Sniffing is a streaming systems problem, not a simple packet read.

Summary

Stable sniffing requires strict separation of capture, parsing, and storage. The faster you return from callbacks, the more data you can trust.

Homework/Exercises

  1. Implement a packet counter that survives 10 minutes without drops.
  2. Add a filter to keep only beacons and probe requests.

Solutions

  1. Use a ring buffer and avoid any heap allocations in callbacks.
  2. Use esp_wifi_set_promiscuous_filter() with management frames only.

Chapter 6: BLE GATT and HID over GATT

Fundamentals

Bluetooth Low Energy uses GATT (Generic Attribute Profile) to expose services and characteristics. HID over GATT (HOGP) defines how keyboards and mice work over BLE. Instead of raw USB descriptors, BLE HID exposes a Report Map characteristic that describes the HID reports. You must implement pairing and bonding if you want reconnects to be stable across reboots. BLE is power-efficient but latency-sensitive; you need to tune connection parameters for responsive HID behavior.

BLE is built around roles: a peripheral advertises, and a central connects. The Cardputer acts as a peripheral for HID. After connection, the host reads your GATT database, discovers services, and subscribes to notifications. The latency and power behavior of a BLE HID device depends on connection interval and slave latency, so even simple features require you to think like a protocol engineer. BLE HID is deceptively complex, but the payoff is a keyboard/mouse that works across phones, tablets, and laptops.

Deep Dive

A BLE HID device advertises services such as HID Service, Battery Service, and Device Information Service. When a host connects, it discovers characteristics and subscribes to notifications for input reports. Your device sends key or mouse reports as notifications. The Report Map defines the structure of those reports and must match the data you send. If the report map is wrong, the host may ignore input or interpret it incorrectly. You also need to manage the Control Point characteristic for suspend/resume.

Pairing and bonding are crucial. Pairing establishes keys, while bonding stores them for future reconnection. If you do not store bonds, the host may require re-pairing every time, which is unacceptable for HID devices. On ESP32, store bonding data in NVS. You should also handle reconnect logic: if a connection drops, restart advertising with the same address, and throttle reconnection attempts to save power.

Latency is shaped by connection interval and slave latency settings. Short intervals give fast input response but use more power. A typical HID interval is 7.5-30 ms. You should queue input events and send them in the next connection event. Do not flood the BLE stack with reports faster than the connection interval; instead, coalesce repeated key states into a single report.

Finally, security matters. BLE supports different security levels. For HID, at least Just Works pairing is common, but you should warn users about untrusted hosts. Build safety gates in your firmware: require explicit user confirmation before pairing with a new host, and provide a way to clear bonds.

The HID over GATT profile defines specific characteristics: a Report Map, one or more Report characteristics, a HID Information characteristic, and a Control Point for suspend/resume. Each Report characteristic can represent input, output, or feature reports. If your Report Map and report data do not match, the host will ignore your input. Debugging this can be painful, so start with a known-good Report Map and modify it incrementally. HOGP uses the HID Service UUID 0x1812. Common characteristics include HID Information (0x2A4A), Report Map (0x2A4B), Control Point (0x2A4C), Report (0x2A4D), and Protocol Mode (0x2A4E). Boot Keyboard Input/Output reports (0x2A22/0x2A32) and Boot Mouse Input (0x2A33) are optional but important for compatibility with BIOS or pre-OS environments.

Connection management is as important as report formatting. You should store bonding keys in NVS to allow reconnection without re-pairing. If a user clears pairing on the host, you must detect the failure and allow a new pairing. Connection parameter updates are also key: a fast interval (7.5-15 ms) gives responsive typing, while a slow interval (30-50 ms) saves power but increases latency. Design your firmware to negotiate these parameters based on user preference.

BLE is event-driven. GATT writes, notifications, and connection events arrive asynchronously. Your firmware should treat them as events and process them in a dedicated BLE task or event loop, not in a UI task. Use queues to pass input events into the BLE stack and always send a key release report after a key press. A common bug is to send only press events, which makes hosts think a key is stuck. Lastly, always provide a way to clear bonds and reset BLE state from the device itself.

Advertising behavior also affects usability. A long advertising interval saves power but delays reconnection. A short interval reconnects fast but burns battery. Provide a user setting that trades reconnection speed for battery life, and persist it. For presentation remotes, faster reconnection usually matters more than battery longevity; for idle macro pads, the opposite may be true.

How This Fits in Projects

  • Project 7: implement HID service, bonding, and macros.
  • Capstone: integrate BLE HID with shared settings and UI.

Definitions & Key Terms

  • GATT: Protocol that exposes services/characteristics in BLE.
  • Report Map: The HID report descriptor in BLE form.
  • Bonding: Persisting security keys for future connections.
  • Protocol Mode: Boot vs Report mode selection for HID over GATT.

Mental Model Diagram

[Key Events] -> [HID Report] -> [BLE Notify] -> [Host HID Driver]
       |              |              |                 |
   debounce       report map     conn interval     OS input

How It Works (Step-by-Step)

  1. Advertise HID service with report map.
  2. Host connects and subscribes to reports.
  3. Key events are converted to HID reports.
  4. Reports sent via notifications.
  5. Bonding keys stored for reconnection.

Minimal Concrete Example

// Pseudocode for sending BLE HID key report
if (ble_connected) {
    hid_send_report(report_id, report_data, report_len);
}

Common Misconceptions

  • “BLE HID is just like USB HID.” -> BLE uses GATT and connection intervals.
  • “Pairing once means it always reconnects.” -> Only if bonding data is stored.

Check-Your-Understanding Questions

  1. What is a HID Report Map in BLE?
  2. Why does connection interval matter for HID latency?
  3. How do you ensure stable reconnection?

Check-Your-Understanding Answers

  1. It defines the structure of input reports and is read by the host.
  2. Reports are delivered per connection event; longer intervals mean more latency.
  3. Store bonds in NVS and reuse them on reboot.

Real-World Applications

  • BLE keyboards, mice, presentation remotes.

Where You Will Apply It

Project 7 and the Capstone.

References

  • Bluetooth HID over GATT profile specification. (See Bluetooth SIG.)

Key Insight

Key insight: BLE HID is a stateful, negotiated protocol; reliability comes from correct report maps and bonding management.

Summary

BLE HID devices must manage GATT services, security, and timing. Treat pairing and reconnection as first-class features, not afterthoughts.

Homework/Exercises

  1. Create a simple BLE HID keyboard that types “HELLO”.
  2. Add bonding and verify reconnection after reboot.

Solutions

  1. Use a fixed report map and send key down/up sequences.
  2. Store bonding keys in NVS and confirm reconnect without re-pairing.

Chapter 7: USB Device and HID via TinyUSB

Fundamentals

USB HID devices present themselves to a host via descriptors. The device descriptor, configuration descriptor, interface descriptors, and HID report descriptor all describe what your device is. The ESP32-S3 uses the TinyUSB stack in ESP-IDF to implement USB device functionality. Unlike BLE, USB is host-driven: the host enumerates the device and asks for descriptors in a fixed sequence. If your descriptors are malformed, the device will fail to enumerate or behave unpredictably.

USB is a host-driven protocol. The device must respond to enumeration requests precisely, or the host will drop it. This is different from BLE: you do not “send” yourself to the host; the host interrogates you and decides which driver to load. That is why descriptor correctness is the foundation of USB HID projects. TinyUSB abstracts many low-level details, but you still own the descriptors and report formats.

Deep Dive

Enumeration starts when the host detects a device and requests the device descriptor. It then requests the configuration descriptor, which includes interface descriptors, endpoints, and HID descriptors. The HID report descriptor defines the format of input reports. For a keyboard, a standard report is 8 bytes: modifiers, reserved, and 6 keycodes. If you want media keys or composite devices, you can add multiple report IDs. TinyUSB provides helper macros for report descriptors and abstracts endpoint management.

On ESP32-S3, USB device functionality uses the USB-OTG controller. The ESP32-S3 shares a PHY between USB-OTG and USB-Serial-JTAG; only one can run at a time unless you use an external PHY. This matters when debugging: if you rely on USB-Serial-JTAG for flashing, you may need to switch modes or use UART for logs. In the USB-OTG mode, D- and D+ are on GPIO19 and GPIO20 (fixed by the USB peripheral). The ESP32-S3 USB device controller supports up to 6 endpoints, so plan composite devices carefully (e.g., HID + CDC) to avoid exhausting endpoints. Secure boot or flash encryption can disable the ROM USB-Serial-JTAG/DFU bootloader path, which affects recovery workflows if you depend on USB for re-flashing.

Timing and reliability: HID is polled by the host at a fixed interval. You should send reports only when state changes or at a low keepalive rate. If you send too quickly, you can saturate the endpoint or waste CPU. Another subtlety is keyboard layout: HID keycodes are for US layout by default. If the host uses another layout, characters may appear wrong. For predictable behavior in security testing, use layout-specific mappings or restrict to scan codes.

Security and ethics are critical. BadUSB-style tools can be abused. Always build safety features: a prominent warning screen, a physical confirmation key, a timeout before sending, and a whitelist of scripts. The device should default to safe mode and require explicit user action to arm.

USB enumeration happens in stages: the host reads the device descriptor, then the configuration descriptor, then class-specific descriptors like the HID report descriptor. If any length field or descriptor order is wrong, enumeration fails. Hosts also cache descriptors based on VID/PID, so changing descriptors without changing VID/PID can lead to confusing behavior. In practice, keep your descriptors stable once they work, and bump product IDs when you make structural changes.

Endpoints define how data flows. HID uses interrupt IN endpoints that the host polls at a specified interval. You should send reports only when state changes, not continuously. If you add a USB CDC serial interface for debugging, you are building a composite device, which requires multiple interfaces in the configuration descriptor. This is possible, but it adds complexity and increases the chance of enumeration errors. Start with a pure HID device first.

TinyUSB runs its own device task and expects you to call tud_task() periodically if you are in a bare-metal loop. In ESP-IDF, this is handled by the stack, but you still need to respect timing: do not block the USB task with long operations. Always check tud_hid_ready() before sending a report. For debugging, use host-side tools such as dmesg on Linux or USB Device Viewer on Windows to see enumeration errors. These tools will save you hours.

Security is part of the engineering. Build arming sequences, visible warnings, and safe defaults into your HID payload tools. Treat every payload as potentially dangerous, and design your UI so accidental execution is unlikely.

How This Fits in Projects

  • Project 4: implement USB descriptors and payload engine safely.
  • Capstone: integrate USB HID with global settings and power management.

Definitions & Key Terms

  • Descriptor: Data structure that describes a USB device to the host.
  • Endpoint: A communication channel (IN or OUT) within a USB interface.
  • HID report: The data packet representing key or mouse state.

Mental Model Diagram

[Device] -> [Descriptors] -> [Host Enumeration] -> [HID Driver] -> [Input]
      |           |                 |                 |
   TinyUSB     report map        polling          OS events

How It Works (Step-by-Step)

  1. Device connects; host requests descriptors.
  2. TinyUSB responds with device and configuration descriptors.
  3. Host loads HID driver and polls for reports.
  4. Device sends reports when keys change.
  5. Host delivers events to applications.

Minimal Concrete Example

// TinyUSB HID report send
if (tud_hid_ready()) {
    tud_hid_keyboard_report(0, modifier, keycodes);
}

Common Misconceptions

  • “USB HID is just sending keycodes.” -> The descriptor defines meaning and protocol.
  • “Enumeration failures are random.” -> Usually descriptor or power issues.

Check-Your-Understanding Questions

  1. Why are descriptors critical for USB HID?
  2. What is the effect of the shared USB PHY on ESP32-S3?
  3. Why do keyboard layouts matter?

Check-Your-Understanding Answers

  1. They define the device and report formats the host expects.
  2. USB-OTG and USB-Serial-JTAG share the PHY; only one can run without external PHY.
  3. Keycodes map to characters differently by layout.

Real-World Applications

  • USB keyboards, macro pads, automation tools.

Where You Will Apply It

Project 4 and the Capstone.

References

  • ESP-IDF USB device stack documentation. (See official ESP-IDF docs.)
  • TinyUSB documentation and examples. (See TinyUSB docs.)

Key Insight

Key insight: USB HID reliability is defined by descriptors and enumeration, not by the code that sends keycodes.

Summary

USB HID is strict. Get descriptors right, manage host polling, and build safety controls.

Homework/Exercises

  1. Create a USB HID keyboard that types “OK” on enumeration.
  2. Break a descriptor on purpose and observe the host error.

Solutions

  1. Use TinyUSB helper macros and send a short key sequence.
  2. Change a report length and watch enumeration fail in the host logs.

Chapter 8: Audio Capture and DSP (I2S + FFT)

Fundamentals

The Cardputer includes a MEMS microphone connected via I2S. I2S is a synchronous serial bus for audio samples. The ESP32-S3 uses DMA to stream audio into buffers without CPU copying. To visualize audio, you apply a window function, run an FFT, and map frequency bins to bars. The hardest part is keeping audio capture real-time while UI rendering happens in parallel.

Digital audio is a continuous stream, which means it behaves more like networking than a typical sensor read. If you miss samples, the spectrum analyzer becomes meaningless. The fundamentals here include the sampling theorem (Nyquist), the concept of frequency resolution, and the reality of noise. You will learn how to trade sample rate, FFT size, and UI refresh to create a display that looks stable and accurate on a small screen.

Deep Dive

An I2S stream is a continuous flow of samples at a fixed rate (e.g., 16 kHz or 44.1 kHz). The I2S driver fills DMA buffers and triggers interrupts when a buffer is ready. Your audio task must read these buffers quickly and pass them to the DSP stage. If you fall behind, buffers overflow and you lose samples. Choose your FFT size carefully: larger FFTs give better frequency resolution but require more CPU and introduce more latency. For small displays, 256 or 512-point FFTs may be sufficient. In ESP-IDF v5.x, the I2S driver uses a channel-based API: you create a RX channel, configure standard mode (sample rate, slot bits, channel format), and set DMA buffer count/length. Large DMA buffers reduce CPU overhead but increase latency; small buffers reduce latency but increase interrupt pressure. The driver also holds a power-management lock to keep the APB clock stable during I2S operation, which affects overall power budget.

Windowing matters. A raw FFT assumes the signal is periodic in the window. In reality, audio segments are not aligned, which creates spectral leakage. Apply a window function (Hann or Hamming) to reduce leakage, then normalize your magnitudes. Use logarithmic scaling for display; human perception is logarithmic, and linear scaling hides low-frequency detail. For stable visuals, average multiple frames or apply exponential smoothing to each bin.

Because DSP is CPU-heavy, isolate it in a task with a high priority. Use a dedicated ring buffer to decouple I2S capture from FFT processing. UI rendering should be a separate task that reads the latest FFT bins. A common pattern is double-buffering FFT results: while one buffer is displayed, the DSP task writes to the other, then swaps.

You should also consider the noise floor. MEMS microphones can have DC offsets and low-frequency noise. Apply a high-pass filter or subtract the mean before the FFT. Normalize your input to avoid clipping. All of these steps can be implemented with small, fixed-point math if needed.

Sample rate defines your maximum frequency (Nyquist), while FFT size defines your frequency resolution. At 16 kHz with a 512-point FFT, your bin width is about 31.25 Hz. If you increase FFT size, you get better resolution but also more latency and CPU cost. Decide what matters for your use case. For a spectrum analyzer, stable visuals often matter more than ultra-fine resolution.

Windowing reduces spectral leakage, but it also changes amplitude. You should normalize your FFT output to avoid a changing baseline when you switch windows. For a better user experience, convert magnitudes to dB and then apply a log scale. Human hearing is logarithmic, so linear amplitude displays hide detail in low-energy bands. To reduce flicker, apply exponential smoothing or average multiple frames. A simple smoothed = alpha * new + (1-alpha) * old works well.

Pay attention to the noise floor. MEMS microphones often output a DC offset and environmental noise. Remove DC offset by subtracting the mean of the buffer, and consider a simple high-pass filter. If you want a waterfall display, you will need a circular buffer of past spectra and a fast drawing routine that can scroll pixels. This is a good stress test for your UI pipeline and DMA throughput.

Finally, pick an FFT implementation that fits your memory and CPU budget. Floating point is easiest but expensive; fixed-point FFTs are faster but harder to scale. On ESP32-S3, floating point is often acceptable for moderate FFT sizes. Profile both and decide. The real lesson is that DSP is a system-level feature: capture, compute, and render must be balanced.

How This Fits in Projects

  • Project 3: build a stable FFT pipeline with smooth UI.
  • Capstone: integrate audio visualization into the unified toolkit.

Definitions & Key Terms

  • I2S: Inter-IC Sound; synchronous audio serial bus.
  • FFT: Fast Fourier Transform; converts time-domain samples to frequency spectrum.
  • Window function: A weighting function applied before FFT to reduce leakage.

Mental Model Diagram

[Mic] -> [I2S DMA] -> [Audio Buffer] -> [Window + FFT] -> [Bins] -> [UI]

How It Works (Step-by-Step)

  1. Configure I2S and allocate DMA buffers.
  2. Capture audio into buffers via DMA.
  3. Apply window and FFT on each buffer.
  4. Convert FFT bins to dB or scaled values.
  5. Render bars on the display at fixed FPS.

Minimal Concrete Example

for (i = 0; i < N; i++) {
    windowed[i] = samples[i] * hann[i];
}
fft(windowed, bins);

Common Misconceptions

  • “Higher FFT size is always better.” -> It increases CPU cost and latency.
  • “Raw FFT output is display-ready.” -> You need scaling and smoothing.

Check-Your-Understanding Questions

  1. Why is a window function necessary?
  2. What is the tradeoff between FFT size and latency?
  3. Why do you need double buffering for FFT results?

Check-Your-Understanding Answers

  1. It reduces spectral leakage from non-periodic windows.
  2. Larger FFT gives better frequency resolution but slower updates.
  3. To avoid tearing while UI reads bins.

Real-World Applications

  • Spectrum analyzers, audio visualizers, noise detection tools.

Where You Will Apply It

Project 3 and the Capstone.

References

  • ESP-IDF I2S driver documentation. (See official ESP-IDF docs.)

Key Insight

Key insight: Audio DSP is a pipeline problem. Keep capture real-time, process in chunks, and display at fixed cadence.

Summary

I2S + FFT teaches real-time streaming and DSP tradeoffs. The discipline of buffering and scheduling is the same as in WiFi capture.

Homework/Exercises

  1. Capture raw samples and print min/max values.
  2. Implement a 256-point FFT and display 8 frequency bands.

Solutions

  1. Use i2s_channel_read() and scan the buffer for min/max.
  2. Sum FFT bins into bands and map to bar heights.

Chapter 9: IR Protocols and Precise Timing

Fundamentals

Infrared remotes transmit data using a carrier (typically 38 kHz) and encode bits as timed pulses and spaces. The NEC protocol is one of the most common: it starts with a long leader pulse and then transmits address and command bits using pulse-distance modulation. To build a learning remote, you must measure pulse widths precisely, decode them into bits, and then replay them with correct timing. This is a classic embedded timing problem that trains you to measure microsecond-level events reliably.

IR remotes are a classic example of encoding information in time. The hardware is simple, but the timing rules are strict. A typical IR receiver module demodulates the 38 kHz carrier and outputs a digital pulse train. That means your firmware sees a stream of pulses and gaps that represent bits. Your job is to interpret those timings and reproduce them accurately when you transmit.

Deep Dive

The NEC protocol uses a 9 ms leader burst followed by a 4.5 ms space. Each bit is a 562.5 us pulse, followed by a space: 562.5 us for a logical 0 and 1.6875 ms for a logical 1. A full command includes 16 bits of address and 16 bits of command (often with inverted bits for validation). When a key is held down, a repeat code is sent at a fixed interval. These timings are approximate; real remotes vary, so your decoder should allow tolerance.

On ESP32, you can capture IR pulses using interrupts on a GPIO connected to an IR receiver module. Each edge triggers an interrupt; you timestamp the time difference between edges to measure pulse widths. You then classify each interval as a leader, bit pulse, or bit space. Alternatively, use the RMT peripheral for more accurate capture and playback. For replay, generate a 38 kHz PWM carrier and gate it on/off according to the pulse sequence. Correct carrier frequency and duty cycle matter; if they are off, devices may not respond.

You should decide whether to store decoded frames or raw pulse trains. Decoded storage is compact and protocol-aware, but it may fail on unusual remotes. Raw storage is universal but uses more memory. A robust remote supports both: try to decode to known protocols, but fall back to raw replay when decoding fails. For usability, build a simple menu of devices and commands, and allow learning new codes with a prompt. Always validate learned codes by replaying and observing the target device.

The NEC protocol provides a concrete example of timing-driven encoding. It starts with a leader burst (about 9 ms) and a 4.5 ms space. Each bit is a 562.5 us pulse followed by a space: short space for 0, long space for 1. A full frame includes address and command bytes, often followed by their bitwise inverses for validation. Repeat codes are sent roughly every 110 ms when a button is held. These values vary slightly across devices, so your decoder must accept a tolerance window (for example, +/- 20 percent).

Capture can be done with GPIO interrupts, but the ESP32 RMT peripheral is better because it timestamps pulses with less CPU overhead. For learning mode, store the raw pulse widths; for known protocols, decode into address/command. Provide a hybrid approach: store both the raw sequence and the decoded fields so you can replay raw signals for unknown remotes but still display decoded information for common ones. On replay, generate a stable PWM carrier and gate it on and off according to the stored sequence. Be careful about duty cycle; some receivers expect around 1/3 duty cycle rather than 50 percent.

Edge cases matter: fluorescent lights and sunlight can introduce noise that looks like false pulses. Use a minimum pulse length threshold to filter noise. Also provide a timeout so that partial frames are discarded. For a good user experience, show a “learning” indicator and immediately test a captured code. If the code fails, prompt the user to try again and increase tolerance or sample length.

How This Fits in Projects

  • Project 2: capture, decode, and replay IR with stable timing.
  • Capstone: integrate learned IR codes into the unified tool suite.

Definitions & Key Terms

  • Carrier: The high-frequency (e.g., 38 kHz) modulation used by IR.
  • Pulse-distance modulation: Encoding bits using different space lengths.
  • Leader: The long initial burst indicating the start of a frame.

Mental Model Diagram

[IR Receiver] -> [Edge Interrupts] -> [Pulse Widths] -> [Decode] -> [Store]
                                                                    |
[IR Emitter]  <- [PWM Carrier] <- [Replay Sequence] <---------------+

How It Works (Step-by-Step)

  1. Capture edge timestamps from IR receiver.
  2. Compute pulse widths and classify leader/bit patterns.
  3. Decode to address/command if protocol matches.
  4. Store either decoded or raw sequences.
  5. Replay with PWM carrier and exact timing.

Minimal Concrete Example

if (pulse_us > 4000) state = LEADER;
else if (pulse_us > 1200) bit = 1;
else bit = 0;

Common Misconceptions

  • “IR decoding is just reading bytes.” -> It is timing-sensitive signal processing.
  • “Any carrier frequency works.” -> Many devices expect ~38 kHz.

Check-Your-Understanding Questions

  1. Why do you need tolerance when decoding IR timings?
  2. What is the difference between raw and decoded storage?
  3. Why must PWM frequency be stable during replay?

Check-Your-Understanding Answers

  1. Real remotes have jitter and components vary.
  2. Raw works for any protocol but uses more memory; decoded is compact but limited.
  3. Devices expect a carrier frequency; incorrect frequency reduces sensitivity.

Real-World Applications

  • Universal remotes, home automation, IR learning tools.

Where You Will Apply It

Project 2 and the Capstone.

References

  • NEC IR timing overview (see Electronic Design or similar references).

Key Insight

Key insight: IR control is a timing protocol. Precision and tolerance are both required.

Summary

Learning IR is about measuring and reproducing time. Your firmware becomes a logic analyzer and signal generator.

Homework/Exercises

  1. Capture a single NEC frame and print all pulse widths.
  2. Replay the raw timing and verify the device responds.

Solutions

  1. Use edge interrupts and store intervals in a buffer.
  2. Generate a PWM carrier and gate it with the stored intervals.

Chapter 10: Power, Battery, and Reliability Engineering

Fundamentals

Portable devices live and die by power. The Cardputer includes a small internal battery plus a larger base battery, and different subsystems draw dramatically different current. WiFi, display backlight, and speaker output are large consumers. If you ignore power, your device will brown out or die quickly. Reliability engineering is about anticipating failure: power loss during SD writes, tasks that hang, or memory leaks that slowly kill the system.

Power is the hidden constraint behind every embedded feature. The same firmware can feel smooth on USB power and unusable on battery if you ignore current spikes and brownouts. You must think in terms of energy budgets: how many minutes of WiFi sniffing, how long the display can stay at full brightness, and what happens when the battery is low. Reliability includes power, but also recovery: when things fail, can your device still function and preserve data?

Deep Dive

Start with a power budget. Measure current draw in different modes: idle, WiFi sniffing, audio processing, display at full brightness. Then decide your operating modes (e.g., low power mode for logging, high power mode for active UI). The ESP32-S3 supports dynamic frequency scaling and light sleep; you can reduce CPU frequency when idle and disable peripherals when not needed. The display backlight is often the biggest power consumer; dim it after a timeout.

Brownouts happen when current spikes exceed what the battery or regulator can provide. Audio playback and WiFi transmit can spike current. Mitigation includes adding capacitors, reducing transmit power, or spreading heavy tasks over time. For firmware, implement watchdog timers and heartbeat tasks. If a task stops updating, reset only that subsystem or restart the device gracefully. Log reset reasons and preserve diagnostic logs on SD.

Reliability also means data safety. Use atomic writes for configuration, write versioned files, and validate at boot. If the SD card fails to mount, fallback to a safe mode and show a clear UI error. When designing the capstone toolkit, build a global error handler that can display error codes and allow recovery. This turns your prototypes into robust field tools.

Peripheral-level power gating adds another lever. Turn off the audio amplifier when not in use, disable the IR LED driver, and unmount the SD card when idle. Use display backlight PWM to step brightness down over time instead of a sudden jump; this saves power and improves UX. For long-running logging, consider a headless mode with the display off that wakes on a key press or timer.

Battery behavior is nonlinear. A LiPo discharge curve is steep near the end, so voltage-based percentage is only an estimate. Calibrate your ADC against known voltages and consider temperature effects. If you need more accuracy, store a simple lookup table that maps voltage to percent. Treat low-voltage thresholds conservatively to avoid sudden brownouts.

Reliability is about diagnosing failure as well. Keep a circular log of key events (boot reason, mount errors, watchdog resets). If the device crashes, you can read that log on the next boot and show a summary to the user. For the capstone, design a global error handler that can recover subsystems one by one instead of rebooting the whole device. These are the practices that separate hobby firmware from production-grade tools.

Battery estimation requires measurement. Use an ADC channel to read battery voltage and calibrate it against known values. Display a simple battery indicator in the UI so users understand current state. When voltage drops below a threshold, automatically reduce display brightness and disable high-power radios. If voltage drops further, save logs, unmount the SD card, and enter a safe shutdown. These behaviors prevent corruption and teach you to design for graceful failure.

Stress testing matters. Run worst-case scenarios (full brightness, WiFi sniffing, audio processing) and confirm the device remains stable for at least 10 minutes. This gives you a safety margin for real-world use.

How This Fits in Projects

  • Project 1 and 5: long captures depend on predictable power usage.
  • Project 6: the mini-OS should recover gracefully from failures.
  • Capstone: coordinated power policies across all tools.

Definitions & Key Terms

  • Brownout: Voltage drop that causes resets or undefined behavior.
  • Light sleep: Low-power mode where CPU pauses but RAM is retained.
  • Watchdog: Timer that resets the system if software hangs.

Mental Model Diagram

[Power Source] -> [Regulator] -> [Subsystems] -> [Firmware Controls]
       |              |             |                |
  battery health   brownout      WiFi, UI, SD      sleep, dim, retry

How It Works (Step-by-Step)

  1. Measure current draw for each subsystem.
  2. Define power modes (active, idle, sleep).
  3. Implement timeouts to dim display and stop radios.
  4. Add watchdogs and reset handling.
  5. Store logs for post-mortem analysis.

Minimal Concrete Example

if (idle_time_ms > 60000) {
    set_backlight(20);  // dim
    disable_wifi();
}

Common Misconceptions

  • “Power problems are hardware-only.” -> Firmware scheduling causes spikes too.
  • “Watchdogs are last resort.” -> They are essential for robust tools.

Check-Your-Understanding Questions

  1. Why does WiFi sniffing increase power draw?
  2. How can you avoid SD corruption on sudden power loss?
  3. What is the purpose of a watchdog?

Check-Your-Understanding Answers

  1. The radio and DMA are active continuously.
  2. Use chunked writes and fsync or journaling.
  3. It recovers the system when software hangs.

Real-World Applications

  • Battery-powered field tools, diagnostic devices.

Where You Will Apply It

Projects 1, 5, 6, 8, and the Capstone.

References

  • M5Stack Cardputer power and battery specs (see official M5Stack docs).

Key Insight

Key insight: Reliability is a feature. Power-aware firmware is what makes tools usable outside the lab.

Summary

Power and reliability are the difference between a demo and a real product. Design for failure and recovery.

Homework/Exercises

  1. Log current draw in different modes and estimate runtime.
  2. Simulate a power loss during SD write and verify recovery logic.

Solutions

  1. Use a USB power monitor or shunt measurement.
  2. Force a reset during a write and check for clean recovery.

Glossary (High-Signal)

  • DMA: Hardware-assisted transfer that reduces CPU load for streaming data.
  • Ring buffer: Fixed-size circular buffer used for high-rate pipelines.
  • GATT: BLE protocol that organizes data into services/characteristics.
  • HOGP: HID over GATT Profile (BLE keyboards/mice).
  • HID: Human Interface Device class (keyboard/mouse) over USB or BLE.
  • PCAP: Binary packet capture file format read by Wireshark.
  • Snaplen: Maximum captured packet size recorded in a PCAP header.
  • Debounce: Filtering a noisy input signal into a stable state.
  • I2S: Synchronous serial bus for audio samples.
  • FFT: Algorithm that transforms time-domain samples to frequency bins.
  • RMT: ESP32 peripheral for precise timing and waveform generation.
  • NVS: ESP32 key-value storage in flash.

Why M5Stack Cardputer Matters

The Modern Problem It Solves

Embedded products are everywhere, but most learners only build bench prototypes. The Cardputer gives you a full embedded product form factor: keyboard, screen, storage, audio, and wireless all in one. This compresses years of system experience into one device: concurrency, UI, I/O, and power management, all in a small MCU system. It is ideal for building portable tools that act like real products.

Real-world impact and scale (recent stats):

  • 18.8 billion connected IoT devices by end of 2024 (IoT Analytics, 2024).
  • 21.1 billion connected IoT devices forecast for end of 2025 (IoT Analytics, 2025).
  • Total IoT connections forecast to reach ~22.3 billion in 2025 and 47.1 billion in 2031 (Ericsson Mobility Report, 2025).

These numbers show that embedded devices are not niche. The skills you learn here scale to billions of devices.

Old vs New Approach

OLD APPROACH (bench dev board)          NEW APPROACH (Cardputer)
+---------------------------+          +---------------------------+
| Single peripheral demos   |          | Multi-peripheral product  |
| Serial-only UI            |          | Real UI + keyboard        |
| No storage or logging     |          | microSD logs + PCAP/CSV   |
| Power not considered      |          | Battery + power budgets   |
+---------------------------+          +---------------------------+

Context & Evolution (Brief)

Traditional embedded learning focused on single sensors or LED blink demos. Modern products require integrated subsystems that must coexist. The Cardputer represents that evolution: a compact, integrated platform that behaves like a real product, not a lab board.


Concept Summary Table

Concept Cluster What You Need to Internalize
Concurrency & DMA Short callbacks, task separation, and pipeline design to avoid dropped data.
Keyboard Matrix + Debounce Scanning logic, ghosting, and event generation.
SPI Display + UI Partial redraws, frame pacing, and state-driven rendering.
Storage + File Formats Chunked writes, PCAP/CSV correctness, and data integrity.
WiFi Sniffing Promiscuous capture, channel hopping, and safe callbacks.
BLE HID GATT services, report maps, bonding, and reconnection.
USB HID Enumeration, descriptors, and safe payload design.
Audio + DSP I2S DMA, FFT pipelines, and smoothing for display.
IR Timing Carrier generation, timing tolerance, and replay accuracy.
Power & Reliability Power budgets, brownouts, and watchdog recovery.

Project-to-Concept Map

Project What It Builds Primer Chapters It Uses
Project 1: WiFi Sniffer PCAP capture and analysis tool 1, 3, 4, 5, 10
Project 2: IR Remote IR learning + replay tool 2, 9, 10
Project 3: Audio Spectrum I2S + FFT + UI 1, 3, 8, 10
Project 4: BadUSB HID USB HID device + editor 1, 2, 7, 10
Project 5: Wardriving WiFi + GPS data logger 1, 4, 5, 10
Project 6: Mini-OS Multi-app launcher 1, 2, 3, 4, 10
Project 7: BLE HID BLE keyboard/mouse 1, 2, 6, 10
Project 8: Capstone Toolkit Unified tool suite 1-10

Deep Dive Reading by Concept

This section maps each concept to specific book chapters or technical standards for deeper understanding.

Concurrency, Timing, and Memory

Concept Book & Chapter Why This Matters
Task scheduling Operating Systems: Three Easy Pieces - Scheduling + Concurrency Mental model for timing and preemption.
Embedded tradeoffs Making Embedded Systems - Ch. 1-4 Teaches constraints and real-time thinking.
Defensive C Effective C, 2nd Edition - Ch. 7-9 Prevents memory and parsing bugs.

Input, UI, and Event Systems

Concept Book & Chapter Why This Matters
Debounce/state machines Making Embedded Systems - Ch. 4-5 Input pipelines and timing discipline.
UI architecture Design Patterns - Observer + MVC Clean separation of state and rendering.

Storage and Data Formats

Concept Book & Chapter Why This Matters
File I/O and parsing Effective C, 2nd Edition - Ch. 9-12 Safe handling of binary and CSV data.
Logging systems Making Embedded Systems - Ch. 8 Robust data capture practices.

Wireless and Networking

Concept Book & Chapter Why This Matters
802.11 fundamentals Computer Networks - Ch. 2-3 Frame structure and channel behavior.
Packet structure TCP/IP Illustrated Vol. 1 - Link layer chapters Practical packet parsing intuition.

USB and HID

Concept Book & Chapter Why This Matters
USB enumeration USB Complete by Jan Axelson - Ch. 3-5 Descriptors and enumeration flow.
HID reports USB Complete - Ch. 11 HID report formats and polling.
TinyUSB stack TinyUSB docs + examples Practical implementation on ESP32-S3.

BLE and HID over GATT

Concept Book & Chapter Why This Matters
BLE GATT basics Getting Started with BLE by Kevin Townsend - Ch. 4-6 Services, characteristics, and notifications.
HID over GATT Bluetooth SIG HOGP spec Official HID over GATT behavior.

Audio DSP

Concept Book & Chapter Why This Matters
FFT and windowing The Scientist and Engineer’s Guide to DSP - FFT chapters Core DSP theory for Project 3.

IR Protocols and Timing

Concept Book & Chapter Why This Matters
Timers/interrupts Bare Metal C - Timer/interrupt chapters Precise timing for IR capture/replay.
IR protocols NEC protocol references Practical decode rules and timing.

Power and Reliability

Concept Book & Chapter Why This Matters
Power management Making Embedded Systems - Ch. 10 Power budgets and brownout prevention.
Reliability mindset Making Embedded Systems - Ch. 9 Fault handling and recovery strategies.

Quick Start: Your First 48 Hours

Day 1 (4 hours):

  1. Read Chapters 1, 2, and 3 of the Theory Primer.
  2. Flash a keyboard echo demo and print key codes.
  3. Draw text to the TFT display and update a counter.
  4. Do not optimize yet; just make it work.

Day 2 (4 hours):

  1. Log key events to microSD as CSV.
  2. Implement a simple menu UI.
  3. Simulate high load by adding a background task.
  4. Observe any UI stutter and adjust task priorities.

End of Weekend: You will understand the input pipeline, basic UI rendering, and storage. That is the foundation for every project.


Best for: General embedded learners.

  1. Project 2 (IR Remote) - timing and interrupts.
  2. Project 3 (Audio Spectrum) - DMA and DSP.
  3. Project 6 (Mini-OS) - system architecture.

Path B: Wireless + Security

Best for: Security-minded learners.

  1. Project 1 (WiFi Sniffer).
  2. Project 5 (Wardriving).
  3. Project 4 (BadUSB HID).

Path C: Human Interface + Automation

Best for: UX and automation focus.

  1. Project 7 (BLE HID).
  2. Project 4 (BadUSB HID).
  3. Project 6 (Mini-OS).

Path D: Completionist

Best for: Building the full toolkit.

  • Weeks 1-4: Projects 2, 3, 1
  • Weeks 5-8: Projects 5, 4
  • Weeks 9-12: Projects 6, 7
  • Weeks 13+: Capstone

Success Metrics

  • Keyboard scanning has zero missed keys during fast typing tests.
  • UI updates stay smooth at 25+ FPS under load.
  • PCAP and CSV logs open correctly on a desktop PC.
  • BLE and USB HID reconnect reliably across 10 reboots.
  • Battery runtime exceeds 2 hours in mixed use.
  • Capstone tool can switch modes without rebooting.

Optional Appendices

Appendix A: Debugging Toolkit

  • Serial logging: Always keep a ring buffer of recent logs.
  • Logic analyzer: Verify IR timing and SPI transfers.
  • Wireshark: Validate PCAP files from Project 1.
  • Power monitor: Measure current draw under different modes.

Appendix B: Ethics and Safe Use

Projects that touch WiFi sniffing, wardriving, or USB HID must be used only on networks and systems you own or are explicitly authorized to test. Build explicit confirmation prompts and visible warning banners in your firmware.

Appendix C: Wireless + USB Debugging Cheatsheet

  • WiFi sniffer troubleshooting:
    • Verify promiscuous mode is enabled and channel is set.
    • Reduce filters first (capture all frames), then add filters.
    • Log packet metadata (RSSI/channel) to confirm capture is alive.
  • BLE HID troubleshooting:
    • Clear bonds on both device and host when pairing fails.
    • Confirm Report Map and Report length match exactly.
    • Send key-up reports after every key press.
  • USB HID troubleshooting:
    • Check host logs (dmesg on Linux) for descriptor errors.
    • Use a known-good report descriptor before adding custom keys.
    • Change PID when you change descriptors to avoid cached host state.

Project Overview Table

Project Difficulty Time Depth Fun Factor Coolness
WiFi Sniffer Advanced 2-3 weeks Very Deep High Level 4
IR Remote Intermediate 1-2 weeks Medium High Level 3
Audio Spectrum Advanced 2-3 weeks Deep Very High Level 4
BadUSB HID Advanced 2 weeks Deep Very High Level 5
Wardriving Advanced 2-3 weeks Deep High Level 4
Mini-OS Expert 1 month+ Very Deep Medium Level 5
BLE HID Intermediate 1-2 weeks Medium High Level 4
Capstone Toolkit Master 2-3 months Extreme Very High Level 5

Project List

Project 1: WiFi Packet Sniffer and Network Analyzer

  • Main Programming Language: C/C++ (ESP-IDF recommended)
  • Alternative Programming Languages: Arduino, Rust (esp-rs)
  • Difficulty: Advanced
  • Time estimate: 2-3 weeks
  • Knowledge Area: Wireless Security / 802.11
  • Main Book: “TCP/IP Illustrated, Volume 1”

What you will build: A portable WiFi packet capture tool that runs in promiscuous mode, shows live traffic stats on the screen, and saves packets to microSD in PCAP format for Wireshark.

Why it teaches Cardputer mastery: It forces you to build a high-rate pipeline with strict callback timing, parse 802.11 headers, and manage storage under load.

Core challenges you will face:

  • Safe promiscuous callbacks without blocking the WiFi driver.
  • Parsing 802.11 frame headers quickly and correctly.
  • Writing PCAP headers and records without corruption.
  • Keeping UI responsive while capture is active.

Architecture sketch:

WiFi driver -> promisc callback -> ring buffer -> parser -> stats/UI
                                       |
                                       +-> pcap writer -> microSD

Real World Outcome

What you will see:

  1. Live packet rates per channel.
  2. Top SSIDs with packet counts.
  3. A PCAP file that opens in Wireshark.

Serial output example:

$ idf.py monitor
I (1200) sniffer: ch=6 pps=214 total=14532 drops=2
I (1200) sniffer: top=HomeNetwork(423), CoffeeShop_5G(312), [hidden](156)
I (1201) sniffer: pcap flush 4096 bytes

On-screen example:

CH:6  PPS:214  DROP:2
Top SSIDs:
1) HomeNetwork   423
2) CoffeeShop_5G 312
3) [Hidden]      156

Desktop verification:

$ tshark -r capture.pcap -c 3
1 0.000000 Beacon  HomeNetwork
2 0.012345 Probe Request  [randomized]
3 0.045678 Beacon  CoffeeShop_5G

The Core Question You Are Answering

“How do you capture high-rate wireless frames without starving the driver or corrupting output files?”

This question forces you to design a real-time pipeline and validate data integrity under load.

Concepts You Must Understand First

  1. 802.11 frame types
    • What distinguishes management, control, and data frames?
    • Why do beacons matter for network discovery?
    • Book Reference: TCP/IP Illustrated Vol 1 - Link layer chapters.
  2. Ring buffers and backpressure
    • How do you handle overflow when producer is faster than consumer?
    • Book Reference: Making Embedded Systems - Ch. 6-7.
  3. PCAP file format
    • What fields are required in global and packet headers?
    • Book Reference: Effective C - Ch. 9-10.

Questions to Guide Your Design

  1. Capture pipeline
    • How many packets per second can your pipeline sustain?
    • Where do you drop data if buffers fill?
  2. Storage
    • How often do you flush to SD?
    • How do you handle unmounted SD cards?
  3. UI
    • What is your target FPS, and what is your SPI bandwidth?
    • Which metrics are worth showing in real-time?

Thinking Exercise

The PCAP Header Trace

struct pcap_hdr {
    uint32_t magic;
    uint16_t v_major;
    uint16_t v_minor;
    int32_t  thiszone;
    uint32_t sigfigs;
    uint32_t snaplen;
    uint32_t network;
};
  • Which fields are endian-sensitive?
  • Why does snaplen matter for memory usage?
  • How will you validate this header on a PC?

The Interview Questions They Will Ask

  1. Why must the promiscuous callback be short?
  2. How do you parse 802.11 headers efficiently?
  3. How do you validate PCAP files?
  4. What is the tradeoff between channel hopping and single-channel depth?
  5. How do you prevent UI from causing packet drops?

Hints in Layers

Hint 1: Start simple

esp_wifi_set_promiscuous(true);

Start by printing a packet counter only.

Hint 2: Add a ring buffer Use a fixed-size ring buffer and enqueue a lightweight descriptor.

wifi_promiscuous_filter_t filt = { .filter_mask = WIFI_PROMIS_FILTER_MASK_MGMT };
esp_wifi_set_promiscuous_filter(&filt);

Hint 3: PCAP writing Write the global header once, then append packet records in chunks.

Hint 4: Measure drops Log drop counts and reduce UI FPS if drops rise.

Books That Will Help

Topic Book Chapter
WiFi framing TCP/IP Illustrated Vol. 1 Link layer chapters
Buffering Making Embedded Systems Ch. 6-7
Binary parsing Effective C Ch. 9-10

Common Pitfalls & Debugging

Problem: “Packets drop when UI updates”

  • Why: UI rendering blocks the capture task.
  • Fix: Move rendering to a lower-priority task.
  • Quick test: Disable display updates and compare drop rates.

Problem: “Wireshark says file is corrupt”

  • Why: Incorrect PCAP headers or record lengths.
  • Fix: Validate headers with a known sample file.
  • Quick test: tshark -r capture.pcap on a PC.

Problem: “No packets in promiscuous mode”

  • Why: Wrong WiFi mode or channel.
  • Fix: Ensure WIFI_MODE_NULL or STA is set.
  • Quick test: Print channel and RSSI metadata.

Definition of Done

  • Capture runs for 10 minutes with <5% packet drops.
  • PCAP opens correctly in Wireshark.
  • Channel hopping works and UI remains responsive.
  • Clear ethical warning in firmware about authorization.

Project 2: Universal IR Remote with Learning Capability

  • Main Programming Language: C/C++ (Arduino or ESP-IDF)
  • Difficulty: Intermediate
  • Time estimate: 1-2 weeks
  • Knowledge Area: Hardware Protocols / Signal Timing
  • Main Book: “Making Embedded Systems”

What you will build: A programmable IR remote that learns codes from existing remotes, stores them by device, and retransmits via the built-in IR emitter. For learning, attach an external IR receiver via the Grove port.

Why it teaches Cardputer mastery: It forces you to do microsecond timing, build a signal decode pipeline, and present a clean UI for learned commands.

Core challenges you will face:

  • Capturing pulse widths reliably with interrupts or RMT.
  • Decoding NEC-style protocols and handling repeats.
  • Storing codes in NVS or microSD.
  • Generating a stable 38 kHz carrier for replay.

Architecture sketch:

IR receiver -> edge timing -> protocol decode -> code store
                                        |
IR emitter  <- PWM carrier    <- replay engine

Real World Outcome

What you will see:

  1. A menu of learned devices and commands.
  2. Successful replay of TV/AC/Audio remote commands.
  3. Stable timing captured and displayed during learning.

Serial output example:

I (3010) ir: leader=9000us space=4500us
I (3010) ir: addr=0x20DF cmd=0x10EF (NEC)
I (3011) ir: stored device "LivingRoomTV" cmd "POWER"

On-screen example:

=== Universal Remote ===
[1] Living Room TV
[2] Bedroom AC
[3] Sound Bar
[4] + Learn New Device

Replay verification:

I (3050) ir: replay "POWER" carrier=38kHz duty=33% pulses=67
I (3051) ir: ack=success (device responded)

The Core Question You Are Answering

“How do you convert raw pulse timings into reliable, replayable control commands?”

Concepts You Must Understand First

  1. Interrupt timing and edge capture
    • How do you timestamp edges without losing interrupts?
    • Book Reference: Bare Metal C - Timer/interrupt chapters.
  2. Pulse-distance modulation
    • How does NEC encode bits using timing?
    • Book Reference: Making Embedded Systems - Ch. 5.
  3. PWM carrier generation
    • Why does 38 kHz matter for IR?
    • Book Reference: Making Embedded Systems - Ch. 5.

Questions to Guide Your Design

  1. What timing tolerance will you accept for decoding?
  2. Will you store raw or decoded codes (or both)?
  3. How will you handle repeat codes on long presses?

Thinking Exercise

Record a raw timing sequence and annotate it:

9000,4500,560,560,560,1690,560,560, ...
  • Which part is the leader?
  • Which bits are 0 vs 1?
  • How will you handle jitter?

The Interview Questions They Will Ask

  1. Why is 38 kHz common for IR remotes?
  2. How do you handle noise and jitter in timing capture?
  3. What are the pros/cons of raw vs decoded storage?
  4. How do you generate a stable carrier on MCU hardware?

Hints in Layers

Hint 1: Raw capture Capture and print raw pulse widths before decoding.

Hint 2: NEC decoder Start with a simple NEC decoder using timing thresholds.

Hint 3: Replay Use a PWM carrier and gate it according to the stored sequence.

Hint 4: UI Store names for devices and commands in a simple menu file.

Books That Will Help

Topic Book Chapter
Timers/PWM Making Embedded Systems Ch. 5
Interrupts Bare Metal C Timer/interrupt chapters
Data storage Effective C Ch. 8-9

Common Pitfalls & Debugging

Problem: “Replayed code does nothing”

  • Why: Wrong carrier frequency or timing scaling.
  • Fix: Verify PWM output with a logic analyzer.
  • Quick test: Replay a known-good NEC code.

Problem: “Codes decode inconsistently”

  • Why: Timing jitter or wrong thresholds.
  • Fix: Use tolerance windows and average timing.
  • Quick test: Print timing histogram.

Definition of Done

  • Learns at least 2 different remotes.
  • Replays codes reliably 10 times in a row.
  • Codes persist across reboots.
  • UI clearly indicates selected device and command.

Project 3: Real-Time Audio Spectrum Analyzer

  • Main Programming Language: C/C++ (ESP-IDF or Arduino)
  • Difficulty: Advanced
  • Time estimate: 2-3 weeks
  • Knowledge Area: DSP / Real-Time Audio
  • Main Book: “The Scientist and Engineer’s Guide to DSP”

What you will build: A portable FFT-based spectrum analyzer that captures audio via the built-in MEMS mic, computes frequency bands in real time, and visualizes them on the TFT screen.

Why it teaches Cardputer mastery: It combines DMA streaming, CPU-heavy DSP, and UI updates under tight timing constraints.

Core challenges you will face:

  • Configuring I2S + DMA on ESP32-S3.
  • Windowing and FFT stability.
  • Mapping FFT bins to visual bands.
  • Rendering at 25-30 FPS without audio dropouts.

Architecture sketch:

Mic -> I2S DMA -> ring buffer -> window + FFT -> bands -> UI

Real World Outcome

What you will see:

  1. Real-time spectrum bars that respond to sound.
  2. A visible peak at 440 Hz for a tuning fork or sine wave.
  3. Smooth animation without audio stutter.

Serial output example:

I (2200) audio: sr=16000 fft=512 bins=256
I (2201) audio: peak=437Hz amp=0.82

On-screen example:

|#       #|
|# #   # #|
|# # # # #|
+------------------+
 63 125 250 500 1k

Test-tone verification:

I (2300) audio: test=440Hz peak_bin=14 freq=437Hz amp=0.82
I (2301) audio: noise_floor=-52dB smoothing=0.85

The Core Question You Are Answering

“How do you process continuous audio streams without missing samples or freezing the UI?”

Concepts You Must Understand First

  1. I2S DMA buffering
    • How does DMA deliver audio blocks?
    • Book Reference: Making Embedded Systems - Ch. 6.
  2. FFT and windowing
    • Why does windowing reduce spectral leakage?
    • Book Reference: The Scientist and Engineer’s Guide to DSP - FFT chapters.
  3. UI pacing
    • How do you decouple UI FPS from audio capture rate?
    • Book Reference: Design Patterns - MVC concepts.

Questions to Guide Your Design

  1. What sample rate and FFT size give acceptable resolution?
  2. How will you prevent buffer overruns?
  3. What smoothing will you apply to bins?

Thinking Exercise

Sketch a timing diagram showing DMA buffer fill, FFT processing, and UI draw. Identify where overlap occurs.

The Interview Questions They Will Ask

  1. Why is windowing needed before FFT?
  2. How does FFT size affect resolution and latency?
  3. Why is log scaling more natural for audio visualization?
  4. How do you keep capture real-time while rendering?

Hints in Layers

Hint 1: Raw waveform Start by plotting raw samples or min/max values.

Hint 2: Small FFT Use a 256-point FFT before moving to 512 or 1024.

Hint 3: Precompute window Compute the Hann window once and store it in flash.

Hint 4: Separate tasks Pin FFT to a separate task/core from UI rendering.

Books That Will Help

Topic Book Chapter
DSP + FFT Scientist and Engineer’s Guide to DSP FFT chapters
Real-time constraints Making Embedded Systems Ch. 6
Rendering Computer Graphics from Scratch Ch. 1-2

Common Pitfalls & Debugging

Problem: “Display lags and audio drops”

  • Why: FFT + rendering on same core.
  • Fix: Separate tasks and use queues.
  • Quick test: Disable UI and check capture stability.

Problem: “Noisy spectrum”

  • Why: No windowing or smoothing.
  • Fix: Apply Hann window and exponential smoothing.
  • Quick test: Test with a pure sine wave.

Definition of Done

  • Smooth spectrum at 25+ FPS.
  • Stable peaks for sine tones.
  • No audio buffer overflows for 5 minutes.
  • Visual modes (bars + waterfall) toggle correctly.

Project 4: BadUSB / USB HID Attack Demonstration Tool

  • Main Programming Language: C/C++ (ESP-IDF or Arduino with TinyUSB)
  • Difficulty: Advanced
  • Time estimate: 2 weeks
  • Knowledge Area: USB Protocols / Security Research
  • Main Book: “Practical Malware Analysis” (ethics)

What you will build: A USB HID device that enumerates as a keyboard, runs scripted keystroke payloads, and provides a local UI to edit and manage payloads. For authorized testing only.

Why it teaches Cardputer mastery: It forces you to implement USB descriptors, build a local editor UI, and add safety controls for ethical use.

Core challenges you will face:

  • HID report descriptors using TinyUSB.
  • Timing keystrokes to match host state.
  • Payload parser and editor UI.
  • Safety gating and ethical controls.

Architecture sketch:

Payload storage -> parser -> HID reports -> USB device stack -> host PC

Real World Outcome

What you will see:

  1. Device enumerates as a keyboard.
  2. Safe payload list with confirmation prompts.
  3. Typed commands appear on the host.

Serial output example:

I (3100) usb: enumerated HID keyboard
I (3102) payload: "system_info" length=42
I (3103) safety: armed=YES confirmed=YES

On-screen example:

=== Payloads ===
[1] System Info (authorized)
[2] Open Terminal
[3] + New Payload

Hold [OK] to arm

Host verification:

$ whoami
douglas
$ uname -a
Linux laptop 6.x.x #1 SMP ...

The Core Question You Are Answering

“How do you build a USB HID device that behaves like a human keyboard without losing synchronization or violating ethics?”

Concepts You Must Understand First

  1. USB enumeration and descriptors
    • What descriptors are required for HID?
    • Book Reference: Making Embedded Systems - Ch. 7.
  2. HID report format
    • How does a keyboard report represent keys?
    • Book Reference: Effective C - Ch. 11 (bitfields and parsing).
  3. Safety controls
    • How do you prevent accidental execution?
    • Book Reference: Penetration Testing - Ch. 1 (ethics).

Questions to Guide Your Design

  1. How do you detect that the host is ready?
  2. What is your payload format (DSL, JSON, simple text)?
  3. How will you lock/unlock the device?

Thinking Exercise

Design a payload DSL with explicit delays and confirmations. Example:

WAIT 1000
TYPE "whoami"
ENTER
  • How will you parse this safely?
  • How do you enforce a safe mode?

The Interview Questions They Will Ask

  1. What is the difference between a USB descriptor and a HID report?
  2. How does enumeration work on USB?
  3. Why is USB HID a security risk?
  4. How would you design ethical safeguards?

Hints in Layers

Hint 1: Single key demo Start with sending a single ‘A’ key press.

Hint 2: Add string typing Send a string with delays between keypresses.

Hint 3: Payload format Create a simple line-based format and parser.

Hint 4: Safety gate Require a long-press on a physical key to arm.

Books That Will Help

Topic Book Chapter
Security ethics Penetration Testing Ch. 1
Robust I/O Making Embedded Systems Ch. 7
Input parsing Effective C Ch. 11-12

Common Pitfalls & Debugging

Problem: “Host sees gibberish”

  • Why: Wrong HID keycode mapping or keyboard layout mismatch.
  • Fix: Use a fixed US layout and explicit mapping.
  • Quick test: Send a single ‘A’ and verify in a text editor.

Problem: “Enumeration fails”

  • Why: Descriptor errors.
  • Fix: Compare against a known-good TinyUSB example.
  • Quick test: Use USB analyzer or host logs.

Definition of Done

  • Enumerates reliably on at least 2 host OSes.
  • Payloads run with explicit confirmation.
  • Local editor can create and save payloads.
  • Warning banner always visible.

Project 5: Wardriving WiFi Mapper

  • Main Programming Language: C/C++ (ESP-IDF or Arduino)
  • Difficulty: Advanced
  • Time estimate: 2-3 weeks
  • Knowledge Area: Wireless + Geolocation
  • Main Book: “Computer Networks”

What you will build: A WiFi mapping device that logs SSIDs, signal strength, and GPS coordinates to a clean CSV schema that can be imported directly or converted to Wigle format. Requires an external GPS module on the Grove UART.

Why it teaches Cardputer mastery: It combines asynchronous inputs (WiFi + GPS), data fusion, and long-term logging under power constraints.

Core challenges you will face:

  • Continuous WiFi scanning with channel hopping.
  • Parsing NMEA GPS sentences.
  • Synchronizing WiFi sightings to GPS time.
  • Writing large CSV logs without fragmentation.

Architecture sketch:

WiFi scan -> network list -> join with GPS -> CSV writer -> microSD

Real World Outcome

What you will see:

  1. Live GPS fix status and satellite count.
  2. A growing list of networks with RSSI.
  3. A CSV file that imports into Wigle or mapping tools.

Serial output example:

I (4000) gps: fix=3D sats=8 lat=37.7749 lon=-122.4194
I (4001) wifi: ssids=247 last=CoffeeShop_Free rssi=-42
I (4002) log: wrote 25 rows (csv)

On-screen example:

GPS: 3D Fix (8 sats)
Lat: 37.7749  Lon: -122.4194
Networks: 247
Last: CoffeeShop_Free (-42dBm)

CSV verification (example schema):

timestamp,ssid,bssid,channel,rssi,auth,lat,lon,alt_m,accuracy_m
2025-01-01T12:00:02Z,CoffeeShop_Free,AA:BB:CC:DD:EE:FF,6,-42,WPA2-PSK,37.7749,-122.4194,10,5

The Core Question You Are Answering

“How do you merge asynchronous data streams into a clean dataset that can be used for mapping?”

Concepts You Must Understand First

  1. NMEA sentence parsing
    • What do $GPRMC and $GPGGA contain?
    • Book Reference: Effective C - Ch. 10 (parsing).
  2. RSSI interpretation
    • Why is RSSI noisy and how do you smooth it?
    • Book Reference: Computer Networks - Ch. 2.
  3. CSV schema discipline
    • How do you keep columns consistent?
    • Book Reference: Effective C - Ch. 10.

Questions to Guide Your Design

  1. Do you log every scan or only deltas?
  2. What timestamp source is authoritative?
  3. How do you handle GPS loss mid-scan?

Thinking Exercise

Design a CSV schema and list the fields required by Wigle. Identify which come from WiFi, which from GPS, and which are computed.

The Interview Questions They Will Ask

  1. Why is RSSI unstable and how do you smooth it?
  2. What are the tradeoffs of active vs passive scanning?
  3. How do you synchronize data from different update rates?
  4. How do you avoid wearing out the SD card?

Hints in Layers

Hint 1: Show WiFi list Start by printing SSIDs and RSSI to the screen.

Hint 2: Add GPS parsing Parse $GPRMC and display lat/lon.

Hint 3: Write CSV Define a strict header and append rows in chunks.

Hint 4: Deduplicate Log only when RSSI changes beyond a threshold.

Books That Will Help

Topic Book Chapter
WiFi basics Computer Networks Ch. 2
Logging Making Embedded Systems Ch. 8
Parsing Effective C Ch. 10

Common Pitfalls & Debugging

Problem: “GPS coordinates always zero”

  • Why: Wrong baud rate or no fix.
  • Fix: Verify UART settings and wait for fix.
  • Quick test: Log raw NMEA lines to serial.

Problem: “CSV import fails”

  • Why: Inconsistent columns or missing header.
  • Fix: Enforce strict schema and validate on PC.
  • Quick test: Open in a spreadsheet and check column count.

Definition of Done

  • GPS fix displayed and updated every second.
  • CSV file opens and matches Wigle format.
  • Scan rate remains stable for 30 minutes.
  • UI shows live counters and last SSID.

Project 6: Custom Application Launcher and Mini-OS

  • Main Programming Language: C/C++ (ESP-IDF recommended)
  • Difficulty: Expert
  • Time estimate: 1 month+
  • Knowledge Area: OS Concepts / UI Frameworks
  • Main Book: “Operating Systems: Three Easy Pieces”

What you will build: A custom launcher that boots into a menu, loads apps from microSD, manages shared services (WiFi, display, input), and provides a consistent UI framework.

Why it teaches Cardputer mastery: It forces you to design a micro-platform with stable APIs, memory discipline, and app lifecycle management.

Core challenges you will face:

  • Boot flow and app registration.
  • Resource arbitration (display, WiFi, storage).
  • Crash recovery without full reboot.
  • Stable UI framework and app APIs.

Architecture sketch:

Boot -> Launcher -> App Manager -> Shared Services
                         |             |
                         +-> Apps <----+

Real World Outcome

What you will see:

  1. A boot menu that launches apps without reboot.
  2. Shared services that persist across apps.
  3. Settings stored across power cycles.

On-screen example:

CARDPUTER OS
[1] WiFi Tools
[2] IR Remote
[3] Spectrum
[4] Settings

Lifecycle log example:

I (5100) os: app=WiFiTools init
I (5120) os: app=WiFiTools exit reason=user_back
I (5121) os: app=IRRemote init

The Core Question You Are Answering

“How do you build a stable platform on bare metal where multiple apps can coexist safely?”

Concepts You Must Understand First

  1. Memory layout and allocation
    • How do you prevent fragmentation?
    • Book Reference: Effective C - Ch. 7-9.
  2. Event loops and message buses
    • How do apps communicate without tight coupling?
    • Book Reference: Design Patterns - Observer/Mediator.
  3. Crash recovery
    • How do you recover without a full reboot?
    • Book Reference: Making Embedded Systems - Ch. 9.

Questions to Guide Your Design

  1. What is your app ABI (init/loop/exit)?
  2. How do apps request shared services?
  3. How do you protect against runaway apps?

Thinking Exercise

Design the minimal API a third-party app needs to draw, read keys, and save settings.

The Interview Questions They Will Ask

  1. How do you handle memory fragmentation on embedded systems?
  2. What is a service locator vs dependency injection?
  3. How do you recover from a crashed app?
  4. How would you sandbox apps with an MMU?

Hints in Layers

Hint 1: Simple menu Start with hardcoded apps and a menu list.

Hint 2: Shared services struct Provide a struct with display/input/storage handles.

Hint 3: App metadata Load app metadata from SD and build a registry.

Hint 4: Watchdog Add a watchdog that returns to launcher on crash.

Books That Will Help

Topic Book Chapter
Scheduling OSTEP Scheduling + Concurrency
UI architecture Design Patterns MVC + Observer
Resource mgmt Making Embedded Systems Ch. 9

Common Pitfalls & Debugging

Problem: “Apps overwrite each other”

  • Why: Shared globals and no ownership.
  • Fix: Centralize services and force apps through APIs.
  • Quick test: Add heap guards and check for corruption.

Problem: “Launcher locks up”

  • Why: An app never returns control.
  • Fix: Run apps in cooperative loops with timeouts.
  • Quick test: Add a watchdog and log stalls.

Definition of Done

  • Launcher boots in <2 seconds.
  • Apps start/exit without memory leaks.
  • Shared services prevent conflicts.
  • Settings persist across power cycles.

Project 7: Bluetooth HID Keyboard/Mouse Injector

  • Main Programming Language: C/C++ (ESP-IDF or Arduino)
  • Difficulty: Intermediate
  • Time estimate: 1-2 weeks
  • Knowledge Area: BLE / HID
  • Main Book: “Getting Started with Bluetooth Low Energy”

What you will build: A BLE keyboard/mouse emulator that pairs with laptops, phones, and tablets. It can type, control presentations, and execute macros from the Cardputer keyboard.

Why it teaches Cardputer mastery: It combines keyboard scanning, BLE GATT design, bonding, and user-facing UI workflows.

Core challenges you will face:

  • Implementing HID over GATT.
  • Pairing and reconnection logic.
  • Macro and key mapping design.

Architecture sketch:

Keyboard -> macro engine -> HID reports -> BLE GATT -> host

Real World Outcome

What you will see:

  1. One-time pairing and automatic reconnection.
  2. Macros triggered from Cardputer keyboard.
  3. Low-latency typing on a host device.

On-screen example:

BLE HID
Status: Connected
Device: Laptop
[F1] Open Terminal
[F2] Lock Screen

Pairing log example:

I (6200) ble: connected addr=AA:BB:CC:DD:EE:FF
I (6201) ble: bonded=YES keys=stored
I (6202) hid: report sent (keys=2)

The Core Question You Are Answering

“How do you build reliable HID over BLE with stable pairing and low latency?”

Concepts You Must Understand First

  1. GATT services and characteristics
    • How do you expose the HID service?
    • Book Reference: Getting Started with BLE - Ch. 4-6.
  2. HID report format
    • How do you encode key presses and mouse movement?
    • Book Reference: Effective C - Ch. 11.
  3. Bonding storage
    • How do you persist pairing keys?
    • Book Reference: Making Embedded Systems - Ch. 8.

Questions to Guide Your Design

  1. How do you handle disconnections gracefully?
  2. What connection interval gives acceptable latency?
  3. How do you avoid ghost key repeats?

Thinking Exercise

Draw the GATT service tree for a BLE keyboard, including Report Map and Report characteristics.

The Interview Questions They Will Ask

  1. What is the difference between BLE and Classic Bluetooth?
  2. How does HID over GATT differ from USB HID?
  3. Why is bonding important for reconnection?
  4. How do you prevent key repeat bugs?

Hints in Layers

Hint 1: Basic BLE keyboard Start with a known BLE HID example and send a single key.

Hint 2: Key mapping Map Cardputer keycodes to HID keycodes.

Hint 3: Bonding Store bonding data in NVS and test reconnection.

Hint 4: Macro engine Add delays and sequences for complex macros.

Books That Will Help

Topic Book Chapter
BLE GATT Getting Started with BLE Ch. 4-6
Event-driven design Making Embedded Systems Ch. 4
State handling Effective C Ch. 12

Common Pitfalls & Debugging

Problem: “Pairs but drops after 30 seconds”

  • Why: Wrong connection parameters or supervision timeout.
  • Fix: Adjust connection interval and latency.
  • Quick test: Log BLE disconnect reasons.

Problem: “Keys repeat unexpectedly”

  • Why: Missing key release events.
  • Fix: Always send key-up reports.
  • Quick test: Log reports to serial.

Definition of Done

  • Stable reconnection across 10 power cycles.
  • Macros execute without skipped keys.
  • UI shows connected device name.
  • All keys mapped and debounced.

Project 8: Complete Cardputer Security Toolkit (Capstone)

  • Main Programming Language: C/C++ (ESP-IDF)
  • Difficulty: Master
  • Time estimate: 2-3 months
  • Knowledge Area: Systems Integration / Embedded Product
  • Main Book: “Making Embedded Systems”

What you will build: A unified firmware that combines WiFi tools, BLE HID, IR control, USB HID, GPS mapping, and audio spectrum into a single system with a consistent UI and settings.

Deep-dive guide: project_based_ideas/HARDWARE_EMBEDDED/M5STACK_CARDPUTER_LEARNING_PROJECTS/P08-complete-cardputer-security-toolkit.md

Core challenges you will face:

  • Memory pressure across multiple features.
  • Unified UI/UX across tools.
  • Power management with multiple radios.
  • OTA update and safe boot strategy.

Real World Outcome

What you will see:

  1. A home screen that switches between tools instantly.
  2. Persistent settings across all modes.
  3. A single SD card with organized logs per tool.

On-screen example:

CARDPUTER TOOLKIT
[1] WiFi Sniffer
[2] Wardrive
[3] BLE HID
[4] USB HID
[5] IR Remote
[6] Spectrum

System status log:

I (7100) toolkit: mode=WiFiSniffer heap=152KB sd=OK batt=74%
I (7101) toolkit: switch to BLEHID ok (wifi paused)

The Core Question You Are Answering

“How do you integrate multiple complex subsystems into one stable, portable product?”

Concepts You Must Understand First

  1. Resource arbitration
    • How do you prevent two tools from using WiFi at once?
    • Book Reference: Making Embedded Systems - Ch. 9.
  2. Persistent settings
    • How do you store and migrate configuration?
    • Book Reference: Effective C - Ch. 8-9.
  3. Power management
    • How do you keep battery life reasonable?
    • Book Reference: Making Embedded Systems - Ch. 10.

Questions to Guide Your Design

  1. What is your global state model?
  2. How do you switch modes without rebooting?
  3. What is your crash recovery strategy?

Thinking Exercise

Design a global service registry that each tool uses for WiFi, storage, and UI.

The Interview Questions They Will Ask

  1. How do you manage memory across multiple subsystems?
  2. What is your strategy for OTA updates?
  3. How do you enforce consistent UI patterns?
  4. How do you implement safe mode on boot failure?

Hints in Layers

Hint 1: Shared services Create a global struct of services (display, input, storage).

Hint 2: App registry Use a table of apps with init/loop/exit callbacks.

Hint 3: Central settings Store settings in NVS and expose a settings UI.

Hint 4: Safe boot Add a boot counter and fallback to safe mode on crashes.

Books That Will Help

Topic Book Chapter
Resource management Making Embedded Systems Ch. 9
Defensive coding Effective C Ch. 7-9
Architecture Clean Architecture Ch. 1-5

Common Pitfalls & Debugging

Problem: “Switching modes corrupts state”

  • Why: Shared global state without reset logic.
  • Fix: Add explicit teardown and reinit per tool.
  • Quick test: Cycle modes 50 times and watch for leaks.

Problem: “Battery drains too fast”

  • Why: Radios or display stay active.
  • Fix: Add power modes and auto-dimming.
  • Quick test: Measure current draw in each mode.

Definition of Done

  • All tool modes switch without reboot.
  • Shared configuration works across tools.
  • Battery runtime exceeds 2 hours in mixed use.
  • Stable OTA update workflow.

References (official docs and key sources)

M5Stack Cardputer docs: https://docs.m5stack.com/en/products/sku/K132
M5Stack Cardputer-Adv (K132-Adv) specs: https://shop.m5stack.com/products/m5stack-cardputer-adv-version-esp32-s3
ESP-IDF WiFi programming guide (promiscuous mode): https://docs.espressif.com/projects/esp-idf/en/stable/esp32s3/api-guides/wifi.html
ESP-IDF WiFi API reference (promiscuous callbacks/filters): https://docs.espressif.com/projects/esp-idf/en/stable/esp32s3/api-reference/network/esp_wifi.html
ESP-IDF USB device stack: https://docs.espressif.com/projects/esp-idf/en/stable/esp32s3/api-reference/peripherals/usb_device.html
ESP-IDF USB console/DFU notes (GPIO19/20, endpoint limit, secure boot caveats): https://docs.espressif.com/projects/esp-idf/en/stable/esp32s3/api-reference/system/usb-serial-jtag-console.html
ESP-IDF I2S driver: https://docs.espressif.com/projects/esp-idf/en/stable/esp32s3/api-reference/peripherals/i2s.html
TinyUSB documentation: https://docs.tinyusb.org/
Bluetooth HID over GATT Profile (HOGP): https://www.bluetooth.com/specifications/specs/hid-over-gatt-profile/
Wireshark PCAP file format: https://wiki.wireshark.org/Development/LibpcapFileFormat
NMEA 0183 sentence reference: https://resources.arcgis.com/en/help/arcpad/10.2/app/00s1/00s10000015p000000.htm
NEC IR timing reference: https://www.electronicdesign.com/technologies/communications/iot/article/55331179/renesas-electronics-using-the-slg47011-to-implement-the-nec-infrared-protocol
IoT Analytics device count 2024: https://iot-analytics.com/number-of-connected-iot-devices-2024/
IoT Analytics device count 2025: https://iot-analytics.com/number-connected-iot-devices/
Ericsson Mobility Report IoT forecast: https://www.ericsson.com/en/reports-and-papers/mobility-report/dataforecasts/iot-connections-outlook