Project 4: Logic Analyzer (PIO + DMA)

Build a multi-channel logic analyzer that samples GPIO with PIO, streams to a host, and decodes UART/I2C/SPI frames.

Quick Reference

Attribute Value
Difficulty Level 3: Advanced
Time Estimate 2-3 weeks
Main Programming Language C (Pico SDK + PIO asm)
Alternative Programming Languages Python (host decoder)
Coolness Level Level 4: Professional Tool
Business Potential 4. The “Instrumentation” Tier
Prerequisites GPIO, DMA basics, UART/I2C/SPI fundamentals
Key Topics PIO sampling, DMA streaming, trigger logic, protocol decoding

1. Learning Objectives

By completing this project, you will:

  1. Program PIO to sample multiple pins at a deterministic rate.
  2. Use DMA to stream samples into a ring buffer without CPU gaps.
  3. Implement trigger logic and aligned captures.
  4. Decode UART/I2C/SPI waveforms on a host.
  5. Diagnose dropped samples and timing errors.

2. All Theory Needed (Per-Concept Breakdown)

2.1 PIO Timing and Sampling

Fundamentals

PIO is a programmable IO engine that executes small instruction sequences deterministically. Each state machine can read pins at a fixed rate independent of CPU load. Sampling is done by shifting pin states into a FIFO, which then feeds DMA. Because PIO runs at a configured clock divider, you can precisely control sample rate.

Deep Dive into the concept

PIO instructions execute in a fixed number of cycles, and the instruction clock is derived from the system clock divided by a configurable divider. This makes sampling deterministic: if you set a 10 MHz PIO clock and execute one in pins, N per cycle, you get 10 million samples per second. However, each push into the FIFO consumes time and the FIFO has limited depth, so you must ensure DMA keeps up. The data width depends on how many pins you sample; a common strategy is to sample 8 pins and pack them into 8 bits, then let the FIFO pack into 32-bit words. The placement of pins matters: contiguous pin groups are easiest because PIO can read a block of pins at once. The RP2040 PIO also supports side-set and pin direction changes, but for sampling you primarily use in and push. Timing accuracy depends on stable system clock; if you change clocks at runtime, the sample rate changes. PIO thus provides the clock-accurate foundation for a logic analyzer, but you must design the data path so that the FIFO never overflows.

How this fits on projects

PIO sampling is the core of §3.2 and §5.10 Phase 1.

Definitions & key terms

  • PIO -> Programmable I/O engine
  • State machine -> independent PIO execution unit
  • FIFO -> buffer between PIO and DMA
  • clkdiv -> PIO clock divider

Mental model diagram (ASCII)

GPIO pins -> PIO IN -> RX FIFO -> DMA -> SRAM buffer

How it works (step-by-step)

  1. Configure PIO program to read pins.
  2. Set clock divider for sample rate.
  3. Push samples into FIFO.
  4. DMA drains FIFO into buffer.

Minimal concrete example

.loop:
  in pins, 8
  push
  jmp .loop

Common misconceptions

  • “PIO can sample any pins” -> contiguous pin groups are easiest.
  • “PIO is a CPU” -> it is deterministic but limited to simple instructions.

Check-your-understanding questions

  1. What determines the sample rate in PIO?
  2. Why can FIFO overflow even if the CPU is idle?
  3. Why is contiguous pin mapping easier?

Check-your-understanding answers

  1. The system clock divided by PIO clkdiv and instruction count.
  2. Because PIO can produce data faster than DMA drains it.
  3. PIO reads a block of pins with a single instruction.

Real-world applications

  • Logic analyzers, waveform generators, protocol sniffers.

Where you’ll apply it

References

  • RP2040 Datasheet: PIO chapter

Key insights

PIO gives you deterministic timing; everything else is about keeping up with its data.

Summary

Configure PIO for stable sampling and design DMA to prevent FIFO overflow.

Homework/Exercises to practice the concept

  1. Calculate required PIO clock divider for 5 MHz sampling.
  2. Determine how many samples fit in the FIFO before overflow.

Solutions to the homework/exercises

  1. divider = sysclk / 5e6.
  2. 4 words per FIFO; at 8-bit samples packed into 32-bit, FIFO holds 16 samples.

2.2 DMA Streaming and Ring Buffers

Fundamentals

DMA moves captured samples into RAM. A ring buffer provides continuous capture by wrapping around memory. The CPU consumes blocks and streams them to the host while DMA keeps filling new blocks.

Deep Dive into the concept

High-speed logic capture is a producer/consumer system. PIO is the producer, DMA is the transport, and the CPU/USB host is the consumer. Ring buffers allow the producer to run continuously without waiting. You typically divide RAM into fixed blocks (e.g., 4 KB each). DMA fills block 0, interrupts, then fills block 1. The CPU streams completed blocks. If the CPU or USB is too slow, DMA overwrites old data. You must detect this by tracking write and read indices. You can choose to drop old data (best for live view) or stop capture (best for debugging). The ring size must balance memory use and latency: larger buffers reduce overrun risk but increase capture-to-display delay. On RP2040, you have limited SRAM, so buffer design is a core architecture decision.

How this fits on projects

DMA streaming is required for §3.2 and §5.10 Phase 2; overrun handling is part of §6.

Definitions & key terms

  • Producer/consumer -> data pipeline model
  • Overrun -> producer overwrites unread data
  • Block size -> chunk size for DMA transfers

Mental model diagram (ASCII)

[Block0][Block1][Block2][Block3]
   ^read            ^write

How it works (step-by-step)

  1. DMA writes samples into current block.
  2. On completion, IRQ fires.
  3. CPU queues block for USB streaming.
  4. DMA advances to next block.

Minimal concrete example

if (next_write == read_idx) overrun++;

Common misconceptions

  • “Ring buffer means no loss” -> you can still overrun if consumer is slow.
  • “Bigger buffer always better” -> it increases latency.

Check-your-understanding questions

  1. What is the trade-off of increasing block size?
  2. Why do we need a read index and write index?
  3. What happens if USB disconnects mid-capture?

Check-your-understanding answers

  1. Fewer interrupts but higher latency.
  2. To detect overwrites and manage data flow.
  3. The consumer stops; DMA may overrun unless capture stops.

Real-world applications

  • Audio streaming, data loggers, oscilloscope capture.

Where you’ll apply it

References

  • RP2040 Datasheet: DMA chapter

Key insights

Continuous capture is a throughput problem, not a CPU problem.

Summary

Ring buffers enable sustained capture, but you must detect and handle overruns.

Homework/Exercises to practice the concept

  1. Calculate buffer size needed for 1 second at 10 MHz with 8 channels.
  2. Estimate USB bandwidth required for 10 MHz x 8 bits.

Solutions to the homework/exercises

  1. 10e6 samples * 1 byte = 10 MB.
  2. 10 MB/s, which is beyond typical USB serial.

2.3 Protocol Decoding (UART/I2C/SPI)

Fundamentals

Protocol decoding translates raw bit transitions into structured frames. UART is asynchronous with start/stop bits, I2C uses open-drain with address bytes and ACK, and SPI uses a clock with MOSI/MISO lines. Correct decoding requires accurate sampling and timing thresholds.

Deep Dive into the concept

Decoding begins with edge detection. For UART, you detect a falling edge (start bit), then sample the line at the middle of each bit period. You must estimate baud rate or allow the user to configure it. I2C decoding requires detecting START and STOP conditions (SDA changes while SCL is high), then sampling SDA on each SCL rising edge. SPI decoding depends on CPOL/CPHA; you must sample on the correct edge and interpret chip select. Because your logic analyzer samples at a fixed rate, you reconstruct protocol timing by finding transitions and calculating bit periods. For accurate decoding, your sample rate should be at least 4-10x the signal frequency. If it’s too low, edges may be missed and decoding fails. Therefore, decoding quality depends on sample rate selection and trigger alignment.

How this fits on projects

Protocol decoding is required for §3.2 and validated in §3.7.2 demos.

Definitions & key terms

  • Start/Stop -> UART framing bits
  • ACK/NACK -> I2C acknowledgment
  • CPOL/CPHA -> SPI clock polarity/phase

Mental model diagram (ASCII)

UART: idle=1, start=0, data bits, stop=1
I2C: SDA change while SCL high = START/STOP
SPI: sample on defined clock edge while CS low

How it works (step-by-step)

  1. Extract edges from sampled buffer.
  2. Identify protocol-specific framing.
  3. Sample data bits at correct phase.
  4. Emit decoded frames.

Minimal concrete example

# UART decode: sample at t0 + 1.5 bit periods

Common misconceptions

  • “UART is self-timing” -> you still must know baud rate.
  • “I2C is just two wires” -> timing and ACK bits are critical.

Check-your-understanding questions

  1. Why sample UART in the middle of a bit?
  2. How do you detect I2C START?
  3. What happens if CPHA is wrong for SPI?

Check-your-understanding answers

  1. It minimizes jitter and edge ambiguity.
  2. SDA falling while SCL is high.
  3. You sample on the wrong edge and decode garbage.

Real-world applications

  • Bus debugging, reverse engineering, firmware validation.

Where you’ll apply it

References

  • “The Book of I2C” Ch. 1-4
  • UART and SPI sections in common MCU datasheets

Key insights

Decoding is a timing problem first, a parsing problem second.

Summary

Accurate decoding requires correct timing, edges, and protocol framing rules.

Homework/Exercises to practice the concept

  1. Decode a UART byte from a waveform by hand.
  2. Identify an I2C START and STOP on a timing diagram.

Solutions to the homework/exercises

  1. Sample mid-bit and reconstruct data bits.
  2. START: SDA falls while SCL high; STOP: SDA rises while SCL high.

3. Project Specification

3.1 What You Will Build

A logic analyzer that captures at least 8 digital channels at 10 MHz, streams data to a host, and decodes UART/I2C/SPI frames.

3.2 Functional Requirements

  1. PIO capture: sample GPIO at fixed rate.
  2. DMA streaming: double-buffer capture.
  3. Triggering: edge trigger on a chosen channel.
  4. Host decoding: UART + I2C + SPI decoders.
  5. Export: save captures to a simple CSV or binary file.

3.3 Non-Functional Requirements

  • Performance: no drops for 1 second at 10 MHz.
  • Reliability: overrun counter and clear warnings.
  • Usability: host UI shows channel states and decoded frames.

3.4 Example Usage / Output

[LA] channels=8 rate=10MHz
[DMA] buffer=4096 samples, double-buffered
[CAPTURE] trigger=falling edge on CH0
UART: 0x55 0xAA

3.5 Data Formats / Schemas / Protocols

  • Capture frame: header + raw samples
  • Decoded frame: timestamp + protocol + bytes

3.6 Edge Cases

  • Missing trigger -> show free-running capture.
  • Sample rate too low -> decoding fails with warning.
  • USB disconnect -> capture stops gracefully.

3.7 Real World Outcome

You will see decoded frames from real hardware buses on your PC.

3.7.1 How to Run (Copy/Paste)

cmake .. && make -j4
picotool load -f logic_analyzer.uf2
python3 host/decode.py --protocol uart --baud 115200

3.7.2 Golden Path Demo (Deterministic)

  • Capture a known UART byte stream 0x55 0xAA at 115200 baud.
  • Host decoder prints the expected bytes.

3.7.3 Failure Demo (Bad Input)

  • Scenario: sample rate set to 200 kHz for 1 Mbps UART.
  • Expected result: decoder prints [ERROR] sample rate too low.

3.7.4 If CLI: exact terminal transcript

$ python3 host/decode.py --protocol uart --baud 115200
Frame: 0x55 0xAA

4. Solution Architecture

4.1 High-Level Design

PIO -> FIFO -> DMA -> Ring Buffer -> USB -> Host Decoder -> UI

4.2 Key Components

| Component | Responsibility | Key Decisions | |———–|—————-|—————| | PIO Program | Sample pins | 8-channel pack | | DMA Engine | Move data | Double-buffering | | Trigger Logic | Align capture | Edge trigger | | Host Decoder | Parse protocols | Python decoders | | UI | Display traces | Simple timeline view |

4.3 Data Structures (No Full Code)

typedef struct {
  uint32_t magic;
  uint32_t rate_hz;
  uint16_t channels;
  uint16_t count;
} capture_header_t;

4.4 Algorithm Overview

Key Algorithm: Capture + Decode

  1. Capture samples into buffer.
  2. Align on trigger index.
  3. Decode protocol based on edges.
  4. Display and export.

Complexity Analysis:

  • Time: O(N) for decode
  • Space: O(N) for capture buffer

5. Implementation Guide

5.1 Development Environment Setup

python3 -m venv .venv && source .venv/bin/activate
pip install pyserial

5.2 Project Structure

logic-analyzer/
├── firmware/
│   ├── main.c
│   ├── pio_capture.pio
│   └── dma.c
├── host/
│   ├── decode.py
│   └── ui.py
└── README.md

5.3 The Core Question You’re Answering

“How do you build a deterministic capture engine for digital signals?”

5.4 Concepts You Must Understand First

  1. PIO timing and sampling
  2. DMA ring buffers
  3. Protocol decoding rules

5.5 Questions to Guide Your Design

  1. What sample rate is required for your target protocols?
  2. How large should your capture buffer be?
  3. Will decoding happen on-device or host?

5.6 Thinking Exercise

Compute the minimum sample rate for a 1 MHz SPI clock if you want 8x oversampling.

5.7 The Interview Questions They’ll Ask

  1. Why use PIO instead of GPIO polling?
  2. How do you avoid DMA overruns?
  3. How do you decode UART from sampled edges?

5.8 Hints in Layers

Hint 1: Start with single-channel capture. Hint 2: Add DMA streaming into a buffer. Hint 3: Add trigger scan on buffer. Hint 4: Add host decoder for one protocol.

5.9 Books That Will Help

| Topic | Book | Chapter | |——-|——|———| | Deterministic IO | “Making Embedded Systems” | Ch. 7 | | Serial protocols | “The Book of I2C” | Ch. 1-4 | | Debugging | “The Art of Debugging with GDB” | Ch. 2 |

5.10 Implementation Phases

Phase 1: Foundation (3-5 days)

  • PIO sampling and DMA capture. Checkpoint: raw samples visible in host log.

Phase 2: Core Functionality (5-7 days)

  • Trigger alignment and ring buffer management. Checkpoint: stable triggered capture.

Phase 3: Decoding & UI (4-6 days)

  • Add UART/I2C/SPI decoders. Checkpoint: decoded frames match known patterns.

5.11 Key Implementation Decisions

| Decision | Options | Recommendation | Rationale | |———-|———|—————-|———–| | Decode location | MCU vs host | Host | More CPU and flexibility | | Trigger type | Edge vs pattern | Edge | Simpler and robust | | Buffer size | Small vs large | Medium | Balance latency and safety |


6. Testing Strategy

6.1 Test Categories

| Category | Purpose | Examples | |———-|———|———-| | Unit Tests | Decoder correctness | Known UART waveform | | Integration Tests | PIO+DMA capture | 10 MHz square wave | | Edge Case Tests | Overrun behavior | Slow host simulation |

6.2 Critical Test Cases

  1. UART decode: 0x55 0xAA at 115200 baud.
  2. I2C decode: known address and data bytes.
  3. Overrun: simulate host pause and verify warning.

6.3 Test Data

UART bit period at 115200 baud ≈ 8.68 us

7. Common Pitfalls & Debugging

7.1 Frequent Mistakes

| Pitfall | Symptom | Solution | |———|———|———-| | Wrong pin mapping | Garbage capture | Verify GPIO base pin | | Sample rate too low | Decode errors | Increase PIO clock | | USB bottleneck | Drops/overruns | Reduce channels or rate |

7.2 Debugging Strategies

  • Use a known test pattern generator or UART loopback.
  • Toggle a GPIO on DMA completion to verify rate.

7.3 Performance Traps

  • Printing too much debug output reduces capture performance.

8. Extensions & Challenges

8.1 Beginner Extensions

  • Add CSV export of raw samples.
  • Add a simple trigger level adjustment.

8.2 Intermediate Extensions

  • Add protocol auto-detection heuristics.
  • Add digital filtering to reduce noise.

8.3 Advanced Extensions

  • Implement hardware triggers with PIO pattern matching.
  • Build a simple GUI with zoom and pan.

9. Real-World Connections

9.1 Industry Applications

  • Hardware debugging: analyze bus traffic in embedded systems.
  • Security research: reverse engineer device protocols.
  • sigrok / pulseview for host UI inspiration

9.3 Interview Relevance

  • PIO, DMA, and protocol decoding are advanced embedded topics.

10. Resources

10.1 Essential Reading

  • RP2040 Datasheet: PIO + DMA
  • “Making Embedded Systems” Ch. 7-8

10.2 Video Resources

  • PIO tutorials and logic analyzer build videos

10.3 Tools & Documentation

  • Pico SDK and PIO assembler docs
  • PySerial for host capture

11. Self-Assessment Checklist

11.1 Understanding

  • I can explain PIO timing and clock dividers.
  • I can design a ring buffer without overruns.
  • I can decode UART/I2C/SPI frames.

11.2 Implementation

  • Captures 8 channels at target rate.
  • Trigger and alignment are stable.
  • Host decoder produces correct frames.

11.3 Growth

  • I can describe bandwidth limits and trade-offs.
  • I can extend decoders for new protocols.

12. Submission / Completion Criteria

Minimum Viable Completion:

  • Capture and display raw digital waveforms.
  • Decode at least one protocol.

Full Completion:

  • Stable capture at 10 MHz with trigger.
  • Decodes UART, I2C, and SPI.

Excellence (Going Above & Beyond):

  • Hardware trigger and GUI zoom/pan.

13. Additional Content Rules

13.1 Determinism

Fix sample rate, buffer length, and trigger settings in demos. Record firmware version in capture headers.

13.2 Outcome Completeness

  • Success demo: §3.7.2
  • Failure demo: §3.7.3
  • CLI exit codes: host decoder returns 0 success, 2 serial open failure, 5 decode error.

13.3 Cross-Linking

Concept links and related projects appear in §2.x and §10.4.

13.4 No Placeholder Text

All sections are fully specified for this project.