Learn RP2350 LCD Development: From Zero to Embedded Graphics Master

Goal: Deeply understand embedded systems programming through the RP2350 1.47-inch LCD Development Board—mastering the unique dual-architecture (ARM Cortex-M33 / RISC-V Hazard3), display programming, SPI communication, PIO state machines, DMA transfers, and real-time graphics. You’ll learn not just how to program microcontrollers, but why each design decision exists and what tradeoffs shape embedded systems.


Why the RP2350 LCD Board Matters

In August 2024, Raspberry Pi released the RP2350—the first widely accessible dual-ISA (Instruction Set Architecture) microcontroller. This $5 chip lets you choose between ARM Cortex-M33 or RISC-V Hazard3 cores at runtime. This isn’t just a marketing gimmick; it’s a window into understanding how processors actually work at the architectural level.

The Waveshare RP2350-LCD-1.47-A board pairs this revolutionary chip with a 172×320 pixel, 262K-color display, creating the perfect learning platform for embedded graphics programming. Here’s why this matters:

Historical Context

1970s: Microprocessors born (Intel 4004, 8008)
       └─► "What if we put a CPU on a chip?"

1980s: ARM architecture created
       └─► "What if we made RISC simple and power-efficient?"

2010s: RISC-V goes open-source
       └─► "What if the ISA itself was free?"

2024: RP2350 ships with BOTH
       └─► "What if you could switch architectures on the same chip?"

Why This Board Specifically?

Component What It Teaches
RP2350 chip Dual-core parallelism, architecture selection, security (TrustZone)
1.47” LCD (ST7789) SPI protocol, pixel manipulation, color theory, frame buffers
RGB LED PWM, color mixing, real-time control
16MB Flash Memory mapping, XIP (Execute-In-Place), file systems
TF card slot FAT file systems, storage management
USB-C Device enumeration, power delivery, CDC/HID protocols

The Skills You’ll Build

After completing these projects, you will be able to:

  1. Read datasheets like a native language
  2. Write bare-metal C without SDK hand-holding
  3. Understand processor architecture at the register level
  4. Drive displays through raw SPI commands
  5. Use DMA to achieve zero-CPU-overhead transfers
  6. Program PIO state machines for custom protocols
  7. Compare ARM vs RISC-V from practical experience
  8. Implement security using TrustZone
  9. Optimize graphics for real-time performance
  10. Debug hardware with oscilloscopes and logic analyzers

Core Concept Analysis

The RP2350 Architecture: A Dual-Core, Dual-ISA Microcontroller

The RP2350 is unique: it contains four processor cores but can only run two at a time. You choose which pair:

┌─────────────────────────────────────────────────────────────────────┐
│                         RP2350 SILICON                              │
│                                                                     │
│  ┌───────────────────────┐    ┌───────────────────────────┐        │
│  │   ARM Cortex-M33 #0   │    │   ARM Cortex-M33 #1       │        │
│  │  ┌─────────────────┐  │    │  ┌─────────────────────┐  │        │
│  │  │ FPU (SP+DP)     │  │    │  │ FPU (SP+DP)         │  │        │
│  │  │ DSP extensions  │  │    │  │ DSP extensions      │  │        │
│  │  │ TrustZone       │  │    │  │ TrustZone           │  │        │
│  │  │ 150 MHz         │  │    │  │ 150 MHz             │  │        │
│  │  └─────────────────┘  │    │  └─────────────────────┘  │        │
│  └───────────────────────┘    └───────────────────────────┘        │
│           ▲                              ▲                          │
│           │ ←── ONLY 2 CORES ──►         │                          │
│           │     ACTIVE AT ONCE           │                          │
│           ▼                              ▼                          │
│  ┌───────────────────────┐    ┌───────────────────────────┐        │
│  │   RISC-V Hazard3 #0   │    │   RISC-V Hazard3 #1       │        │
│  │  ┌─────────────────┐  │    │  ┌─────────────────────┐  │        │
│  │  │ RV32IMAC        │  │    │  │ RV32IMAC            │  │        │
│  │  │ 3-stage pipeline│  │    │  │ 3-stage pipeline    │  │        │
│  │  │ No FPU (soft)   │  │    │  │ No FPU (soft)       │  │        │
│  │  │ 150 MHz         │  │    │  │ 150 MHz             │  │        │
│  │  └─────────────────┘  │    │  └─────────────────────┘  │        │
│  └───────────────────────┘    └───────────────────────────┘        │
│                                                                     │
│  ┌─────────────────────────────────────────────────────────────┐   │
│  │                    SHARED RESOURCES                          │   │
│  │  • 520KB SRAM (6 banks)    • 12 DMA channels                │   │
│  │  • 12 PIO state machines   • 2 UARTs, 2 SPIs, 2 I2Cs        │   │
│  │  • 30 GPIO pins            • USB 1.1 Host/Device            │   │
│  │  • 8KB OTP memory          • SHA-256, TRNG                  │   │
│  └─────────────────────────────────────────────────────────────┘   │
└─────────────────────────────────────────────────────────────────────┘

Key Insight: ARM Cortex-M33 has hardware floating-point; RISC-V Hazard3 does not. Choose ARM for math-heavy work, RISC-V for learning open architectures.

Memory Map: Where Everything Lives

Understanding the memory map is essential for bare-metal programming:

0xFFFFFFFF ┌─────────────────────────────────────┐
           │      CORTEX-M33 INTERNAL            │
           │      (PPB, SCS, etc.)               │
0xE0000000 ├─────────────────────────────────────┤
           │                                     │
           │      PERIPHERAL SPACE               │
           │      (I/O registers for:            │
           │       GPIO, SPI, I2C, PIO,          │
           │       DMA, PWM, ADC, etc.)          │
           │                                     │
0x40000000 ├─────────────────────────────────────┤
           │                                     │
           │      SRAM (520KB)                   │
           │      0x20000000 - 0x20082000        │
           │      ┌──────────────────────────┐   │
           │      │ Bank 0: 64KB (striped)   │   │
           │      │ Bank 1: 64KB (striped)   │   │
           │      │ Bank 2: 64KB (striped)   │   │
           │      │ Bank 3: 64KB (striped)   │   │
           │      │ Bank 4: 64KB             │   │
           │      │ Bank 5: 256KB            │   │
           │      └──────────────────────────┘   │
0x20000000 ├─────────────────────────────────────┤
           │                                     │
           │      XIP (Execute-In-Place)         │
           │      Flash memory mapped here       │
           │      (Up to 16MB on this board)     │
           │                                     │
0x10000000 ├─────────────────────────────────────┤
           │                                     │
           │      ROM (16KB bootloader)          │
           │      Contains boot code, USB boot,  │
           │      and floating-point routines    │
           │                                     │
0x00000000 └─────────────────────────────────────┘

Key Insight: The XIP region lets you execute code directly from flash without copying to RAM. This is how 16MB of flash fits on a chip with only 520KB RAM.

The LCD Display: Understanding ST7789

The 1.47-inch display uses the ST7789 controller, which speaks SPI:

┌─────────────────────────────────────────────────────────────────┐
│                    RP2350 ←──SPI──► ST7789                      │
│                                                                 │
│   RP2350 PINS          4-WIRE SPI            ST7789 DISPLAY     │
│  ┌─────────┐         ┌──────────┐         ┌─────────────────┐   │
│  │ GPIO 10 │────────►│   CLK    │────────►│ Serial Clock    │   │
│  │ GPIO 11 │────────►│   MOSI   │────────►│ Serial Data In  │   │
│  │ GPIO 9  │────────►│   CS     │────────►│ Chip Select     │   │
│  │ GPIO 8  │────────►│   DC     │────────►│ Data/Command    │   │
│  │ GPIO 12 │────────►│   RST    │────────►│ Reset           │   │
│  │ GPIO 25 │────────►│   BL     │────────►│ Backlight       │   │
│  └─────────┘         └──────────┘         └─────────────────┘   │
│                                                                 │
│                      DATA FLOW                                  │
│   ┌──────────────────────────────────────────────────────┐     │
│   │ 1. Pull CS LOW (select display)                       │     │
│   │ 2. Set DC LOW for command, HIGH for data             │     │
│   │ 3. Clock out bytes on MOSI, synced to CLK            │     │
│   │ 4. Pull CS HIGH (deselect)                           │     │
│   └──────────────────────────────────────────────────────┘     │
└─────────────────────────────────────────────────────────────────┘

DISPLAY MEMORY LAYOUT:
┌────────────────────────────────────┐
│  172 pixels wide × 320 pixels tall │
│                                    │
│  Each pixel = 16 bits (RGB565)     │
│  OR 18 bits (RGB666 for 262K)      │
│                                    │
│  Total = 172 × 320 × 2 bytes       │
│        = 110,080 bytes per frame   │
│        = ~107 KB                   │
│                                    │
│  At 60 FPS = 6.4 MB/s throughput   │
│  SPI must run at ~50+ MHz!         │
└────────────────────────────────────┘

The PIO Subsystem: Custom Hardware in Software

The RP2350 has 12 PIO state machines (upgraded from RP2040’s 8). Think of them as tiny programmable CPUs that run independently:

┌─────────────────────────────────────────────────────────────────┐
│                        PIO BLOCK (x2)                           │
│                                                                 │
│   ┌───────────────────────────────────────────────────────┐    │
│   │                 INSTRUCTION MEMORY                     │    │
│   │                 (32 instructions each)                 │    │
│   │                                                        │    │
│   │  JMP  │ WAIT │  IN  │ OUT  │ PUSH │ PULL │ MOV │ IRQ  │    │
│   │  SET  │  ... │ ...  │ ...  │ ...  │ ...  │ ... │ ...  │    │
│   └───────────────────────────────────────────────────────┘    │
│          │           │           │           │                  │
│   ┌──────▼──┐ ┌──────▼──┐ ┌──────▼──┐ ┌──────▼──┐              │
│   │  SM 0   │ │  SM 1   │ │  SM 2   │ │  SM 3   │              │
│   │ ┌─────┐ │ │ ┌─────┐ │ │ ┌─────┐ │ │ ┌─────┐ │              │
│   │ │OSR  │ │ │ │OSR  │ │ │ │OSR  │ │ │ │OSR  │ │              │
│   │ │ISR  │ │ │ │ISR  │ │ │ │ISR  │ │ │ │ISR  │ │              │
│   │ │FIFO │ │ │ │FIFO │ │ │ │FIFO │ │ │ │FIFO │ │              │
│   │ └─────┘ │ │ └─────┘ │ │ └─────┘ │ │ └─────┘ │              │
│   └────┬────┘ └────┬────┘ └────┬────┘ └────┬────┘              │
│        │           │           │           │                    │
│   ┌────▼───────────▼───────────▼───────────▼────┐              │
│   │              GPIO PINS                       │              │
│   │    (Each SM can control up to 32 pins)       │              │
│   └─────────────────────────────────────────────┘              │
└─────────────────────────────────────────────────────────────────┘

PIO INSTRUCTION SET (only 9 instructions!):
┌──────┬─────────────────────────────────────────────────┐
│ JMP  │ Jump to address (conditional or unconditional) │
│ WAIT │ Stall until condition (GPIO, IRQ, etc.)        │
│ IN   │ Shift bits into ISR from pins/scratch          │
│ OUT  │ Shift bits out of OSR to pins/scratch          │
│ PUSH │ Write ISR contents to RX FIFO                  │
│ PULL │ Read TX FIFO into OSR                          │
│ MOV  │ Copy between registers                         │
│ IRQ  │ Set/clear interrupt flags                      │
│ SET  │ Write immediate value to pins/scratch          │
└──────┴─────────────────────────────────────────────────┘

Key Insight: PIO lets you implement custom protocols (like WS2812B LEDs, VGA timing, or proprietary buses) without CPU involvement. Combined with DMA, you can achieve true zero-copy, zero-CPU data streaming.

DMA: Moving Data Without the CPU

DMA (Direct Memory Access) transfers data between memory regions or between peripherals and memory without CPU intervention:

┌─────────────────────────────────────────────────────────────────┐
│                     WITHOUT DMA                                 │
│                                                                 │
│   CPU       ┌───────────────────────────────────┐              │
│    │        │ 1. Read byte from memory          │              │
│    │        │ 2. Write byte to SPI TX register  │              │
│    │        │ 3. Wait for SPI ready              │              │
│    │        │ 4. Repeat 110,080 times per frame │              │
│    ▼        └───────────────────────────────────┘              │
│   BUSY                                                          │
│   100%!     Frame update: 100% CPU, ~20ms latency              │
│                                                                 │
├─────────────────────────────────────────────────────────────────┤
│                      WITH DMA                                   │
│                                                                 │
│   CPU       ┌───────────────────────────────────┐              │
│    │        │ 1. Configure DMA channel          │              │
│    │        │ 2. Point: source → destination    │              │
│    │        │ 3. Set transfer count             │              │
│    │        │ 4. Start transfer                 │              │
│    │        │ 5. (optional) Get IRQ when done   │              │
│    ▼        └───────────────────────────────────┘              │
│   FREE!                                                         │
│   0% CPU    Frame update: 0% CPU, DMA handles it               │
│             CPU can compute next frame while current sends!     │
└─────────────────────────────────────────────────────────────────┘

DMA TRANSFER CONFIGURATION:
┌─────────────────────────────────────────────────────────────────┐
│  dma_channel_config config = dma_channel_get_default_config(n);│
│                                                                 │
│  channel_config_set_transfer_data_size(&config, DMA_SIZE_8);   │
│  //                                                ▲            │
│  //                                    8, 16, or 32 bits        │
│                                                                 │
│  channel_config_set_read_increment(&config, true);             │
│  //                                           ▲                 │
│  //          Source address increments after each transfer      │
│                                                                 │
│  channel_config_set_write_increment(&config, false);           │
│  //                                            ▲                │
│  //          Destination stays same (SPI FIFO register)        │
│                                                                 │
│  channel_config_set_dreq(&config, DREQ_SPI0_TX);               │
│  //                                    ▲                        │
│  //          Pace transfers to SPI's ready signal               │
└─────────────────────────────────────────────────────────────────┘

ARM vs RISC-V: What’s the Difference?

┌─────────────────────────────────────────────────────────────────┐
│              ARM CORTEX-M33              RISC-V HAZARD3         │
│                                                                 │
│  ┌───────────────────────────┐   ┌───────────────────────────┐ │
│  │  CISC-ish (complex ISA)   │   │  RISC (simple ISA)        │ │
│  │  Thumb-2 instruction set  │   │  RV32IMAC instruction set │ │
│  │  Variable-length: 16/32b  │   │  Fixed 32b (C ext: 16b)   │ │
│  └───────────────────────────┘   └───────────────────────────┘ │
│                                                                 │
│  HARDWARE FPU:                    NO FPU:                       │
│  ┌───────────────────────────┐   ┌───────────────────────────┐ │
│  │  Single-precision: 1 cyc  │   │  Soft float via libm      │ │
│  │  Double-precision: ~4 cyc │   │  Single: ~50 cycles       │ │
│  │  DSP: MAC in 1 cycle      │   │  Double: ~100+ cycles     │ │
│  └───────────────────────────┘   └───────────────────────────┘ │
│                                                                 │
│  TRUSTZONE SECURITY:              NO TRUSTZONE:                 │
│  ┌───────────────────────────┐   ┌───────────────────────────┐ │
│  │  Secure/Non-Secure worlds │   │  Single privilege level   │ │
│  │  Hardware isolation       │   │  Software-based security  │ │
│  │  Crypto acceleration      │   │  Uses shared crypto       │ │
│  └───────────────────────────┘   └───────────────────────────┘ │
│                                                                 │
│  COREMARK SCORE: ~4.8/MHz         COREMARK: 4.15/MHz           │
│  (slightly faster)                (slightly slower)             │
│                                                                 │
│  WHEN TO USE:                     WHEN TO USE:                  │
│  • Heavy math (graphics, DSP)     • Learning open architectures │
│  • Security-critical code         • Experimenting with ISA     │
│  • Power-sensitive (better FPU)   • Contributing to open HW    │
│  • Production firmware            • Academic/research work     │
└─────────────────────────────────────────────────────────────────┘

SPI Protocol: The Display’s Language

         ┌─────────────────────────────────────────────────────┐
         │               SPI (Serial Peripheral Interface)     │
         │                                                     │
  MASTER │                                              SLAVE  │
 (RP2350)│                                           (ST7789)  │
    ┌────┴────┐                                    ┌────┴────┐ │
    │         │───────── SCLK (Clock) ────────────►│         │ │
    │         │───────── MOSI (Data) ─────────────►│         │ │
    │         │◄──────── MISO (Data) ──────────────│         │ │
    │         │───────── CS (Select) ─────────────►│         │ │
    │         │───────── DC (Cmd/Data) ───────────►│         │ │
    └─────────┘                                    └─────────┘ │
         │                                                     │
         └─────────────────────────────────────────────────────┘

  TIMING DIAGRAM (Mode 0: CPOL=0, CPHA=0):

  CS   ▔▔▔╲_______________________________________________╱▔▔▔▔

  SCLK ____╱▔╲_╱▔╲_╱▔╲_╱▔╲_╱▔╲_╱▔╲_╱▔╲_╱▔╲_╱▔╲_╱▔╲_╱▔╲____
            0   1   2   3   4   5   6   7   8   9  10

  MOSI ────╳═══╳═══╳═══╳═══╳═══╳═══╳═══╳═══╳═══╳═══╳────
           D7  D6  D5  D4  D3  D2  D1  D0  (next byte...)

  DC   ▔▔▔╲____(LOW = Command)___╱▔▔▔▔(HIGH = Data)▔▔▔▔▔

  • CS pulled LOW to select device
  • Data sampled on RISING edge of SCLK
  • DC tells display: command byte or pixel data?
  • MSB first (bit 7 sent first)

The Waveshare RP2350-LCD-1.47-A Board

Here’s the exact hardware you’re working with:

┌─────────────────────────────────────────────────────────────────────┐
│                    WAVESHARE RP2350-LCD-1.47-A                      │
│                                                                     │
│  ┌─────────────────────────────────────────────────────────────┐   │
│  │                      1.47" LCD DISPLAY                       │   │
│  │                      172 × 320 pixels                        │   │
│  │                      262K colors (18-bit)                    │   │
│  │                      ST7789 controller                       │   │
│  │                                                              │   │
│  │    ┌──────────────────────────────────────────────────┐     │   │
│  │    │                                                   │     │   │
│  │    │                                                   │     │   │
│  │    │              ACTIVE DISPLAY AREA                  │     │   │
│  │    │              320px                                │     │   │
│  │    │               ▲                                   │     │   │
│  │    │               │                                   │     │   │
│  │    │      172px ◄──┼──►                                │     │   │
│  │    │               │                                   │     │   │
│  │    │               ▼                                   │     │   │
│  │    │                                                   │     │   │
│  │    └──────────────────────────────────────────────────┘     │   │
│  │                                                              │   │
│  │    [RGB LED underneath - visible through display edge]       │   │
│  │                                                              │   │
│  └─────────────────────────────────────────────────────────────┘   │
│                                                                     │
│  ┌─────────────────────────────────────────────────────────────┐   │
│  │                         PCB                                  │   │
│  │                                                              │   │
│  │  [BOOT]                              [RESET]                 │   │
│  │    ○                                    ○                    │   │
│  │                                                              │   │
│  │           ┌─────────────────────┐                           │   │
│  │           │      RP2350A        │                           │   │
│  │           │   ┌───────────┐     │                           │   │
│  │           │   │ 2×CM33   │     │                           │   │
│  │           │   │ 2×Hazard3│     │                           │   │
│  │           │   │ 520KB RAM│     │                           │   │
│  │           │   └───────────┘     │                           │   │
│  │           │                     │                           │   │
│  │           └─────────────────────┘                           │   │
│  │                                                              │   │
│  │  [16MB FLASH]   [TF CARD SLOT]                              │   │
│  │                                                              │   │
│  │  ══════════════════════════════════════════════════════     │   │
│  │  │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │     │   │
│  │     GPIO PINS (directly accessible)                          │   │
│  │                                                              │   │
│  │                    ┌───────┐                                │   │
│  │                    │USB-C  │                                │   │
│  │                    └───────┘                                │   │
│  └─────────────────────────────────────────────────────────────┘   │
└─────────────────────────────────────────────────────────────────────┘

PIN ASSIGNMENTS (from Waveshare documentation):
┌────────────┬──────────────────────────────────────┐
│ GPIO PIN   │ FUNCTION                             │
├────────────┼──────────────────────────────────────┤
│ GPIO 8     │ LCD DC (Data/Command)                │
│ GPIO 9     │ LCD CS (Chip Select)                 │
│ GPIO 10    │ LCD CLK (SPI Clock)                  │
│ GPIO 11    │ LCD MOSI (SPI Data)                  │
│ GPIO 12    │ LCD RST (Reset)                      │
│ GPIO 25    │ LCD Backlight (PWM controllable)     │
│ GPIO 23    │ RGB LED Data (WS2812-style)          │
│ GPIO 18    │ TF Card MISO                         │
│ GPIO 19    │ TF Card CS                           │
│ GPIO 20    │ TF Card CLK                          │
│ GPIO 21    │ TF Card MOSI                         │
└────────────┴──────────────────────────────────────┘

Concept Summary Table

Concept Cluster What You Need to Internalize
RP2350 Architecture Four cores, two architectures, only two active at once. ARM has FPU+TrustZone; RISC-V is open and simpler.
Memory Map Know where ROM, XIP Flash, SRAM, and peripherals live. Addresses are fixed in hardware.
SPI Protocol 4-wire synchronous: CLK, MOSI, CS, DC. Master controls timing. Display expects specific command sequences.
ST7789 Display 172×320 pixels, RGB565/RGB666 color. Commands set windows, then stream pixel data.
PIO State Machines 9-instruction mini-CPUs for I/O. Runs independently of main cores at system clock.
DMA Transfers Hardware copies memory without CPU. Configure once, transfer thousands of bytes automatically.
ARM vs RISC-V ARM: mature tooling, FPU, TrustZone. RISC-V: open ISA, learning opportunity, no FPU.
Dual-Core Two cores share memory and peripherals. Use spinlocks and FIFOs for synchronization.
TrustZone Hardware-enforced separation between secure and non-secure code (ARM only).
PWM & Timers Hardware counters for precise timing, LED brightness, motor control.

Deep Dive Reading by Concept

Microcontroller Fundamentals

Concept Book & Chapter
Memory maps and addressing “Computer Systems: A Programmer’s Perspective” by Bryant & O’Hallaron — Ch. 9: “Virtual Memory”
Embedded C basics “Bare Metal C” by Steve Oualline — Ch. 1-4
Processor architecture “Code: The Hidden Language” by Charles Petzold — Ch. 17-19
ARM architecture “Computer Organization and Design ARM Edition” by Patterson & Hennessy — Ch. 2-3

RP2350 Specific

Concept Book & Chapter
RP2350 hardware details RP2350 Datasheet (official Raspberry Pi documentation)
Pico SDK programming “Raspberry Pi Pico-series C/C++ SDK” (official PDF)
PIO programming “RP2040 PIO Guide” (applies to RP2350) — Hardware design sections
RISC-V fundamentals “Computer Organization and Design RISC-V Edition” by Patterson & Hennessy — Ch. 2-3

Display & Graphics

Concept Book & Chapter
SPI protocol “The Sockets Networking API” (wrong book - use “Making Embedded Systems” by Elecia White — Ch. 8)
Graphics algorithms “Computer Graphics from Scratch” by Gabriel Gambetta — Ch. 1-6
ST7789 controller ST7789 Datasheet (Sitronix) — Command reference section

Essential Reading Order

For maximum comprehension, read in this order:

  1. Foundation (Week 1):
    • “Bare Metal C” Ch. 1-4 (C on microcontrollers)
    • RP2350 Datasheet Ch. 1-2 (Overview and memory map)
  2. Display Basics (Week 2):
    • “Making Embedded Systems” Ch. 8 (Serial protocols)
    • ST7789 Datasheet (Command reference)
    • Pico SDK Documentation — SPI section
  3. Advanced I/O (Week 3):
    • RP2350 Datasheet Ch. 3 (PIO)
    • Pico Examples — PIO folder
  4. Performance & Architecture (Week 4):
    • RP2350 Datasheet Ch. 2 (DMA)
    • “Computer Organization and Design” — ARM and/or RISC-V edition

Project 1: Hello Display — Raw SPI Communication

View Detailed Guide

  • File: LEARN_RP2350_LCD_DEEP_DIVE.md
  • Main Programming Language: C
  • Alternative Programming Languages: Rust, MicroPython
  • Coolness Level: Level 3: Genuinely Clever
  • Business Potential: 1. The “Resume Gold”
  • Difficulty: Level 1: Beginner
  • Knowledge Area: SPI Protocol / Display Initialization
  • Software or Tool: Pico SDK, ST7789 display
  • Main Book: “Making Embedded Systems” by Elecia White

What you’ll build: A minimal program that initializes the ST7789 display and fills the screen with a solid color—no libraries, just raw SPI commands based on the datasheet.

Why it teaches RP2350+LCD fundamentals: Before using any display library, you need to understand what’s actually happening at the wire level. This project forces you to read the ST7789 datasheet, understand the initialization sequence, and see how GPIO pins become SPI signals.

Core challenges you’ll face:

  • Reading and decoding the ST7789 datasheet → Understanding command formats and timing
  • Configuring RP2350 SPI peripheral → Hardware registers vs SDK functions
  • Implementing the initialization sequence → Power-on timing, software reset, configuration commands
  • Sending pixel data efficiently → Understanding the column/row address window
  • Debugging without visibility → When the screen stays black, what went wrong?

Key Concepts:

  • SPI Protocol Basics: “Making Embedded Systems” Ch. 8 — Elecia White
  • GPIO Configuration: RP2350 Datasheet Ch. 2.19 — Raspberry Pi
  • Display Controller Commands: ST7789 Datasheet — Sitronix

Difficulty: Beginner Time estimate: Weekend Prerequisites: Basic C programming, ability to compile and flash Pico programs


Real World Outcome

You’ll have a program that transforms a blank display into a working screen. When you run it:

Example Output:

$ picotool load -x hello_display.uf2
Loading into flash: [====================] 100%
Resetting device...

$ # Watch the display:
# 1. Backlight turns on (screen visible but gray)
# 2. Brief flash as initialization completes
# 3. Entire screen fills with bright blue (RGB: 0, 0, 255)

$ # Your serial output shows:
SPI initialized at 62.5 MHz
GPIO pins configured:
  DC  = GPIO 8 (output)
  CS  = GPIO 9 (output)
  CLK = GPIO 10 (SPI0 SCK)
  TX  = GPIO 11 (SPI0 TX)
  RST = GPIO 12 (output)
  BL  = GPIO 25 (output)

Performing hardware reset...
Sending initialization sequence (14 commands)...
  SLPOUT (0x11) - Exit sleep mode
  COLMOD (0x3A) - Set 16-bit color mode
  MADCTL (0x36) - Set memory access control
  CASET  (0x2A) - Set column address 0-171
  RASET  (0x2B) - Set row address 0-319
  INVON  (0x21) - Enable display inversion
  NORON  (0x13) - Normal display mode
  DISPON (0x29) - Display on

Filling screen with blue (0x001F in RGB565)...
Transferred 110,080 bytes (55,040 pixels)
Frame complete in 2.3ms (theoretical: 434 FPS max)

Screen is now blue!

What you’ll see: The display goes from black → gray (backlight on) → solid blue. If anything fails, the screen stays black or shows garbage—debugging is part of the learning.


The Core Question You’re Answering

“What actually happens between calling spi_write() and seeing pixels on the screen?”

Before writing code, understand this: the ST7789 is essentially a giant shift register connected to 55,040 tiny RGB LEDs. Every pixel update involves:

  1. Telling the display WHICH pixels (set address window)
  2. Telling the display WHAT color (send pixel data)
  3. The display interpreting your bytes as 16-bit color values

There’s no “draw pixel at X,Y” command—only “here’s the window, here’s a stream of colors.”


Concepts You Must Understand First

Stop and research these before coding:

  1. SPI Protocol Fundamentals
    • What are MOSI, MISO, SCLK, and CS signals?
    • What does “clock polarity” (CPOL) and “clock phase” (CPHA) mean?
    • Why does the display not need MISO (what would it even send back)?
    • Book Reference: “Making Embedded Systems” Ch. 8 — Elecia White
  2. ST7789 Command Structure
    • How does the DC (Data/Command) pin work?
    • What’s the difference between commands with parameters vs. data-only?
    • Why must you wait after SLPOUT but not after COLMOD?
    • Book Reference: ST7789 Datasheet — Section 9 “Command Description”
  3. Color Encoding (RGB565)
    • Why 16 bits instead of 24 bits for color?
    • How do you pack Red (5 bits), Green (6 bits), Blue (5 bits)?
    • What color is 0xF800? What color is 0x001F? What color is 0x07E0?
    • Book Reference: “Computer Graphics from Scratch” Ch. 1 — Gabriel Gambetta

Questions to Guide Your Design

Before implementing, think through these:

  1. Initialization Sequence
    • In what order must commands be sent?
    • How long should you wait after exiting sleep mode?
    • What happens if you skip COLMOD (color mode)?
  2. Memory Access Pattern
    • The display is 240×320 but you only use 172×320. How do you set the correct window?
    • What happens if you send more pixels than the window can hold?
    • Should you send data LSB-first or MSB-first?
  3. SPI Configuration
    • What’s the maximum SPI clock the ST7789 supports?
    • What happens if you run too fast? Too slow?
    • Do you need hardware CS control or can you toggle GPIO manually?

Thinking Exercise

Trace the Initialization Sequence

Before coding, manually trace what happens for the COLMOD command:

Command: COLMOD (0x3A) with parameter 0x55

1. Set DC pin LOW (this is a command)
2. Set CS pin LOW (select the display)
3. Clock out byte 0x3A on MOSI:

   SCLK  ╱╲_╱╲_╱╲_╱╲_╱╲_╱╲_╱╲_╱╲_
   MOSI  0 0 1 1 1 0 1 0
                   (0x3A = 58 = COLMOD)

4. Set DC pin HIGH (parameters are data)
5. Clock out byte 0x55:

   SCLK  ╱╲_╱╲_╱╲_╱╲_╱╲_╱╲_╱╲_╱╲_
   MOSI  0 1 0 1 0 1 0 1
                   (0x55 = 16-bit color mode)

6. Set CS pin HIGH (deselect display)

Questions while tracing:

  • Why must DC change between command and parameter bytes?
  • What if you forgot to pull CS low?
  • What if you sent 0x55 with DC still low (as a command)?

The Interview Questions They’ll Ask

Prepare to answer these:

  1. “Explain the difference between SPI Mode 0 and Mode 3. Which does the ST7789 use?”
  2. “You’re seeing garbage on the display. Walk me through your debugging approach.”
  3. “Why do embedded displays often use RGB565 instead of RGB888?”
  4. “What’s the purpose of the DC (Data/Command) pin? Why can’t you just use a command prefix byte?”
  5. “How would you calculate the theoretical maximum frame rate for this display at 62.5 MHz SPI?”
  6. “The display shows the wrong colors—red appears blue. What went wrong?”

Hints in Layers

Hint 1: Starting Point Begin with the Pico SDK’s hardware SPI library. Focus on getting ONE command to work (SLPOUT) before doing the full initialization.

Hint 2: Command Structure Every command follows this pattern: DC LOW → send command byte → DC HIGH → send parameter bytes (if any). Create a helper function for this.

Hint 3: Initialization Order A minimal working sequence: Hardware Reset → SLPOUT → Wait 120ms → COLMOD(0x55) → MADCTL(0x00) → CASET → RASET → INVON → DISPON. You can add more commands later for color calibration.

Hint 4: Debugging If the screen is black: Check backlight GPIO is HIGH. If gray but no image: Check initialization completed. If garbage: Check SPI mode (CPOL=0, CPHA=0) and clock speed. Use a logic analyzer if available.


Books That Will Help

Topic Book Chapter
SPI fundamentals “Making Embedded Systems” by Elecia White Ch. 8: “Serial Communication”
RP2350 peripherals “RP2350 Datasheet” Ch. 4.4: “SPI”
Display commands “ST7789 Datasheet” Section 9: Command Reference
Bit manipulation “Bare Metal C” by Steve Oualline Ch. 5-6

Implementation Hints

The initialization sequence follows a strict order mandated by the datasheet. Here’s the conceptual flow:

┌───────────────────────────────────────────────────────────┐
│                 ST7789 INITIALIZATION                     │
├───────────────────────────────────────────────────────────┤
│                                                           │
│  1. POWER ON                                              │
│     └─► Wait for power supply stabilization               │
│                                                           │
│  2. HARDWARE RESET                                        │
│     └─► Pulse RST pin LOW for 10ms                        │
│     └─► Wait 120ms for internal reset                     │
│                                                           │
│  3. EXIT SLEEP (SLPOUT - 0x11)                           │
│     └─► Display wakes from sleep mode                     │
│     └─► MANDATORY: Wait 120ms                            │
│                                                           │
│  4. CONFIGURE (order matters!)                            │
│     ├─► COLMOD (0x3A, 0x55) — 16-bit color              │
│     ├─► MADCTL (0x36, 0x00) — Memory access order        │
│     ├─► CASET (0x2A) — Column address 0-171             │
│     ├─► RASET (0x2B) — Row address 0-319                │
│     └─► INVON (0x21) — Enable color inversion            │
│                                                           │
│  5. TURN ON (DISPON - 0x29)                              │
│     └─► Display becomes active                            │
│                                                           │
│  6. WRITE PIXELS (RAMWR - 0x2C)                          │
│     └─► Stream pixel data as RGB565 values               │
│                                                           │
└───────────────────────────────────────────────────────────┘

For the pixel data transfer, think of it as:

  1. Send RAMWR command (0x2C) with DC=LOW
  2. Switch DC=HIGH
  3. Stream all 110,080 bytes continuously (172 × 320 × 2 bytes per pixel)

RGB565 Color Packing:

16-bit pixel:  RRRRR GGGGGG BBBBB
               └──┬──┘ └──┬──┘ └─┬─┘
               bits   bits   bits
               15-11  10-5   4-0

Pure Red:   0xF800 (11111 000000 00000)
Pure Green: 0x07E0 (00000 111111 00000)
Pure Blue:  0x001F (00000 000000 11111)
White:      0xFFFF
Black:      0x0000

Learning milestones:

  1. Screen turns on (any color) → You’ve mastered the initialization sequence
  2. Correct color appears → You understand RGB565 encoding and byte order
  3. Full frame in <5ms → You’re using SPI efficiently

Project 2: Pixel Artist — Drawing Primitives Without Libraries

View Detailed Guide

  • File: LEARN_RP2350_LCD_DEEP_DIVE.md
  • Main Programming Language: C
  • Alternative Programming Languages: Rust, C++
  • Coolness Level: Level 3: Genuinely Clever
  • Business Potential: 1. The “Resume Gold”
  • Difficulty: Level 2: Intermediate
  • Knowledge Area: Graphics Algorithms / Frame Buffers
  • Software or Tool: Pico SDK, custom graphics library
  • Main Book: “Computer Graphics from Scratch” by Gabriel Gambetta

What you’ll build: A software frame buffer with functions to draw pixels, lines (Bresenham’s algorithm), rectangles, and circles—all rendered to the display via your SPI driver from Project 1.

Why it teaches embedded graphics: Real graphics isn’t about calling drawLine()—it’s understanding how every primitive becomes a set of pixel coordinates. Building these algorithms from scratch reveals why displays have frame buffers, why drawing order matters, and why embedded systems often use fixed-point math.

Core challenges you’ll face:

  • Implementing a frame buffer in limited RAM → 110KB for full frame vs 520KB total SRAM
  • Bresenham’s line algorithm → Drawing diagonal lines without floating-point math
  • Midpoint circle algorithm → Efficient circle drawing using only integers
  • Coordinate systems → Understanding display orientation and addressing
  • Optimization trade-offs → When to draw directly vs. buffer first

Key Concepts:

  • Bresenham’s Line Algorithm: “Computer Graphics from Scratch” Ch. 6 — Gabriel Gambetta
  • Frame Buffer Management: “Making Embedded Systems” Ch. 10 — Elecia White
  • Fixed-Point Arithmetic: “Bare Metal C” Ch. 7 — Steve Oualline

Difficulty: Intermediate Time estimate: 1-2 weeks Prerequisites: Project 1 completed, understanding of basic algorithms


Real World Outcome

You’ll have a graphics library that can draw shapes on the display:

Example Output:

$ # Your program draws a test pattern:

Graphics Library initialized
Frame buffer: 110,080 bytes at 0x20010000

Drawing test pattern...
  draw_pixel(86, 160, RED)           - 1 pixel
  draw_line(0, 0, 171, 319, GREEN)   - 358 pixels (diagonal)
  draw_rect(20, 40, 100, 80, BLUE)   - 328 pixels (outline)
  draw_filled_rect(50, 100, 40, 40, YELLOW) - 1,600 pixels
  draw_circle(86, 200, 50, WHITE)    - 314 pixels (outline)
  draw_filled_circle(86, 280, 30, CYAN) - 2,826 pixels

Total pixels modified: 5,427
Flushing frame buffer to display...
Transfer complete in 2.3ms

# The display shows:
# - Red dot at center-top
# - Green diagonal line corner to corner
# - Blue rectangle outline
# - Solid yellow square
# - White circle outline
# - Filled cyan circle

What you’ll see: A colorful test pattern proving each drawing primitive works correctly.


The Core Question You’re Answering

“How do you draw a line between any two points using only integer math?”

This is Bresenham’s insight from 1962: you can draw perfect lines by tracking an “error” value and deciding whether to step in X only, or X and Y together. No floating-point needed—critical for 1960s plotters and still relevant for embedded systems.


Concepts You Must Understand First

Stop and research these before coding:

  1. Frame Buffer Architecture
    • Why buffer pixels in RAM before sending to display?
    • What’s “double buffering” and why does it prevent tearing?
    • Can you fit a full frame in RP2350’s 520KB SRAM?
    • Book Reference: “Computer Graphics from Scratch” Ch. 1 — Gabriel Gambetta
  2. Bresenham’s Line Algorithm
    • What’s the “error accumulator” approach?
    • Why does the algorithm only use integer addition?
    • How do you handle lines in all 8 octants?
    • Book Reference: “Computer Graphics from Scratch” Ch. 6 — Gabriel Gambetta
  3. Circle Drawing (Midpoint Algorithm)
    • Why can you draw 1/8 of a circle and reflect it?
    • What’s the “midpoint” in the midpoint circle algorithm?
    • How do you fill a circle efficiently?
    • Book Reference: “Computer Graphics from Scratch” Ch. 7 — Gabriel Gambetta

Questions to Guide Your Design

Before implementing, think through these:

  1. Memory Layout
    • How will you store the frame buffer? Contiguous array? 2D array?
    • RGB565 means 2 bytes per pixel—how do you access pixel (x, y)?
    • What happens if you try to draw outside the screen bounds?
  2. Line Drawing
    • For line from (0,0) to (10,5), which pixels should be colored?
    • What if the line is steeper than 45°?
    • How do you handle the line going right-to-left or bottom-to-top?
  3. Performance
    • Should you send each pixel individually or batch them?
    • When is it faster to send the whole frame vs. just changed regions?
    • How much faster is direct-to-display vs. frame-buffer?

Thinking Exercise

Trace Bresenham’s Algorithm

Draw the line from (0, 0) to (5, 3) by hand:

Grid:
  0 1 2 3 4 5    (X axis)
0 ●─┬─┬─┬─┬─┐
1 ├─●─●─┼─┼─┤
2 ├─┼─●─●─┼─┤
3 ├─┼─┼─●─●─┤
         (Y axis)

Step through the algorithm:
dx = 5, dy = 3
error = 0

Step 0: (0,0) — draw pixel
Step 1: x=1, error=3, error<5 → (1,0) — NO, error≥dy/2, y++
        Actually: (1,1) — draw pixel, error = 3-5 = -2?

Wait, let me restart with the actual algorithm...

The exercise forces you to work through the algorithm manually before coding it.


The Interview Questions They’ll Ask

  1. “Why is Bresenham’s line algorithm considered efficient? What operations does it avoid?”
  2. “Explain how you’d draw a line from (100, 50) to (20, 200)—the line goes ‘backwards’ and down.”
  3. “What’s the time complexity of drawing a circle with radius R using the midpoint algorithm?”
  4. “You have a 172×320 display and 520KB RAM. Can you double-buffer? What are your options?”
  5. “How would you implement anti-aliased lines on this display?”

Hints in Layers

Hint 1: Starting Point Start with draw_pixel(x, y, color). If this works, everything else builds on it. Just write directly to the frame buffer: framebuffer[y * WIDTH + x] = color.

Hint 2: Bresenham’s Algorithm Core The key insight: track an error term. When error exceeds threshold, step in the minor axis. For shallow lines (dx > dy), step X every iteration, step Y when error overflows.

Hint 3: Octant Handling The basic Bresenham works for one octant. For all directions, either: (1) swap X/Y roles for steep lines, or (2) use direction signs (+1/-1) for all 8 cases.

Hint 4: Circle Optimization Draw 1/8 of the circle (one octant), then reflect to get all 8 points per iteration: (x,y), (y,x), (-x,y), (-y,x), (x,-y), (y,-x), (-x,-y), (-y,-x).


Books That Will Help

Topic Book Chapter
Line drawing algorithms “Computer Graphics from Scratch” by Gambetta Ch. 6
Circle algorithms “Computer Graphics from Scratch” by Gambetta Ch. 7
Frame buffer concepts “Making Embedded Systems” by Elecia White Ch. 10
Embedded optimization “Bare Metal C” by Steve Oualline Ch. 12

Implementation Hints

Frame Buffer Structure:

┌─────────────────────────────────────────────────────────────┐
│                    FRAME BUFFER LAYOUT                      │
│                                                             │
│  Memory: Linear array of 16-bit values                     │
│  Size: 172 × 320 × 2 = 110,080 bytes                       │
│                                                             │
│  Pixel (x, y) → framebuffer[y * 172 + x]                   │
│                                                             │
│  Example:                                                   │
│  ┌────┬────┬────┬────┬────┬─────────────┐                  │
│  │ 0  │ 1  │ 2  │ 3  │ 4  │ ... │ 171   │ Row 0           │
│  ├────┼────┼────┼────┼────┼─────────────┤                  │
│  │172 │173 │174 │175 │176 │ ... │ 343   │ Row 1           │
│  ├────┼────┼────┼────┼────┼─────────────┤                  │
│  │... │... │... │... │... │ ... │ ...   │                  │
│  └────┴────┴────┴────┴────┴─────────────┘                  │
│                                                             │
│  Pixel (5, 2) = framebuffer[2*172 + 5] = framebuffer[349] │
└─────────────────────────────────────────────────────────────┘

Bresenham’s Line (Conceptual):

For line from (x0, y0) to (x1, y1) where dx > dy (shallow line):

Initialize:
  dx = x1 - x0
  dy = y1 - y0
  error = 0

For each x from x0 to x1:
  draw_pixel(x, y, color)
  error += dy
  if error * 2 >= dx:
    y += 1
    error -= dx

Learning milestones:

  1. Lines draw correctly in all directions → You understand Bresenham’s algorithm
  2. Circles are smooth → You’ve mastered the midpoint algorithm
  3. Frame rate > 30 FPS with primitives → You understand the performance implications

Project 3: DMA Display Driver — Zero-CPU Frame Transfers

View Detailed Guide

  • File: LEARN_RP2350_LCD_DEEP_DIVE.md
  • Main Programming Language: C
  • Alternative Programming Languages: Rust
  • Coolness Level: Level 4: Hardcore Tech Flex
  • Business Potential: 2. The “Micro-SaaS / Pro Tool”
  • Difficulty: Level 3: Advanced
  • Knowledge Area: DMA / Peripheral Programming
  • Software or Tool: Pico SDK, hardware DMA
  • Main Book: “RP2350 Datasheet” (Raspberry Pi)

What you’ll build: A display driver that uses DMA to transfer the entire frame buffer to the display while the CPU renders the next frame—achieving true double-buffering with zero CPU involvement during transfer.

Why it teaches DMA and concurrency: DMA is how real embedded systems achieve performance. This project shows you how the CPU can “fire and forget” a 110KB transfer, letting hardware handle the byte-by-byte movement while you compute the next frame. You’ll learn about DREQ pacing, DMA chaining, and synchronization.

Core challenges you’ll face:

  • Configuring DMA channels for SPI → Understanding DREQ signals
  • Pacing transfers to SPI speed → Why you can’t just dump data instantly
  • Handling the DC pin between command and data → DMA doesn’t toggle GPIO!
  • Synchronizing CPU and DMA → Knowing when transfer completes
  • Achieving true double-buffering → Two frame buffers, alternating

Key Concepts:

  • DMA Configuration: RP2350 Datasheet Ch. 2.5 — Raspberry Pi
  • DREQ Signals: Pico SDK Documentation — hardware_dma section
  • Double Buffering: “Making Embedded Systems” Ch. 10 — Elecia White

Difficulty: Advanced Time estimate: 1-2 weeks Prerequisites: Projects 1-2 completed, understanding of interrupts


Real World Outcome

You’ll have a display system that renders smoothly without blocking:

Example Output:

$ # Your animation demo runs at 60 FPS

DMA Display Driver initialized
  DMA Channel 0: Frame buffer → SPI TX
  DMA Channel 1: Chained for continuous transfer
  Frame buffer A: 0x20010000 (110,080 bytes)
  Frame buffer B: 0x2002B000 (110,080 bytes)

Starting animation loop...

Frame 1:
  [CPU] Rendering to buffer B... (1.2ms)
  [DMA] Transferring buffer A to display... (1.8ms)
  CPU idle during transfer: 0.6ms FREE

Frame 2:
  [CPU] Rendering to buffer A... (1.3ms)
  [DMA] Transferring buffer B to display... (1.8ms)
  CPU idle during transfer: 0.5ms FREE

Performance metrics (after 1000 frames):
  Average frame time: 2.1ms (476 FPS theoretical)
  Actual display rate: 60 FPS (vsync limited)
  CPU utilization during transfer: 0%
  Frames dropped: 0

# The display shows smooth animation with no tearing!

What you’ll see: Smooth animation where the CPU is free to compute while the display updates.


The Core Question You’re Answering

“How can I send 110KB to the display without the CPU moving a single byte?”

DMA is the answer. You configure the DMA controller with: (1) source address (your frame buffer), (2) destination address (SPI TX FIFO), (3) transfer count (110,080 bytes), and (4) pacing (DREQ_SPI0_TX). Then you trigger it and walk away.


Concepts You Must Understand First

Stop and research these before coding:

  1. DMA Controller Architecture
    • What is a DMA channel?
    • What are “source” and “destination” addresses?
    • Why does read address increment but write address stays fixed?
    • Book Reference: “RP2350 Datasheet” Ch. 2.5 — Raspberry Pi
  2. DREQ (Data Request) Signals
    • Why can’t DMA just transfer instantly?
    • What happens if DMA is faster than SPI can transmit?
    • What is DREQ_SPI0_TX and when does it signal?
    • Book Reference: “RP2350 Datasheet” Ch. 2.5.3.1 — DREQ table
  3. Double Buffering
    • Why two frame buffers instead of one?
    • What’s “tearing” and why does double-buffering prevent it?
    • How do you know which buffer the DMA is currently reading?
    • Book Reference: “Computer Graphics from Scratch” Ch. 1 — Gabriel Gambetta

Questions to Guide Your Design

Before implementing, think through these:

  1. DMA Configuration
    • Should transfer size be 8-bit, 16-bit, or 32-bit?
    • The SPI FIFO is 8 entries deep—is that enough?
    • How do you handle the DC pin (DMA can’t toggle GPIO directly)?
  2. Command vs Data Problem
    • RAMWR command needs DC=LOW, but pixel data needs DC=HIGH
    • Option 1: Send command via CPU, then DMA the data
    • Option 2: Use a separate DMA transfer for the command byte
    • Which approach is simpler? Which is more flexible?
  3. Synchronization
    • How do you know when DMA transfer is complete?
    • What happens if you start rendering to a buffer DMA is still reading?
    • Should you use polling or interrupts?

Thinking Exercise

Trace the DMA Data Flow

Draw the path data takes from your frame buffer to the display:

┌──────────────┐      ┌──────────────┐      ┌──────────────┐
│ Frame Buffer │ DMA  │   SPI TX     │ Wire │   ST7789     │
│   (SRAM)     │─────►│    FIFO      │─────►│  Display     │
│ 0x20010000   │      │  (8 bytes)   │      │   RAM        │
└──────────────┘      └──────────────┘      └──────────────┘
     │                      │
     │                      ▼
     │               ┌──────────────┐
     │               │ DREQ signal  │
     │               │ (FIFO not    │
     │               │  full)       │
     │               └──────────────┘
     │                      │
     ▼                      ▼
 DMA reads            DMA waits for
 next byte            DREQ before
                      next transfer

Questions while tracing:

  • What happens if the SPI is configured slower than DMA can supply data?
  • Why does the DMA need the DREQ signal instead of just blasting data?
  • If FIFO is 8 deep and SPI runs at 62.5 MHz, how long until FIFO drains?

The Interview Questions They’ll Ask

  1. “Explain the difference between DMA and CPU-driven I/O. When would you use each?”
  2. “What is a DREQ signal and why is it necessary for peripheral DMA?”
  3. “You’re seeing corrupted frames. The DMA transfer count is correct. What could cause this?”
  4. “How would you implement triple buffering? What problem does it solve that double-buffering doesn’t?”
  5. “The RP2350 has 12 DMA channels. How would you handle if you needed 15 simultaneous transfers?”

Hints in Layers

Hint 1: Starting Point Use the Pico SDK’s hardware_dma library. Start with a simple memory-to-memory transfer to verify DMA works, then adapt for SPI.

Hint 2: SPI Configuration The SPI peripheral has a TX FIFO. DMA writes to spi_get_hw(spi0)->dr (the data register). Set DREQ to DREQ_SPI0_TX so DMA waits when FIFO is full.

Hint 3: DC Pin Problem Before starting DMA, send the RAMWR command (0x2C) via CPU with DC=LOW, then set DC=HIGH and start DMA. The display expects continuous pixel data after RAMWR.

Hint 4: Completion Detection Use dma_channel_wait_for_finish_blocking() or set up an IRQ with dma_channel_set_irq0_enabled(). The IRQ approach lets CPU work while waiting.


Books That Will Help

Topic Book Chapter
DMA architecture “RP2350 Datasheet” Ch. 2.5: DMA
SPI peripheral details “RP2350 Datasheet” Ch. 4.4: SPI
Double buffering theory “Computer Graphics from Scratch” by Gambetta Ch. 1
Interrupt handling “Making Embedded Systems” by Elecia White Ch. 7

Implementation Hints

DMA + SPI Data Flow:

┌─────────────────────────────────────────────────────────────┐
│                    DMA → SPI → DISPLAY                      │
│                                                             │
│  ┌─────────────────┐                                        │
│  │  Frame Buffer   │                                        │
│  │  [0x20010000]   │                                        │
│  │  110,080 bytes  │                                        │
│  └────────┬────────┘                                        │
│           │                                                 │
│           │  DMA reads 8 bits at a time                     │
│           │  Read address increments automatically          │
│           ▼                                                 │
│  ┌─────────────────┐                                        │
│  │  DMA Channel    │                                        │
│  │  ┌───────────┐  │                                        │
│  │  │ READ_ADDR │──┼──► Increments each transfer            │
│  │  │ WRITE_ADDR│──┼──► Fixed: SPI TX FIFO                  │
│  │  │ COUNT     │──┼──► 110,080 transfers                   │
│  │  │ CTRL      │──┼──► DREQ = SPI0_TX                      │
│  │  └───────────┘  │                                        │
│  └────────┬────────┘                                        │
│           │                                                 │
│           │  Waits for DREQ (FIFO not full)                │
│           ▼                                                 │
│  ┌─────────────────┐                                        │
│  │  SPI TX FIFO    │                                        │
│  │  (8 bytes deep) │──► Transmits at SCLK rate             │
│  └────────┬────────┘                                        │
│           │                                                 │
│           ▼  MOSI pin                                       │
│  ┌─────────────────┐                                        │
│  │  ST7789 Display │                                        │
│  │  (receives data)│                                        │
│  └─────────────────┘                                        │
└─────────────────────────────────────────────────────────────┘

Pseudocode for DMA setup:

1. Claim a DMA channel
2. Configure channel:
   - Data size: 8 bits
   - Read address: frame_buffer, increment
   - Write address: &spi_get_hw(spi0)->dr, no increment
   - DREQ: DREQ_SPI0_TX
   - Transfer count: 172 * 320 * 2

3. Before transfer:
   - Send RAMWR command (DC=LOW, then DC=HIGH)

4. Start transfer:
   - dma_channel_start(channel)

5. Wait or handle IRQ when complete

Learning milestones:

  1. DMA transfer completes without errors → You understand DMA configuration
  2. Frame appears correctly on display → Data flow is correct
  3. Animation runs while CPU computes → True double-buffering achieved

Project 4: RGB LED Controller with PIO — Custom Protocol Implementation

View Detailed Guide

  • File: LEARN_RP2350_LCD_DEEP_DIVE.md
  • Main Programming Language: C (with PIO assembly)
  • Alternative Programming Languages: MicroPython, Rust
  • Coolness Level: Level 4: Hardcore Tech Flex
  • Business Potential: 2. The “Micro-SaaS / Pro Tool”
  • Difficulty: Level 3: Advanced
  • Knowledge Area: PIO State Machines / Timing Protocols
  • Software or Tool: Pico SDK, pioasm assembler
  • Main Book: “RP2350 Datasheet” (Raspberry Pi) — Chapter 3: PIO

What you’ll build: A WS2812-compatible LED driver using PIO state machines—writing the timing-critical protocol in PIO assembly to generate precise nanosecond-level waveforms for the onboard RGB LED (and compatible LED strips).

Why it teaches PIO programming: The WS2812 protocol requires 800kHz signaling with precise high/low times (400ns/850ns for ‘0’, 850ns/400ns for ‘1’). This is impossible to bit-bang reliably with the CPU, but trivial with PIO. You’ll learn the 9-instruction PIO instruction set, side-set pins, and autopull.

Core challenges you’ll face:

  • Understanding PIO assembly → Only 9 instructions, but fundamentally different from CPU assembly
  • Calculating timing at clock cycles → Each instruction is 1 cycle at the state machine clock
  • Using side-set for pin control → Changing pins during instruction execution
  • Autopull and bit shifting → Getting data from FIFO into OSR
  • Chaining with DMA → Driving long LED strips without CPU

Key Concepts:

  • PIO Architecture: RP2350 Datasheet Ch. 3 — Raspberry Pi
  • WS2812 Protocol: WS2812B Datasheet — Worldsemi
  • State Machine Programming: Pico SDK Examples — ws2812 folder

Difficulty: Advanced Time estimate: 1-2 weeks Prerequisites: Projects 1-3 completed, basic assembly concepts


Real World Outcome

You’ll have a PIO-based LED driver that produces precise WS2812 waveforms:

Example Output:

$ # Your program cycles through colors on the RGB LED

PIO WS2812 Driver initialized
  State machine: PIO0_SM0
  Output pin: GPIO 23
  Frequency: 800 kHz (1.25µs per bit)
  Bits per LED: 24 (GRB format)

Color cycle demo starting...

Frame 0: Red (0xFF0000) → GRB: 0x00FF00
  Sending 24 bits via PIO...
  Waveform: ▔▔▔▁▔▔▔▁▔▔▔▁▔▔▔▁ ... (800 kHz)
  Transfer complete in 30µs

Frame 30: Green (0x00FF00) → GRB: 0xFF0000
  Sending 24 bits via PIO...
  Transfer complete in 30µs

Frame 60: Blue (0x0000FF) → GRB: 0x0000FF
  Sending 24 bits via PIO...
  Transfer complete in 30µs

# The onboard RGB LED smoothly transitions through colors!

$ # Logic analyzer view (if connected):
Bit 0: ▔▔▔▔▔▔▔▔▔▁▁▁▁▁▁ = '1' (850ns high, 400ns low)
Bit 1: ▔▔▔▔▁▁▁▁▁▁▁▁▁▁ = '0' (400ns high, 850ns low)
...

What you’ll see: The RGB LED under the display changing colors smoothly.


The Core Question You’re Answering

“How do you generate precise sub-microsecond timing waveforms without busy-looping the CPU?”

PIO state machines run at up to 150 MHz independently of the CPU. Each PIO instruction takes exactly 1 clock cycle (plus optional delay). By setting the state machine clock divider, you can create predictable timing down to ~6.67ns resolution.


Concepts You Must Understand First

Stop and research these before coding:

  1. WS2812 Protocol Timing
    • What’s the bit period (total time for one bit)?
    • How do you distinguish ‘0’ from ‘1’?
    • What’s the reset signal and how long must it be?
    • Book Reference: WS2812B Datasheet — Timing specifications
  2. PIO State Machine Structure
    • What are OSR (Output Shift Register) and ISR (Input Shift Register)?
    • What is autopull and when does it trigger?
    • How does side-set work during instruction execution?
    • Book Reference: “RP2350 Datasheet” Ch. 3.2 — Raspberry Pi
  3. Clock Divider Calculations
    • If system clock is 150 MHz and you need 800 kHz bit rate, what divider?
    • How many instructions can you execute per bit period?
    • Book Reference: “RP2350 Datasheet” Ch. 3.5.3 — CLKDIV

Questions to Guide Your Design

Before implementing, think through these:

  1. Timing Calculation
    • At 150 MHz, how many clock cycles per 1.25µs bit period?
    • You need to create a high pulse of either 400ns or 850ns—how many cycles each?
    • How do you add delay to PIO instructions?
  2. Program Structure
    • How many PIO instructions do you need per bit?
    • When do you shift the next bit from OSR?
    • How do you know when all 24 bits are sent?
  3. Integration
    • How do you convert RGB to GRB format (WS2812 quirk)?
    • How would you extend this to support LED strips (multiple LEDs)?
    • Could you use DMA to stream LED data?

Thinking Exercise

Trace One Bit Transmission

For WS2812 ‘1’ bit (850ns high, 400ns low):

At 10 MHz PIO clock (100ns per cycle):

Time    | Pin State | Instruction
--------|-----------|----------------------------------
0ns     | HIGH      | set pins, 1 [7]  ; High + 7 delays = 800ns
800ns   | HIGH      | out x, 1         ; Shift bit to X, still high
900ns   | LOW       | jmp !x do_zero   ; If bit=0, jump (we're doing 1, no jump)
1000ns  | LOW       | nop [3]          ; Delay 400ns for '1' timing
1400ns  | LOW       | jmp bitloop      ; Start next bit

Wait—that doesn't quite work. Let me try with side-set...

This exercise forces you to understand PIO timing intimately.


The Interview Questions They’ll Ask

  1. “What is PIO and why is it useful for timing-critical protocols?”
  2. “Explain how side-set works and why it’s important for generating waveforms.”
  3. “The WS2812 protocol specifies 800kHz ±150kHz. How do you ensure your implementation meets this?”
  4. “You’re seeing wrong colors on the LED. What could cause GRB vs RGB confusion?”
  5. “How would you modify your driver to support 1000 LEDs without dropping frames?”

Hints in Layers

Hint 1: Starting Point Use the official Pico SDK ws2812 example as reference. Study how it uses side-set to control the output pin while other operations happen.

Hint 2: Timing Strategy The clever approach: set pin HIGH, shift bit, conditionally delay based on bit value, set pin LOW. Side-set can change the pin at the same time as another operation.

Hint 3: Clock Divider For 800kHz signaling with ~10 instructions per bit, you need about 8 MHz PIO clock. Use sm_config_set_clkdiv() with a divider of ~18.75.

Hint 4: GRB Ordering WS2812 expects GRB format, not RGB. Rearrange your color bytes: (g << 16) | (r << 8) | b.


Books That Will Help

Topic Book Chapter
PIO architecture “RP2350 Datasheet” Ch. 3: PIO
PIO instruction set “RP2350 Datasheet” Ch. 3.4: PIO Instructions
WS2812 timing WS2812B Datasheet Timing Specifications
Pico SDK PIO “Pico C/C++ SDK” Chapter 3: PIO

Implementation Hints

WS2812 Bit Timing:

┌─────────────────────────────────────────────────────────────┐
│                    WS2812 BIT TIMING                        │
│                                                             │
│  BIT PERIOD: 1.25µs (800 kHz)                              │
│                                                             │
│  '0' BIT:                                                   │
│  ▔▔▔▔▔╲_________________                                   │
│  ← 400ns →← 850ns low   →                                  │
│  T0H       T0L                                              │
│                                                             │
│  '1' BIT:                                                   │
│  ▔▔▔▔▔▔▔▔▔▔▔▔▔╲________                                   │
│  ←   850ns   →← 400ns →                                    │
│  T1H          T1L                                           │
│                                                             │
│  RESET: LOW for > 50µs (resets LED to receive new data)    │
│                                                             │
│  DATA FORMAT: 24 bits per LED                              │
│  ┌────────┬────────┬────────┐                              │
│  │ G7..G0 │ R7..R0 │ B7..B0 │                              │
│  └────────┴────────┴────────┘                              │
│     MSB first for each byte                                 │
└─────────────────────────────────────────────────────────────┘

PIO Program Concept:

; WS2812 PIO program (conceptual)
.program ws2812
.side_set 1

bitloop:
    out x, 1        side 1    ; Shift bit, set pin HIGH
    jmp !x do_zero  side 1    ; If bit=0, jump (still HIGH for now)

do_one:
    nop [5]         side 1    ; Stay high longer for '1'
    jmp bitloop     side 0    ; Low, next bit

do_zero:
    nop [1]         side 0    ; Go low quickly for '0'
    nop [3]         side 0    ; Stay low longer
    jmp bitloop     side 0    ; Next bit

Learning milestones:

  1. LED lights up in any color → You’ve got the timing right
  2. Colors match expected values → GRB ordering is correct
  3. Smooth color transitions → DMA integration works

Project 5: Dual-Core Rendering Engine — One Core Draws, One Core Sends

View Detailed Guide

  • File: LEARN_RP2350_LCD_DEEP_DIVE.md
  • Main Programming Language: C
  • Alternative Programming Languages: Rust
  • Coolness Level: Level 4: Hardcore Tech Flex
  • Business Potential: 2. The “Micro-SaaS / Pro Tool”
  • Difficulty: Level 3: Advanced
  • Knowledge Area: Multi-Core Programming / Concurrency
  • Software or Tool: Pico SDK multicore library
  • Main Book: “RP2350 Datasheet” (Raspberry Pi) — Chapter 2.3: Multicore

What you’ll build: A parallel rendering system where Core 0 renders the next frame while Core 1 sends the current frame to the display via DMA—achieving true parallel processing with inter-core communication.

Why it teaches dual-core programming: The RP2350’s two cores share memory but run independently. This project teaches you about multicore synchronization primitives (spinlocks, FIFO), cache coherency, and the producer-consumer pattern in embedded systems.

Core challenges you’ll face:

  • Starting and managing Core 1multicore_launch_core1()
  • Inter-core communication → Hardware FIFO between cores
  • Synchronization without deadlock → When to block, when to continue
  • Cache coherency → Ensuring both cores see the same memory
  • Debugging multicore code → printf goes to one UART!

Key Concepts:

  • Multicore Primitives: RP2350 Datasheet Ch. 2.3 — Raspberry Pi
  • Producer-Consumer Pattern: “Operating Systems: Three Easy Pieces” Ch. 31
  • Lock-Free Programming: “Computer Systems: A Programmer’s Perspective” Ch. 12

Difficulty: Advanced Time estimate: 2-3 weeks Prerequisites: Projects 1-3 completed, understanding of threads/concurrency


Real World Outcome

You’ll have a system where both cores work in parallel:

Example Output:

$ # Your dual-core demo shows performance metrics

RP2350 Dual-Core Renderer initialized
  Core 0: Frame rendering (graphics primitives)
  Core 1: Display output (DMA to SPI)

Starting parallel rendering...

Core 0 → Core 1 FIFO: Ready
Buffer swap protocol: Double-buffered

Performance (1000 frames):

  CORE 0 (Renderer):
    Average render time: 4.2ms
    Drawing 500 primitives/frame
    CPU utilization: 100% (rendering)

  CORE 1 (Display):
    Average transfer time: 1.8ms
    DMA throughput: 61 MB/s
    CPU utilization: 15% (DMA setup only)

  COMBINED PERFORMANCE:
    Effective frame rate: 238 FPS (limited by render)
    Without dual-core: 166 FPS (35% improvement!)
    Inter-core messages: 2 per frame (buffer swap)

# The display shows smooth complex graphics!

What you’ll see: Complex graphics rendering faster than single-core would allow.


The Core Question You’re Answering

“How do I make two processor cores work together without stepping on each other’s toes?”

The challenge isn’t just running code on two cores—it’s coordinating access to shared resources (frame buffers, peripherals) without races or deadlocks.


Concepts You Must Understand First

Stop and research these before coding:

  1. RP2350 Multicore Architecture
    • Both cores share the same memory—what are the implications?
    • What is the inter-core FIFO and how deep is it?
    • Can both cores access the same peripheral simultaneously?
    • Book Reference: “RP2350 Datasheet” Ch. 2.3 — Raspberry Pi
  2. Synchronization Primitives
    • What’s a spinlock and when should you use it?
    • What’s the difference between mutex and spinlock?
    • Why can’t you use a semaphore from the SDK on both cores?
    • Book Reference: “Operating Systems: Three Easy Pieces” Ch. 28-30
  3. Producer-Consumer Pattern
    • Core 0 produces frames, Core 1 consumes them
    • How do you signal “frame ready” without busy-waiting?
    • What if producer is faster than consumer?
    • Book Reference: “Operating Systems: Three Easy Pieces” Ch. 31

Questions to Guide Your Design

Before implementing, think through these:

  1. Core Roles
    • Which core should own the display (SPI peripheral)?
    • Which core should own the frame buffers?
    • Should both cores be able to render, or just one?
  2. Buffer Management
    • With double buffering, which core decides when to swap?
    • What happens if Core 0 finishes rendering before Core 1 finishes sending?
    • What if Core 1 finishes before Core 0?
  3. Communication Protocol
    • What messages do you need to send between cores?
    • “I’m done rendering buffer A” → Core 1 can send it
    • “I’m done sending buffer A” → Core 0 can render to it
    • How do you avoid race conditions in this handshake?

Thinking Exercise

Design the Synchronization Protocol

Draw a timeline of both cores:

Core 0 (Renderer):     Core 1 (Display):
─────────────────      ─────────────────
[Render to A    ]      [Wait for msg...]
[Send: "A ready"]───►  [Receive: A ready]
[Render to B    ]      [DMA send A      ]
[Wait for A free]◄──── [Send: "A free"  ]
[Render to A    ]      [Wait for msg...]
...                    ...

Questions:

  • What if Core 0 finishes rendering B before Core 1 finishes sending A?
  • Should Core 0 wait, or start on a third buffer?
  • Is the FIFO deep enough for all pending messages?

The Interview Questions They’ll Ask

  1. “Explain the difference between spinlock and mutex. When would you use each?”
  2. “You’re seeing occasional visual glitches. How would you debug a potential race condition?”
  3. “What is cache coherency and why does it matter on multi-core systems?”
  4. “Design a lock-free queue for the producer-consumer pattern.”
  5. “The RP2350’s inter-core FIFO is 8 deep. What happens if you push 9 messages?”

Hints in Layers

Hint 1: Starting Point Use multicore_launch_core1() to start Core 1 running a simple function. Verify it works with separate printf messages (both appear on UART).

Hint 2: Inter-Core Communication Use multicore_fifo_push_blocking() and multicore_fifo_pop_blocking(). Design a simple protocol: even numbers mean “buffer A ready”, odd means “buffer B ready”.

Hint 3: Synchronization Don’t use spinlocks for buffer management—use the FIFO as a synchronization mechanism. Core 0 waits for “buffer free” message before rendering to it.

Hint 4: Peripheral Ownership Assign SPI and DMA to Core 1 only. Core 0 only touches the frame buffers. This avoids needing locks on peripheral access.


Books That Will Help

Topic Book Chapter
RP2350 multicore “RP2350 Datasheet” Ch. 2.3: Processors
Concurrency fundamentals “Operating Systems: Three Easy Pieces” Ch. 28-31
Lock-free programming “Computer Systems: A Programmer’s Perspective” Ch. 12
Producer-consumer “Operating Systems: Three Easy Pieces” Ch. 31

Implementation Hints

Dual-Core Buffer Swap Protocol:

┌─────────────────────────────────────────────────────────────┐
│                DUAL-CORE BUFFER MANAGEMENT                  │
│                                                             │
│  CORE 0                          CORE 1                     │
│  (Renderer)                      (Display Driver)           │
│                                                             │
│  ┌─────────────┐                ┌─────────────┐            │
│  │ Render to A │                │ Wait FIFO   │            │
│  └──────┬──────┘                └──────┬──────┘            │
│         │                               │                   │
│         │  FIFO: "A ready"              │                   │
│         │──────────────────────────────►│                   │
│         │                               │                   │
│  ┌──────▼──────┐                ┌──────▼──────┐            │
│  │ Render to B │                │ DMA send A  │            │
│  └──────┬──────┘                └──────┬──────┘            │
│         │                               │                   │
│         │  FIFO: "A free"               │                   │
│         │◄──────────────────────────────│                   │
│         │                               │                   │
│  ┌──────▼──────┐                        │  FIFO: "B ready" │
│  │ Render to A │◄───────────────────────│                   │
│  └─────────────┘                                            │
│                                                             │
│  FRAME BUFFERS:                                             │
│  ┌────────────────┐  ┌────────────────┐                    │
│  │   Buffer A     │  │   Buffer B     │                    │
│  │  [0x20010000]  │  │  [0x2002B000]  │                    │
│  │  110,080 bytes │  │  110,080 bytes │                    │
│  └────────────────┘  └────────────────┘                    │
│        ↑                     ↑                              │
│  Only ONE core writes at a time (no locks needed)          │
└─────────────────────────────────────────────────────────────┘

Learning milestones:

  1. Core 1 starts and runs independently → Basic multicore works
  2. Cores exchange messages via FIFO → Communication works
  3. Frame rate increases over single-core → Parallelism achieved

Project 6: Font Rendering Engine — Bitmap and Anti-Aliased Text

View Detailed Guide

  • File: LEARN_RP2350_LCD_DEEP_DIVE.md
  • Main Programming Language: C
  • Alternative Programming Languages: Rust, C++
  • Coolness Level: Level 3: Genuinely Clever
  • Business Potential: 2. The “Micro-SaaS / Pro Tool”
  • Difficulty: Level 2: Intermediate
  • Knowledge Area: Font Rendering / Typography Basics
  • Software or Tool: Pico SDK, custom font data
  • Main Book: “Computer Graphics from Scratch” by Gabriel Gambetta

What you’ll build: A text rendering system that displays bitmap fonts (fixed-size) and anti-aliased fonts (smooth edges) on the LCD, including support for multiple font sizes and colors.

Why it teaches font rendering: Text is the most common thing displayed on screens. This project reveals how fonts are stored as bitmaps, how anti-aliasing creates the illusion of smooth edges at low resolution, and how to efficiently render text strings.

Core challenges you’ll face:

  • Font data structures → How to store glyph bitmaps efficiently
  • Glyph lookup and spacing → Variable-width fonts vs monospace
  • Anti-aliasing with alpha blending → Mixing foreground and background
  • Text layout → Word wrapping, line height, kerning basics
  • Memory optimization → Fonts can be large; how to minimize?

Key Concepts:

  • Bitmap Font Storage: Custom implementation
  • Anti-Aliasing: “Computer Graphics from Scratch” Ch. 2 — Gabriel Gambetta
  • Alpha Blending: Standard formula: out = alpha * fg + (1-alpha) * bg

Difficulty: Intermediate Time estimate: 1-2 weeks Prerequisites: Project 2 completed, understanding of bit manipulation


Real World Outcome

You’ll have a text rendering library:

Example Output:

$ # Your demo displays formatted text

Font Renderer initialized
  Bitmap font: 8x16 monospace (ASCII 32-126)
  Anti-aliased font: Variable width, 16px height

Rendering test text...

Line 1: "Hello, RP2350!" (bitmap 8x16)
  Characters: 14
  Render time: 0.3ms
  Pixels modified: 1,792

Line 2: "Smooth Text" (anti-aliased 16px)
  Characters: 11
  Render time: 0.8ms (alpha blending)
  Pixels modified: 3,456

Line 3: "Temperature: 23.5°C" (mixed fonts)
  Monospace numbers, symbol support
  Render time: 0.5ms

# The display shows crisp text with smooth anti-aliased edges!

┌────────────────────────────────┐
│  Hello, RP2350!                │ ← Bitmap (sharp but blocky)
│                                │
│  Smooth Text                   │ ← Anti-aliased (smooth edges)
│                                │
│  Temperature: 23.5°C           │ ← Mixed styles
└────────────────────────────────┘

What you’ll see: Readable text on the display with different font styles.


The Core Question You’re Answering

“How does a single character become hundreds of colored pixels on the screen?”

A font is a lookup table: given character code ‘A’ (65), return a bitmap of which pixels should be lit. For anti-aliasing, each pixel also has an alpha value (0-255) indicating how much foreground color to blend.


Concepts You Must Understand First

Stop and research these before coding:

  1. Bitmap Font Structure
    • How are glyph bitmaps typically stored (1-bit per pixel)?
    • What’s the relationship between character code and glyph index?
    • How do you handle characters not in the font?
    • Book Reference: Create your own understanding from font data formats
  2. Anti-Aliasing Concept
    • Why do diagonal lines look jagged at low resolution?
    • How does anti-aliasing (sub-pixel rendering) help?
    • What’s the alpha blending formula?
    • Book Reference: “Computer Graphics from Scratch” Ch. 2
  3. Text Layout
    • What’s “advance width” in typography?
    • What’s kerning and why does “AV” look bad without it?
    • How do you calculate line height for wrapped text?

Questions to Guide Your Design

Before implementing, think through these:

  1. Font Storage
    • Should fonts be stored in RAM or flash?
    • For an 8x16 bitmap font with 95 ASCII characters, how many bytes?
    • How do you represent anti-aliased glyphs (multiple bits per pixel)?
  2. Rendering Pipeline
    • Given string “Hi”, how do you calculate where each character goes?
    • Should you render character-by-character or build a text bitmap first?
    • How do you handle text that doesn’t fit on one line?
  3. Performance
    • Anti-aliasing requires reading background pixels—how does this affect speed?
    • Should you cache rendered strings?
    • Can you use DMA for text rendering?

Thinking Exercise

Decode a Character Bitmap

Here’s how ‘A’ might be stored as 8x16 bitmap:

Character 'A' (8 pixels wide, 16 pixels tall)
Stored as 16 bytes (one byte per row):

Byte 0:  0b00011000  →  ...##...
Byte 1:  0b00111100  →  ..####..
Byte 2:  0b01100110  →  .##..##.
Byte 3:  0b01100110  →  .##..##.
Byte 4:  0b11111110  →  #######.
Byte 5:  0b11000110  →  ##...##.
Byte 6:  0b11000110  →  ##...##.
Byte 7:  0b11000110  →  ##...##.
... (more rows for full 16-pixel height)

Questions:

  • To render ‘A’ at position (x=10, y=20), which pixels do you set?
  • If bit 7 is the leftmost pixel, how do you extract it?
  • How would you store ‘A’ with anti-aliasing (4 bits per pixel)?

The Interview Questions They’ll Ask

  1. “Explain how anti-aliasing works for font rendering.”
  2. “What’s the alpha blending formula? Derive it from first principles.”
  3. “You’re seeing fonts render with wrong spacing. What could cause this?”
  4. “How would you implement text word-wrapping for a fixed-width display?”
  5. “Compare bitmap fonts vs vector fonts. When would you use each?”

Hints in Layers

Hint 1: Starting Point Start with a monospace 8x16 bitmap font (only ASCII printable characters 32-126). Store as a simple array of 16 bytes per character.

Hint 2: Rendering Loop For each character: (1) lookup glyph data, (2) for each row of the glyph, (3) for each bit in the row, if set, draw pixel at (x + bit_index, y + row).

Hint 3: Anti-Aliasing Store glyphs with 4 bits per pixel (16 levels). Blend formula: color = (alpha * fg_color + (15 - alpha) * bg_color) / 15.

Hint 4: Variable Width Include a width table (one byte per character) in addition to the bitmap data. Advance cursor by glyph width, not fixed width.


Books That Will Help

Topic Book Chapter
Anti-aliasing basics “Computer Graphics from Scratch” by Gambetta Ch. 2
Alpha blending “Computer Graphics from Scratch” by Gambetta Ch. 14
Text rendering SDL2 TTF documentation N/A
Typography basics Online resources on typography N/A

Implementation Hints

Font Data Structure:

┌─────────────────────────────────────────────────────────────┐
│                    BITMAP FONT STORAGE                      │
│                                                             │
│  HEADER:                                                    │
│  ┌────────────┬────────────┬────────────┬────────────┐     │
│  │ Width: 8   │ Height: 16 │ First: 32  │ Count: 95  │     │
│  └────────────┴────────────┴────────────┴────────────┘     │
│                                                             │
│  GLYPH DATA (16 bytes per glyph × 95 glyphs = 1520 bytes): │
│                                                             │
│  Glyph ' ' (32):  00 00 00 00 00 00 00 00 ...              │
│  Glyph '!' (33):  18 18 18 18 18 00 18 00 ...              │
│  Glyph 'A' (65):  18 3C 66 66 7E C6 C6 00 ...              │
│  ...                                                        │
│                                                             │
│  TO RENDER CHARACTER c AT (x, y):                          │
│                                                             │
│  glyph_index = c - 32  // (if c >= 32 && c < 127)          │
│  glyph_data = &font_data[glyph_index * 16]                 │
│                                                             │
│  for row in 0..16:                                          │
│      byte = glyph_data[row]                                 │
│      for bit in 0..8:                                       │
│          if byte & (0x80 >> bit):                           │
│              draw_pixel(x + bit, y + row, color)           │
└─────────────────────────────────────────────────────────────┘

Alpha Blending Formula:

┌─────────────────────────────────────────────────────────────┐
│                    ALPHA BLENDING                           │
│                                                             │
│  Given:                                                     │
│    fg = foreground color (text color)                       │
│    bg = background color (what's behind)                    │
│    alpha = coverage (0 = transparent, 255 = opaque)        │
│                                                             │
│  Formula (for each color channel):                          │
│                                                             │
│    out = (alpha * fg + (255 - alpha) * bg) / 255           │
│                                                             │
│  Example (gray text on white background):                   │
│    fg = 0 (black), bg = 255 (white), alpha = 128 (50%)     │
│    out = (128 * 0 + 127 * 255) / 255 = 127 (gray)          │
│                                                             │
│  For RGB565, apply to each channel separately:              │
│    out_r5 = blend(fg_r5, bg_r5, alpha)                     │
│    out_g6 = blend(fg_g6, bg_g6, alpha)                     │
│    out_b5 = blend(fg_b5, bg_b5, alpha)                     │
└─────────────────────────────────────────────────────────────┘

Learning milestones:

  1. Single characters render correctly → Font data structure works
  2. Strings render with correct spacing → Layout logic works
  3. Anti-aliased text looks smooth → Alpha blending implemented

Project 7: RISC-V vs ARM Benchmark — Compare the Two Architectures

View Detailed Guide

  • File: LEARN_RP2350_LCD_DEEP_DIVE.md
  • Main Programming Language: C
  • Alternative Programming Languages: Rust, Assembly
  • Coolness Level: Level 4: Hardcore Tech Flex
  • Business Potential: 1. The “Resume Gold”
  • Difficulty: Level 4: Expert
  • Knowledge Area: Computer Architecture / ISA Comparison
  • Software or Tool: Pico SDK, dual-build toolchains
  • Main Book: “Computer Organization and Design RISC-V Edition” by Patterson & Hennessy

What you’ll build: A benchmarking suite that runs identical graphics/computation workloads on both ARM Cortex-M33 and RISC-V Hazard3 cores, measuring and visualizing performance differences on the LCD.

Why it teaches computer architecture: The RP2350 is unique—same chip, two ISAs. By running identical code and comparing results, you’ll see how instruction set design affects real performance. FPU vs soft-float, SIMD, branch prediction—all become visible.

Core challenges you’ll face:

  • Building for both architectures-DPICO_PLATFORM=rp2350 vs -DPICO_PLATFORM=rp2350-riscv
  • Measuring cycles accurately → Using cycle counters (DWT on ARM, mcycle on RISC-V)
  • Fair comparison → Same algorithm, same compiler optimizations
  • Floating-point performance → ARM has FPU, RISC-V doesn’t
  • Interpreting results → Understanding why differences occur

Key Concepts:

  • ISA Design: “Computer Organization and Design RISC-V Edition” Ch. 2 — Patterson & Hennessy
  • ARM Cortex-M33: ARM Technical Reference Manual
  • RISC-V Hazard3: Hazard3 Documentation (Wren6991/Hazard3)

Difficulty: Expert Time estimate: 2-3 weeks Prerequisites: Projects 1-6 completed, understanding of assembly basics


Real World Outcome

You’ll have a comprehensive benchmark comparing both architectures:

Example Output (on display):

┌──────────────────────────────────────────┐
│     RP2350 ARCHITECTURE BENCHMARK        │
│                                          │
│  TEST           ARM     RISC-V   RATIO   │
│  ─────────────────────────────────────   │
│  Integer add    1.00    1.02     0.98x   │
│  Integer mul    1.00    1.05     0.95x   │
│  Branching      1.00    0.95     1.05x   │
│  Memory copy    1.00    1.03     0.97x   │
│  FP add (f32)   1.00    48.2     0.02x   │
│  FP mul (f32)   1.00    52.7     0.02x   │
│  Mandelbrot     1.00    45.3     0.02x   │
│  Line draw      1.00    1.12     0.89x   │
│  Circle draw    1.00    1.08     0.93x   │
│  Text render    1.00    1.15     0.87x   │
│                                          │
│  VERDICT: ARM dominates FP-heavy work    │
│           RISC-V competitive elsewhere   │
└──────────────────────────────────────────┘

What you’ll see: Clear visualization of where each architecture excels.


The Core Question You’re Answering

“Why does the same code run at different speeds on different processor architectures?”

The answer lies in how each architecture implements instructions. ARM Cortex-M33 has a hardware FPU; RISC-V Hazard3 does not (soft-float ~50x slower). ARM has DSP extensions; RISC-V relies on basic IMAC. These design choices have real-world consequences.


Questions to Guide Your Design

Before implementing, think through these:

  1. Benchmark Selection
    • What workloads stress integer vs floating-point?
    • How do you ensure both builds are optimized equally?
    • What’s the minimum iterations needed for stable timing?
  2. Cycle Counting
    • How do you read cycle counters on ARM (DWT_CYCCNT)?
    • How do you read cycle counters on RISC-V (CSR mcycle)?
    • How do you handle counter overflow?
  3. Build Process
    • How do you build the same source for ARM and RISC-V?
    • What compiler flags ensure fair comparison?
    • How do you switch architectures at runtime (or do you)?

Implementation Hints

Dual-Build CMake Setup:

# Build for ARM (default):
# cmake -B build-arm -DPICO_BOARD=pico2 ..

# Build for RISC-V:
# cmake -B build-riscv -DPICO_PLATFORM=rp2350-riscv -DPICO_BOARD=pico2 ..

Cycle Counter Access:

ARM (Cortex-M33):
  CoreDebug->DEMCR |= CoreDebug_DEMCR_TRCENA_Msk;
  DWT->CYCCNT = 0;
  DWT->CTRL |= DWT_CTRL_CYCCNTENA_Msk;
  // Read: uint32_t cycles = DWT->CYCCNT;

RISC-V (Hazard3):
  uint32_t mcycle;
  asm volatile ("csrr %0, mcycle" : "=r"(mcycle));

Learning milestones:

  1. Both builds compile and run → Toolchain mastery
  2. Cycle counts are accurate → Measurement infrastructure works
  3. Results match expectations → You understand architectural tradeoffs

Project 8: TF Card Image Viewer — File System Integration

View Detailed Guide

  • File: LEARN_RP2350_LCD_DEEP_DIVE.md
  • Main Programming Language: C
  • Alternative Programming Languages: Rust
  • Coolness Level: Level 3: Genuinely Clever
  • Business Potential: 3. The “Service & Support” Model
  • Difficulty: Level 2: Intermediate
  • Knowledge Area: File Systems / Image Decoding
  • Software or Tool: Pico SDK, FatFS library
  • Main Book: “Making Embedded Systems” by Elecia White

What you’ll build: An image viewer that reads BMP and raw image files from the TF (microSD) card slot and displays them on the LCD, with navigation via serial commands.

Why it teaches file systems: The TF card slot speaks SPI to a FAT32 file system. This project connects storage hardware, file system drivers, and image decoding—a complete data pipeline from card to screen.

Core challenges you’ll face:

  • SPI configuration for SD card → Different mode than display (slower, different pins)
  • FAT32 file system integration → Using FatFS or similar library
  • Image format parsing → BMP header structure, pixel data extraction
  • Memory constraints → Can’t load full image to RAM; stream it
  • Error handling → Card removal, corrupted files, unsupported formats

Key Concepts:

  • FAT File System: “Making Embedded Systems” Ch. 9 — Elecia White
  • BMP File Format: Microsoft BMP documentation
  • SD Card SPI Mode: SD Card Simplified Specification

Difficulty: Intermediate Time estimate: 1-2 weeks Prerequisites: Project 1 completed, understanding of file systems


Real World Outcome

You’ll have an image viewer:

Example Output:

$ # Via serial console:
TF Card Image Viewer
Card detected: SDHC 16GB
File system: FAT32
Free space: 14.2 GB

> ls
photos/
  sunset.bmp     (172x320, 110KB)
  portrait.bmp   (172x320, 110KB)
icons/
  wifi.bmp       (32x32, 2KB)

> show photos/sunset.bmp
Loading: photos/sunset.bmp
Format: BMP 24-bit
Dimensions: 172x320
Converting to RGB565...
Streaming to display (110,080 bytes)...
Done in 450ms

# The display shows the sunset image!

What you’ll see: Your own images displayed on the LCD from the SD card.


Implementation Hints

BMP File Structure:

┌─────────────────────────────────────────────────────────────┐
│                     BMP FILE FORMAT                         │
│                                                             │
│  HEADER (14 bytes):                                         │
│  ┌────────────────────────────────────────────────────────┐│
│  │ Offset 0:  'B' 'M'        (magic number)               ││
│  │ Offset 2:  File size      (4 bytes, little-endian)     ││
│  │ Offset 10: Pixel offset   (where image data starts)    ││
│  └────────────────────────────────────────────────────────┘│
│                                                             │
│  DIB HEADER (40+ bytes):                                    │
│  ┌────────────────────────────────────────────────────────┐│
│  │ Offset 14: Header size    (40 for BITMAPINFOHEADER)    ││
│  │ Offset 18: Width          (4 bytes, signed)            ││
│  │ Offset 22: Height         (4 bytes, signed, negative=top-down)││
│  │ Offset 28: Bits per pixel (24 = RGB888)                ││
│  └────────────────────────────────────────────────────────┘│
│                                                             │
│  PIXEL DATA (at pixel offset):                              │
│  • Stored bottom-to-top (unless height is negative)        │
│  • Each row padded to 4-byte boundary                      │
│  • 24-bit: BGR order (not RGB!)                            │
│                                                             │
│  CONVERSION: BMP (BGR888) → Display (RGB565)               │
│  r5 = (r >> 3) & 0x1F                                      │
│  g6 = (g >> 2) & 0x3F                                      │
│  b5 = (b >> 3) & 0x1F                                      │
│  pixel = (r5 << 11) | (g6 << 5) | b5                       │
└─────────────────────────────────────────────────────────────┘

Learning milestones:

  1. Card mounts and lists files → SPI and FAT32 work
  2. BMP header parses correctly → File format understood
  3. Image displays correctly → Full pipeline works

Project 9: Real-Time System Monitor — CPU/Memory/Temp Dashboard

View Detailed Guide

  • File: LEARN_RP2350_LCD_DEEP_DIVE.md
  • Main Programming Language: C
  • Alternative Programming Languages: MicroPython
  • Coolness Level: Level 3: Genuinely Clever
  • Business Potential: 2. The “Micro-SaaS / Pro Tool”
  • Difficulty: Level 2: Intermediate
  • Knowledge Area: System Monitoring / Data Visualization
  • Software or Tool: Pico SDK, on-chip sensors
  • Main Book: “RP2350 Datasheet” (Raspberry Pi)

What you’ll build: A real-time dashboard on the LCD showing CPU utilization (both cores), memory usage, on-chip temperature, and optional external sensor data—updated 10+ times per second.

Why it teaches system internals: Measuring your own system requires understanding what “CPU utilization” even means on a microcontroller, how the on-chip temperature sensor works, and how to visualize data in real-time without slowing down the thing you’re measuring.

Core challenges you’ll face:

  • Measuring CPU utilization → No OS scheduler; measure idle time yourself
  • Reading temperature sensor → ADC configuration for internal sensor
  • Tracking memory → Stack high-water mark, heap usage
  • Real-time updating → 10 FPS without blocking measurements
  • Visualization → Bar graphs, line charts, numerical displays

Key Concepts:

  • RP2350 ADC & Temperature: RP2350 Datasheet Ch. 4.9 — Raspberry Pi
  • Real-Time Rendering: Your previous graphics projects
  • Profiling Techniques: “Making Embedded Systems” Ch. 12 — Elecia White

Difficulty: Intermediate Time estimate: 1-2 weeks Prerequisites: Projects 1-3 completed


Real World Outcome

You’ll have a live system monitor:

Display shows:

┌────────────────────────────────────────┐
│         RP2350 SYSTEM MONITOR          │
├────────────────────────────────────────┤
│  CPU USAGE                             │
│  Core 0: ████████████░░░░ 75%         │
│  Core 1: ████░░░░░░░░░░░░ 25%         │
│                                        │
│  MEMORY                                │
│  SRAM:   ██████████░░░░░░ 220/520 KB  │
│  Stack0: ████░░░░░░░░░░░░ 2.1 KB      │
│  Stack1: ██░░░░░░░░░░░░░░ 1.1 KB      │
│                                        │
│  TEMPERATURE                           │
│  ┌──────────────────────────────────┐ │
│  │ 45°C  ─────────────────────────  │ │
│  │       ────────╱────────────────  │ │
│  │ 25°C  ────────────────────────── │ │
│  │       └─────────────────────────►│ │
│  │       -60s                    now│ │
│  └──────────────────────────────────┘ │
│                                        │
│  Uptime: 02:34:17    Refresh: 12 Hz   │
└────────────────────────────────────────┘

What you’ll see: Live updating graphs and metrics.


Implementation Hints

CPU Utilization Measurement:

┌─────────────────────────────────────────────────────────────┐
│              MEASURING CPU UTILIZATION                      │
│                                                             │
│  APPROACH: Track how long the CPU spends in idle loop      │
│                                                             │
│  1. Set a flag in your main loop when doing "real work"    │
│  2. In the idle loop (or tight_loop_contents), count cycles│
│  3. CPU_utilization = 1 - (idle_cycles / total_cycles)     │
│                                                             │
│  TIMER APPROACH:                                            │
│  • Start a repeating timer (e.g., 100ms interval)          │
│  • In timer callback, read cycle counter                   │
│  • Compute: total_cycles - idle_cycles = work_cycles       │
│  • Reset counters for next interval                        │
│                                                             │
│  FOR CORE 1:                                                │
│  • Core 1 runs its own loop; use multicore FIFO to report  │
│  • Or use shared volatile variable (with care)             │
└─────────────────────────────────────────────────────────────┘

Temperature Sensor Reading:

// Enable temperature sensor
adc_init();
adc_set_temp_sensor_enabled(true);
adc_select_input(4);  // Channel 4 = temperature

// Read and convert
uint16_t raw = adc_read();
float voltage = raw * 3.3f / 4096.0f;
float temp_c = 27.0f - (voltage - 0.706f) / 0.001721f;

Learning milestones:

  1. Temperature reads correctly → ADC works
  2. CPU graph updates smoothly → Measurement doesn’t affect result
  3. All metrics display in real-time → Full integration

Project 10: Bare-Metal Display Driver — No SDK, Just Registers

View Detailed Guide

  • File: LEARN_RP2350_LCD_DEEP_DIVE.md
  • Main Programming Language: C (or Assembly)
  • Alternative Programming Languages: Rust (no_std)
  • Coolness Level: Level 5: Pure Magic
  • Business Potential: 1. The “Resume Gold”
  • Difficulty: Level 5: Master
  • Knowledge Area: Bare-Metal Programming / Hardware Registers
  • Software or Tool: arm-none-eabi-gcc, linker scripts
  • Main Book: “RP2350 Datasheet” (Raspberry Pi) — Full document

What you’ll build: A complete display driver that works without the Pico SDK—direct register manipulation for clocks, GPIO, SPI, and display initialization. True bare-metal.

Why it teaches the deepest level: The SDK abstracts away everything. This project removes that abstraction, forcing you to understand every register, every clock tree configuration, every hardware quirk. After this, you can port the RP2350 to any RTOS or bare-metal framework.

Core challenges you’ll face:

  • Boot process without SDK → Understanding the RP2350 boot ROM
  • Clock configuration → PLL, system clock, peripheral clocks
  • GPIO pad and IO configuration → Function selection, pull-ups, drive strength
  • SPI register-level programming → SSPCR0, SSPCR1, SSPDR, etc.
  • Linker scripts and startup code → Memory layout, vector table, C runtime

Key Concepts:

  • RP2350 Register Map: RP2350 Datasheet Ch. 2-4 — Raspberry Pi
  • Boot Sequence: RP2350 Datasheet Ch. 5 — Boot ROM
  • Bare Metal Programming: “Bare Metal C” by Steve Oualline

Difficulty: Master Time estimate: 1 month Prerequisites: All previous projects completed, assembly experience


Real World Outcome

You’ll have a minimal bare-metal program:

Project structure:

bare_metal_display/
├── link.ld           # Linker script (memory layout)
├── startup.s         # Assembly startup code
├── main.c            # Your bare-metal code
├── registers.h       # Hardware register definitions
├── clocks.c          # Clock tree configuration
├── gpio.c            # GPIO functions
├── spi.c             # SPI driver
└── display.c         # ST7789 driver

Binary size: 4KB (vs ~100KB with SDK)
Boot time: <10ms to first pixel

What you’ll see: Same display output as Project 1, but with complete understanding of how it works.


Implementation Hints

Register Access Pattern:

// Direct register access (no SDK)
#define SPI0_BASE       0x4003C000
#define SSPCR0_OFFSET   0x00
#define SSPCR1_OFFSET   0x04
#define SSPDR_OFFSET    0x08
#define SSPSR_OFFSET    0x0C

#define SPI0_CR0  (*(volatile uint32_t*)(SPI0_BASE + SSPCR0_OFFSET))
#define SPI0_CR1  (*(volatile uint32_t*)(SPI0_BASE + SSPCR1_OFFSET))
#define SPI0_DR   (*(volatile uint32_t*)(SPI0_BASE + SSPDR_OFFSET))
#define SPI0_SR   (*(volatile uint32_t*)(SPI0_BASE + SSPSR_OFFSET))

// Wait for TX FIFO not full, then send byte
while (!(SPI0_SR & (1 << 1))) {}  // TNF bit
SPI0_DR = byte;

Minimal Linker Script Concept:

MEMORY {
    FLASH (rx) : ORIGIN = 0x10000000, LENGTH = 16M
    RAM (rwx)  : ORIGIN = 0x20000000, LENGTH = 520K
}

SECTIONS {
    .text : { *(.vector_table) *(.text*) } > FLASH
    .data : { *(.data*) } > RAM AT > FLASH
    .bss  : { *(.bss*) } > RAM
}

Learning milestones:

  1. Code compiles without SDK → Toolchain setup correct
  2. Chip boots and runs your code → Boot sequence understood
  3. Display works identically to SDK version → Complete bare-metal mastery

Project 11: Simple Game — Pong or Snake with Button Input

View Detailed Guide

  • File: LEARN_RP2350_LCD_DEEP_DIVE.md
  • Main Programming Language: C
  • Alternative Programming Languages: Rust, MicroPython
  • Coolness Level: Level 4: Hardcore Tech Flex
  • Business Potential: 2. The “Micro-SaaS / Pro Tool”
  • Difficulty: Level 3: Advanced
  • Knowledge Area: Game Development / Real-Time Systems
  • Software or Tool: Pico SDK, GPIO buttons or serial input
  • Main Book: “Game Programming Patterns” by Robert Nystrom

What you’ll build: A playable game (Pong or Snake) running at 60 FPS on the LCD, with input via GPIO buttons or serial commands, featuring score display and sound effects via the PWM buzzer.

Why it teaches real-time game development: Games require consistent frame timing, input handling, game state management, and audio—all while maintaining smooth graphics. This combines everything you’ve learned into a cohesive interactive system.

Core challenges you’ll face:

  • Game loop timing → Fixed 16.67ms frame time for 60 FPS
  • Input debouncing → Buttons bounce; serial has latency
  • Collision detection → Ball vs paddle, snake vs walls
  • State machines → Menu, playing, paused, game over
  • Audio timing → PWM for simple sound effects

Key Concepts:

  • Game Loop Architecture: “Game Programming Patterns” Ch. 3 — Robert Nystrom
  • PWM Audio: RP2350 Datasheet Ch. 4.5 — PWM
  • Input Handling: “Making Embedded Systems” Ch. 6 — Elecia White

Difficulty: Advanced Time estimate: 2-3 weeks Prerequisites: Projects 1-3 completed


Real World Outcome

You’ll have a playable game:

Display shows Pong:

┌────────────────────────────────────────┐
│  P1: 3                          P2: 2  │
│                                        │
│  ██                              ██    │
│  ██                              ██    │
│  ██         ●                    ██    │
│  ██          \                   ██    │
│  ██           \                  ██    │
│                                        │
│                                        │
│  ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ │
│                                        │
│  [Press BOOT to pause]                 │
└────────────────────────────────────────┘

What you’ll see: A playable game running smoothly at 60 FPS.


Project 12: USB HID Device — Turn the Board into a Custom Controller

View Detailed Guide

  • File: LEARN_RP2350_LCD_DEEP_DIVE.md
  • Main Programming Language: C
  • Alternative Programming Languages: Rust
  • Coolness Level: Level 4: Hardcore Tech Flex
  • Business Potential: 3. The “Service & Support” Model
  • Difficulty: Level 4: Expert
  • Knowledge Area: USB Protocol / HID Class
  • Software or Tool: Pico SDK TinyUSB, USB HID descriptors
  • Main Book: “USB Complete” by Jan Axelson

What you’ll build: The board becomes a USB HID device that your computer recognizes as a custom game controller or macro keyboard, with the LCD showing current state and the RGB LED indicating connection status.

Why it teaches USB protocol: USB is everywhere but rarely understood. This project takes you through USB enumeration, descriptors, HID reports, and endpoint management—all the layers that make “plug and play” actually work.

Core challenges you’ll face:

  • USB descriptor configuration → Device, configuration, interface, endpoint, HID descriptors
  • HID report format → Defining your button/axis layout
  • TinyUSB integration → The Pico SDK’s USB stack
  • Cross-platform compatibility → Works on Windows, Mac, Linux
  • Latency optimization → USB polling intervals

Key Concepts:

  • USB Fundamentals: “USB Complete” by Jan Axelson — Ch. 1-4
  • HID Class: “USB Complete” by Jan Axelson — Ch. 8
  • TinyUSB: Pico SDK documentation

Difficulty: Expert Time estimate: 2-3 weeks Prerequisites: Projects 1-3 completed, USB basics


Real World Outcome

Your computer sees:

$ lsusb
Bus 001 Device 015: ID 1234:5678 RP2350 Custom Controller

$ # On Windows: Game Controllers shows your device
$ # Your buttons appear as gamepad inputs!

Display shows:

┌────────────────────────────────────────┐
│     USB HID CONTROLLER                 │
│                                        │
│  Status: CONNECTED (12ms polling)      │
│                                        │
│  BUTTON STATE:                         │
│  ┌───┐ ┌───┐ ┌───┐ ┌───┐             │
│  │ A │ │ B │ │ X │ │ Y │             │
│  │ ○ │ │ ● │ │ ○ │ │ ○ │             │
│  └───┘ └───┘ └───┘ └───┘             │
│                                        │
│  AXIS:                                 │
│  X: ████████████░░░░ 75%              │
│  Y: ████████░░░░░░░░ 50%              │
│                                        │
│  Reports sent: 15,432                  │
└────────────────────────────────────────┘

What you’ll see: The LCD shows current controller state; your computer uses it as an input device.


Final Project: Mini Operating System — Cooperative Multitasking on RP2350

  • File: LEARN_RP2350_LCD_DEEP_DIVE.md
  • Main Programming Language: C
  • Alternative Programming Languages: Rust
  • Coolness Level: Level 5: Pure Magic
  • Business Potential: 1. The “Resume Gold”
  • Difficulty: Level 5: Master
  • Knowledge Area: Operating Systems / Task Scheduling
  • Software or Tool: No SDK, custom implementation
  • Main Book: “Operating Systems: Three Easy Pieces” by Arpaci-Dusseau

What you’ll build: A minimal cooperative multitasking OS that runs multiple “tasks” on the RP2350—one task runs the display, one handles input, one runs animations—with a scheduler that switches between them. The LCD shows a task manager view.

Why this is the ultimate project: An OS combines everything: memory management, task switching, inter-task communication, hardware abstraction, and system calls. Building even a minimal one proves you understand computing from the ground up.

Core challenges you’ll face:

  • Context switching → Saving/restoring registers, stack pointers
  • Scheduler design → Round-robin, priority, time-slicing
  • Task stacks → Allocating per-task stack space
  • Synchronization → Semaphores, mutexes without data races
  • System calls → Interface between tasks and kernel

Key Concepts:

  • Process Abstraction: “Operating Systems: Three Easy Pieces” Ch. 4-6
  • Scheduling: “Operating Systems: Three Easy Pieces” Ch. 7-9
  • Context Switching: “Computer Systems: A Programmer’s Perspective” Ch. 8

Difficulty: Master Time estimate: 1-2 months Prerequisites: All previous projects, especially Project 10


Real World Outcome

Display shows task manager:

┌────────────────────────────────────────┐
│        PICO-OS TASK MANAGER            │
├────────────────────────────────────────┤
│  TASKS (4 running)                     │
│                                        │
│  PID  NAME        STATE    CPU  STACK  │
│  ─────────────────────────────────────  │
│  0    idle        READY     5%   256B  │
│  1    display     RUNNING  45%  1024B  │
│  2    input       BLOCKED  10%   512B  │
│  3    animation   READY    40%  1024B  │
│                                        │
│  Scheduler: Round-robin (10ms slice)   │
│  Context switches: 45,231              │
│  Uptime: 00:07:32                      │
│                                        │
│  Memory: 128/520 KB used               │
│  Free task slots: 12/16                │
└────────────────────────────────────────┘

What you’ll see: Multiple tasks running concurrently, managed by your scheduler.


Project Comparison Table

# Project Difficulty Time Depth Fun
1 Hello Display Weekend ⭐⭐ ⭐⭐⭐
2 Pixel Artist ⭐⭐ 1-2 weeks ⭐⭐⭐ ⭐⭐⭐⭐
3 DMA Display Driver ⭐⭐⭐ 1-2 weeks ⭐⭐⭐⭐ ⭐⭐⭐
4 PIO LED Controller ⭐⭐⭐ 1-2 weeks ⭐⭐⭐⭐ ⭐⭐⭐⭐
5 Dual-Core Renderer ⭐⭐⭐ 2-3 weeks ⭐⭐⭐⭐⭐ ⭐⭐⭐⭐
6 Font Rendering ⭐⭐ 1-2 weeks ⭐⭐⭐ ⭐⭐⭐
7 ARM vs RISC-V Benchmark ⭐⭐⭐⭐ 2-3 weeks ⭐⭐⭐⭐⭐ ⭐⭐⭐⭐⭐
8 TF Card Image Viewer ⭐⭐ 1-2 weeks ⭐⭐⭐ ⭐⭐⭐⭐
9 System Monitor ⭐⭐ 1-2 weeks ⭐⭐⭐ ⭐⭐⭐⭐
10 Bare-Metal Driver ⭐⭐⭐⭐⭐ 1 month ⭐⭐⭐⭐⭐ ⭐⭐⭐
11 Simple Game ⭐⭐⭐ 2-3 weeks ⭐⭐⭐ ⭐⭐⭐⭐⭐
12 USB HID Device ⭐⭐⭐⭐ 2-3 weeks ⭐⭐⭐⭐ ⭐⭐⭐⭐
13 Mini Operating System ⭐⭐⭐⭐⭐ 1-2 months ⭐⭐⭐⭐⭐ ⭐⭐⭐⭐

Recommendation

If You’re New to Embedded Systems

Start with Projects 1-3 in order. They build on each other:

  1. Project 1 teaches raw hardware communication
  2. Project 2 adds software graphics on top
  3. Project 3 introduces DMA for performance

If You Already Know Microcontrollers

Jump to Projects 4-5 (PIO and Dual-Core) for RP2350-specific features that make it unique.

If You Want Maximum Learning

Do Projects 1, 3, 5, 7, 10 for a journey from SDK to bare-metal, single-core to multi-core, ARM to RISC-V.

If You Want Maximum Fun

Do Projects 4, 8, 11 for LED effects, image viewing, and game development.


Summary

This learning path covers the RP2350 1.47-inch LCD Development Board through 13 hands-on projects. Here’s the complete list:

# Project Name Main Language Difficulty Time Estimate
1 Hello Display — Raw SPI Communication C Beginner Weekend
2 Pixel Artist — Drawing Primitives C Intermediate 1-2 weeks
3 DMA Display Driver C Advanced 1-2 weeks
4 RGB LED Controller with PIO C/PIO Advanced 1-2 weeks
5 Dual-Core Rendering Engine C Advanced 2-3 weeks
6 Font Rendering Engine C Intermediate 1-2 weeks
7 RISC-V vs ARM Benchmark C Expert 2-3 weeks
8 TF Card Image Viewer C Intermediate 1-2 weeks
9 Real-Time System Monitor C Intermediate 1-2 weeks
10 Bare-Metal Display Driver C/Assembly Master 1 month
11 Simple Game (Pong/Snake) C Advanced 2-3 weeks
12 USB HID Device C Expert 2-3 weeks
13 Mini Operating System C Master 1-2 months

For beginners: Start with projects #1, #2, #3, #6, #8, #9 For intermediate: Jump to projects #3, #4, #5, #11 For advanced: Focus on projects #7, #10, #12, #13

Expected Outcomes

After completing these projects, you will:

  • Understand the RP2350’s unique dual-architecture (ARM/RISC-V) from practical experience
  • Master SPI, DMA, PIO, and multi-core programming
  • Build real-time graphics systems from raw register access to high-level abstractions
  • Create interactive applications with displays, LEDs, file systems, and USB
  • Have the skills to port any RTOS or bare-metal framework to the RP2350
  • Understand embedded systems deeply enough to debug any hardware/software issue

You’ll have built 13 working projects demonstrating deep understanding of embedded systems programming from first principles.


Sources and References

Key resources used in this learning path: