Learn RP2040/RP2350: From Zero to Embedded Systems Master

Goal: Build a deep, working mental model of the RP2040 and RP2350 microcontrollers from power-on to production firmware. You will understand how these chips boot, how their clocks and buses behave, how memory and peripherals are exposed, and how PIO, DMA, and multicore synchronization actually work under the hood. By the end, you will be able to design and debug bare-metal firmware, implement USB devices and hosts, build protocol analyzers, and ship secure boot chains. You will also be able to compare the Arm Cortex-M33 and RISC-V Hazard3 cores, make architectural tradeoffs, and design a custom RP2040/RP2350 board.

Introduction

RP2040 and RP2350 are microcontrollers designed by Raspberry Pi for people who want to understand and control the whole stack. These chips pair low cost with an unusually transparent and well-documented architecture, plus a unique Programmable I/O (PIO) subsystem that lets you implement custom digital protocols in hardware. RP2040 (2021) brings dual Cortex-M0+ cores, deterministic timing, and a large SRAM for its price class. RP2350 (2024) upgrades the platform with dual Cortex-M33 or dual RISC-V Hazard3 cores, more SRAM, stronger security features, and more PIO state machines, while keeping the RP2040 programming model largely familiar. The result is a family that is ideal for learning serious embedded systems engineering.

This guide is a mini-book plus a project sprint. First you will build the mental model of how these chips work: memory map, clocks, PIO, DMA, USB, multicore, and secure boot. Then you will build real tools and devices: a bare-metal boot flow, PIO-driven LED control, a USB HID keyboard, logic analyzers, audio synthesizers, VGA output, secure boot chains, and a custom dev board.

Big picture: you are learning to move data from the real world into deterministic, timed logic and then out again, with tight control over clocks, interrupts, DMA, and peripherals.

                 RP2040 / RP2350 System View

   Power On
     |
     v
+------------+     +---------------------+     +--------------------+
| Boot ROM   | --> | XIP Flash + Cache   | --> | C Runtime / main() |
+------------+     +---------------------+     +--------------------+
     |                        |                         |
     |                        v                         v
     |                +---------------+        +--------------------+
     |                | SRAM (banks) |        | Firmware Services  |
     |                +---------------+        | (drivers, queues) |
     |                        |                +--------------------+
     |                        v                         |
     |                +---------------+                 v
     +--------------> | Bus Fabric    | -------> +------------------+
                      +---------------+          | Peripherals      |
                        |    |   |              | UART/I2C/SPI/PWM |
                        |    |   |              +------------------+
                        |    |   |
                        |    |   +--> USB Device/Host
                        |    +------> DMA Engine
                        +-----------> PIO State Machines

Scope: this guide focuses on RP2040 and RP2350 silicon (and Pico/Pico 2 boards as reference platforms). It does not attempt to teach full RTOS design or high-level IoT cloud stacks, but the projects provide enough depth to build real devices and reason about performance, power, and security.

How to Use This Guide

Read the Theory Primer like a short textbook. Each concept chapter ends with exercises and solutions.
Pick a learning path in the Recommended Learning Paths section, or follow the Quick Start if you are overwhelmed.
Each project is designed to apply multiple concept chapters. Use the Project-to-Concept Map to connect theory to implementation.
Treat each project as a deliverable you could demo in an interview or ship in a product.
Keep a lab notebook. The fastest learning comes from writing down measured timings, scope traces, and bugs you fixed.

Prerequisites & Background

Essential Prerequisites (Must Have)

C programming basics: pointers, structs, bitwise ops, and memory layout.
Comfort with reading datasheets and hardware reference manuals.
Familiarity with a command line and a build system (Make or CMake).

Helpful But Not Required

Assembly basics (Arm or RISC-V) for startup code and debugging.
Electronics basics: pull-ups, voltage levels, decoupling, and signal integrity.
Oscilloscope or logic analyzer usage (helpful for timing validation).

Self-Assessment Questions

Can you explain what a memory-mapped register is and why volatile matters?
Do you know the difference between flash and SRAM, and what “execute in place” means?
Can you read a datasheet pinout table and identify peripheral functions?
Can you describe the difference between an interrupt and polling?

Development Environment Setup

Install the Pico SDK and toolchain (GCC Arm Embedded or LLVM with Arm target).
Install cmake, ninja (optional), and picotool.
Have a debug probe available (Picoprobe, CMSIS-DAP, or J-Link) for hard problems.
For USB projects, a powered USB hub is recommended for stable enumeration.
For PIO timing projects, a logic analyzer or scope is strongly recommended.

Time Investment

Beginner path: 8 to 12 weeks at 6 to 8 hours per week.
Intermediate path: 6 to 8 weeks at 8 to 12 hours per week.
Advanced path: 4 to 6 weeks at 10 to 15 hours per week.

Important Reality Check

You will spend time reading datasheets and debugging hardware. That is the point. The fastest learners are those who keep notes, measure with tools, and validate assumptions with real signals.

Big Picture / Mental Model

Think of RP2040/RP2350 as a deterministic data pump with programmable timing and flexible I/O. Your job is to move data between external pins, internal memory, and protocol engines while meeting timing and power constraints.

   Data Flow Mental Model

   Sensors/Signals --> GPIO/PIO --> FIFO/Buffer --> DMA --> SRAM --> CPU
                                      |                      |
                                      |                      v
                                      +--> USB / UART / SPI / I2C

At a high level:

PIO is a deterministic per-pin state machine that can speak protocols at bit-level precision.
DMA moves data between peripherals and memory with minimal CPU involvement.
Dual cores let you split time-critical I/O from high-level logic.
USB enables host and device use cases with a small amount of firmware.
Security features on RP2350 allow secure boot and protected assets.

Theory Primer

Concept 1: Boot ROM, XIP Flash, and the Memory Map

Fundamentals

The RP2040/RP2350 boot flow is intentionally simple: a small ROM bootloader validates and starts code from external flash, then hands control to your firmware. The chips are stateless and rely on external QSPI flash for persistent storage. Your code typically runs in execute-in-place (XIP) mode, meaning instructions are fetched directly from flash through a cache, while data lives in SRAM. The memory map is fixed and well-documented, which is why bare-metal development is viable: you can understand exactly where ROM, flash, SRAM, and peripherals live in the address space. Knowing the address map is the difference between deterministic firmware and blind trial-and-error.

Deep Dive into the Concept

At reset, the RP2040 and RP2350 start executing from ROM. The ROM contains USB boot support (UF2 on RP2040), flash programming helpers, and the initial boot sequence. The external flash is connected over QSPI and is mapped into the address space so the CPU can fetch instructions as though it were memory. This design is called XIP: execute in place. You do not need to copy the entire program into SRAM; only hot data (buffers, stacks, critical routines) need to be in RAM. In practice, this means your linker script must define sections for .text (XIP flash), .data (initialized in flash, copied to SRAM), and .bss (zeroed in SRAM). If you ignore this, your firmware will crash long before main().

The memory map is more than addresses. It implies bus structure and timing. Flash accesses are slower than SRAM, which is why the chip provides a cache. Instruction fetches can be stalled by cache misses or by flash erase/program operations. This becomes important when you try to run precise timing code from flash: if you need deterministic timing, copy the routine into SRAM or use PIO to offload it. The peripheral address space is memory-mapped, with separate buses for high-speed peripherals. The SIO (single-cycle I/O) block provides low-latency GPIO access and includes spinlocks and inter-core FIFOs. A correct mental model is: cores and DMA are masters on a shared fabric, and all of them contend for SRAM banks, flash, and peripherals. If you are not aware of this, you will blame “random” timing issues on your code when the bus fabric is actually the cause.

Boot flow on these chips is also unique because of the two-stage boot structure in RP2040. The ROM reads the first 256 bytes of external flash, verifies a CRC, and runs it from SRAM. That stage configures the XIP interface and jumps to your main firmware. This allows the ROM to remain small while still supporting flexible flash parts. For RP2350, the security model extends this concept by supporting signed boot and secure boot chains. That means your bootloader might be cryptographically verified before any untrusted code is allowed to execute, depending on the security configuration and OTP settings. Understanding this is essential for the Secure Boot and Custom Bootloader projects.

When things go wrong, the memory map is your debugging compass. For example, writing to the wrong GPIO control register might accidentally hit an unimplemented address range, producing a HardFault. If your vector table is wrong, the CPU will jump to address 0x00000000 and fault. When the chip does not boot at all, you can attach a debugger and inspect the PC (program counter) and stack pointer to see whether the ROM handed off correctly. That workflow is critical in bare-metal development.

Finally, XIP is not just about speed; it is about system design. External flash density, QSPI clock speed, and cache behavior all influence code execution and data access patterns. If your project streams audio or video data, you may need to use DMA from flash to SRAM, or allocate buffers in SRAM to avoid XIP stalls. If your system uses USB, you need to ensure ISRs are in RAM or you risk missing frames during flash accesses. This is why understanding the boot and memory map concept first makes the rest of the system predictable.

How This Fits in Projects

Project 1 (Bare-Metal Blinky) uses the boot sequence, flash layout, and memory map directly.
Project 11 (Custom Bootloader) builds a multi-slot flash layout and uses the ROM boot rules.
Project 8 (Secure Boot) depends on secure boot and OTP configuration in RP2350.

Definitions & Key Terms

Boot ROM: On-chip ROM containing the initial boot code and helpers.
XIP (Execute In Place): Running code directly from external flash via a cache.
Vector Table: The table at the start of memory with initial stack pointer and reset handler.
SRAM Bank: A portion of on-chip SRAM, often independently accessible.
Memory-Mapped I/O: Accessing peripherals as memory addresses.

Mental Model Diagram

  Reset -> Boot ROM -> Stage 2 -> XIP -> main()
      |        |          |          |
      |        |          |          +--> C runtime init (.data/.bss)
      |        |          +--> Configure QSPI/XIP
      |        +--> Validate flash header / checksum
      +--> Reset vector = ROM entry

How It Works (Step-by-Step)

Reset line asserts; CPU starts at ROM reset vector.
ROM reads the flash header and optional CRC/signature.
ROM copies the stage-2 boot block into SRAM and executes it.
Stage-2 configures the QSPI interface and enables XIP.
CPU jumps to firmware entry point in XIP flash.
C runtime copies .data to SRAM and zeros .bss.
Your main() runs.

Minimal Concrete Example

// Minimal vector table in flash for RP2040-like boot
__attribute__((section(".vectors")))
const uint32_t vectors[] = {
    0x20042000,        // initial stack pointer (top of SRAM)
    (uint32_t)Reset_Handler,
};

void Reset_Handler(void) {
    // Initialize data/bss (normally done by runtime)
    extern uint32_t _sidata, _sdata, _edata, _sbss, _ebss;
    uint32_t *src = &_sidata;
    for (uint32_t *dst = &_sdata; dst < &_edata; ) *dst++ = *src++;
    for (uint32_t *dst = &_sbss; dst < &_ebss; ) *dst++ = 0;

    main();
    while (1) {}
}

Common Misconceptions

“The CPU always starts at my main().” It does not; it starts in ROM.
“Code in flash is as fast as SRAM.” It is not; XIP depends on cache.
“If I can blink an LED, my memory map is correct.” Not always; small errors can hide.

Check-Your-Understanding Questions

Why does RP2040 require a stage-2 boot block in flash?
What happens if your vector table points to an invalid address?
When should you copy code into SRAM instead of running from XIP?

Check-Your-Understanding Answers

The stage-2 boot block configures the QSPI interface and enables XIP, which the ROM does not fully do by itself.
The CPU will attempt to execute invalid memory and will fault or hang, often before any visible output.
When you need deterministic timing or low-latency ISR response, or when flash access may stall.

Real-World Applications

Safe bootloaders for IoT devices.
Field-updatable firmware with rollback.
Timing-sensitive audio or video playback.

Where You’ll Apply It

Projects 1, 8, 11, and the Capstone board.

References

RP2040 specifications and documents: https://www.raspberrypi.com/products/rp2040/specifications/
RP2040 documentation overview: https://www.raspberrypi.com/documentation/microcontrollers/rp2040.html
RP2040 product portal (datasheet and hardware design): https://pip.raspberrypi.com/categories/814-rp2040
UF2 format (bootloader file format): https://github.com/microsoft/uf2

Key Insight

A microcontroller boot flow is a data validation and memory-mapping problem before it is a programming problem.

Summary

The boot ROM and memory map define how your firmware exists in the chip. If you understand the boot flow, the vector table, and XIP behavior, you can predict and debug almost every early-boot failure and design reliable update mechanisms.

Homework/Exercises to Practice the Concept

Draw the RP2040 memory map from memory and label ROM, XIP, SRAM, and peripherals.
Write a minimal linker script that places .text in flash and .data in SRAM.
Write a tiny stage-2 boot stub that toggles a pin before jumping to main.

Solutions to the Homework/Exercises

A correct map shows ROM at 0x00000000, XIP flash at 0x10000000, SRAM at 0x20000000, peripherals at 0x40000000 and above, and SIO at 0xD0000000.
The linker script should define FLASH and RAM regions and set .text in FLASH, .data in RAM with AT(FLASH).
The stub should set a GPIO as output using IO_BANK0 and toggle it before branching to the firmware entry point.

Concept 2: Clocking, Reset, and Power Management

Fundamentals

Clocking is the heartbeat of every RP2040/RP2350 system. These chips use a crystal oscillator, a ring oscillator, and PLLs to generate multiple clock domains for CPU, USB, peripherals, and RTC. Reset and power-control logic ensure safe startup and allow low-power states. If you do not control clocks correctly, peripherals will misbehave, USB will fail to enumerate, and timing-sensitive protocols will drift. This concept explains how clock sources are selected, how PLLs lock, and how peripherals are reset and gated.

Clock configuration is also the first place you learn to read a hardware manual: you must identify register fields, default values, and required delays. The practical skill is sequencing and verification. You will learn to measure clocks on GPIO outputs, confirm PLL lock, and reason about which peripherals need which clock domains. This makes every later project predictable instead of trial-and-error.

Deep Dive into the Concept

The RP2040/RP2350 clock architecture starts with two main oscillators: a ring oscillator (ROSC) that is fast but imprecise, and an external crystal oscillator (XOSC) that is accurate but requires startup time. The ROSC is useful for early boot and low-power states, while XOSC is the reference for stable system clocks. From these sources, PLLs multiply frequencies to create the system clock (CLK_SYS) and specialized clocks such as CLK_USB and CLK_ADC. Each peripheral uses a clock derived from these domains, and many peripherals will not work unless their clocks are explicitly enabled and correctly configured.

Reset logic in these chips is granular. Peripherals can be held in reset until you explicitly release them. This prevents undefined behavior during power-up but means firmware must be explicit about resets. There is also a power-on state machine that sequences internal regulators and domains. For low power, the chips support sleep and dormant modes, where clocks are gated and certain blocks are powered down. On RP2350, the power subsystem is improved with on-chip regulators and new low-power states, which can extend battery life dramatically in real-world applications.

Clock configuration is a dependency graph. For example, to run USB, you need a 48 MHz clock derived from a PLL that is itself locked to the crystal oscillator. If you start the PLL before the XOSC is stable, you will get unpredictable results. The typical sequence is: enable XOSC, wait for stable, configure PLL, wait for lock, then switch CLK_SYS to PLL output. For USB, you configure a dedicated PLL or divider to reach exactly 48 MHz. For PIO and PWM timing, you need to know the exact system clock and the divider values. If the clock configuration is wrong, your WS2812 LED timing will be off, your VGA signal will be unstable, and your UART baud rate will drift.

Power management ties directly to clock gating. When you shut down a peripheral, you should also disable its clock to reduce power. In dormant mode, only the ROSC or RTC may remain active, and you wake via GPIO or RTC. This is critical in battery-powered devices. The practical lesson is that power is not just about hardware design, but about firmware control of the clock tree. If your firmware does not control clocks, you will not get low power.

Finally, deterministic timing requires more than just a stable PLL. The bus fabric can still introduce timing jitter if multiple masters contend for SRAM or flash. That is why time-critical work is often pushed into PIO or DMA with precise clock divisors. You do not always need a faster clock; you need a known, stable clock. The RP2040/RP2350 design gives you that, but only if you understand the clock tree.

Another key detail is the clock “glitchless” muxing. The RP2040/RP2350 provide safe switching between clock sources to avoid short pulses or runt clocks that can corrupt logic. The correct sequence is to select a safe source, wait for stability, then switch. Many peripherals also have independent clock dividers; it is often safer to run a peripheral at a known, lower frequency than to assume the system clock is always suitable. Debugging tools exist as well: you can output clock signals to GPIO via clock generators to verify actual frequencies with a scope. This practice turns register configuration into measurable reality.

The power subsystem is also coupled with resets. For example, when you enter dormant mode, most clocks are stopped and only the RTC or ROSC may remain. Waking requires specific signals, and peripheral state can be lost. That means firmware must reinitialize clocks and peripherals after wake. If you design for low power from the start, you will structure your code to reconfigure clocks cleanly and avoid implicit dependencies on default reset states. This mindset is essential for production battery devices.

How This Fits in Projects

Project 1 configures clocks manually to blink at an accurate rate.
Project 2 and 7 require precise PIO timing.
Project 5 requires stable audio sample clocks.
Project 10 requires USB clock accuracy.

Definitions & Key Terms

ROSC: Ring oscillator, fast but imprecise.
XOSC: External crystal oscillator, accurate reference.
PLL: Phase-Locked Loop, multiplies a reference clock.
Clock domain: A group of logic driven by a common clock.
Reset gating: Keeping peripherals in reset until configured.

Mental Model Diagram

  ROSC -----> CLK_REF -----> RTC
     \ 
      \--> XOSC -> PLL_SYS -> CLK_SYS -> CPU / BUS
             \-> PLL_USB -> CLK_USB -> USB

How It Works (Step-by-Step)

On reset, ROSC provides a temporary clock.
Firmware enables XOSC and waits for stability.
Firmware configures PLL_SYS and waits for lock.
CLK_SYS is switched to PLL output.
Peripheral clocks are derived and enabled.
Peripherals are released from reset.

Minimal Concrete Example

// Pseudocode for clock init
enable_xosc();
wait_xosc_stable();
configure_pll_sys(12_000_000, 150_000_000);
wait_pll_lock();
set_clk_sys(150_000_000);
enable_clk_peripheral(UART0);
reset_deassert(UART0);

Common Misconceptions

“If the chip boots, the clocks are fine.” USB and precise IO still need exact clocks.
“ROSC is good enough for anything.” It is not stable enough for protocols like USB.
“PLL configuration only affects CPU speed.” It affects all timing across the chip.

Check-Your-Understanding Questions

Why does USB require an accurate 48 MHz clock?
What is the consequence of forgetting to deassert a peripheral reset?
When should you keep ROSC running during low-power modes?

Check-Your-Understanding Answers

USB signaling requires strict timing tolerance; drift breaks enumeration and transfers.
The peripheral will appear dead even if configured, because it is still held in reset.
When you need a low-power wake source or simple timing without the crystal.

Real-World Applications

Battery-powered sensors with long sleep intervals.
Accurate audio playback and synthesis.
USB devices that must enumerate reliably across hosts.

Where You’ll Apply It

Projects 1, 2, 5, 7, 10.

References

RP2040 specifications: https://www.raspberrypi.com/products/rp2040/specifications/
RP2350 Pico 2 announcement (features, power, clocks): https://www.raspberrypi.com/news/raspberry-pi-pico-2-our-new-5-microcontroller-board-on-sale-now/

Key Insight

Clock configuration is the hidden root cause of most “mysterious” peripheral failures.

Summary

By mastering clocks and resets, you make every other subsystem predictable. The RP2040/RP2350 clock tree is powerful but unforgiving: correct configuration enables precise timing and low-power operation.

Homework/Exercises to Practice the Concept

Compute the PLL divider values needed to produce 125 MHz and 150 MHz from a 12 MHz crystal.
Write code that switches CLK_SYS between ROSC and PLL at runtime.
Measure GPIO toggle frequency with and without PLL configuration.

Solutions to the Homework/Exercises

Example: 12 MHz * 125 / 6 / 2 = 125 MHz (classic RP2040 setup). 12 MHz * 150 / 6 / 2 = 150 MHz for RP2350.
Use the clock mux control registers to switch sources after the PLL locks.
Observe that ROSC yields inconsistent frequency while PLL yields stable frequency.

Concept 3: GPIO, Pad Control, SIO, and Interrupts

Fundamentals

GPIO is the simplest I/O, but on RP2040/RP2350 it is split across IO_BANK (function select and interrupt), PADS (electrical characteristics), and SIO (fast, single-cycle access). Understanding this split is essential for reliable timing and for mixing peripherals on the same pins. Interrupts connect pins to firmware logic, but you must configure edge detection, masking, and clearing correctly to avoid spurious triggers or missed events.

GPIO on these chips is not just about logic highs and lows; it is about electrical behavior, pin multiplexing, and timing. You will learn why a pull-up matters for I2C, why drive strength matters for LED strips, and why the same pin can behave differently depending on pad settings. This concept turns “it works sometimes” into reliable hardware behavior.

Deep Dive into the Concept

Each GPIO pin is backed by multiple subsystems: the IO bank selects which peripheral function drives the pin (GPIO, SPI, UART, PIO, etc.), the pads control electrical characteristics (pull-up/down, drive strength, slew rate, Schmitt trigger), and the SIO provides a fast path for reading/writing pins with minimal bus latency. This separation is why you can have a pin that is configured as a PWM output with strong drive, or as a high-impedance input with pull-up and hysteresis. For high-speed protocols, pad configuration matters as much as logic.

SIO is critical for deterministic timing. Unlike the regular APB peripheral bus, SIO provides single-cycle access to GPIO and includes hardware spinlocks and a FIFO for multicore communication. When you bit-bang protocols or need to respond to an interrupt within a tight window, SIO is how you make that deterministic. The RP2040 uses SIO for fast GPIO toggling and core-to-core coordination.

Interrupts are a second-order system: not only must you enable them, you must also understand their priority and masking. On Cortex-M0+ (RP2040), the NVIC provides relatively simple priority and masking. On RP2350’s Cortex-M33, TrustZone introduces secure vs non-secure interrupt routing, which complicates the model. Additionally, GPIO interrupts can be level or edge-triggered, and if you configure them incorrectly, you might either miss events or re-trigger continuously. For example, if you configure a level interrupt and do not clear the source, it will retrigger immediately.

Another subtlety: when a pin is used by PIO, you still configure the pad and the IO function. If you forget to set the pin function to PIO, your state machine will not see the signal. Similarly, if you set pad drive too weak for a high-speed LED strip, signals may degrade. The theory here is that digital I/O is always electrical + logical; you cannot ignore one or the other.

Finally, the RP2040/RP2350 allow you to route and synchronize interrupts across cores. You can pin an IRQ to a specific core or use software interrupts to coordinate tasks. This is crucial when you split time-critical ISR handling on core 1 and background tasks on core 0. Poor interrupt design will show up as jitter, dropped frames, or missed timing windows in PIO-driven protocols.

The RP2040/RP2350 also provide atomic GPIO set, clear, and XOR registers. This matters in multi-threaded or interrupt-heavy code because you can toggle pins without read-modify-write races. This is more than convenience: it is how you avoid heisenbugs when two cores touch the same GPIO bank. For interrupts, you must learn the difference between edge-triggered and level-triggered behavior and how event flags are latched and cleared. A frequent failure mode is forgetting to clear the interrupt status, which leads to immediate retrigger and apparent lockups.

Electrical behavior cannot be ignored. Input hysteresis and pull-ups affect noise sensitivity, which is critical when capturing fast edges or using long wires. Pad drive strength and slew rate affect signal integrity and EMI; too fast can cause ringing, too slow can fail at high speed. In practice, you treat pad configuration as part of your protocol design. For example, I2C uses open-drain outputs and requires external pull-ups, while SPI often benefits from stronger drive and faster slew.

How This Fits in Projects

Project 1: basic GPIO control and timing.
Project 2: GPIO + PIO pin mapping for WS2812.
Project 4/12: GPIO interrupts and timing capture for analyzers.
Project 7: VGA output timing and pad strength.

Definitions & Key Terms

IO_BANK: Registers for function select and interrupt control.
PADS: Electrical pad settings (pull-ups, drive strength).
SIO: Single-cycle I/O, fast GPIO and multicore primitives.
Edge trigger: Interrupt on transitions (rising/falling).
Level trigger: Interrupt held active while pin is high/low.

Mental Model Diagram

  Peripheral Function -> IO_BANK -> PAD -> PIN
                                \-> SIO (fast read/write)

How It Works (Step-by-Step)

Select a function for the pin (GPIO, PIO, SPI, etc.).
Configure pad settings (pull-up/down, drive strength).
If using GPIO, read/write via SIO for speed.
Configure interrupt edge/level and enable it.
Clear interrupt flags in the handler to avoid retrigger.

Minimal Concrete Example

// Configure GPIO 25 as output with strong drive
pads_bank0_hw->io[25] = PADS_BANK0_GPIO0_DRIVE_12MA | PADS_BANK0_GPIO0_SLEWFAST;
io_bank0_hw->io[25].ctrl = IO_BANK0_GPIO0_CTRL_FUNCSEL_VALUE_SIO;
sio_hw->gpio_oe_set = 1u << 25;

// Toggle
sio_hw->gpio_xor = 1u << 25;

Common Misconceptions

“Setting IO_BANK is enough.” You also need to configure PADS.
“Interrupts always fire once per edge.” Only if you clear the status correctly.
“GPIO speed is only about CPU speed.” Drive strength and slew matter.

Check-Your-Understanding Questions

Why does a GPIO pin have both IO_BANK and PADS configuration?
What happens if you configure a level interrupt and do not clear its flag?
Why might you prefer SIO over APB for GPIO toggling?

Check-Your-Understanding Answers

IO_BANK selects logic function and interrupts; PADS controls electrical behavior.
The interrupt will retrigger immediately, creating an interrupt storm.
SIO provides deterministic single-cycle access and lower latency.

Real-World Applications

High-speed LED driving.
Signal capture for logic analysis.
Low-power wake from GPIO events.

Where You’ll Apply It

Projects 1, 2, 4, 7, 12.

References

RP2040 specifications and documents: https://www.raspberrypi.com/products/rp2040/specifications/
RP2040/RP2350 documentation portal: https://www.raspberrypi.com/documentation/microcontrollers/rp2040.html

Key Insight

GPIO is not just logic; it is a combined electrical and timing system.

Summary

Understanding GPIO at the pad, function, and interrupt level lets you build reliable signal interfaces and precise timing control.

Homework/Exercises to Practice the Concept

Configure one pin as input with pull-up and another as output, then measure rise/fall times on a scope.
Implement a GPIO interrupt that timestamps rising edges using the timer.
Compare GPIO toggle speed via SIO vs a generic peripheral register.

Solutions to the Homework/Exercises

You should observe different rise times when changing drive strength and pull-ups.
Use an interrupt handler to read a microsecond timer and store timestamps in a ring buffer.
SIO toggles are measurably faster and more consistent.

Concept 4: PIO (Programmable I/O) Microarchitecture

Fundamentals

PIO is the feature that makes RP2040/RP2350 unique. Each PIO block contains multiple state machines that run tiny programs at deterministic timing, allowing you to implement custom digital protocols without CPU involvement. PIO can shift data in and out, control pin directions, and synchronize with external signals. You can think of PIO as a hardware-backed bit-level co-processor that runs at predictable timing.

PIO is best understood as a programmable timing fabric. It is not a peripheral with fixed behavior; it is a microengine that you program to match a protocol. That means you must think about cycle counts, pin direction, and FIFO pacing at the same time. Once you do, you gain a tool that replaces many external chips and complex bit-banging loops.

Deep Dive into the Concept

PIO sits alongside the CPU and DMA as a peer on the bus fabric. Each PIO block includes a small instruction memory, shift registers, a program counter, and configurable pin mapping. A state machine executes 16-bit instructions such as in, out, jmp, wait, and set. Instructions can stall on pin conditions or FIFO availability, which provides a convenient way to sync to external signals. The state machine can shift bits into its input shift register (ISR) and out of its output shift register (OSR) with configurable shift directions and bit counts. When the ISR or OSR reaches its threshold, it can push or pull data from the FIFO. This is how you stream data between PIO and the CPU or DMA.

PIO timing is deterministic because each instruction takes a fixed number of cycles (typically one) plus optional delay slots. With a dedicated clock divider per state machine, you can set precise bit timings independent of CPU load. This is why PIO can generate WS2812 LED timings or VGA sync pulses even when the CPU is busy. The state machines can also coordinate via IRQ flags, enabling multi-SM synchronization. For example, one state machine can generate pixel data while another generates sync pulses, or one can capture while another triggers.

The PIO instruction memory is shared among state machines in a block. This encourages compact programs and reuse. On RP2040, each PIO block has four state machines and a 32-instruction program memory, with two blocks total. RP2350 increases the number of state machines and PIO blocks, enabling more concurrent protocols. Each state machine has configurable pin bases and side-set pins, letting you manipulate pins in parallel with data shifting. Side-set is a key PIO concept: you can change pins as part of instruction encoding, eliminating extra cycles.

A common design pattern is to use PIO for precise I/O and DMA for throughput. For example, in a logic analyzer, the PIO captures pin states into its FIFO, DMA transfers the data into SRAM, and the CPU performs decoding. In a LED controller, the CPU or DMA loads the PIO FIFO with encoded data; the PIO handles timing and pin toggling. PIO is not a replacement for all peripherals; it is a tool for deterministic timing and protocol flexibility.

When building PIO programs, think like a hardware designer. You should reason about cycle-by-cycle behavior, pipeline delays, and exact sampling points. Use the PIO assembler and the visualizer tools to check pin timing. Avoid relying on implicit behavior: always define shift directions, FIFO thresholds, and wrap points. If you want to understand PIO deeply, write a scope-driven test harness that toggles pins and captures waveforms while varying delays.

PIO also supports automatic push and pull of shift registers when certain bit counts are reached. This is how you create continuous data streams without explicit push/pull instructions in the loop. Combined with FIFO join modes, you can increase buffering depth and reduce CPU involvement. Another critical feature is pin mapping: each state machine can have independent base pins and side-set pins, which allows you to use the same program for multiple pin groups by changing configuration instead of code.

Multi-SM coordination is a powerful but underused feature. You can have one SM generate a clock, another drive data, and a third sample an input. IRQ flags can synchronize state machines at frame boundaries. For video output, this allows clean sync generation. For protocol analyzers, it enables precise alignment between sampling and timestamping. These capabilities turn PIO into a small, programmable peripheral subsystem rather than a single-purpose IO engine.

How This Fits in Projects

Project 2 (WS2812) uses PIO to generate strict timing patterns.
Project 4 and 12 (Logic Analyzer) use PIO to capture external signals.
Project 7 (VGA) uses multi-SM PIO for sync and pixel data.
Project 10 (USB Host) can use PIO for bit-level debugging signals.

Definitions & Key Terms

State machine (SM): The PIO execution unit.
ISR/OSR: Input/Output shift registers in the SM.
Side-set: Additional pin control encoded in instruction bits.
FIFO: Buffer between PIO and CPU/DMA.
Wrap: PIO instruction loop boundaries.

Mental Model Diagram

     +------------------+     FIFO     +------------+
Pins | PIO State Machine| <----------> | CPU / DMA  |
     +------------------+              +------------+
            |  ^
            v  |
         Pin control

How It Works (Step-by-Step)

Load a PIO program into instruction memory.
Configure pin bases, directions, and side-set pins.
Set clock divider and shift control (thresholds, direction).
Enable the state machine.
PIO executes instructions deterministically, pushing/pulling FIFO.
CPU or DMA consumes/produces FIFO data.

Minimal Concrete Example

.program ws2812
; Send 24 bits with strict timing
pull         ; OSR <- FIFO
set x, 23
bitloop:
    out pins, 1    side 0 [1]
    jmp !x do_zero side 1 [1]
    jmp bitloop    side 1 [1]
do_zero:
    jmp bitloop    side 0 [1]

Common Misconceptions

“PIO is just for LEDs.” It can implement many serial protocols.
“PIO replaces interrupts.” It avoids them by handling timing in hardware.
“PIO is easy to debug.” It is deterministic but still requires waveform validation.

Check-Your-Understanding Questions

Why does side-set reduce instruction count?
How do FIFO thresholds affect throughput?
When should you use DMA with PIO?

Check-Your-Understanding Answers

Side-set allows pin changes without extra instructions.
Thresholds control when data is pushed/pulled, affecting latency and buffering.
When you need sustained throughput without CPU intervention.

Real-World Applications

Custom LED drivers, VGA output, logic analyzers.
Software-defined UARTs, SPI, and I2C controllers.
Protocol analyzers and test equipment.

Where You’ll Apply It

Projects 2, 4, 7, 12.

References

Pico SDK PIO documentation: https://www.raspberrypi.com/documentation/pico-sdk/hardware.html#hardware_pio
Getting started with PIO (Raspberry Pi docs): https://www.raspberrypi.com/documentation/microcontrollers/raspberry-pi-pico.html#raspberry-pi-pico-pio-using-the-pio
RP2350 datasheet (PIO count, features): https://datasheets.raspberrypi.com/rp2350/rp2350-datasheet.pdf

Key Insight

PIO lets you treat timing as hardware, not as code.

Summary

PIO is a deterministic, programmable, per-pin engine. When used with DMA and careful timing, it can implement protocols that would otherwise require specialized hardware.

Homework/Exercises to Practice the Concept

Write a PIO program that captures a UART RX line at 115200 and pushes bytes to FIFO.
Implement a PIO program that generates a PWM-like output with adjustable duty.
Measure the timing of PIO instructions on a scope.

Solutions to the Homework/Exercises

Use wait to sync on start bit and in pins, 1 with a loop for 8 bits.
Use a loop with set pins, 1 and set pins, 0 plus delays.
Count clock cycles and use the divider to compute expected period.

Concept 5: DMA and Streaming Data Pipelines

Fundamentals

DMA (Direct Memory Access) moves data between peripherals and memory without CPU involvement. The RP2040/RP2350 DMA engine supports multiple channels, ring buffers, pacing with DREQ signals, and chaining for complex transfer sequences. DMA is critical for high-throughput or low-jitter tasks such as audio streaming, logic analysis, or VGA output.

DMA is your escape hatch from CPU-bound data movement. If you understand DMA, you can build systems that stream data continuously while the CPU focuses on control logic. This is the line between a “demo” project and a production-ready pipeline: DMA makes throughput predictable and timing stable.

Deep Dive into the Concept

The DMA engine acts as a bus master that can read and write memory and peripheral registers. Each channel is configured with source, destination, transfer count, data size, and pacing. The pacing signal, typically called DREQ, allows the DMA to match the peripheral rate. For example, a PIO or UART can assert a DREQ when its FIFO is ready, ensuring the DMA transfers only when it is safe. This prevents overruns and underruns without CPU intervention.

DMA channels can be chained: when one channel completes, it can trigger another channel to start. This is a powerful pattern for double-buffered streaming. For example, in an audio synthesizer you can fill buffer A while buffer B is being transmitted by DMA. When B completes, DMA triggers the transfer of A, and the CPU refills B. This forms a stable, glitch-free pipeline.

The RP2040 DMA provides additional features such as address increment control and ring buffer mode. Ring buffers are useful for continuous capture: you can configure the DMA to wrap addresses, turning a region of SRAM into a circular buffer. For analyzers, this is ideal because you can capture continuously and stop on a trigger, then analyze the buffer.

DMA is not “free.” It uses the bus, and if it is not configured carefully it can starve the CPU or other peripherals. For example, if you set DMA to higher priority and run large burst transfers from SRAM, you can cause instruction fetch stalls for the CPU. The fix is to tune burst sizes, priorities, and to use SRAM banking wisely. Another pitfall is alignment: some transfers require aligned addresses, and unaligned transfers can fail silently or produce corrupt data.

The mental model: DMA is a programmable traffic controller on the bus fabric. You decide what flows where, when, and how fast. When paired with PIO, you can build hardware-like pipelines. When paired with USB, you can move buffers with minimal latency. Mastering DMA makes your firmware feel like hardware.

Advanced DMA usage includes building linked lists of control blocks in memory. While RP2040 does not expose a full scatter-gather engine like large SoCs, you can emulate it by chaining channels that load new configurations from memory. This lets you create complex transfer sequences such as pattern generation, interleaved channels, or streaming with headers. Another technique is to use DMA completion IRQs to signal buffer availability, reducing polling overhead.

DMA also interacts with cache and XIP behavior. If your source data is in flash, you may incur cache misses that stall the bus. A common optimization is to prefetch or copy critical data into SRAM before DMA transfers. Similarly, if DMA writes into SRAM that the CPU immediately reads, you must ensure proper memory barriers. Even without caches, compiler reordering can cause subtle bugs; use volatile or explicit barriers for shared flags. The takeaway is that DMA is powerful, but it must be integrated into a coherent memory and scheduling model.

Finally, measure DMA behavior. Use timestamps around buffer boundaries to verify that transfers complete within deadlines. This measurement-driven approach will catch underruns early and help you size buffers based on evidence rather than guesswork.

How This Fits in Projects

Project 4 and 12 use DMA for high-speed capture.
Project 5 uses DMA for audio streaming.
Project 7 uses DMA for pixel data throughput.
Project 11 uses DMA to update flash or verify images efficiently.

Definitions & Key Terms

DREQ: DMA request signal from a peripheral.
Ring buffer: Circular buffer with wrap-around addressing.
Channel chaining: Triggering a channel when another completes.
Burst: Number of transfers before the DMA yields the bus.

Mental Model Diagram

  Peripheral -> DREQ -> DMA Channel -> SRAM Buffer -> CPU

How It Works (Step-by-Step)

Configure DMA channel with source/destination and size.
Configure DREQ pacing and data size.
Optionally configure ring buffer and chaining.
Enable channel; DMA moves data autonomously.
CPU handles buffers or interrupts on completion.

Minimal Concrete Example

// Pseudocode for DMA from PIO RX FIFO to SRAM buffer
channel_config cfg = dma_channel_get_default_config(chan);
channel_config_set_dreq(&cfg, DREQ_PIO0_RX0);
channel_config_set_transfer_data_size(&cfg, DMA_SIZE_32);
channel_config_set_ring(&cfg, true, 10); // 1 KB ring buffer

dma_channel_configure(chan, &cfg,
    buffer,              // dst
    &pio0->rxf[0],        // src
    256,                  // transfers
    true                  // start
);

Common Misconceptions

“DMA is always faster.” It depends on bus contention and burst size.
“DMA eliminates the need for interrupts.” You still need completion or error handling.
“Ring buffer means no data loss.” You still must size and read it correctly.

Check-Your-Understanding Questions

What is a DREQ and why does it matter?
How does channel chaining help with double buffering?
Why can DMA cause CPU stalls?

Check-Your-Understanding Answers

DREQ paces DMA to a peripheral’s readiness, preventing overruns.
It allows seamless switching between buffers without CPU scheduling gaps.
DMA uses the same bus as the CPU, so heavy transfers can delay instruction/data fetches.

Real-World Applications

Audio streaming with zero-dropout.
Continuous logic capture.
High-speed display output.

Where You’ll Apply It

Projects 4, 5, 7, 12.

References

Pico SDK DMA documentation: https://www.raspberrypi.com/documentation/pico-sdk/hardware.html#hardware_dma
RP2040 specifications: https://www.raspberrypi.com/products/rp2040/specifications/

Key Insight

DMA turns your microcontroller into a streaming data engine.

Summary

DMA is the backbone of throughput and low-jitter systems. With proper pacing and buffering, you can move large amounts of data without sacrificing CPU time.

Homework/Exercises to Practice the Concept

Configure a DMA channel to copy a buffer in SRAM and measure CPU time saved.
Implement a double-buffered DMA transfer and measure glitch-free audio output.
Use DMA in ring buffer mode to capture a GPIO waveform via PIO.

Solutions to the Homework/Exercises

Measure cycles with a timer; CPU usage drops significantly when DMA handles copies.
Use two buffers and a completion IRQ to refill while DMA transmits.
Configure DMA with ring size and verify wrap in your capture buffer.

Concept 6: Dual-Core, Synchronization, and Real-Time Scheduling

Fundamentals

RP2040 and RP2350 have two cores that can run independent code. This enables parallelism, but only if you coordinate shared resources. The chips provide hardware spinlocks, FIFOs, and IRQ routing to make synchronization reliable. Real-time scheduling is the discipline of ensuring that time-critical tasks meet deadlines, often by isolating them on one core or using deterministic pipelines with PIO and DMA.

Dual-core does not automatically solve performance problems; it shifts the challenge to synchronization and scheduling. The key mental model is to treat each core as a resource with a time budget and to make communication explicit. If you can design clear ownership of tasks and shared data, the system becomes reliable and scalable.

Deep Dive into the Concept

The dual-core model is simple but powerful. Each core can execute its own program, or both can run the same firmware with different roles. One common pattern is to dedicate core 1 to time-critical I/O while core 0 handles high-level logic and communication. The cores share SRAM, so you must guard shared data with locks or message passing. RP2040 provides 32 hardware spinlocks in the SIO block, which are single-cycle primitives for mutual exclusion. Using them incorrectly can lead to deadlocks or priority inversions, so you need to design critical sections carefully.

Inter-core FIFOs allow fast message passing without shared memory. You can send small commands or tokens between cores with low overhead. This is useful for scheduling: for example, core 0 can send a “render next frame” message to core 1, which drives a VGA PIO pipeline. The key is to keep data structures consistent and minimize shared-state complexity.

Real-time scheduling is more than splitting work. You must identify deadlines and ensure that operations are deterministic. For example, a USB HID device must respond within a specific time window, while an audio DAC must be fed at an exact rate. If you allow a high-priority interrupt to run too long, you might miss a DMA completion or PIO FIFO refill. The proper strategy is to use timers, prioritize interrupts, and offload data movement to DMA. Some projects require a basic scheduler: a timer tick that switches tasks or a cooperative loop that yields at defined points. Project 6 will implement a minimal real-time scheduler to make these concepts concrete.

On RP2350, the TrustZone security model can also affect interrupts: secure and non-secure interrupts are partitioned, and you must ensure the right core and security domain handle the right events. This adds a layer of complexity for secure boot and protected data, but the principles are the same: minimize shared state, use hardware primitives, and reason about timing.

Dual-core systems also expose memory ordering issues. Even without caches, you can see surprising behavior if you do not use barriers when sharing flags. The Pico SDK provides memory barrier helpers and synchronization APIs; if you roll your own, you must understand how the compiler may reorder memory operations. Many embedded bugs are actually memory ordering bugs.

A subtle but important detail is interrupt affinity. On RP2040 you can choose which core handles certain interrupts, and on RP2350 the security model can also route them to secure or non-secure domains. If you misroute an interrupt, you might see latency spikes or unexpected handler execution. Real-time scheduling on a dual-core system often uses a “time-critical core” and a “services core”. The time-critical core should do minimal work beyond feeding PIO/DMA, and should avoid locks whenever possible.

Another advanced topic is memory ordering. Even in small MCUs, compilers can reorder reads and writes around synchronization points. When a core sets a flag to indicate new data in a buffer, the other core must see the buffer contents before it sees the flag. This requires a memory barrier or a strict ordering discipline. The Pico SDK provides helpers for this, but if you roll your own, you must insert barriers manually. Ignoring this can lead to rare, hard-to-debug race conditions.

How This Fits in Projects

Project 4 and 5 benefit from splitting capture vs processing.
Project 6 is explicitly about real-time scheduling.
Project 7 uses core separation for VGA timing.
Project 10 can dedicate a core to USB host polling.

Definitions & Key Terms

Spinlock: Hardware lock primitive for mutual exclusion.
FIFO: Inter-core message queue.
Priority inversion: A low-priority task blocking a high-priority task.
Memory barrier: Prevents compiler/CPU reordering of memory operations.

Mental Model Diagram

Core 0 (control/UI)  <---- FIFO ---->  Core 1 (real-time I/O)
      |                                        |
      +---- shared buffers (guarded by locks) -+

How It Works (Step-by-Step)

Boot core 0, then launch core 1.
Assign responsibilities (e.g., core 1 handles PIO/DMA).
Use spinlocks or FIFO to protect shared data.
Set interrupt priorities and use timers for scheduling.
Validate deadlines with scope or timestamps.

Minimal Concrete Example

// Launch core 1 and send a message
multicore_launch_core1(core1_entry);
multicore_fifo_push_blocking(0x12345678);
uint32_t response = multicore_fifo_pop_blocking();

Common Misconceptions

“Dual-core means twice the speed.” Only if tasks are independent and coordinated.
“Spinlocks are always safe.” They can deadlock or stall real-time code.
“RTOS is required for real-time.” You can build deterministic loops without an RTOS.

Check-Your-Understanding Questions

When should you use FIFO instead of shared memory?
Why is memory ordering important in dual-core systems?
How can you prevent priority inversion in a simple scheduler?

Check-Your-Understanding Answers

When messages are small and you want deterministic, lock-free coordination.
Without barriers, the compiler or CPU may reorder writes, leading to stale data reads.
Keep critical sections short and avoid locking from low-priority tasks.

Real-World Applications

Audio streaming and DSP pipelines.
Video generation and capture.
Multi-protocol bridges (USB-to-serial, USB-to-I2C).

Where You’ll Apply It

Projects 4, 5, 6, 7, 10.

References

Pico SDK synchronization docs: https://www.raspberrypi.com/documentation/pico-sdk/hardware.html#hardware_sync
RP2040 specifications: https://www.raspberrypi.com/products/rp2040/specifications/

Key Insight

Dual-core gives you deterministic time budgets only if you partition and synchronize carefully.

Summary

Multicore programming on RP2040/RP2350 is straightforward but demands discipline. Use hardware spinlocks and FIFO, reason about timing, and keep shared state minimal.

Homework/Exercises to Practice the Concept

Build a dual-core program where core 1 toggles a pin at 1 kHz while core 0 logs timestamps.
Implement a producer-consumer queue with spinlocks and measure latency.
Write a cooperative scheduler with fixed-rate tasks and measure jitter.

Solutions to the Homework/Exercises

Use a timer interrupt on core 1 and log results over UART from core 0.
Use a ring buffer guarded by a spinlock and track enqueue/dequeue times.
Use a SysTick or hardware timer and record maximum deviation.

Concept 7: USB Device and Host Fundamentals on RP2040/RP2350

Fundamentals

USB is both a protocol and a timing discipline. The RP2040 includes a USB 1.1 Full Speed controller, enabling it to behave as a device (and on RP2350, improved capability and support for USB-focused projects). You must understand descriptors, endpoints, enumeration, and the timing of control transfers to build reliable USB devices. USB host mode requires additional scheduling and power considerations.

USB is the first protocol in this guide that forces you to think like a system integrator rather than a firmware coder. You must align clocking, descriptors, endpoints, and host behavior. Once you understand USB’s structure, you can reliably build devices that work on any PC without “mystery” enumeration failures.

Deep Dive into the Concept

USB Full Speed is 12 Mbps with strict timing. When you connect a USB device, the host performs enumeration by issuing control transfers to endpoint 0. Your firmware must respond with device, configuration, interface, and endpoint descriptors, which describe how the device behaves. If you get these wrong, the device will not enumerate. A USB HID keyboard is a classic example: it uses standard HID descriptors, reports keycodes on interrupt endpoints, and requires a periodic report interval. The TinyUSB stack abstracts many details, but you still need to understand descriptors to debug enumeration issues.

USB host mode adds another layer: the host must schedule transfers and poll devices. On a microcontroller, this can be challenging because you must service USB frames at 1 ms intervals and maintain device state. If the CPU is busy or if interrupts are masked too long, you will miss frames and lose devices. This is why splitting tasks across cores or using DMA for heavy I/O becomes important. USB host also requires VBUS power and proper USB power negotiation, especially for devices that draw more than default current.

USB communication uses a token-data-handshake sequence. Each transaction includes a token (IN/OUT/SETUP), optional data, and a handshake (ACK/NAK/STALL). The device must respond within a specified time. If your firmware stalls the CPU while servicing a PIO or DMA transfer, you may miss USB deadlines. Therefore, the USB stack should be prioritized appropriately. On RP2040, the USB controller is integrated and uses the system clock; you must ensure a stable 48 MHz clock for reliable operation.

USB debugging is a skill in itself. When enumeration fails, you check the host logs, review descriptors, and ensure the device responds with correct lengths. Tools like usbmon (Linux) or a hardware USB analyzer can reveal transactions. The project-based approach in this guide gives you the mental model to debug these issues without guesswork.

USB descriptors form a hierarchy: device -> configuration -> interface -> endpoint -> class-specific descriptors (like HID report descriptors). Each level has length and type fields that must be correct. A single byte error can prevent enumeration. Control transfers have three stages (setup, data, status) and must be handled within strict timing windows. Devices can also respond with NAK to indicate temporary unavailability, but repeated NAKs can cause the host to drop the device.

Host mode adds scheduling complexity. Full-speed USB uses 1 ms frames, and the host is responsible for polling interrupt endpoints at declared intervals. If your firmware cannot keep up, input devices will lag or disconnect. Power budgeting also matters: a host should assume 100 mA default draw until a device requests more. On a microcontroller, you must ensure the VBUS supply can handle the load. This is why host implementations must combine protocol knowledge with hardware design discipline.

USB device classes also impose behavioral rules beyond descriptors. For example, HID devices must report at a declared interval and respect idle rates. Host stacks are strict about these rules. If you observe unexpected host behavior, it is often because a class-level rule was violated, not a low-level USB signal error.

How This Fits in Projects

Project 3 builds a USB HID keyboard.
Project 10 builds a USB host that reads keyboards/mice.
Project 11 bootloader can use USB for firmware updates.

Definitions & Key Terms

Descriptor: Structured data that describes a USB device.
Endpoint: A logical communication channel on a USB device.
Enumeration: Host process that configures and identifies a device.
HID: Human Interface Device class (keyboards, mice).
VBUS: USB power line provided by the host.

Mental Model Diagram

Host  <--- control (EP0) ---> Device
  |                              |
  |-- interrupt IN (HID) ------->|
  |-- bulk/iso (optional) ------>|

How It Works (Step-by-Step)

Device pulls D+ or D- to signal speed.
Host enumerates via SETUP packets on endpoint 0.
Device returns descriptors and configuration.
Host sets configuration and begins periodic transfers.
Device sends HID reports or receives data.

Minimal Concrete Example

// TinyUSB HID report example (keyboard 'A')
uint8_t keycode[6] = { HID_KEY_A, 0, 0, 0, 0, 0 };
tud_hid_keyboard_report(0, 0, keycode);

Common Misconceptions

“USB is just UART with a different cable.” It is not; timing and descriptors matter.
“Enumeration failures are random.” They almost always trace to descriptor or timing errors.
“Host mode is the same as device mode.” Host requires scheduling and power management.

Check-Your-Understanding Questions

Why is endpoint 0 special in USB?
What role do descriptors play in enumeration?
Why does USB need an accurate 48 MHz clock?

Check-Your-Understanding Answers

Endpoint 0 is the default control endpoint used for enumeration.
Descriptors describe the device’s capabilities and configuration to the host.
USB timing is strict; clock drift breaks signaling and packets.

Real-World Applications

Custom input devices, macro pads, and instrumentation.
USB-to-serial or USB-to-MIDI devices.
USB host dongles for HID devices.

Where You’ll Apply It

Projects 3, 10, 11.

References

Pico SDK USB and TinyUSB integration: https://www.raspberrypi.com/documentation/pico-sdk/index_doxygen.html
UF2 and USB bootloader: https://github.com/microsoft/uf2

Key Insight

USB success is 90 percent correct descriptors and timing discipline.

Summary

USB is powerful but unforgiving. Understanding enumeration, endpoints, and timing makes USB projects tractable on microcontrollers.

Homework/Exercises to Practice the Concept

Write a minimal HID descriptor for a keyboard and validate it with a USB descriptor tool.
Capture USB enumeration with usbmon and annotate each transaction.
Implement a USB CDC (serial) device and measure throughput.

Solutions to the Homework/Exercises

Use TinyUSB examples and verify descriptor lengths and report IDs.
Look for SETUP, DATA, STATUS stages and ensure descriptors match lengths.
Measure data rate over serial and compare to expected full-speed limits.

Concept 8: Security, Secure Boot, and TrustZone on RP2350

Fundamentals

RP2350 introduces hardware security features that enable secure boot, trusted execution, and cryptographic acceleration. The chip supports Arm TrustZone-M (on Cortex-M33), OTP storage for keys, hardware SHA-256 acceleration, and random number generation. These features let you build devices that verify firmware authenticity and protect secrets.

Security is not a feature you add at the end; it is a system architecture decision. On RP2350, the presence of OTP and TrustZone means you can actually enforce a trust boundary. The fundamentals here are about creating a chain of trust, protecting keys, and handling updates safely.

You will also learn to separate security design from security implementation. The hardware gives you primitives; you must decide policies, lifecycle stages, and recovery strategies. Without that, security features can be misused or left disabled.

Deep Dive into the Concept

Security in microcontrollers starts at boot. If the boot chain is not verified, attackers can replace your firmware. RP2350 provides mechanisms to authenticate code before it runs. The typical secure boot flow uses a root of trust stored in one-time programmable (OTP) memory. The boot ROM verifies a cryptographic signature on the firmware image using a public key stored in OTP. If verification fails, the boot ROM can halt or fall back to a recovery image. This ensures that only authorized firmware runs.

TrustZone-M divides the system into secure and non-secure worlds. Secure code can access secure memory and peripherals, while non-secure code cannot. This is useful for protecting cryptographic keys or secure services. On Cortex-M33, TrustZone introduces two sets of exception vectors, secure and non-secure, and requires careful configuration of the Security Attribution Unit (SAU) and Implementation Defined Attribution Unit (IDAU). The RP2350 can leverage this to isolate a secure bootloader from user firmware. If you misconfigure TrustZone, non-secure code may gain access, or secure code may accidentally call into non-secure space.

Cryptographic acceleration matters because cryptography is expensive on microcontrollers. The RP2350 hardware SHA-256 block can offload hashing, which is central to signature verification. Combined with a hardware TRNG (true random number generator) and OTP storage, you can implement secure key provisioning and secure updates. But security features are only as good as the system design: if you allow firmware to bypass checks, or if you expose debug access, you can still be compromised.

A realistic security model includes secure boot, signed firmware images, a rollback mechanism, and a protected update path. This is exactly what Project 8 and Project 11 aim to teach. You will build a chain of trust that verifies images, switches between slots, and logs failures. The key lesson is that embedded security is a system problem: bootloader, memory map, update process, and hardware configuration all matter.

Secure boot also interacts with device lifecycle. You may want a development mode with debug access, and a production mode where debug is locked. If you enable TrustZone, you must define a secure gateway for non-secure code to call secure services. This interface must be minimal and carefully validated. Another practical consideration is key provisioning: OTP is irreversible. You should design a provisioning script that verifies the key hash before burning fuses, and you should keep a recovery path for development boards.

Secure updates are not only about signature verification. You must also prevent rollback to a vulnerable version. A common pattern is to store a monotonic version counter in OTP or secure storage and reject older images. Additionally, integrity checks should happen before any untrusted code runs. If you use a staged bootloader, keep the trusted code small to minimize the attack surface. These are system-level design decisions that the project will force you to implement.

It is also important to plan for recovery. A secure boot flow should include a minimal recovery image that can accept a new signed firmware even if the main image is corrupted. Without a recovery path, a single failure in the update process can permanently disable a device.

How This Fits in Projects

Project 8 implements secure boot and TrustZone partitioning.
Project 11 implements OTA updates with integrity and rollback.
Capstone board uses secure boot for production firmware.

Definitions & Key Terms

TrustZone-M: Arm security extension for Cortex-M33.
OTP: One-time programmable memory for keys and fuses.
Root of Trust: The first trusted key or code in the boot chain.
Secure boot: Verification of firmware authenticity before execution.
TRNG: True random number generator.

Mental Model Diagram

Power On -> Boot ROM -> Verify Signature -> Secure Bootloader
                                |
                                v
                       Non-secure application

How It Works (Step-by-Step)

Boot ROM reads firmware header and signature.
Boot ROM uses OTP-stored public key to verify signature.
If valid, boot ROM transfers control to secure bootloader.
Secure bootloader configures TrustZone and launches non-secure app.
Non-secure app runs with restricted access.

Minimal Concrete Example

// Pseudocode: verify firmware hash in bootloader
uint8_t hash[32];
sha256_hw(firmware_image, image_len, hash);
if (!ecdsa_verify(hash, signature, public_key_from_otp)) {
    fail_safe();
}

Common Misconceptions

“Security is just encryption.” Integrity and authenticity matter more than secrecy.
“TrustZone is automatic.” It requires careful memory and peripheral partitioning.
“OTP is a convenience.” It is permanent and must be programmed carefully.

Check-Your-Understanding Questions

Why does secure boot need a root of trust?
What happens if TrustZone is misconfigured?
Why is hardware SHA useful during boot?

Check-Your-Understanding Answers

The root of trust anchors all verification; without it, signatures are meaningless.
Non-secure code may access secure data, or the system may crash on secure calls.
It speeds up hashing and reduces CPU time in the boot path.

Real-World Applications

Secure IoT devices and firmware update pipelines.
Industrial controllers with tamper protection.
Devices that must comply with security certifications.

Where You’ll Apply It

Projects 8 and 11, plus the Capstone.

References

Arm TrustZone for Cortex-M: https://developer.arm.com/documentation/102418/latest/
Pico SDK SHA-256 hardware API: https://www.raspberrypi.com/documentation/pico-sdk/hardware.html#hardware_sha256
RP2350 product information portal: https://pip.raspberrypi.com/categories/1120-rp2350

Key Insight

Security is a chain, and the boot ROM is its first link.

Summary

RP2350 security features make robust secure boot possible, but only if you design the system carefully: keys, signatures, TrustZone configuration, and update logic must align.

Homework/Exercises to Practice the Concept

Design a firmware image format that includes size, version, hash, and signature.
Simulate signature verification in software using test keys.
Create a TrustZone memory map and label which regions are secure vs non-secure.

Solutions to the Homework/Exercises

Include a header with magic, version, image size, hash, and signature fields.
Use a known ECDSA library to verify a test image and log pass/fail.
Mark bootloader and key storage as secure; application and peripherals as non-secure.

Concept 9: RP2350 RISC-V Hazard3 and Dual-ISA Development

Fundamentals

RP2350 is unique because it can run either Arm Cortex-M33 cores or RISC-V Hazard3 cores. This dual-ISA capability lets you explore two architectures on one chip. Hazard3 is an open-source RISC-V core designed for microcontrollers. Understanding how to target both architectures teaches you about toolchains, ABI differences, and low-level startup code.

The dual-ISA nature of RP2350 makes it a rare teaching platform. It forces you to isolate CPU-specific code, to understand how interrupts are wired, and to think in terms of ABI compatibility. This is a transferable skill for any embedded platform, not just RP2350.

Deep Dive into the Concept

RISC-V is an open ISA with modular extensions. Hazard3 implements the RV32IMAC base with optional extensions, providing a compact, efficient core that is well-suited to microcontrollers. The toolchain for RISC-V uses GCC or LLVM with a different target triple, different startup code, and different ABI. When you switch between Cortex-M33 and Hazard3, you must adjust linker scripts, startup vectors, and interrupt handling. The fundamental memory map of the RP2350 remains the same, but the CPU-specific peripherals (like the NVIC for Arm) are different.

One key difference is how exceptions and interrupts are handled. Cortex-M uses a vector table in memory with fixed entries. RISC-V uses a trap vector base and software-managed trap handlers. This changes how you write startup code and how you handle faults. Another difference is instruction set: Cortex-M is Thumb-based, while RISC-V uses a clean RISC architecture with fixed-length instructions (with compressed C extension). This affects code size and performance profiles.

Dual-ISA development is not just about learning two assembly syntaxes. It is about understanding what parts of the system are CPU-specific and what parts are peripheral-specific. The GPIO, PIO, DMA, and USB blocks are the same regardless of core. Therefore, driver code can be portable if it does not depend on CPU-specific features. This is a powerful lesson in hardware abstraction. Your HAL (hardware abstraction layer) should separate core-specific startup and exception handling from peripheral drivers.

From a practical standpoint, the RP2350 gives you a real-world environment to compare toolchains, debuggers, and performance. You can run a small benchmark or a PIO-driven project on both cores and measure the differences. This is valuable if you plan to work on RISC-V in industry, because you gain intuition on interrupts, ABI, and toolchain behavior.

The practical differences between Arm and RISC-V show up in toolchains, calling conventions, and debugging workflows. The register sets are different, the ABI defines different argument passing rules, and the exception models are not compatible. This means that inline assembly, startup code, and linker scripts must be tailored per architecture. However, once you isolate those pieces, the rest of the firmware can be shared. This is why HAL design is crucial.

Hazard3 also supports standard RISC-V CSRs for machine-level interrupt control. You must configure mstatus, mie, and mtvec correctly to receive interrupts. Many early bring-up bugs come from forgetting to enable machine interrupts or misconfiguring the trap handler. GDB debugging also differs slightly, as the register names and exception state differ from Cortex-M. The exercise of bringing up both cores teaches you to think in terms of architecture, not vendor.

A practical exercise is to build the same peripheral driver for both cores and compare the generated assembly. This helps you see how compilers map C code to different ISAs and teaches you which patterns are portable and which rely on architecture-specific assumptions.

Another useful comparison is interrupt latency: measure how many cycles elapse between a timer event and a GPIO toggle on each core. This teaches you how pipeline depth, toolchain options, and interrupt entry sequences affect real-time behavior.

How This Fits in Projects

Project 9 focuses on RISC-V bare metal programming.
Project 11 and Capstone can use either ISA to compare toolchain output.

Definitions & Key Terms

ISA: Instruction Set Architecture.
RV32IMAC: RISC-V 32-bit base + multiply + atomic + compressed instructions.
ABI: Application Binary Interface (calling conventions, registers).
Trap vector: RISC-V interrupt/exception entry point.

Mental Model Diagram

   Same peripherals
        |
        v
   +-----------------+
   |   RP2350 SoC    |
   +-----------------+
     |           |
     v           v
 Cortex-M33   RISC-V Hazard3

How It Works (Step-by-Step)

Choose core in boot configuration (Arm or RISC-V).
Build firmware with the correct toolchain and startup code.
Initialize vector table or trap handler.
Configure clocks and peripherals (shared across cores).
Run application code.

Minimal Concrete Example

// RISC-V minimal trap handler (pseudocode)
void trap_handler(void) {
    uint32_t cause = read_csr(mcause);
    // handle interrupt or fault
}

Common Misconceptions

“RISC-V is automatically faster.” Performance depends on pipeline and memory system.
“Peripherals differ between cores.” The peripherals are the same; only CPU-specific logic differs.
“Toolchains are interchangeable.” Startup and linker scripts are architecture-specific.

Check-Your-Understanding Questions

What changes when you switch from Cortex-M33 to Hazard3?
Why is a HAL important when supporting two ISAs?
What is the role of the trap vector in RISC-V?

Check-Your-Understanding Answers

Toolchain target, startup code, exception/interrupt handling.
It isolates CPU-specific code and keeps peripheral drivers portable.
It is the entry point for interrupts and exceptions.

Real-World Applications

Porting firmware between Arm and RISC-V.
Evaluating architecture tradeoffs for product design.
Building portable embedded libraries.

Where You’ll Apply It

Project 9 and the Capstone.

References

Hazard3 core repository: https://github.com/Wren6991/Hazard3
RP2350 product information portal: https://pip.raspberrypi.com/categories/1120-rp2350

Key Insight

Dual-ISA development teaches you what is truly hardware-specific and what is portable.

Summary

RP2350 offers a rare opportunity to learn both Arm and RISC-V on one board. This makes you a better embedded engineer because you can reason about architectures rather than memorizing one.

Homework/Exercises to Practice the Concept

Build a “Hello GPIO” program for both Arm and RISC-V and compare binary size.
Implement a timer interrupt on both cores and compare latency.
Port a simple PIO example between the two cores using a HAL.

Solutions to the Homework/Exercises

Use different toolchain targets and compare output sizes with size.
Use a scope to measure interrupt response time for each core.
Keep peripheral drivers unchanged and swap only startup code.

Glossary

APB/AHB: Peripheral and system buses used to connect cores and peripherals.
Boot ROM: On-chip ROM that performs initial boot and flash setup.
DREQ: DMA request signal from a peripheral.
FIFO: First-in-first-out buffer for PIO or inter-core communication.
GPIO: General-purpose input/output pin.
IO_BANK: Logic block for GPIO function selection and interrupts.
PIO: Programmable I/O, a deterministic state machine for bit-level protocols.
PADS: Electrical configuration for pins (pull-ups, drive strength).
SIO: Single-cycle I/O, fast GPIO and synchronization block.
TrustZone: Arm security extension separating secure and non-secure execution.
XIP: Execute-in-place from external flash.

Why RP2040/RP2350 Matters

Modern embedded systems need more than blinking LEDs: they need deterministic timing, flexible I/O, and secure update paths. RP2040 and RP2350 deliver that at a price point usually reserved for far simpler chips. RP2040 offers dual cores, a large SRAM, and PIO for custom protocols. RP2350 adds more SRAM, security features, and a dual-ISA architecture. This combination makes these chips excellent for learning and for production-grade products that need custom I/O or secure boot.

Real-world stats and impact:

Raspberry Pi states RP2040 will remain in production until at least January 2041, making it viable for long-lived products.
RP2350A is announced at an aggressive price point (around $0.80 in volume), which enables cost-sensitive designs with advanced security.
In FY2024, Raspberry Pi Holdings reported 5.7 million microcontroller units shipped, indicating strong adoption in commercial products.

Context & Evolution

Before RP2040, low-cost microcontrollers often came with limited documentation and vendor-specific quirks. Raspberry Pi took a different path: a fully documented design, a large open ecosystem, and a focus on education. RP2350 extends this approach by adding security and RISC-V support, acknowledging industry demand for secure and open architectures.

Old Model (Closed MCU)           New Model (RP2040/RP2350)
+---------------------+          +-------------------------+
| Vendor SDK only     |          | Full docs + open tools  |
| Limited IO          |          | PIO for custom protocols|
| Proprietary security|          | TrustZone + OTP + SHA   |
+---------------------+          +-------------------------+

Concept Summary Table

Concept	What You Must Internalize	Key Projects
Boot ROM, XIP, Memory Map	Boot flow, vector table, flash layout, XIP timing	1, 8, 11, Capstone
Clocking, Reset, Power	Clock tree, PLLs, resets, low-power modes	1, 2, 5, 7, 10
GPIO, Pads, SIO, IRQs	Pin muxing, pad settings, fast I/O, interrupts	1, 2, 4, 7, 12
PIO Microarchitecture	State machines, instruction timing, FIFO, side-set	2, 4, 7, 12
DMA Pipelines	DREQ pacing, chaining, ring buffers	4, 5, 7, 12
Dual-Core + RT	Spinlocks, FIFO, scheduling, memory ordering	4, 5, 6, 7, 10
USB Device/Host	Descriptors, endpoints, enumeration	3, 10, 11
Security + TrustZone	Secure boot, OTP, SHA, TrustZone	8, 11, Capstone
RISC-V Hazard3	Toolchains, traps, dual-ISA portability	9, Capstone

Project-to-Concept Map

Project	Concepts Applied
1. Bare-Metal Blinky	Boot ROM/XIP, Clocks, GPIO/SIO
2. PIO LED Controller	PIO, GPIO/PADS, Clocks
3. USB HID Keyboard	USB, Clocks, DMA (optional)
4. Logic Analyzer	PIO, DMA, Dual-core, GPIO
5. Audio Synthesizer	DMA, Dual-core, Clocks
6. Real-Time Scheduler	Dual-core/RT, Interrupts
7. VGA Display	PIO, DMA, Clocks, Dual-core
8. Secure Boot	Security/TrustZone, Boot ROM/XIP
9. RISC-V Bare Metal	RISC-V Hazard3, Boot ROM/XIP
10. USB Host	USB, Dual-core, Clocks
11. Custom Bootloader	Boot ROM/XIP, Security, DMA
12. I2C/SPI Analyzer	PIO, DMA, GPIO
Capstone Board	All concepts

Deep Dive Reading by Concept

Concept	Primary References	Book Support
Boot ROM/XIP	RP2040 docs, UF2 spec	“Computer Systems: A Programmer’s Perspective” Ch. 1-3; “Making Embedded Systems” Ch. 2-4
Clocking/Reset	RP2040/RP2350 docs	“Making Embedded Systems” Ch. 5 (Timing)
GPIO/SIO	RP2040 docs	“The Art of Electronics” (pin drive and signal integrity)
PIO	Pico SDK PIO docs	“RP2040 Assembly Language Programming” Ch. 10-12
DMA	Pico SDK DMA docs	“Making Embedded Systems” Ch. 9 (DMA section)
Dual-Core/RT	Pico SDK sync docs	“Operating Systems: Three Easy Pieces” Ch. 28-30
USB	TinyUSB docs	“USB Complete” Ch. 1-6, 11
Security/TrustZone	Arm TrustZone docs, RP2350 portal	“Serious Cryptography” Ch. 7
RISC-V	Hazard3 docs	“Computer Organization and Design RISC-V Edition” Ch. 2-3

Quick Start

If you are overwhelmed, do this in the first 48 hours:

Day 1:

Skim Concepts 1-3 (boot, clocks, GPIO) and build Project 1 (Blinky).
Use a scope or logic analyzer to confirm your GPIO timing.

Day 2:

Read Concept 4 (PIO).
Build Project 2 (WS2812) and verify timing with a scope.

After that, follow a learning path below.

Recommended Learning Paths

1) Beginner Embedded Path

Projects: 1 -> 2 -> 3 -> 4
Focus: boot flow, PIO timing, USB enumeration, capture pipelines

2) Systems and Performance Path

Projects: 4 -> 5 -> 6 -> 7
Focus: DMA pipelines, multicore scheduling, timing analysis

3) Security and Architecture Path

Projects: 8 -> 9 -> 11 -> Capstone
Focus: secure boot, dual-ISA development, update strategies

Success Metrics

You can boot a bare-metal program without the SDK and explain every step.
You can design a PIO program and validate its waveform timing on a scope.
You can stream data using DMA without dropouts or CPU stalls.
You can build a USB HID device and debug enumeration errors.
You can implement a signed firmware update flow and explain the trust model.
You can build and debug a custom RP2040/RP2350 dev board.

Appendix: Tooling and Debugging Cheatsheet

picotool: flash, erase, and inspect device state.
OpenOCD + GDB: step through boot code and inspect registers.
Logic analyzer: validate PIO timing and capture protocols.
USB sniffing: use usbmon on Linux or a USB analyzer.
Scope: verify clock, PWM, and VGA timing.

Project Overview Table

#	Project	Difficulty	Time	Coolness	Key Learning
1	Bare-Metal Blinky	Advanced	1-2 weeks	Level 4	Boot process, clocks, memory map
2	PIO LED Controller	Intermediate	Weekend	Level 5	PIO basics, timing, DMA
3	USB HID Keyboard	Advanced	1-2 weeks	Level 4	USB protocol, TinyUSB
4	Logic Analyzer	Advanced	2-4 weeks	Level 5	PIO capture, DMA, multicore
5	Audio Synthesizer	Advanced	2-4 weeks	Level 5	I2S, DMA double-buffer, DSP
6	Real-Time Scheduler	Expert	3-6 weeks	Level 4	Context switching, RT scheduling
7	VGA Display	Expert	3-6 weeks	Level 5	Video timing, multi-SM sync
8	Secure Boot (RP2350)	Master	2-4 weeks	Level 4	Cryptography, OTP, TrustZone
9	RISC-V Programming	Expert	2-3 weeks	Level 5	Dual ISA, toolchains
10	USB Host	Expert	2-3 weeks	Level 4	USB host mode, enumeration
11	Custom Bootloader	Expert	3-4 weeks	Level 4	Flash, A/B updates, rollback
12	I2C/SPI Analyzer	Advanced	2-3 weeks	Level 4	Protocol decoding, debugging
13	Capstone Board	Master	2-3 months	Level 5	Hardware design, full-stack embedded

Project List

Project 1: Bare-Metal Blinky (No SDK)

Main Programming Language: C
Alternative Programming Languages: Assembly, Rust
Difficulty: Advanced
Knowledge Area: Boot flow, clocks, GPIO, memory map

Real World Outcome

You can flash a raw binary and watch the Pico LED blink at exactly 1 Hz, with a timing derived from your own PLL configuration. You can explain every register write from power-on to main(). The LED blink rate does not drift because your clock setup is correct.

Example output and evidence:

$ arm-none-eabi-objcopy -O binary blinky.elf blinky.bin
$ ./crc32_patch blinky.bin
Patching blinky.bin: CRC32 = 0x5A34B72D

$ picotool load blinky.bin
Loading into flash: [==============================] 100%
The device was rebooted to start the application.

# On a scope you see a clean 1 Hz square wave on GPIO25.

The Core Question You’re Answering

“What exactly happens between power-on and the first instruction in my firmware?”

Concepts You Must Understand First

Boot ROM and stage-2 boot process (Concept 1)
Clock tree and PLL setup (Concept 2)
GPIO, pads, and SIO (Concept 3)

Questions to Guide Your Design

Where is the vector table located and how is it constructed?
How do you configure XOSC and PLL_SYS without SDK helpers?
How do you ensure GPIO25 is correctly configured as SIO output?
What is the minimal linker script to place .text and .data correctly?

Thinking Exercise

Draw the full boot pipeline from ROM to your C main(). Label every memory region touched.

The Interview Questions They’ll Ask

Explain the RP2040 two-stage boot process.
Why does XIP require a stage-2 boot block?
How do you configure clocks on a bare-metal MCU?
What is the difference between IO_BANK and SIO?
Why is volatile required for memory-mapped I/O?

Hints in Layers

Hint 1: Start by reading the RP2040 memory map and locate the boot ROM.
Hint 2: Configure XOSC first, then PLL_SYS, then switch CLK_SYS.
Hint 3: Use SIO registers for fast GPIO toggling.
Hint 4: Keep the linker script minimal: .text in flash, .data in SRAM.

Books That Will Help

Common Pitfalls & Debugging

Problem: “LED does not blink”

Why: PLL not configured or GPIO not mapped to SIO.
Fix: Verify clock registers and IO_BANK function select.
Quick test: Toggle GPIO via SIO and check on scope.

Problem: “Boot hangs before main”

Why: Incorrect vector table or bad stage-2 header.
Fix: Validate vector table address and CRC.
Quick test: Halt CPU with debugger and inspect PC.

Definition of Done

Code boots without the Pico SDK.
LED blinks at the expected frequency (verified with a scope).
You can explain each step from ROM to main().
Linker script places .text in flash and .data in SRAM.

Project 2: PIO LED Strip Controller (WS2812B)

Main Programming Language: C + PIO assembly
Difficulty: Intermediate
Knowledge Area: PIO timing, GPIO, DMA

Real World Outcome

A 60-LED strip displays smooth animations at full brightness with no flicker. Your PIO program outputs a 800 kHz WS2812 waveform with correct timing. You can switch between CPU-driven and DMA-driven feed and see the difference in jitter on a scope.

Example output:

$ picotool load build/ws2812.uf2 -f

WS2812 Controller v1.0
- PIO SM0 @ 8 MHz
- 60 LEDs
- DMA: enabled

[OK] Animation: rainbow sweep
[OK] Frame rate: 120 fps

The Core Question You’re Answering

“How do I generate nanosecond-accurate waveforms without burning CPU cycles?”

Concepts You Must Understand First

PIO instruction timing (Concept 4)
GPIO and pad drive strength (Concept 3)
Clock configuration and dividers (Concept 2)

Questions to Guide Your Design

What clock divider yields the exact WS2812 bit timings?
How do you encode 24-bit RGB data into PIO FIFO format?
How do you handle reset timing between frames?
When does DMA improve jitter compared to CPU feeding?

Thinking Exercise

Sketch the WS2812 bit timing (T0H/T0L/T1H/T1L). Then map each timing segment to PIO instruction delays.

The Interview Questions They’ll Ask

Why is PIO better than bit-banging for LED timing?
How do you compute PIO timing from the system clock?
What is the role of the PIO FIFO in streaming LED data?
How would you scale to 1000 LEDs?

Hints in Layers

Hint 1: Use side-set to toggle the data pin without extra instructions.
Hint 2: Precompute a packed buffer and let DMA feed the FIFO.
Hint 3: Use a reset delay by adding idle cycles after the last bit.
Hint 4: Confirm timing on a scope before trusting the output.

Books That Will Help

Common Pitfalls & Debugging

Problem: “LED colors are wrong”

Why: Data order (GRB vs RGB) or bit order is incorrect.
Fix: Adjust data packing and verify with single LED test.
Quick test: Send a single red pixel and observe actual color.

Problem: “Flicker at high brightness”

Why: Timing jitter or insufficient reset interval.
Fix: Add reset delay and validate timing on scope.
Quick test: Reduce clock speed and compare stability.

Definition of Done

PIO program generates correct WS2812 timing (validated on scope).
LED animations run without flicker for 10 minutes.
DMA feed works and frees CPU time.
You can explain the mapping from timing diagram to instructions.

Project 3: USB HID Keyboard Emulator

Main Programming Language: C (TinyUSB)
Difficulty: Advanced
Knowledge Area: USB descriptors, HID reports, timing

Real World Outcome

When you plug the Pico into a PC, it enumerates as a USB keyboard and types a scripted sequence. Your device appears in the host device manager and passes HID compliance checks.

Example output:

$ picotool load build/hid_keyboard.uf2 -f

USB HID Keyboard Ready
- VID:PID 0xCafe:0x4000
- Report interval: 10 ms
- Typing test phrase...

HELLO_FROM_RP2040

The Core Question You’re Answering

“How does a USB host decide what a device is, and how do I convince it?”

Concepts You Must Understand First

USB enumeration and descriptors (Concept 7)
Clock accuracy (Concept 2)
DMA/interrupt scheduling basics (Concept 5/6)

Questions to Guide Your Design

What descriptors are required for a HID keyboard?
How do you structure HID reports for key press and release?
How do you handle USB suspend and resume?
What timing constraints exist for USB control transfers?

Thinking Exercise

Write a minimal HID report descriptor for one key and explain each byte.

The Interview Questions They’ll Ask

What happens during USB enumeration?
What is an endpoint, and why is endpoint 0 special?
How does HID differ from CDC or vendor-specific classes?
What causes USB enumeration failures on MCUs?

Hints in Layers

Hint 1: Start from TinyUSB HID examples and modify descriptors.
Hint 2: Validate descriptors with a USB descriptor parser.
Hint 3: Ensure the USB clock is exactly 48 MHz.
Hint 4: Add logging for SETUP requests to debug enumeration.

Books That Will Help

Common Pitfalls & Debugging

Problem: “Device not recognized”

Why: Incorrect descriptor length or class codes.
Fix: Validate descriptor bytes and lengths.
Quick test: Compare with TinyUSB reference descriptor.

Problem: “Random disconnects”

Why: USB clock drift or long interrupt masking.
Fix: Verify clock config and reduce ISR latency.
Quick test: Use a scope to confirm 48 MHz stability.

Definition of Done

Device enumerates consistently on two different PCs.
HID report sends correct key presses.
USB descriptors validated with a tool.
No disconnects during 10-minute test.

Project 4: Logic Analyzer with PIO

Main Programming Language: C + PIO assembly
Difficulty: Advanced
Knowledge Area: PIO capture, DMA, multicore, protocol analysis

Real World Outcome

A desktop tool shows captured digital waveforms in real time. You connect the Pico to a target bus and capture 1-8 channels at up to several MHz. The device streams captures to a host PC using USB or UART, and you can export data to Saleae/Sigrok formats.

Example output:

$ picotool load build/logic_analyzer.uf2 -f

Logic Analyzer v1.0
- Channels: 8
- Sample rate: 10 MHz
- Buffer: 64 KB ring

[Capture] Trigger: rising edge on CH0
[Capture] 4096 samples stored
[Host] Exported to capture.sr

The Core Question You’re Answering

“How do I capture fast digital signals without dropping samples?”

Concepts You Must Understand First

PIO capture timing (Concept 4)
DMA ring buffers (Concept 5)
Dual-core buffer handling (Concept 6)
GPIO pad configuration (Concept 3)

Questions to Guide Your Design

How do you align sample timing with the system clock?
How large should your buffers be for a 1-second capture?
How do you implement trigger detection in PIO vs CPU?
How will you stream data to the host without blocking capture?

Thinking Exercise

Design a sampling pipeline: PIO -> FIFO -> DMA -> ring buffer -> host. Label each rate and buffer size.

The Interview Questions They’ll Ask

Why use PIO instead of GPIO polling?
How do you prevent DMA overruns?
What are the limits of USB or UART streaming?
How do you implement a trigger in hardware vs software?

Hints in Layers

Hint 1: Use PIO to sample pins into FIFO at fixed rate.
Hint 2: Configure DMA in ring buffer mode.
Hint 3: Use core 1 to drain buffers and format data.
Hint 4: Implement a trigger by comparing sample patterns in PIO.

Books That Will Help

Common Pitfalls & Debugging

Problem: “Captured data is corrupted”

Why: DMA alignment or buffer overruns.
Fix: Align buffers to 4 bytes and increase ring size.
Quick test: Capture a known square wave and verify pattern.

Problem: “Trigger never fires”

Why: PIO trigger logic incorrect or pin mapping wrong.
Fix: Test with a simple edge trigger first.
Quick test: Trigger on a fixed GPIO pulse.

Definition of Done

Can capture 8 channels at 10 MHz without dropping samples.
Trigger works on at least one channel edge.
Data can be exported to a PC and visualized.
Buffer overrun detection and reporting works.

Project 5: Dual-Core Audio Synthesizer

Main Programming Language: C
Difficulty: Advanced
Knowledge Area: DMA audio, timing, DSP, dual-core

Real World Outcome

You hear clean audio generated by the RP2040/RP2350 with no clicks. The system can play multiple oscillators, envelopes, and filters in real time. One core generates samples, the other handles UI and MIDI input. DMA streams audio to a DAC or PWM output.

Example output:

$ picotool load build/synth.uf2 -f

Synth v1.0
- Sample rate: 48 kHz
- Voices: 4
- DMA: double-buffered

[OK] Note on: C4
[OK] Filter cutoff: 800 Hz

The Core Question You’re Answering

“How do I stream continuous audio without glitching while doing other work?”

Concepts You Must Understand First

DMA double buffering (Concept 5)
Clocking and timing (Concept 2)
Dual-core partitioning (Concept 6)

Questions to Guide Your Design

What sample rate is feasible given CPU and DMA constraints?
How large should each audio buffer be to avoid underruns?
Which core should handle UI or MIDI input?
How do you implement a simple low-pass filter efficiently?

Thinking Exercise

Design a pipeline: oscillator -> mixer -> filter -> buffer -> DMA -> DAC. Label each stage with CPU cost.

The Interview Questions They’ll Ask

What causes audio clicks and pops in DMA streaming?
How do you implement double buffering?
Why is timing jitter audible?
How do you compute a simple IIR filter?

Hints in Layers

Hint 1: Start with a single oscillator and fixed frequency.
Hint 2: Add DMA double buffering and verify stable output.
Hint 3: Move UI/MIDI handling to the second core.
Hint 4: Add a simple ADSR envelope once timing is stable.

Books That Will Help

Common Pitfalls & Debugging

Problem: “Clicks in audio”

Why: Buffer underruns or DMA timing mismatch.
Fix: Increase buffer size and use double buffering.
Quick test: Output a constant tone and check for interruptions.

Problem: “CPU overload”

Why: Too many voices or heavy filters on one core.
Fix: Reduce voices or offload control logic to core 1.
Quick test: Measure CPU time per buffer.

Definition of Done

48 kHz audio plays continuously for 10 minutes.
DMA double buffering runs without underrun errors.
Two cores are used with clear task separation.
At least one filter or envelope is implemented.

Project 6: Multicore Real-Time Scheduler

Main Programming Language: C
Difficulty: Expert
Knowledge Area: RT scheduling, interrupts, dual-core

Real World Outcome

You build a tiny scheduler that runs periodic tasks with measured jitter. One task toggles GPIO at a precise interval, another logs performance metrics, and a third handles background I/O. You can prove timing with logs and scope measurements.

Example output:

Scheduler v1.0
Task A (1 ms): jitter max 6 us
Task B (10 ms): jitter max 12 us
Task C (100 ms): jitter max 40 us

The Core Question You’re Answering

“How do I guarantee time-critical tasks meet deadlines on a microcontroller?”

Concepts You Must Understand First

Dual-core synchronization (Concept 6)
Timer interrupts and priority
Memory ordering and critical sections

Questions to Guide Your Design

Is your scheduler preemptive or cooperative?
How do you measure jitter and deadline misses?
How do you assign priorities to tasks?
What happens if a task overruns its budget?

Thinking Exercise

Define three tasks with different deadlines and compute a schedule. Then simulate a worst-case overrun.

The Interview Questions They’ll Ask

What is jitter and how do you measure it?
What is priority inversion and how do you avoid it?
When would you use a cooperative scheduler vs preemptive?
How do you debug missed deadlines?

Hints in Layers

Hint 1: Start with a simple timer tick and a task list.
Hint 2: Record timestamps at task start and end.
Hint 3: Pin time-critical task to core 1.
Hint 4: Use a watchdog to reset on missed deadlines.

Books That Will Help

Common Pitfalls & Debugging

Problem: “Jitter spikes”

Why: Long ISRs or shared locks.
Fix: Minimize ISR work and use lock-free queues.
Quick test: Disable optional logging and compare jitter.

Problem: “Task starvation”

Why: Improper priority or overrun handling.
Fix: Add time budgets and yield points.
Quick test: Log task execution counts.

Definition of Done

Scheduler runs at least 3 periodic tasks.
Jitter is measured and logged.
Deadline misses are detected and handled.
Core separation is used for time-critical work.

Project 7: VGA Display via PIO

Main Programming Language: C + PIO assembly
Difficulty: Expert
Knowledge Area: Video timing, PIO, DMA, multicore

Real World Outcome

You connect a VGA monitor and see a stable 640x480 display generated by the RP2040/RP2350. Text and simple graphics render without jitter. Sync pulses are correct and stable, confirmed on a scope.

Example output:

VGA v1.0
- Resolution: 640x480@60Hz
- Pixel clock: 25.175 MHz (approx)
- DMA: enabled

[OK] Sync lock achieved
[OK] Frame rate stable

The Core Question You’re Answering

“How do I generate a video signal with strict timing on a microcontroller?”

Concepts You Must Understand First

PIO timing and multi-SM sync (Concept 4)
DMA streaming (Concept 5)
Clock configuration (Concept 2)
Dual-core partitioning (Concept 6)

Questions to Guide Your Design

How do you generate HSYNC and VSYNC pulses with PIO?
How do you align pixel data with sync timing?
What buffer size is required for a single scanline?
How do you manage memory bandwidth for video?

Thinking Exercise

Draw a VGA timing diagram and map each timing segment to PIO instructions and DMA transfers.

The Interview Questions They’ll Ask

What are HSYNC and VSYNC, and why are they required?
How do you generate a pixel clock without dedicated hardware?
How does DMA help with video throughput?
What limits the maximum resolution on RP2040?

Hints in Layers

Hint 1: Use separate state machines for sync and data.
Hint 2: Generate a fixed pixel clock from CLK_SYS with a divider.
Hint 3: Use DMA to stream scanlines to PIO FIFO.
Hint 4: Reduce color depth to fit memory bandwidth.

Books That Will Help

Common Pitfalls & Debugging

Problem: “No display or unstable sync”

Why: Incorrect timing or sync pulse widths.
Fix: Validate timing with scope and adjust PIO delays.
Quick test: Output only sync signals and measure.

Problem: “Image tearing”

Why: DMA underruns or buffer not filled in time.
Fix: Increase buffer or dedicate a core to rendering.
Quick test: Render a static image and check stability.

Definition of Done

VGA display locks and shows a stable image for 5 minutes.
HSYNC/VSYNC pulses match timing spec.
DMA streaming runs without underruns.
At least one simple graphic demo works.

Project 8: RP2350 Secure Boot Implementation

Main Programming Language: C
Difficulty: Master
Knowledge Area: Security, boot chain, TrustZone

Real World Outcome

You boot signed firmware images. A corrupted or unsigned image is rejected, and the system falls back to a recovery image. Secure and non-secure code regions are separated, and cryptographic verification is performed in hardware.

Example output:

Secure Boot v1.0
- Slot A: valid signature
- Slot B: invalid signature

[BOOT] Verified Slot A
[BOOT] TrustZone configured
[BOOT] Launching non-secure app

The Core Question You’re Answering

“How do I guarantee only trusted firmware runs on my device?”

Concepts You Must Understand First

Secure boot and TrustZone (Concept 8)
Boot ROM and flash layout (Concept 1)
DMA and hashing (Concept 5)

Questions to Guide Your Design

What is the root of trust and where is it stored?
What metadata do you include in the firmware header?
How will you handle rollback on failed verification?
How do you partition secure vs non-secure memory?

Thinking Exercise

Design a firmware header that includes size, version, hash, and signature. Explain how each field is used.

The Interview Questions They’ll Ask

What is secure boot and why is it important?
How does TrustZone isolate secure code?
What happens if OTP keys are programmed incorrectly?
How do you prevent rollback attacks?

Hints in Layers

Hint 1: Use hardware SHA-256 to compute firmware hashes.
Hint 2: Store public keys in OTP and verify signatures at boot.
Hint 3: Add monotonic version counters to prevent rollback.
Hint 4: Keep a minimal secure bootloader and small trusted code base.

Books That Will Help

Common Pitfalls & Debugging

Problem: “Secure boot fails for all images”

Why: Incorrect key in OTP or hash mismatch.
Fix: Verify hashing and signature process on a PC first.
Quick test: Use a known-good test key and image.

Problem: “Non-secure code cannot access peripherals”

Why: TrustZone attribution misconfigured.
Fix: Adjust SAU/IDAU settings and re-test.
Quick test: Toggle a known peripheral from non-secure code.

Definition of Done

Signed images are verified and boot successfully.
Invalid images are rejected with clear logging.
Secure and non-secure memory regions are enforced.
Rollback protection implemented.

Project 9: RP2350 RISC-V Bare Metal Programming

Main Programming Language: C + RISC-V assembly
Difficulty: Expert
Knowledge Area: RISC-V startup, toolchains, traps

Real World Outcome

You build and run firmware on the RP2350 RISC-V cores, toggling GPIO and handling a timer interrupt. You can compare performance and binary size against the Arm build.

Example output:

RISC-V bringup v1.0
- Core: Hazard3
- Toolchain: riscv64-unknown-elf-gcc
- GPIO toggle: 1 kHz

[OK] Timer interrupt fired

The Core Question You’re Answering

“What changes when I switch architectures, and what stays the same?”

Concepts You Must Understand First

RISC-V Hazard3 architecture (Concept 9)
Boot and memory map (Concept 1)
GPIO/SIO registers (Concept 3)

Questions to Guide Your Design

What startup code is needed for RISC-V?
How do you set the trap vector and enable interrupts?
What differences exist in ABI and calling conventions?
How do you share drivers between Arm and RISC-V builds?

Thinking Exercise

List all firmware components that are CPU-specific vs peripheral-specific.

The Interview Questions They’ll Ask

How does RISC-V handle interrupts compared to Cortex-M?
What is the role of the trap handler?
How would you structure a HAL to support both ISAs?
What is the benefit of the C extension (compressed instructions)?

Hints in Layers

Hint 1: Start with a minimal startup that sets the stack and jumps to main.
Hint 2: Implement a trap handler that toggles a pin.
Hint 3: Reuse peripheral drivers from Arm build.
Hint 4: Use objdump to inspect differences in code size.

Books That Will Help

Common Pitfalls & Debugging

Problem: “No interrupts firing”

Why: Trap vector not set or interrupts not enabled.
Fix: Configure mtvec and enable mie/mstatus bits.
Quick test: Trigger a timer interrupt and toggle a pin.

Problem: “Program runs but GPIO does not toggle”

Why: Startup did not initialize clocks or IO function.
Fix: Reuse clock and GPIO init from Arm build.
Quick test: Verify IO_BANK and SIO settings.

Definition of Done

RISC-V firmware boots and toggles GPIO.
Timer interrupt handler runs reliably.
Build system supports both Arm and RISC-V targets.
Peripheral drivers are shared across builds.

Project 10: USB Host Mode (Keyboard/Mouse Reader)

Main Programming Language: C (TinyUSB host)
Difficulty: Expert
Knowledge Area: USB host scheduling, power, descriptors

Real World Outcome

You connect a USB keyboard or mouse to your RP2040/RP2350 board (with proper VBUS power), and your firmware prints keycodes or movement data in real time. The host stack handles device enumeration and polling at 1 ms intervals.

Example output:

USB Host v1.0
- VBUS: enabled
- Device detected: HID Keyboard

Key: A
Key: B
Key: Enter

The Core Question You’re Answering

“How do I build a reliable USB host on a microcontroller with limited resources?”

Concepts You Must Understand First

USB fundamentals (Concept 7)
Clock accuracy and timing (Concept 2)
Dual-core scheduling (Concept 6)

Questions to Guide Your Design

How do you supply VBUS power safely?
How do you enumerate and identify HID devices?
How do you schedule interrupt transfers reliably?
How do you handle device disconnects and reconnects?

Thinking Exercise

Draw the USB host state machine from device attach to polling.

The Interview Questions They’ll Ask

What is the difference between USB host and device roles?
How do you implement HID polling on a microcontroller?
What are the timing constraints for USB host mode?
What are common causes of host enumeration failure?

Hints in Layers

Hint 1: Start with TinyUSB host examples.
Hint 2: Ensure stable 48 MHz clock and correct VBUS power.
Hint 3: Use a separate core or high-priority task for USB polling.
Hint 4: Log each USB state transition to debug enumeration.

Books That Will Help

Common Pitfalls & Debugging

Problem: “Device not detected”

Why: VBUS not powered or D+ pull-up not seen.
Fix: Verify VBUS power and USB wiring.
Quick test: Measure VBUS voltage and D+ line.

Problem: “Enumeration fails intermittently”

Why: Timing jitter or ISR latency.
Fix: Raise USB task priority and reduce blocking code.
Quick test: Log frame intervals and compare to 1 ms.

Definition of Done

Keyboard and mouse devices enumerate reliably.
HID reports are decoded correctly.
Device disconnect/reconnect is handled.
USB host task runs without missed frames.

Project 11: Custom Bootloader with OTA Updates

Main Programming Language: C
Difficulty: Expert
Knowledge Area: Flash layout, boot strategy, integrity

Real World Outcome

You build a dual-slot A/B firmware update system. Your bootloader verifies image integrity, selects the active slot, and rolls back if the new firmware fails. Updates can be delivered over USB or UART, and the device never bricks.

Example output:

Bootloader v1.0
- Slot A: version 1.2.0 (valid)
- Slot B: version 1.3.0 (pending)

[BOOT] Attempting Slot B
[BOOT] Slot B failed health check
[BOOT] Rolling back to Slot A

The Core Question You’re Answering

“How do I update firmware safely without bricking the device?”

Concepts You Must Understand First

Boot ROM and flash layout (Concept 1)
DMA and hashing (Concept 5)
Security and verification (Concept 8)

Questions to Guide Your Design

How do you partition flash for A/B images?
What metadata do you store to track active and pending slots?
How do you verify image integrity quickly?
How do you detect a failed boot and roll back?

Thinking Exercise

Draw a flash layout and annotate each region (bootloader, slot A, slot B, metadata).

The Interview Questions They’ll Ask

What is a watchdog and how does it help with rollback?
How do you detect a failed update?
What are the risks of partial flash writes?
How would you implement OTA updates over USB?

Hints in Layers

Hint 1: Start with a fixed flash layout and simple metadata struct.
Hint 2: Use CRC32 or SHA-256 to validate images.
Hint 3: Use a watchdog to detect boot failures.
Hint 4: Keep bootloader minimal and separate from application.

Books That Will Help

Common Pitfalls & Debugging

Problem: “Bootloop after update”

Why: Health check never marks success.
Fix: Add a “confirm” flag after successful run.
Quick test: Log boot metadata and reset counts.

Problem: “Image fails validation”

Why: CRC or hash computed over wrong region.
Fix: Verify image size and header offsets.
Quick test: Compute hash on PC and compare.

Definition of Done

A/B update works with automatic rollback.
Integrity checks pass for valid images and fail for corrupted ones.
Boot metadata stored and updated safely.
Bootloader remains functional after repeated updates.

Project 12: PIO-Based I2C/SPI Analyzer

Main Programming Language: C + PIO assembly
Difficulty: Advanced
Knowledge Area: PIO decoding, protocol analysis

Real World Outcome

You connect the analyzer to an I2C or SPI bus and see decoded transactions in real time: addresses, read/write bits, data bytes, ACK/NAK, and timing warnings. You can identify protocol errors without modifying the device under test.

Example output:

I2C Analyzer v1.0
SDA: GPIO4, SCL: GPIO5
Mode: Standard (100 kHz)

[0.001234] START
[0.001245] Addr 0x68 (WRITE)
[0.001267] ACK
[0.001289] Data 0x3B
[0.001334] RESTART
[0.001356] Addr 0x68 (READ)
[0.001378] ACK
[0.001400] Data 0xFC
[0.001532] NAK
[0.001576] STOP

The Core Question You’re Answering

“How do I observe a live bus without disturbing it?”

Concepts You Must Understand First

PIO edge detection (Concept 4)
GPIO input and pad configuration (Concept 3)
DMA buffering (Concept 5)

Questions to Guide Your Design

How do you detect I2C START and STOP conditions reliably?
How do you decode SPI modes (CPOL/CPHA)?
How do you timestamp events with microsecond precision?
How do you avoid loading the bus electrically?

Thinking Exercise

Design an I2C state machine that detects START, address, data, ACK, and STOP.

The Interview Questions They’ll Ask

What is clock stretching and how do you detect it?
How do I2C and SPI differ in timing and framing?
Why must the analyzer be high impedance?
What is the difference between passive capture and active probing?

Hints in Layers

Hint 1: Use PIO wait on SDA edges while monitoring SCL.
Hint 2: Sample SDA on SCL rising edges for I2C.
Hint 3: Use a high-rate timer or DMA timestamps for events.
Hint 4: Provide a “raw bits” capture mode for debugging.

Books That Will Help

| Topic | Book | Chapter | |——-|——|———| | I2C protocol | The Book of I2C | Ch. 1-4 | | Debugging | The Art of Debugging with GDB | Ch. 1 | | PIO | RP2040 Assembly Language Programming | Ch. 10-12 |

Common Pitfalls & Debugging

Problem: “Missing START events”

Why: Incorrect edge detection or timing.
Fix: Increase sampling rate and validate pin mapping.
Quick test: Use a known I2C device and verify START.

Problem: “Decoded bytes are wrong”

Why: Sampling on wrong edge or wrong SPI mode.
Fix: Allow mode selection and validate with test pattern.
Quick test: Decode a known SPI transaction.

Definition of Done

Correctly decodes I2C addresses and data.
Supports multiple SPI modes and decodes data.
Timestamps events and detects errors.
Does not disturb the bus electrically.

Project 13: Capstone - RP2040/RP2350 Development Board from Scratch

Main Programming Language: C, KiCad
Difficulty: Master
Knowledge Area: Hardware design, power, USB, flash, boot

Real World Outcome

You hold a custom PCB you designed. It powers up, enumerates over USB, blinks an LED, and runs your firmware. You can program it via SWD, and it supports at least one peripheral (sensor, display, or radio). This is a production-grade artifact.

The Core Question You’re Answering

“Can I design, build, and bring up a complete embedded system from silicon to firmware?”

Concepts You Must Understand First

Boot and flash layout (Concept 1)
Clock and power design (Concept 2)
GPIO/PIO configuration (Concept 3/4)
Security and boot update strategy (Concept 8)

Questions to Guide Your Design

What power rails and decoupling are required?
How do you route USB and QSPI safely?
What debug headers and test points do you need?
How will you validate the board during bring-up?

Thinking Exercise

Draw the power tree and label decoupling capacitor values and placement priorities.

The Interview Questions They’ll Ask

How do you choose flash parts and QSPI routing constraints?
What are the common bring-up failures for new boards?
How do you verify clock stability on a custom PCB?
What design-for-manufacture practices did you follow?

Hints in Layers

Hint 1: Start from the official RP2040/RP2350 reference design.
Hint 2: Place decoupling capacitors as close as possible to power pins.
Hint 3: Add test pads for critical signals and rails.
Hint 4: Validate in stages: power, clock, SWD, boot, peripherals.

Books That Will Help

Common Pitfalls & Debugging

Problem: “Board does not enumerate over USB”

Why: USB routing or power issues.
Fix: Verify D+/D- routing impedance and VBUS power.
Quick test: Check USB lines with a scope during connect.

Problem: “Board does not boot”

Why: Flash wiring or clock oscillator issues.
Fix: Check QSPI wiring and crystal oscillator startup.
Quick test: Probe XOSC pins and verify oscillation.

Definition of Done

Board powers up with correct voltages.
SWD debug works and firmware can be flashed.
USB enumeration succeeds.
At least one peripheral works reliably.

Sources and Further Reading

Official and primary sources used to build this guide:

RP2040 specifications and documents: https://www.raspberrypi.com/products/rp2040/specifications/
RP2040 documentation portal: https://www.raspberrypi.com/documentation/microcontrollers/rp2040.html
RP2350 product information portal: https://pip.raspberrypi.com/categories/1120-rp2350
RP2350 announcement and Pico 2 launch: https://www.raspberrypi.com/news/raspberry-pi-pico-2-our-new-5-microcontroller-board-on-sale-now/
RP2350 datasheet: https://datasheets.raspberrypi.com/rp2350/rp2350-datasheet.pdf
Raspberry Pi hardware design with RP2350 (PDF): https://datasheets.raspberrypi.com/rp2350/hardware-design-with-rp2350.pdf
Pico SDK hardware docs (PIO/DMA/SYNC/SHA): https://www.raspberrypi.com/documentation/pico-sdk/hardware.html
UF2 bootloader format: https://github.com/microsoft/uf2
Arm TrustZone for Cortex-M: https://developer.arm.com/documentation/102418/latest/
Hazard3 RISC-V core repository: https://github.com/Wren6991/Hazard3
Raspberry Pi Holdings FY2024 final results (microcontroller unit sales): https://www.advfn.com/stock-market/london/RPI/stock-news/95763177/raspberry-pi-holdings-plc-fy-2024-final-results

Additional context:

Raspberry Pi RP2350 product page: https://www.raspberrypi.com/products/rp2350/