Project 4: Memory Map & MMIO Field Notebook

Document MMIO register maps and compute addresses reliably.

Quick Reference

Attribute	Value
Difficulty	Level 3
Time Estimate	10-16 hours
Main Programming Language	Assembly + C (Alternatives: Rust)
Alternative Programming Languages	Rust
Coolness Level	Level 3
Business Potential	Level 2
Prerequisites	Datasheet reading, Concept 4: Memory Maps & Ordering
Key Topics	MMIO semantics, atomic set/clear, address calculation

1. Learning Objectives

By completing this project, you will:

Translate ARM concepts into observable outputs you can verify.
Explain why each toolchain or hardware step is necessary.
Detect and fix at least one realistic failure mode.
Communicate the result clearly in a technical review or interview.

2. All Theory Needed (Per-Concept Breakdown)

Memory Maps, MMIO & Ordering

Fundamentals ARM systems expose peripherals through memory-mapped I/O (MMIO): reading or writing specific addresses triggers hardware behavior rather than normal memory access. This is central to microcontrollers and still vital on A-profile SoCs. The memory map defines which address ranges are RAM, flash, peripherals, and internal control regions. Memory ordering adds another layer: modern CPUs can reorder memory accesses for performance, so barriers (DMB/DSB/ISB) are required to guarantee visibility and ordering to devices or other cores. citeturn3search3 Understanding MMIO and ordering is the key to controlling hardware reliably.

Deep Dive A memory map is a contract between the CPU and the SoC. Addresses are not abstract: they correspond to real hardware blocks. In Cortex-M systems, large fixed ranges map to flash, SRAM, peripherals, and internal control registers. These ranges determine what happens when you load or store. For example, a store to a GPIO register flips a pin; a load from a UART data register consumes a byte from a FIFO. MMIO behaves differently from RAM: it is often non-cacheable, may have side effects on read, and is frequently write-only or read-only. When you treat it like ordinary memory, bugs emerge: stale values, missing updates, or unintended state changes.

Memory ordering complicates this further. ARM cores, like most modern CPUs, can reorder memory operations to improve performance. This is invisible in single-threaded logic but catastrophic for devices and multi-core coordination. If you write a command buffer to memory and then write a “doorbell” MMIO register that tells the device to consume it, the device might see the doorbell first unless you insert a barrier. ARM provides barrier instructions—DMB, DSB, ISB—each with distinct strength. DMB ensures prior memory accesses are observed before subsequent ones; DSB additionally waits for completion; ISB flushes the instruction pipeline to make control-register changes visible. citeturn3search3 These are not optional: they are the difference between “mostly works” and “always correct.”

On microcontrollers, you may not have caches or complex reorder buffers, but the bus fabric and peripheral interactions still require ordering. On A-profile systems with caches, speculation, and out-of-order execution, the need is even greater. DMA engines read and write memory independently of the CPU; if you don’t synchronize caches or enforce ordering, the DMA sees stale or partial data. This is why firmware often combines barriers with explicit cache maintenance. The principle is simple: your mental model must include the device, the bus, and the CPU pipeline, not just the instruction sequence.

MMIO access patterns also introduce concurrency hazards. Read-modify-write sequences can race with interrupts or other cores. Hardware often provides SET/CLEAR registers specifically to avoid these races by allowing atomic bit operations. If you ignore these and perform a naive read-modify-write, you can silently clear unrelated bits. The safest approach is to understand the register semantics and use the atomic registers provided. That is not assembly-specific, but assembly exposes the pattern directly and makes it obvious.

How this fits on projects

Core to P04 (Memory Map & MMIO Field Notebook) and P09 (Memory Ordering Litmus Tests).

Definitions & key terms

Memory map: The assignment of address ranges to RAM, flash, and peripherals.
MMIO: Memory addresses that control hardware rather than store data.
DMB/DSB/ISB: Memory barrier instructions for ordering and visibility. citeturn3search3

Mental model diagram

Cortex-M Memory Map (4GB address space):
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━

    0xFFFFFFFF ┌─────────────────────────────────────────┐
               │         Vendor-Specific                 │
    0xE0100000 ├─────────────────────────────────────────┤
               │         Private Peripheral Bus          │  ← NVIC lives here
               │         (Internal peripherals)          │    at 0xE000E000
    0xE0000000 ├─────────────────────────────────────────┤
               │                                         │
               │         External Device                 │  ← Memory-mapped
               │         (Peripherals, etc.)             │    devices
               │                                         │
    0xA0000000 ├─────────────────────────────────────────┤
               │                                         │
               │         External RAM                    │
               │                                         │
    0x60000000 ├─────────────────────────────────────────┤
               │                                         │
               │         Peripheral                      │  ← GPIO, UART, SPI,
               │         (On-chip I/O)                   │    I2C, PWM, etc.
               │                                         │
    0x40000000 ├─────────────────────────────────────────┤
               │                                         │
               │         SRAM                            │  ← Variables, stack,
               │         (On-chip RAM)                   │    heap
               │                                         │
    0x20000000 ├─────────────────────────────────────────┤
               │                                         │
               │         Code                            │  ← Flash/ROM with
               │         (Flash/ROM)                     │    your program
               │                                         │
    0x00000000 └─────────────────────────────────────────┘


RP2040-Specific Memory Map:
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━

    Address         │  Size      │  Contents
    ────────────────┼────────────┼─────────────────────────────────────
    0x10000000      │  2 MB      │  External Flash (XIP)
                    │            │  ↳ Your code runs from here
    ────────────────┼────────────┼─────────────────────────────────────
    0x20000000      │  256 KB    │  Main SRAM (4 banks × 64KB)
                    │            │  ↳ Variables, stack, heap
    0x20040000      │  4 KB      │  SRAM4 (for USB)
    0x20041000      │  4 KB      │  SRAM5 (for USB)
    ────────────────┼────────────┼─────────────────────────────────────
    0x40000000      │  -         │  APB Peripherals
                    │            │  ↳ UART, SPI, I2C, PWM...
    ────────────────┼────────────┼─────────────────────────────────────
    0x50000000      │  -         │  AHB-Lite Peripherals
                    │            │  ↳ DMA, USB, PIO...
    ────────────────┼────────────┼─────────────────────────────────────
    0xD0000000      │  -         │  SIO (Single-cycle I/O)
                    │            │  ↳ GPIO (fast access!)
    ────────────────┼────────────┼─────────────────────────────────────
    0xE0000000      │  -         │  Cortex-M0+ internal
                    │            │  ↳ NVIC, SysTick, Debug

Cortex-M Memory Map

Memory-Mapped I/O Concept:
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━

    Normal Memory:                  Peripheral Register:
    ──────────────                  ────────────────────
    LDR r0, [addr]                  LDR r0, [UART_DATA]
         │                               │
         ▼                               ▼
    Read from RAM                   Read TRIGGERS HARDWARE!
    Data was sitting there          Byte removed from RX FIFO
    Memory unchanged                Status flags updated

    STR r0, [addr]                  STR r0, [GPIO_OUT]
         │                               │
         ▼                               ▼
    Write to RAM                    Write CAUSES ACTION!
    Data now stored there           Pin voltage changes
    Can read it back                May not read same value back


Example: GPIO Control on RP2040:
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━

    SIO Base: 0xD0000000

    Offset  │ Register       │ Purpose
    ────────┼────────────────┼──────────────────────────────────
    0x000   │ CPUID          │ Processor ID (read-only)
    0x004   │ GPIO_IN        │ Read current GPIO input state
    0x010   │ GPIO_OUT       │ Read/write GPIO output state
    0x014   │ GPIO_OUT_SET   │ Set bits in GPIO_OUT (write-only)
    0x018   │ GPIO_OUT_CLR   │ Clear bits in GPIO_OUT (write-only)
    0x01C   │ GPIO_OUT_XOR   │ Toggle bits in GPIO_OUT (write-only)
    0x020   │ GPIO_OE        │ Output enable (1=output, 0=input)
    0x024   │ GPIO_OE_SET    │ Set bits in GPIO_OE
    0x028   │ GPIO_OE_CLR    │ Clear bits in GPIO_OE


    To turn ON GPIO25 (Pico's LED):
    ─────────────────────────────────────────────────────────────────

    LDR  r0, =0xD0000000     // SIO base address
    MOV  r1, #1
    LSL  r1, r1, #25         // r1 = 0x02000000 (bit 25)
    STR  r1, [r0, #0x024]    // GPIO_OE_SET: enable output
    STR  r1, [r0, #0x014]    // GPIO_OUT_SET: set high → LED ON!


    Why SET/CLR registers instead of just GPIO_OUT?
    ─────────────────────────────────────────────────────────────────

    Without SET/CLR (DANGEROUS):
    ┌────────────────────────────────────────────────────────────────┐
    │ LDR r1, [r0, #GPIO_OUT]   // Read current value                │
    │ ORR r1, r1, #(1<<25)      // Set bit 25                        │
    │ STR r1, [r0, #GPIO_OUT]   // Write back                        │
    │                                                                 │
    │ PROBLEM: If another core or interrupt modifies GPIO_OUT        │
    │ between the LDR and STR, those changes are LOST!               │
    │ This is a classic "read-modify-write race condition."          │
    └────────────────────────────────────────────────────────────────┘

    With SET/CLR (ATOMIC and SAFE):
    ┌────────────────────────────────────────────────────────────────┐
    │ MOV r1, #(1<<25)                                                │
    │ STR r1, [r0, #GPIO_OUT_SET]  // Hardware atomically sets bit   │
    │                                                                 │
    │ Other bits are UNAFFECTED - hardware handles it!               │
    └────────────────────────────────────────────────────────────────┘

Memory-Mapped I/O

Why Memory Barriers Are Needed:
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━

Modern CPUs reorder memory accesses for performance. This is usually
invisible to single-threaded code, but becomes critical when:

  1. Communicating with peripherals (they have side effects!)
  2. Multi-core systems (other cores see different ordering)
  3. DMA operations (hardware sees memory, not caches)

Example WITHOUT barrier (BROKEN):
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━

    You write:                  CPU might execute as:
    ──────────────────────      ────────────────────────────────
    mailbox_buffer[0] = cmd     mailbox_write = buffer_addr ← FIRST!
    mailbox_buffer[1] = arg     mailbox_buffer[0] = cmd     ← TOO LATE
    mailbox_write = buffer_addr mailbox_buffer[1] = arg

    The peripheral reads garbage because the buffer wasn't filled yet!


ARM Memory Barrier Instructions:
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━

    DMB (Data Memory Barrier)
    ├── Ensures all previous memory accesses complete before
    │   subsequent memory accesses begin
    ├── Does NOT affect instruction execution order
    └── Use between: data writes and peripheral write

    DSB (Data Synchronization Barrier)
    ├── Like DMB, but also waits for all previous instructions
    │   to complete (stronger than DMB)
    └── Use before: peripheral access that must be visible

    ISB (Instruction Synchronization Barrier)
    ├── Flushes the instruction pipeline
    ├── Ensures previous context changes take effect
    └── Use after: changing system registers, enabling MMU


Correct Pattern for Peripheral Access:
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━

    // Fill mailbox buffer
    str  w1, [x0]           // Write data to buffer
    str  w2, [x0, #4]       // Write more data

    dsb  sy                 // ← BARRIER: Complete all writes

    str  w3, [x4]           // Now write to mailbox register
                            // Hardware now sees complete buffer

ARM Memory Barriers

How it works (step-by-step, with invariants and failure modes)

Identify which addresses are MMIO and which are normal memory.
Use atomic SET/CLEAR registers when available to avoid races.
Insert barriers before device “doorbell” writes to guarantee ordering. citeturn3search3
Failure mode: devices read partial buffers, interrupts race, or GPIO bits flip incorrectly.

Minimal concrete example (pseudo, not runnable)

WRITE buffer
BARRIER
WRITE device_register

Common misconceptions

“MMIO behaves like RAM” → Reads and writes can trigger side effects.
“Ordering is always preserved” → CPUs and buses can reorder operations. citeturn3search3

Check-your-understanding questions

Why can reading a UART data register change system state?
When do you need a DSB instead of a DMB?
Why are SET/CLEAR registers safer than read-modify-write?

Check-your-understanding answers

MMIO reads can pop FIFO entries or clear flags, which changes hardware state.
When you need to ensure prior instructions are fully completed before continuing. citeturn3search3
They avoid races because the hardware performs the atomic bit update.

Real-world applications

GPIO control, DMA setup, and peripheral initialization in firmware.

Where you’ll apply it

This project: see §3.1 and §5.4 in P04-mmio-memory-map-notebook.md
P04 Memory Map & MMIO Field Notebook
P09 Memory Ordering Litmus Tests

References

Arm ACLE barrier intrinsics and semantics. citeturn3search3

Key insights MMIO and ordering are the difference between “works once” and “always correct.”

Summary Memory maps define what addresses mean; barriers define when writes become real.

Homework/Exercises to practice the concept

Describe a race condition caused by a read-modify-write on GPIO.
Sketch an ordering bug where a peripheral sees stale data.

Solutions to the homework/exercises

Another core sets a different bit between your read and write; your write erases it.
You signal the device before writing the buffer; it reads garbage.

3. Project Specification

3.1 What You Will Build

A structured notebook that maps peripheral registers to addresses and behaviors.

3.2 Functional Requirements

Requirement 1: Compute addresses from base + offset
Requirement 2: Track read/write/clear-on-read semantics
Requirement 3: Provide a lookup interface for at least 3 peripherals

3.3 Non-Functional Requirements

Data must be consistent and validated

3.4 Example Usage / Output

$ mmio-notebook lookup GPIO_OUT_SET
Base: 0xD0000000
Offset: 0x014
Address: 0xD0000014
Access: write-only

$ mmio-notebook lookup UNKNOWN
error: register not found
exit code: 3

3.5 Data Formats / Schemas / Protocols

3.6 Edge Cases

Duplicate names
Conflicting offsets

3.7 Real World Outcome

This is the golden reference for success:

You can explain why a write toggles a pin and a read clears a flag.

3.7.1 How to Run (Copy/Paste)

Build: follow the toolchain steps defined in this guide
Run: use the CLI examples in §3.4 with fixed inputs
Expected directory: project root

3.7.2 Golden Path Demo (Deterministic)

Run with a fixed input set and confirm output matches §3.4 exactly.

3.7.3 If CLI: Exact Terminal Transcript

$ mmio-notebook lookup GPIO_OUT_SET
Base: 0xD0000000
Offset: 0x014
Address: 0xD0000014
Access: write-only

$ mmio-notebook lookup UNKNOWN
error: register not found
exit code: 3

4. Solution Architecture

4.1 High-Level Design

┌──────────────┐     ┌──────────────┐     ┌──────────────┐
│ Input Layer  │───▶│ Core Logic   │───▶│ Output Layer │
└──────────────┘     └──────────────┘     └──────────────┘

4.2 Key Components

Component	Responsibility	Key Decisions
Input Parser	Validate and normalize input	Strict error handling
Core Engine	Perform the main computation	Deterministic paths
Reporter	Produce user-facing output	Stable formatting

4.3 Data Structures (No Full Code)

Record Entry {
  name: string
  fields: list
  notes: text
}

4.4 Algorithm Overview

Key Algorithm: Core Flow

Parse input and validate parameters.
Execute the core transformation or analysis.
Emit deterministic output or error summary.

Complexity Analysis:

Time: O(n) in the size of input records
Space: O(n) for stored mappings and logs

5. Implementation Guide

5.1 Development Environment Setup

# Install toolchain and verify versions
toolchain --version

5.2 Project Structure

project-root/
├── src/
│   ├── core
│   └── io
├── tests/
│   └── fixtures
├── docs/
└── README.md

5.3 The Core Question You’re Answering

“Document MMIO register maps and compute addresses reliably.”

5.4 Concepts You Must Understand First

Stop and research these before coding:

Memory Maps, MMIO & Ordering
- What is the key invariant you must preserve?

5.5 Questions to Guide Your Design

Data Flow
- How does input become output?
- Which steps must be deterministic?
Validation
- What is the simplest test that proves correctness?
- How will you detect regressions?

5.6 Thinking Exercise

Trace the Critical Path

Write a step-by-step trace of the most important workflow in this project.

Questions to answer:

Where could a subtle bug hide?
What would you log to prove correctness?

5.7 The Interview Questions They’ll Ask

“What is the core invariant this project relies on?”
“How would you debug a failure in this workflow?”
“What trade-offs did you make in design?”
“How does this map to real hardware or toolchains?”
“How do you prove your output is correct?”

5.8 Hints in Layers

Hint 1: Start small Focus on the smallest input that still demonstrates the concept.

Hint 2: Make output deterministic Fix inputs and produce stable logs before expanding functionality.

Hint 3: Validate against a known reference Compare with a known-good output or specification.

Hint 4: Add instrumentation Log internal steps so you can verify each phase explicitly.

5.9 Books That Will Help

Topic	Book	Chapter
Core concept	“ARM Assembly Language” by William Hohl	Ch. 3-5
Binary formats	“Linkers and Loaders” by John R. Levine	Ch. 1-3

5.10 Implementation Phases

Phase 1: Foundation (2-4 hours)

Goals:

Establish a minimal working pipeline
Validate one end-to-end path Tasks:
1. Build the smallest viable input and output
2. Verify outputs against a reference Checkpoint: Output matches expected golden path

Phase 2: Core Functionality (4-8 hours)

Goals:

Implement main logic and validation
Add structured error handling Tasks:
1. Implement the core transformation
2. Add deterministic reporting Checkpoint: Core tests pass reliably

Phase 3: Polish & Edge Cases (2-4 hours)

Goals:

Cover edge cases
Improve output clarity Tasks:
1. Add negative tests
2. Document limitations Checkpoint: All edge cases handled gracefully

5.11 Key Implementation Decisions

Decision	Options	Recommendation	Rationale
Input format	Free-form vs structured	Structured	Easier validation
Output format	Human vs machine	Both	Supports verification and tooling

6. Testing Strategy

6.1 Test Categories

Category	Purpose	Examples
Unit Tests	Validate core logic	Field parsing, bounds checks
Integration Tests	Validate full flow	End-to-end CLI runs
Edge Case Tests	Validate boundaries	Empty input, invalid flags

6.2 Critical Test Cases

Golden path: Fixed input produces known output.
Invalid input: Error path triggers correct exit code.
Boundary case: Maximum supported value handled correctly.

6.3 Test Data

Input: fixed seed or fixed fixture
Expected: exact output text from §3.4

7. Common Pitfalls & Debugging

Pitfall	Symptom	Solution
Misaligned assumptions	Unexpected output	Re-check invariants
Missing validation	Silent failures	Add explicit checks
Non-determinism	Flaky output	Fix inputs and seeds

7.2 Debugging Strategies

Trace everything: Log each step with stable ordering
Compare against reference: Use known-good outputs

7.3 Performance Traps

Avoid repeated parsing of the same input; cache results when possible

8. Extensions & Challenges

8.1 Beginner Extensions

Add one extra output format
Add a help screen with examples

8.2 Intermediate Extensions

Add a verification mode that compares two outputs
Add structured JSON output

8.3 Advanced Extensions

Add a batch mode for large inputs
Add cross-target comparisons (M vs A profile)

9. Real-World Connections

9.1 Industry Applications

Firmware bring-up: use the same checks to validate early boot images
Security audits: analyze binaries for ABI or control-flow correctness

binutils: source of many ARM tooling workflows
QEMU: emulator used for ARM testing

9.3 Interview Relevance

Explains why ARM behavior differs across profiles
Demonstrates toolchain literacy and debugging rigor

10. Resources

10.1 Essential Reading

“ARM Assembly Language” by William Hohl - practical instruction usage
“Linkers and Loaders” by John R. Levine - binary layout

10.2 Video Resources

ARM architecture overview talks and lectures

10.3 Tools & Documentation

GNU binutils documentation
Arm developer documentation

This project connects with: P01-toolchain-pipeline-explorer.md, P02-register-stack-visualizer.md, P03-thumb-encoder-decoder.md

11. Self-Assessment Checklist

11.1 Understanding

I can explain the core concept without notes
I can explain why my design choices were necessary
I can describe one realistic failure mode

11.2 Implementation

All functional requirements are met
Tests pass deterministically
Edge cases are documented

11.3 Growth

I can describe what I would improve next time
I can explain this project in an interview

12. Submission / Completion Criteria

Minimum Viable Completion:

Core functionality works on reference inputs
Deterministic golden path is documented
At least one failure path is demonstrated

Full Completion:

All minimum criteria plus:
Edge cases are covered with tests
Output format is stable and documented

Excellence (Going Above & Beyond):

Add a comparison against a second target
Provide a short write-up of lessons learned

Project 4: Memory Map & MMIO Field Notebook

Quick Reference

1. Learning Objectives

2. All Theory Needed (Per-Concept Breakdown)

Memory Maps, MMIO & Ordering

3. Project Specification

3.1 What You Will Build

3.2 Functional Requirements

3.3 Non-Functional Requirements

3.4 Example Usage / Output

3.5 Data Formats / Schemas / Protocols

3.6 Edge Cases

3.7 Real World Outcome

3.7.1 How to Run (Copy/Paste)

3.7.2 Golden Path Demo (Deterministic)

3.7.3 If CLI: Exact Terminal Transcript

4. Solution Architecture

4.1 High-Level Design

4.2 Key Components

4.3 Data Structures (No Full Code)

4.4 Algorithm Overview

5. Implementation Guide

5.1 Development Environment Setup

5.2 Project Structure

5.3 The Core Question You’re Answering

5.4 Concepts You Must Understand First

5.5 Questions to Guide Your Design

5.6 Thinking Exercise

5.7 The Interview Questions They’ll Ask

5.8 Hints in Layers

5.9 Books That Will Help

5.10 Implementation Phases

Phase 1: Foundation (2-4 hours)

Phase 2: Core Functionality (4-8 hours)

Phase 3: Polish & Edge Cases (2-4 hours)

5.11 Key Implementation Decisions

6. Testing Strategy

6.1 Test Categories

6.2 Critical Test Cases

6.3 Test Data

7. Common Pitfalls & Debugging

7.2 Debugging Strategies

7.3 Performance Traps

8. Extensions & Challenges

8.1 Beginner Extensions

8.2 Intermediate Extensions

8.3 Advanced Extensions

9. Real-World Connections

9.1 Industry Applications

9.2 Related Open Source Projects

9.3 Interview Relevance

10. Resources

10.1 Essential Reading

10.2 Video Resources

10.3 Tools & Documentation

10.4 Related Projects in This Series

11. Self-Assessment Checklist

11.1 Understanding

11.2 Implementation

11.3 Growth

12. Submission / Completion Criteria