Project 4: Userspace Memory Mapper with MMIO Traps

Build a userspace memory manager that simulates guest RAM, MMIO regions, and access traps.

Quick Reference

Attribute Value
Difficulty Level 3: Intermediate
Time Estimate 1-2 weeks
Main Programming Language C (Alternatives: Rust)
Alternative Programming Languages Rust
Coolness Level Level 3: Memory Alchemy
Business Potential Level 2: Core Systems Skill
Prerequisites Virtual memory, signals
Key Topics MMIO, memory protection, device traps

1. Learning Objectives

By completing this project, you will:

  1. Carve a virtual address space into RAM and MMIO regions.
  2. Trigger and handle access faults as device traps.
  3. Track dirty pages for migration-style reporting.
  4. Produce deterministic logs for MMIO access.

2. All Theory Needed (Per-Concept Breakdown)

2.1 Memory Protection and MMIO Traps

Fundamentals MMIO is a technique where device registers are mapped into the CPU’s address space. A guest reads or writes those addresses as if they were normal memory, but the hypervisor intercepts those accesses and emulates device behavior. This requires memory protection: the hypervisor marks MMIO ranges as inaccessible so that any access triggers a fault that the hypervisor can handle. The rest of guest RAM remains accessible and behaves like normal memory. This simple mechanism is the foundation of device emulation in many VMMs.

Separating RAM from MMIO is also a correctness requirement. If an MMIO range is mistakenly mapped as normal RAM, the guest will write to it without trapping, and the device state will never update. If normal RAM is accidentally trapped, guest performance collapses. A clean memory map and reliable fault handling are therefore essential to any VM system.

Deep Dive into the concept MMIO virtualization is effectively a memory protection problem. The hypervisor (or in this project, your userspace manager) defines a memory map that includes contiguous RAM regions and sparse MMIO windows. Each MMIO window corresponds to a device’s register range. To emulate device behavior, the hypervisor sets those ranges to trigger access faults. When the guest reads or writes, the fault handler inspects the address, determines which device it belongs to, and calls the appropriate emulation routine.

This creates a clean separation between the data plane and the control plane. Guest RAM is handled directly by hardware (or by normal memory operations in your simulator), while device operations take the slow path through the trap handler. This is why MMIO-based devices are often slower than paravirtual devices: every register access induces a trap.

MMIO also interacts with alignment and width. A guest may access a device register as a byte, a word, or a double word, and the hypervisor must preserve the device’s semantics for each access size. Some devices expect specific alignment, and unaligned accesses may produce undefined behavior or trigger additional faults. Correct emulation requires careful definition of read/write widths and side effects.

In a real hypervisor, MMIO trapping may be implemented with EPT/NPT permissions. The hypervisor marks MMIO pages as non-present in the second-level page tables, so any access triggers an EPT violation. In your userspace project, you can simulate this by marking pages as inaccessible and catching the resulting access fault. The conceptual flow is the same: the fault handler becomes the device dispatch engine.

Dirty page tracking is another related concept. If you mark RAM pages read-only and trap on write, you can record which pages were modified. This mirrors dirty logging used in live migration. The trap handler updates a dirty bitmap and then allows the write to proceed. This is slower than hardware dirty bits but is conceptually simple and demonstrates the underlying idea.

Finally, MMIO regions must be excluded from normal memory allocation. The guest memory allocator should never hand out addresses inside MMIO ranges. In a real VMM, this is enforced by the VM’s memory map and by firmware tables presented to the guest. In your project, you will enforce it by design: MMIO ranges are fixed and distinct from RAM.

Performance tuning often revolves around page size and locality. Huge pages reduce TLB pressure, but they can make dirty tracking and snapshots coarse-grained, increasing migration time. NUMA locality is another key factor: if vCPUs run on one NUMA node while memory resides on another, latency increases. Hypervisors may expose virtual NUMA topologies to help guests optimize placement. Finally, memory isolation is a security boundary: incorrect mappings can leak or corrupt data, so verification and careful invalidation are non-negotiable.

Performance tuning often revolves around page size and locality. Huge pages reduce TLB pressure, but they can make dirty tracking and snapshots coarse-grained, increasing migration time. NUMA locality is another key factor: if vCPUs run on one NUMA node while memory resides on another, latency increases. Hypervisors may expose virtual NUMA topologies to help guests optimize placement. Finally, memory isolation is a security boundary: incorrect mappings can leak or corrupt data, so verification and careful invalidation are non-negotiable.How this fit on projects This concept is the heart of the project: you will implement MMIO traps and memory map enforcement.

Definitions & key terms

  • MMIO: memory-mapped I/O; device registers mapped into address space.
  • Trap: an access fault that redirects control to the VMM.
  • Dirty page: a page that has been written since last scan.
  • Memory map: division of address space into RAM and MMIO ranges.

Mental model diagram

Guest write -> MMIO range -> trap handler -> device emulation

How it works (step-by-step, with invariants and failure modes)

  1. Define RAM and MMIO ranges.
  2. Mark MMIO ranges as inaccessible.
  3. On fault, identify device by address.
  4. Emulate read/write and update state.

Invariants: MMIO ranges must trap; RAM must not trap. Failure modes include mis-mapped ranges and missing handlers.

Minimal concrete example

WRITE 0x10000000 (UART) -> trap -> UART handler emits 'A'

Common misconceptions

  • MMIO is just like normal RAM.
  • Trapping every MMIO access is always acceptable.

Check-your-understanding questions

  1. Why must MMIO ranges be protected?
  2. What happens if MMIO is mapped as RAM?
  3. How can dirty tracking be implemented with traps?

Check-your-understanding answers

  1. To intercept device register accesses for emulation.
  2. Device state never updates, causing undefined guest behavior.
  3. Mark pages read-only, trap on write, record dirty bitmap.

Real-world applications

  • Device emulation in QEMU
  • MMIO-based device models in hypervisors

Where you’ll apply it

  • Apply in §3.5 (memory map) and §4.1 (trap handler design)
  • Also used in: P05-virtio-block-device

References

  • Linux memory protection and mmap documentation
  • QEMU memory and MMIO docs

Key insights MMIO virtualization is memory protection plus a device dispatch engine.

Summary You now understand how MMIO traps enable device emulation via memory protection.

Homework/Exercises to practice the concept

  1. Sketch a memory map with two MMIO regions.
  2. Explain how a fault handler decides which device to emulate.

Solutions to the homework/exercises

  1. RAM from 0x0-0x1fffffff, MMIO at 0x10000000-0x10000fff, etc.
  2. Compare fault address to known MMIO ranges.

2.2 Device Emulation Fundamentals

Fundamentals Device emulation means providing a software model of a device that responds to guest reads and writes. The emulator must preserve the device’s register semantics, interrupt behavior, and reset logic. In early hypervisors, emulated devices were the default because they offered compatibility with existing OS drivers. The downside is performance: every register access triggers a trap, which is expensive. Still, emulation is essential for bootstrapping and for devices without virtio drivers.

Device performance depends on queue sizing and interrupt behavior. If queues are too small, the guest stalls waiting for buffers; if they are too large, latency can increase and memory consumption grows. Interrupt moderation and batching reduce exit overhead but can add latency. Backend choice also matters: a virtio device backed by a slow storage layer cannot be fast, even if the frontend is optimized. Migration imposes additional constraints, because device state must be serializable and consistent across hosts.Deep Dive into the concept Emulated devices are state machines. Each register read or write can change device state, trigger interrupts, or initiate I/O. Correct emulation requires a precise model of these state transitions. For example, a UART device has transmit and receive registers, FIFO buffers, and interrupt enable bits. A read from the receive buffer consumes a byte; a write to the transmit buffer may trigger an interrupt when the FIFO becomes empty.

Emulation performance depends on reducing the number of traps. Some hypervisors batch operations, coalesce interrupts, or use “fast paths” for common accesses. But the core cost remains: every MMIO or port I/O access requires a transition to the hypervisor. This is why virtio was created: it reduces trap frequency by moving data transfer to shared memory queues.

Even in a system with virtio, emulated devices remain important. Many guests use emulated devices during boot before virtio drivers load. Some guest OSes may not have virtio support at all. Therefore, correctness of emulation is non-negotiable: a bug in device emulation can crash or corrupt a guest OS.

Device emulation also has security implications. The hypervisor is parsing guest-provided inputs; a bug in the emulated device can allow a guest to escape. This is why modern hypervisors use fuzzing and strict input validation for device models. In your project, you will focus on correctness and clarity rather than security hardening, but it is important to recognize that device emulation is a security boundary.

Device performance depends on queue sizing and interrupt behavior. If queues are too small, the guest stalls waiting for buffers; if they are too large, latency can increase and memory consumption grows. Interrupt moderation and batching reduce exit overhead but can add latency. Backend choice also matters: a virtio device backed by a slow storage layer cannot be fast, even if the frontend is optimized. Migration imposes additional constraints, because device state must be serializable and consistent across hosts.

Device performance depends on queue sizing and interrupt behavior. If queues are too small, the guest stalls waiting for buffers; if they are too large, latency can increase and memory consumption grows. Interrupt moderation and batching reduce exit overhead but can add latency. Backend choice also matters: a virtio device backed by a slow storage layer cannot be fast, even if the frontend is optimized. Migration imposes additional constraints, because device state must be serializable and consistent across hosts.

Device performance depends on queue sizing and interrupt behavior. If queues are too small, the guest stalls waiting for buffers; if they are too large, latency can increase and memory consumption grows. Interrupt moderation and batching reduce exit overhead but can add latency. Backend choice also matters: a virtio device backed by a slow storage layer cannot be fast, even if the frontend is optimized. Migration imposes additional constraints, because device state must be serializable and consistent across hosts.

Device performance depends on queue sizing and interrupt behavior. If queues are too small, the guest stalls waiting for buffers; if they are too large, latency can increase and memory consumption grows. Interrupt moderation and batching reduce exit overhead but can add latency. Backend choice also matters: a virtio device backed by a slow storage layer cannot be fast, even if the frontend is optimized. Migration imposes additional constraints, because device state must be serializable and consistent across hosts.How this fit on projects This concept provides the mental model for MMIO trap handling and device responses.

Definitions & key terms

  • Device model: software state machine that mimics a device.
  • Register semantics: meaning of reads/writes to device registers.
  • Interrupt injection: delivering device events to the guest.

Mental model diagram

Guest access -> MMIO trap -> device state machine -> response

How it works (step-by-step, with invariants and failure modes)

  1. Guest writes to a device register.
  2. Trap handler identifies device.
  3. Device model updates state and produces response.
  4. Optional interrupt is injected.

Invariants: device state transitions must match specification. Failure modes include missing interrupts or incorrect register behavior.

Minimal concrete example

WRITE UART.THR = 'H' -> device queues byte -> interrupt set

Common misconceptions

  • Emulated devices are always sufficient for performance.
  • Device emulation is purely a software detail with no security impact.

Check-your-understanding questions

  1. Why do emulated devices cause many VM exits?
  2. How can a device model bug lead to a security issue?

Check-your-understanding answers

  1. Every register access triggers a trap into the hypervisor.
  2. The device model parses guest inputs; bugs can allow escapes.

Real-world applications

  • QEMU device models
  • Legacy device support in hypervisors

Where you’ll apply it

  • Apply in §3.4 (emulation requirements) and §4.2 (component design)
  • Also used in: P05-virtio-block-device, P06-virtio-net-device

References

  • QEMU device emulation documentation
  • Hardware datasheets for device registers

Key insights Device emulation is a correctness-first state machine with security implications.

Summary You now understand how emulated devices respond to MMIO accesses.

Homework/Exercises to practice the concept

  1. Sketch a UART register map and its state transitions.
  2. Explain why virtio reduces exit frequency.

Solutions to the homework/exercises

  1. Map registers to state changes (TX FIFO, RX FIFO, interrupts).
  2. Virtio uses shared queues instead of per-register traps.

3. Project Specification

3.1 What You Will Build

A userspace memory manager that simulates guest RAM, MMIO regions, and traps.

3.2 Functional Requirements

  1. Define RAM and MMIO ranges.
  2. Trap MMIO accesses and dispatch to handlers.
  3. Log MMIO reads/writes.
  4. Track dirty pages.

3.3 Non-Functional Requirements

  • Performance: handle thousands of accesses quickly.
  • Reliability: no crashes on valid accesses.
  • Usability: clear logs and deterministic behavior.

3.4 Example Usage / Output

$ ./memmap
[MEM] RAM: 512MB
[MEM] MMIO: UART 0x10000000-0x10000fff
[GUEST] write RAM OK
[GUEST] write MMIO -> TRAP

3.5 Data Formats / Schemas / Protocols

  • Memory map: list of ranges with permissions
  • Log format: access type, address, value

3.6 Edge Cases

  • Access to unmapped address
  • Access with unexpected width

3.7 Real World Outcome

You can see exactly which accesses are MMIO and which are RAM.

3.7.1 How to Run (Copy/Paste)

  • Run in a standard Linux environment

3.7.2 Golden Path Demo (Deterministic)

  • Write to RAM succeeds
  • Write to MMIO traps

3.7.3 If CLI: exact terminal transcript

$ ./memmap
[RAM] write ok
[MMIO] trap at 0x10000000 -> UART handler

4. Solution Architecture

4.1 High-Level Design

Memory map -> access -> trap -> device handler -> log

4.2 Key Components

| Component | Responsibility | Key Decisions | |———–|—————-|—————| | Memory map | RAM/MMIO ranges | Fixed layout | | Trap handler | Dispatch | Address-based routing | | Device handlers | Emulation | Minimal UART stub |

4.3 Data Structures (No Full Code)

  • Range table: base, size, permissions
  • Dirty bitmap: per-page flags

4.4 Algorithm Overview

  1. On access, check range
  2. If MMIO, trap and emulate
  3. If RAM, allow access

5. Implementation Guide

5.1 Development Environment Setup

# Standard compiler toolchain

5.2 Project Structure

project-root/
├── src/
│   ├── main.c
│   └── mmio.c
└── README.md

5.3 The Core Question You’re Answering

“How do hypervisors detect and emulate device accesses in guest memory space?”

5.4 Concepts You Must Understand First

  1. Memory protection and faults
  2. MMIO register semantics
  3. Trap dispatch

5.5 Questions to Guide Your Design

  1. How will you represent the memory map?
  2. How will you decode access width?

5.6 Thinking Exercise

Draw a memory map and simulate a device register write.

5.7 The Interview Questions They’ll Ask

  1. “What is MMIO and why is it used?”
  2. “How does a hypervisor intercept MMIO?”

5.8 Hints in Layers

Hint 1: Start with a single MMIO page. Hint 2: Log every trap address. Hint 3: Pseudocode

IF addr in MMIO -> trap -> handler
ELSE -> RAM access

Hint 4: Use a bitmap for dirty tracking.

5.9 Books That Will Help

| Topic | Book | Chapter | |——-|——|———| | Memory mapping | “The Linux Programming Interface” | Ch. 49 | | I/O systems | “Operating System Concepts” | Ch. 13 |

5.10 Implementation Phases

  • Phase 1: Memory map
  • Phase 2: MMIO traps
  • Phase 3: Dirty tracking

5.11 Key Implementation Decisions

| Decision | Options | Recommendation | Rationale | |———-|———|—————-|———–| | Trap method | SIGSEGV vs polling | SIGSEGV | Closer to real VMM |


6. Testing Strategy

6.1 Test Categories

| Category | Purpose | Examples | |———-|———|———-| | Unit Tests | Range matching | MMIO detection | | Integration Tests | Trap flow | UART handler invoked |

6.2 Critical Test Cases

  1. RAM write does not trap.
  2. MMIO write traps and logs.

6.3 Test Data

Access sequence: RAM write, MMIO write

7. Common Pitfalls & Debugging

7.1 Frequent Mistakes

| Pitfall | Symptom | Solution | |———|———|———-| | Wrong range | No trap | Fix MMIO bounds | | Handler missing | Crash | Add default handler |

7.2 Debugging Strategies

  • Print memory map at startup.

7.3 Performance Traps

  • Excessive logging can slow the system.

8. Extensions & Challenges

8.1 Beginner Extensions

  • Add a second MMIO device.

8.2 Intermediate Extensions

  • Implement read semantics.

8.3 Advanced Extensions

  • Add a virtio-style queue for device I/O.

9. Real-World Connections

9.1 Industry Applications

  • MMIO device models in QEMU
  • QEMU memory subsystem

9.3 Interview Relevance

  • MMIO, device emulation

10. Resources

10.1 Essential Reading

  • QEMU memory and MMIO docs

10.2 Video Resources

  • Talks on device emulation