Project 3: Shadow Page Table Simulator

Build a simulator that mirrors guest page tables into shadow tables and quantifies the exit overhead.

Quick Reference

Attribute Value
Difficulty Level 4: Advanced
Time Estimate 2-3 weeks
Main Programming Language C (Alternatives: Rust, Python)
Alternative Programming Languages Rust, Python
Coolness Level Level 4: MMU Wizardry
Business Potential Level 2: Systems Cred
Prerequisites Paging, TLBs, VM exits
Key Topics Shadow paging, dirty tracking

1. Learning Objectives

By completing this project, you will:

  1. Simulate GVA->GPA->HPA translation with shadow tables.
  2. Quantify the overhead of trapping guest PT writes.
  3. Explain why EPT/NPT reduce exit rates.
  4. Produce deterministic traces and statistics.

2. All Theory Needed (Per-Concept Breakdown)

2.1 Memory Virtualization and Shadow Paging

Fundamentals Memory virtualization lets each guest OS believe it has contiguous physical memory. The guest maps guest virtual addresses (GVA) to guest physical addresses (GPA) using its own page tables. The hypervisor then maps GPA to host physical addresses (HPA). Historically, hypervisors used shadow page tables to maintain a combined mapping, trapping on guest page table updates. Modern CPUs provide EPT/NPT (second-level translation) so the hardware performs both translations. This reduces exits but introduces new fault types (EPT violations). Memory virtualization also enables overcommit and ballooning, which can improve utilization but introduce performance cliffs.

It is useful to separate correctness from performance. Correctness means each guest sees a consistent memory model with isolation from other guests. Performance means minimizing page faults, TLB misses, and exit overhead. Hypervisors constantly trade these goals by choosing page sizes, tracking dirty pages, and deciding when to reclaim memory. These trade-offs show up immediately in migration behavior and in the latency profile of memory-heavy workloads.

Deep Dive into the concept The fundamental challenge is that the guest OS believes it controls physical memory, but the host must multiplex and protect that memory across VMs. With shadow page tables, the VMM maintains a mapping from GVA to HPA by shadowing the guest’s page tables. Every guest write to its own page tables must be trapped so the shadow can be updated. This is correct but expensive; a busy guest can trigger thousands of exits just to update page tables, and TLB flushes become frequent.

Hardware-assisted translation (EPT on Intel, NPT on AMD) changes the trade-off. The CPU walks the guest page tables to translate GVA to GPA, then walks the EPT/NPT tables to translate GPA to HPA. The guest can update its own page tables without exits. The hypervisor handles second-level faults when the GPA does not map to an HPA. This enables lazy allocation, demand paging, and copy-on-write. However, it also increases TLB pressure because translations are effectively two-level. Performance tuning often involves large pages, page-walk caching, and careful TLB invalidation strategies.

Overcommit introduces another layer. A hypervisor might allocate more guest memory than physically available, assuming the guest will not touch it all. To make this safe, hypervisors use balloon drivers inside guests to reclaim unused pages, or use deduplication to share identical pages across VMs. These techniques improve density but can create latency spikes under memory pressure. Live migration also depends on dirty-page tracking: the hypervisor must know which pages were modified since the last copy.

NUMA adds another dimension. A VM may span multiple NUMA nodes, but if its vCPUs run on one node while its memory sits on another, memory access latency increases and bandwidth drops. Hypervisors therefore try to keep vCPU and memory locality aligned. Some platforms expose virtual NUMA topologies to the guest so it can make NUMA-aware scheduling decisions.

Memory virtualization also affects security. Side channels such as page-fault timing or cache contention can leak information between guests. Techniques like page coloring and memory bandwidth throttling are used in high-security environments to reduce leakage. These are advanced topics, but they show why memory virtualization is not just about mapping addresses.

Performance tuning often revolves around page size and locality. Huge pages reduce TLB pressure, but they can make dirty tracking and snapshots coarse-grained, increasing migration time. NUMA locality is another key factor: if vCPUs run on one NUMA node while memory resides on another, latency increases. Hypervisors may expose virtual NUMA topologies to help guests optimize placement. Finally, memory isolation is a security boundary: incorrect mappings can leak or corrupt data, so verification and careful invalidation are non-negotiable.

Performance tuning often revolves around page size and locality. Huge pages reduce TLB pressure, but they can make dirty tracking and snapshots coarse-grained, increasing migration time. NUMA locality is another key factor: if vCPUs run on one NUMA node while memory resides on another, latency increases. Hypervisors may expose virtual NUMA topologies to help guests optimize placement. Finally, memory isolation is a security boundary: incorrect mappings can leak or corrupt data, so verification and careful invalidation are non-negotiable.How this fit on projects This project is a direct application: you will simulate the shadow paging mechanism and measure its overhead.

Definitions & key terms

  • GVA/GPA/HPA: guest virtual, guest physical, host physical addresses.
  • Shadow page tables: VMM-maintained combined translation tables.
  • EPT/NPT: hardware second-level translation.
  • Dirty tracking: recording modified pages for migration.

Mental model diagram

GVA --(guest PT)--> GPA --(shadow)--> HPA

How it works (step-by-step, with invariants and failure modes)

  1. Guest updates its page table.
  2. Hypervisor traps the update.
  3. Hypervisor updates the shadow table.
  4. CPU uses shadow table for translation.

Invariants: shadow must match guest mapping; traps must occur on PT writes. Failure modes include stale mappings and incorrect translations.

Minimal concrete example

GUEST: write PTE (GVA 0x1000 -> GPA 0x5000)
VMM: trap -> update shadow to HPA 0x8A000

Common misconceptions

  • Shadow paging is only slower by a small constant factor.
  • Shadow paging eliminates all faults.

Check-your-understanding questions

  1. Why must guest PT writes be trapped?
  2. What is the cost of frequent TLB flushes?
  3. Why did EPT replace shadow paging in production?

Check-your-understanding answers

  1. The hypervisor must keep shadow mappings consistent.
  2. TLB flushes reduce cache locality and add latency.
  3. EPT reduces exit frequency and simplifies PT handling.

Real-world applications

  • Legacy hypervisors without EPT/NPT
  • Educational VMMs and simulators

Where you’ll apply it

  • Apply in §3.2 (functional requirements) and §4.2 (data structures)
  • Also used in: P01-toy-kvm-hypervisor

References

  • OSTEP chapters on paging
  • Intel SDM Vol. 3C (EPT)

Key insights Shadow paging is correct but costly; it explains the importance of EPT/NPT.

Summary You now understand why shadow paging was a performance bottleneck in early virtualization.

Homework/Exercises to practice the concept

  1. Draw a two-level shadow mapping for a sample address.
  2. Estimate exit counts for a guest that updates PTs frequently.

Solutions to the homework/exercises

  1. Map GVA->GPA via guest PT, then shadow to HPA.
  2. Each PT write induces a trap and update; exits scale with PT writes.

3. Project Specification

3.1 What You Will Build

A simulator that models guest PT updates and shadow PT synchronization, producing a trace and statistics.

3.2 Functional Requirements

  1. Represent guest page tables.
  2. Trap guest PT writes in the simulator.
  3. Maintain a shadow PT mapping to HPA.
  4. Output stats on traps and flushes.

3.3 Non-Functional Requirements

  • Performance: Able to simulate thousands of accesses quickly.
  • Reliability: Deterministic output for the same input trace.
  • Usability: Clear logs and summary stats.

3.4 Example Usage / Output

$ ./shadow-sim
[GUEST] PTE write: GVA 0x1000 -> GPA 0x5000
[VMM] trap -> shadow update
[STATS] PT writes: 1247, shadow updates: 1247, TLB flushes: 89

3.5 Data Formats / Schemas / Protocols

  • Input trace of guest memory operations
  • Output log of translations and traps

3.6 Edge Cases

  • Guest writes to unmapped PTs
  • Conflicting mappings for same GVA

3.7 Real World Outcome

A trace and summary that clearly show shadow paging overhead.

3.7.1 How to Run (Copy/Paste)

  • Run with a provided trace file

3.7.2 Golden Path Demo (Deterministic)

  • Trace shows a PT write and a shadow update

3.7.3 If CLI: exact terminal transcript

$ ./shadow-sim trace.txt
[TRACE] PT_WRITE 0x1000->0x5000
[TRACE] SHADOW_UPDATE 0x1000->0x8A000
[STATS] exits=1247 tlb_flushes=89

4. Solution Architecture

4.1 High-Level Design

Trace reader -> Guest PT model -> Shadow PT model -> Stats

4.2 Key Components

| Component | Responsibility | Key Decisions | |———–|—————-|—————| | PT model | Represent mappings | Simple map first | | Trap logic | Detect PT writes | Explicit events | | Stats | Count exits | Deterministic counters |

4.3 Data Structures (No Full Code)

  • Guest PT: map of GVA->GPA
  • Shadow PT: map of GVA->HPA

4.4 Algorithm Overview

  1. Read trace event
  2. If PT write, trap and update shadow
  3. If access, translate and log

5. Implementation Guide

5.1 Development Environment Setup

# Standard compiler toolchain

5.2 Project Structure

project-root/
├── src/
│   ├── main.c
│   └── shadow_pt.c
├── traces/
│   └── sample_trace.txt
└── README.md

5.3 The Core Question You’re Answering

“Why was shadow paging so expensive, and what did EPT fix?”

5.4 Concepts You Must Understand First

  1. Page tables and TLBs
  2. VM exit overhead
  3. Dirty tracking

5.5 Questions to Guide Your Design

  1. How will you model PT writes and traps?
  2. What statistics best show overhead?

5.6 Thinking Exercise

Manually trace a PT write followed by a memory access.

5.7 The Interview Questions They’ll Ask

  1. “Why do shadow page tables require trapping guest PT writes?”
  2. “How does EPT reduce VM exits?”

5.8 Hints in Layers

Hint 1: Start with a single-level PT. Hint 2: Add a trap event for PT writes. Hint 3: Pseudocode

IF event=PT_WRITE -> update shadow
ELSE -> translate access

Hint 4: Print each translation step to debug.

5.9 Books That Will Help

| Topic | Book | Chapter | |——-|——|———| | Paging | “Operating Systems: Three Easy Pieces” | Ch. 18-20 | | Memory | “CSAPP” | Ch. 9 |

5.10 Implementation Phases

  • Phase 1: Basic translation
  • Phase 2: PT write traps
  • Phase 3: Stats and reporting

5.11 Key Implementation Decisions

| Decision | Options | Recommendation | Rationale | |———-|———|—————-|———–| | Trace format | Text vs binary | Text | Easier debugging |


6. Testing Strategy

6.1 Test Categories

| Category | Purpose | Examples | |———-|———|———-| | Unit Tests | Mapping logic | Translation correctness | | Integration Tests | Trace replay | Stats correctness |

6.2 Critical Test Cases

  1. PT write updates shadow.
  2. Access uses shadow mapping.

6.3 Test Data

Trace: PT_WRITE 0x1000->0x5000; ACCESS 0x1000

7. Common Pitfalls & Debugging

7.1 Frequent Mistakes

| Pitfall | Symptom | Solution | |———|———|———-| | Missing trap | Shadow stale | Trap PT writes | | Bad mapping | Wrong translation | Validate tables |

7.2 Debugging Strategies

  • Print each PT walk step.

7.3 Performance Traps

  • Excessive logging can distort timing.

8. Extensions & Challenges

8.1 Beginner Extensions

  • Add two-level paging.

8.2 Intermediate Extensions

  • Add dirty tracking stats.

8.3 Advanced Extensions

  • Compare to simulated EPT.

9. Real-World Connections

9.1 Industry Applications

  • Legacy hypervisors
  • KVM (EPT)

9.3 Interview Relevance

  • Shadow paging vs EPT

10. Resources

10.1 Essential Reading

  • OSTEP chapters on paging
  • Intel SDM EPT section

10.2 Video Resources

  • Virtual memory lectures