Project 5: Shadow Page Table Simulator

Simulate the pre-EPT memory virtualization technique where the VMM maintains “shadow” page tables to translate guest virtual addresses directly to host physical addresses.

Quick Reference

Attribute	Value
Language	C (alt: Rust, C++)
Difficulty	Advanced
Time	2-3 weeks
Knowledge Area	Memory Virtualization / Page Tables
Coolness	Hardcore Tech Flex
Portfolio Value	Resume Gold
Hardware Needed	None (pure simulation)

Learning Objectives

By completing this project, you will:

Master three-level address translation: Understand how Guest Virtual Address (GVA) maps to Guest Physical Address (GPA) maps to Host Physical Address (HPA)
Implement trap-and-emulate for page tables: Build the mechanism that detects and handles guest page table modifications
Understand why shadow page tables are expensive: Quantify the VM exit cost through statistics and analysis
Simulate TLB behavior: Model how Translation Lookaside Buffers cache address translations
Compare to hardware-assisted virtualization: Understand why EPT/NPT was such a breakthrough
Debug memory virtualization issues: Develop intuition for the bugs that plague real hypervisors

The Core Question You’re Answering

“How did hypervisors provide memory isolation before hardware support existed, and why was it so expensive?”

Before Intel’s Extended Page Tables (EPT) and AMD’s Nested Page Tables (NPT), hypervisors faced a fundamental problem: the guest OS expects to control its own page tables, but the VMM must ensure guests can only access memory allocated to them. The solution—shadow page tables—required intercepting every guest page table modification and maintaining a parallel set of translations. This project forces you to implement this mechanism and understand viscerally why hardware-assisted memory virtualization was revolutionary.

Concepts You Must Understand First

Before starting this project, ensure you understand these concepts:

Concept	Why It Matters	Where to Learn
x86 paging basics (PDE, PTE, CR3)	Shadow tables mirror guest page tables	OSTEP Ch. 18-20, Intel SDM Vol. 3A Ch. 4
Page table walk algorithm	You’ll implement this three times	CS:APP Ch. 9, OSTEP Ch. 18
Virtual memory address translation	Foundation of everything	OSTEP Ch. 18, CS:APP Ch. 9
TLB operation and invalidation	Performance depends on TLB behavior	OSTEP Ch. 19, CS:APP Ch. 9.6
Write protection and page faults	How you detect guest PT modifications	OSTEP Ch. 21, Intel SDM Vol. 3A Ch. 4
Trap-and-emulate virtualization	Core hypervisor technique	Popek & Goldberg 1974, OSTEP Ch. 6-7

Key Concepts Deep Dive

The Address Translation Problem in Virtualization
- Without virtualization: Process VA -> Physical Address (one level)
- With virtualization: Guest VA -> Guest PA -> Host PA (two levels)
- Why can’t the guest control real page tables?
- What happens if two guests try to use the same “physical” address?
- OSTEP Ch. 18-20, Virtualization Deep Dive resources
Shadow Page Table Mechanism
- What is the relationship between guest PT and shadow PT?
- Why does the shadow PT map GVA -> HPA directly?
- When must the VMM update the shadow tables?
- What happens on a guest page fault?
- “The Definitive Guide to Xen” Ch. 8, VMware Papers
Write-Protecting Guest Page Tables
- How does the VMM know when the guest modifies its page tables?
- What is the trap mechanism?
- Why is every PT modification expensive?
- Intel SDM Vol. 3A Ch. 4.6, KVM Source Code
TLB Coherency in Virtualization
- When does the guest flush its TLB?
- When must the VMM flush the real TLB?
- What are the coherency requirements?
- OSTEP Ch. 19, Intel SDM Vol. 3A Ch. 4.10
Performance Implications
- How many VM exits does a typical workload cause?
- What is the cost of each VM exit?
- Why did this drive the development of EPT/NPT?
- Intel VT-x Performance Papers, VMware Whitepapers

Theoretical Foundation

The Memory Virtualization Challenge

In a virtualized environment, we have multiple address spaces to manage:

┌─────────────────────────────────────────────────────────────────────────────┐
│                         ADDRESS SPACE HIERARCHY                              │
├─────────────────────────────────────────────────────────────────────────────┤
│                                                                              │
│   Guest Process                Guest OS               VMM/Host               │
│   ┌──────────────┐           ┌──────────────┐       ┌──────────────┐        │
│   │   Guest      │  Guest    │   Guest      │  VMM  │   Host       │        │
│   │   Virtual    │──────────>│   Physical   │──────>│   Physical   │        │
│   │   Address    │  Page     │   Address    │  Map  │   Address    │        │
│   │   (GVA)      │  Tables   │   (GPA)      │       │   (HPA)      │        │
│   └──────────────┘           └──────────────┘       └──────────────┘        │
│                                                                              │
│   Example:                                                                   │
│   GVA 0x1000 ──────────────> GPA 0x5000 ──────────> HPA 0x8A000             │
│                                                                              │
│   The CPU only understands one level of translation.                        │
│   Shadow page tables collapse GVA -> GPA -> HPA into GVA -> HPA.           │
│                                                                              │
└─────────────────────────────────────────────────────────────────────────────┘

Without Hardware Support: The Problem

The x86 CPU’s MMU only supports ONE level of page table translation. When a guest OS sets CR3 to point to its page tables:

Naive approach (WRONG):

Let guest control real CR3
Guest page tables translate GVA -> GPA
Problem: GPA is NOT a real physical address!

┌─────────────────────────────────────────────────────────────────────────────┐
│                          WHY NAIVE APPROACH FAILS                            │
├─────────────────────────────────────────────────────────────────────────────┤
│                                                                              │
│   Guest thinks GPA 0x1000 is physical memory                                │
│   But GPA 0x1000 might be:                                                  │
│     - HPA 0x50000 for Guest 1                                               │
│     - HPA 0xA0000 for Guest 2                                               │
│     - Unmapped (requires allocation)                                        │
│     - A device MMIO region                                                  │
│                                                                              │
│   If we let the guest directly control page tables:                         │
│     1. Guest could access any physical memory (security violation)          │
│     2. Multiple guests would conflict (isolation violation)                 │
│     3. VMM couldn't track memory usage (management failure)                │
│                                                                              │
└─────────────────────────────────────────────────────────────────────────────┘

The Shadow Page Table Solution

The VMM maintains a “shadow” of the guest’s page tables that maps GVA directly to HPA:

┌─────────────────────────────────────────────────────────────────────────────┐
│                        SHADOW PAGE TABLE ARCHITECTURE                        │
├─────────────────────────────────────────────────────────────────────────────┤
│                                                                              │
│   ┌─────────────────┐              ┌─────────────────┐                      │
│   │  Guest Page     │              │  Shadow Page    │                      │
│   │  Tables         │  Controlled  │  Tables         │  Used by             │
│   │  (GVA -> GPA)   │  by Guest    │  (GVA -> HPA)   │  Real MMU            │
│   └────────┬────────┘              └────────┬────────┘                      │
│            │                                │                                │
│            │  VMM intercepts               │  VMM points                    │
│            │  writes here                  │  CR3 here                      │
│            ▼                                ▼                                │
│   ┌─────────────────────────────────────────────────────────────────────┐  │
│   │                         VMM Memory Map                               │  │
│   │                         (GPA -> HPA)                                 │  │
│   │   GPA 0x0000 -> HPA 0x10000    GPA 0x1000 -> HPA 0x20000            │  │
│   │   GPA 0x2000 -> HPA 0x35000    GPA 0x3000 -> HPA 0x48000            │  │
│   └─────────────────────────────────────────────────────────────────────┘  │
│                                                                              │
│   Translation Path:                                                         │
│   1. Guest writes to page table at GPA 0x2004                               │
│   2. VMM traps (page was write-protected)                                   │
│   3. VMM reads guest's intended mapping: GVA 0x1000 -> GPA 0x5000          │
│   4. VMM looks up GPA 0x5000 in its map: GPA 0x5000 -> HPA 0x8A000         │
│   5. VMM updates shadow: GVA 0x1000 -> HPA 0x8A000                         │
│   6. VMM resumes guest                                                      │
│                                                                              │
└─────────────────────────────────────────────────────────────────────────────┘

The Trap-and-Update Mechanism

The key insight is that the VMM must intercept every guest page table modification:

┌─────────────────────────────────────────────────────────────────────────────┐
│                         SHADOW PT UPDATE PROTOCOL                            │
├─────────────────────────────────────────────────────────────────────────────┤
│                                                                              │
│   1. VMM marks all guest page table pages as READ-ONLY in shadow tables     │
│                                                                              │
│   2. Guest tries to modify its page table:                                  │
│      mov [GPA_of_PTE], new_mapping                                          │
│                                                                              │
│   3. CPU triggers page fault (write to read-only page)                      │
│                                                                              │
│   4. Page fault causes VM EXIT to VMM                                       │
│                                                                              │
│   5. VMM handles the exit:                                                  │
│      a. Decode the faulting instruction                                     │
│      b. Emulate the write to guest's page table (in guest memory)          │
│      c. Parse the new PTE to extract GPA                                    │
│      d. Translate GPA to HPA using VMM's memory map                        │
│      e. Update shadow page table with GVA -> HPA mapping                   │
│      f. If new PT page created, mark it read-only in shadow                │
│                                                                              │
│   6. VMM resumes guest execution (VM ENTRY)                                 │
│                                                                              │
│   ┌───────────────────────────────────────────────────────────────────────┐│
│   │ COST: Every page table modification = 1 VM EXIT + instruction decode │││
│   │       + shadow table update + potential TLB flush + VM ENTRY          │││
│   │                                                                       │││
│   │ Typical guest OS during boot: 10,000+ page table modifications       │││
│   │ Each VM EXIT/ENTRY cycle: ~1000-3000 CPU cycles                      │││
│   │ Total overhead: Millions of cycles just for memory management!       │││
│   └───────────────────────────────────────────────────────────────────────┘│
│                                                                              │
└─────────────────────────────────────────────────────────────────────────────┘

Multi-Level Page Table Structure (x86-64)

For realistic simulation, understand the 4-level page table hierarchy:

┌─────────────────────────────────────────────────────────────────────────────┐
│                      x86-64 4-LEVEL PAGE TABLE WALK                          │
├─────────────────────────────────────────────────────────────────────────────┤
│                                                                              │
│   Virtual Address (48 bits used): 0x00007F4A12345678                        │
│                                                                              │
│   ┌─────┬──────┬──────┬──────┬──────┬──────────────┐                        │
│   │Sign │PML4  │ PDPT │  PD  │  PT  │    Offset    │                        │
│   │Ext. │Index │Index │Index │Index │  (12 bits)   │                        │
│   │16b  │ 9b   │ 9b   │ 9b   │ 9b   │              │                        │
│   └─────┴──┬───┴───┬──┴───┬──┴──┬───┴──────────────┘                        │
│            │       │      │     │                                            │
│            ▼       ▼      ▼     ▼                                            │
│   CR3 ─> [PML4] ─> [PDPT] ─> [PD] ─> [PT] ─> Physical Page                  │
│            │        │        │       │                                       │
│            │        │        │       └── 512 PTEs (4KB page table)          │
│            │        │        └────────── 512 PDEs (can map 2MB pages)       │
│            │        └─────────────────── 512 PDPTEs (can map 1GB pages)     │
│            └──────────────────────────── 512 PML4Es                         │
│                                                                              │
│   Each entry is 8 bytes:                                                    │
│   ┌────────────────────────────────────────────────────────────────────┐   │
│   │ Bits 63-52: Reserved/NX │ Bits 51-12: Physical Address │ Bits 11-0 │   │
│   │                         │    (Page Frame Number)       │  Flags    │   │
│   └────────────────────────────────────────────────────────────────────┘   │
│                                                                              │
│   Important Flags:                                                          │
│   Bit 0 (P): Present                                                        │
│   Bit 1 (R/W): Read/Write                                                   │
│   Bit 2 (U/S): User/Supervisor                                              │
│   Bit 5 (A): Accessed                                                       │
│   Bit 6 (D): Dirty                                                          │
│   Bit 7 (PS): Page Size (for 2MB/1GB pages)                                 │
│                                                                              │
└─────────────────────────────────────────────────────────────────────────────┘

Shadow Page Table Entry Mapping

For each guest PTE, the shadow PTE combines guest permissions with VMM policy:

┌─────────────────────────────────────────────────────────────────────────────┐
│                       SHADOW PTE CONSTRUCTION                                │
├─────────────────────────────────────────────────────────────────────────────┤
│                                                                              │
│   Guest PTE (at GPA 0x2008):                                                │
│   ┌───────────────────────────────────────────────────────────────────┐    │
│   │ Physical Addr: 0x5000 (GPA) │ R/W: 1 │ U/S: 1 │ P: 1 │ Flags... │    │
│   └───────────────────────────────────────────────────────────────────┘    │
│                                                                              │
│   VMM Memory Map lookup: GPA 0x5000 -> HPA 0x8A000                          │
│                                                                              │
│   Shadow PTE (used by real MMU):                                            │
│   ┌───────────────────────────────────────────────────────────────────┐    │
│   │ Physical Addr: 0x8A000 (HPA) │ R/W: 1 │ U/S: 1 │ P: 1 │ Flags...│    │
│   └───────────────────────────────────────────────────────────────────┘    │
│                                                                              │
│   SPECIAL CASE - Guest Page Table Page:                                     │
│   If GPA 0x5000 contains a guest page table, shadow marks it read-only:     │
│   ┌───────────────────────────────────────────────────────────────────┐    │
│   │ Physical Addr: 0x8A000 (HPA) │ R/W: 0 │ U/S: 1 │ P: 1 │ Flags...│    │
│   └───────────────────────────────────────────────────────────────────┘    │
│   This ensures any write triggers a trap so VMM can update shadows.         │
│                                                                              │
│   PERMISSION INTERSECTION:                                                   │
│   Shadow R/W = Guest R/W AND VMM Policy (read-only if PT page)              │
│   Shadow U/S = Guest U/S (VMM doesn't usually restrict)                     │
│   Shadow NX  = Guest NX OR VMM Policy (can make more restrictive)           │
│                                                                              │
└─────────────────────────────────────────────────────────────────────────────┘

TLB Management in Shadow Paging

TLB coherency is critical and adds more overhead:

┌─────────────────────────────────────────────────────────────────────────────┐
│                         TLB COHERENCY REQUIREMENTS                           │
├─────────────────────────────────────────────────────────────────────────────┤
│                                                                              │
│   Guest modifies PTE:                                                       │
│   1. Guest writes to its page table (trapped by VMM)                        │
│   2. VMM updates shadow page table                                          │
│   3. VMM must flush TLB entry for affected GVA                              │
│      - If not flushed, CPU uses stale shadow translation!                   │
│                                                                              │
│   Guest executes INVLPG (invalidate page):                                  │
│   1. INVLPG causes VM EXIT (privileged instruction)                         │
│   2. VMM must execute real INVLPG for that GVA                              │
│   3. Another VM EXIT/ENTRY cycle                                            │
│                                                                              │
│   Guest writes to CR3 (changes address space):                              │
│   1. CR3 write causes VM EXIT                                               │
│   2. VMM must:                                                              │
│      a. Look up which shadow PT corresponds to new guest CR3                │
│      b. Create new shadow PT if needed                                      │
│      c. Point real CR3 to shadow PT                                         │
│      d. Flush entire TLB (or use VPID)                                      │
│                                                                              │
│   ┌───────────────────────────────────────────────────────────────────────┐│
│   │ TLB Flush Statistics (typical workload):                              │││
│   │   - Context switches: 100s per second, each flushes TLB               │││
│   │   - Page table updates: 1000s per second during heavy allocation      │││
│   │   - Each flush: ~100-500 cycles, but destroys cached translations     │││
│   │   - TLB miss after flush: ~50-200 cycles per translation              │││
│   └───────────────────────────────────────────────────────────────────────┘│
│                                                                              │
└─────────────────────────────────────────────────────────────────────────────┘

Why EPT Changed Everything

For comparison, understand what hardware assistance provides:

┌─────────────────────────────────────────────────────────────────────────────┐
│                    SHADOW PT vs EPT COMPARISON                               │
├─────────────────────────────────────────────────────────────────────────────┤
│                                                                              │
│   SHADOW PAGE TABLES:                                                       │
│   ┌─────────────────────────────────────────────────────────────────────┐  │
│   │                                                                     │  │
│   │   Guest CR3 ─────────────┐                                         │  │
│   │   (not used by HW)       │                                         │  │
│   │                          ▼                                         │  │
│   │                   Guest Page Tables                                │  │
│   │                   (in guest memory)                                │  │
│   │                          │                                         │  │
│   │                          │ VMM intercepts writes                   │  │
│   │                          ▼                                         │  │
│   │   Real CR3 ────────> Shadow Page Tables ────> Physical Memory     │  │
│   │   (points here)      (VMM maintained)                              │  │
│   │                                                                     │  │
│   │   Cost: VM EXIT on every PT modification, CR3 write, INVLPG       │  │
│   └─────────────────────────────────────────────────────────────────────┘  │
│                                                                              │
│   EPT (Extended Page Tables):                                               │
│   ┌─────────────────────────────────────────────────────────────────────┐  │
│   │                                                                     │  │
│   │   Guest CR3 ────────> Guest Page Tables ────> Guest Physical       │  │
│   │   (guest controlled)  (GVA -> GPA)            Address              │  │
│   │                                                   │                │  │
│   │                                                   │ EPT Walk       │  │
│   │                                                   │ (hardware)     │  │
│   │                                                   ▼                │  │
│   │   EPT Pointer ────────> EPT Tables ─────────> Host Physical       │  │
│   │   (VMM controlled)      (GPA -> HPA)           Address            │  │
│   │                                                                     │  │
│   │   Cost: Extra memory access during page walk (cached in TLB)       │  │
│   │   NO VM EXIT for normal PT modifications!                          │  │
│   └─────────────────────────────────────────────────────────────────────┘  │
│                                                                              │
│   Performance Impact:                                                       │
│   ┌──────────────────────┬─────────────────┬───────────────────────────┐  │
│   │ Operation            │ Shadow PT       │ EPT                       │  │
│   ├──────────────────────┼─────────────────┼───────────────────────────┤  │
│   │ Guest PT modification│ VM EXIT (~2000  │ No exit (0 cycles)        │  │
│   │                      │ cycles)         │                           │  │
│   │ CR3 write            │ VM EXIT + flush │ No exit (VPID helps)      │  │
│   │ INVLPG               │ VM EXIT         │ No exit (affects guest    │  │
│   │                      │                 │ TLB only)                 │  │
│   │ Page walk (TLB miss) │ 1 walk (fast)   │ 2 walks (4x4=16 accesses │  │
│   │                      │                 │ worst case, cached)       │  │
│   └──────────────────────┴─────────────────┴───────────────────────────┘  │
│                                                                              │
└─────────────────────────────────────────────────────────────────────────────┘

Project Specification

What You Will Build

A software simulator that models shadow page table memory virtualization, including:

Simulated guest memory with guest page tables
VMM memory map tracking GPA to HPA mappings
Shadow page table management with trap-and-update
TLB simulation with hit/miss tracking
Statistics collection for performance analysis

Functional Requirements

Memory Subsystem:
- Allocate host physical memory pool (e.g., 256MB)
- Allocate guest physical memory from pool (e.g., 64MB)
- Track GPA -> HPA mappings
Guest Page Table Management:
- Support single-level (for simplicity) or multi-level paging
- Allow guest to “allocate” pages and create mappings
- Detect writes to page table pages
Shadow Page Table Management:
- Create shadow tables that map GVA -> HPA
- Update shadows when guest modifies its tables
- Mark guest PT pages as read-only in shadows
TLB Simulation:
- Model a simple TLB (e.g., 64 entries, fully associative)
- Track hits and misses
- Flush on CR3 changes and INVLPG
Statistics and Analysis:
- Count guest PT modifications (VM exits)
- Count TLB flushes
- Calculate estimated performance overhead

Non-Functional Requirements

Educational: Output explains what’s happening at each step
Configurable: Memory sizes, TLB size as parameters
Deterministic: Same input produces same output
Extensible: Easy to add multi-level paging

Real World Outcome

When complete, your simulator will produce output like this:

$ ./shadow_sim --guest-mem=64M --host-mem=256M guest_program.bin

=== Shadow Page Table Simulator ===
Configuration:
  Guest Physical Memory: 64MB (16,384 pages)
  Host Physical Memory: 256MB (65,536 pages)
  Guest PT allocated at GPA 0x1000000-0x1003FFF
  TLB: 64 entries, fully associative

Loading guest program: guest_program.bin (4KB)
Program loaded at GPA 0x0000 -> HPA 0x50000

=== Simulation Start ===

[CYCLE 1] Guest loads CR3 with GPA 0x1000000
  [VMM] VM EXIT: CR3 write
  [VMM] Guest page directory at GPA 0x1000000 -> HPA 0x60000
  [VMM] Creating shadow page directory at HPA 0x70000
  [VMM] Real CR3 set to HPA 0x70000
  [VMM] Write-protecting guest PD at HPA 0x60000
  [VMM] TLB flushed (full)
  [VMM] VM ENTRY (resumed guest)
  [STATS] VM exits so far: 1

[CYCLE 2] Guest writes PTE: GVA 0x1000 -> GPA 0x5000
  [VMM] VM EXIT: Write to protected page (guest page table)
  [VMM] Fault address: GPA 0x1000004 (PTE for GVA 0x1000)
  [VMM] Decoding guest instruction: MOV [0x1000004], 0x5003
  [VMM] Guest mapping: GVA 0x1000 -> GPA 0x5000 (R/W, Present)
  [VMM] Looking up GPA 0x5000 in memory map...
  [VMM] GPA 0x5000 -> HPA 0x8A000
  [VMM] Updating shadow PTE: GVA 0x1000 -> HPA 0x8A000
  [VMM] TLB invalidate: GVA 0x1000
  [VMM] VM ENTRY (resumed guest)
  [STATS] VM exits so far: 2

[CYCLE 3] Guest accesses GVA 0x1000 (read)
  [CPU] TLB lookup: GVA 0x1000 -> MISS
  [CPU] Shadow table walk: GVA 0x1000 -> HPA 0x8A000
  [CPU] TLB fill: GVA 0x1000 -> HPA 0x8A000
  [CPU] Memory access completed
  [STATS] TLB misses: 1, TLB hits: 0

[CYCLE 4] Guest accesses GVA 0x1000 (read)
  [CPU] TLB lookup: GVA 0x1000 -> HIT (HPA 0x8A000)
  [CPU] Memory access completed
  [STATS] TLB misses: 1, TLB hits: 1

[CYCLE 5] Guest writes PTE: GVA 0x2000 -> GPA 0x6000
  [VMM] VM EXIT: Write to protected page
  [VMM] Fault address: GPA 0x1000008 (PTE for GVA 0x2000)
  [VMM] Guest mapping: GVA 0x2000 -> GPA 0x6000 (R/W, Present)
  [VMM] GPA 0x6000 -> HPA 0x95000
  [VMM] Updating shadow PTE: GVA 0x2000 -> HPA 0x95000
  [VMM] TLB invalidate: GVA 0x2000
  [VMM] VM ENTRY
  [STATS] VM exits so far: 3

... (more cycles) ...

[CYCLE 1000] Guest changes CR3 (context switch simulation)
  [VMM] VM EXIT: CR3 write
  [VMM] Full TLB flush
  [VMM] Shadow table switched
  [STATS] VM exits so far: 247

=== Simulation Complete ===

Total cycles simulated: 1000
Guest instructions: ~5000

=== Performance Statistics ===
┌────────────────────────────────────┬─────────────┐
│ Metric                             │ Count       │
├────────────────────────────────────┼─────────────┤
│ Guest page table modifications     │ 1,247       │
│ Shadow table updates               │ 1,247       │
│ CR3 writes (context switches)      │ 89          │
│ INVLPG instructions                │ 156         │
│ Total VM exits                     │ 1,492       │
│ TLB flushes (full)                 │ 89          │
│ TLB invalidations (single page)    │ 1,403       │
│ TLB hits                           │ 3,412       │
│ TLB misses                         │ 588         │
│ TLB hit rate                       │ 85.3%       │
└────────────────────────────────────┴─────────────┘

=== Performance Analysis ===
Estimated cycles lost to VM exits: 1,492 * 2000 = 2,984,000 cycles
Estimated cycles lost to TLB misses: 588 * 100 = 58,800 cycles
Total overhead: ~3,042,800 cycles (~6% of execution time at 1GHz)

=== Why EPT is Better ===
With EPT, PT modifications would cause 0 VM exits.
Only EPT violations (unmapped GPA access) would trap.
Estimated EPT overhead: ~50,000 cycles (for initial EPT faults only)
Shadow PT overhead vs EPT: 60x more expensive!

Questions to Guide Your Design

Work through these questions BEFORE writing code:

Memory Layout: How will you organize your simulated physical memory? How do you track which HPAs are free vs allocated? How do you handle the GPA->HPA mapping table efficiently?
Page Table Structure: Will you start with single-level paging (simpler) or jump to multi-level? What page size will you use? How will you represent PTEs in your simulator?
Trap Detection: How do you detect writes to guest page tables? In a real system, this is done by marking pages read-only—how will you simulate this?
Shadow Synchronization: When the guest writes a PTE, what exact steps must you perform? How do you handle the case where the guest is creating a new page table?
TLB Simulation: What TLB replacement policy will you use (LRU, random, FIFO)? How do you handle TLB flushes efficiently?
Guest Program: How will you simulate guest execution? Will you interpret a simple instruction set, or script the memory operations?
Statistics Collection: What metrics best illustrate the overhead of shadow page tables? How will you make the comparison to EPT compelling?

Thinking Exercise

Before writing code, work through this scenario on paper:

Setup:

Guest physical memory: 16 pages (GPA 0x0000-0xFFFF)
Host physical memory: 64 pages (HPA 0x00000-0x3FFFF)
GPA->HPA map:
- GPA 0x0000 -> HPA 0x10000 (guest code)
- GPA 0x1000 -> HPA 0x20000 (guest page table)
- GPA 0x2000 -> HPA 0x25000 (free for mapping)
- GPA 0x3000 -> HPA 0x30000 (free for mapping)
Initial state: Shadow PT at HPA 0x40000, all entries marked not present

Trace this sequence:

Guest writes CR3 = 0x1000
- What does VMM do?
- What is the state of shadow PT?
- What is real CR3?
Guest writes PTE at GPA 0x1000: entry[0] = 0x2003 (maps GVA 0x0000 to GPA 0x2000, present, R/W)
- Why does this trap?
- What does VMM write to shadow PT?
- Show the exact shadow PTE value
Guest reads from GVA 0x0100
- Show the address translation (TLB miss path)
- What HPA is accessed?
- What TLB entry is created?
Guest reads from GVA 0x0200
- Is this a TLB hit or miss?
- What HPA is accessed?
Guest writes PTE at GPA 0x1000: entry[0] = 0x3003 (changes GVA 0x0000 to point to GPA 0x3000)
- What happens to the TLB?
- What does VMM update in shadow PT?
- If guest reads GVA 0x0100 now, what HPA is accessed?

Write out all the data structure states after each step.

Hints in Layers

If you’re stuck, reveal hints one at a time:

Hint 1: Basic Data Structures

Start with these structures:

#define PAGE_SIZE 4096
#define PAGE_SHIFT 12

// Host physical memory (simulated)
typedef struct {
    uint8_t *memory;        // mmap'd region
    size_t size;
    uint64_t *free_bitmap;  // Track free pages
} host_memory_t;

// Guest physical address to host physical address mapping
typedef struct {
    uint64_t gpa;
    uint64_t hpa;
    uint32_t flags;  // Present, is_page_table, etc.
} gpa_mapping_t;

typedef struct {
    gpa_mapping_t *mappings;
    size_t count;
    size_t guest_mem_size;
} vmm_memory_map_t;

// Page table entry (simplified single-level)
typedef struct {
    uint64_t pfn     : 40;  // Physical frame number
    uint64_t flags   : 12;  // P, R/W, U/S, etc.
    uint64_t rsvd    : 12;
} pte_t;

// Shadow page table state
typedef struct {
    uint64_t guest_cr3_gpa;    // Guest's CR3 value (GPA)
    uint64_t shadow_cr3_hpa;   // Shadow PT base (HPA)
    pte_t *shadow_pt;          // Pointer to shadow PT
} shadow_state_t;

// TLB entry
typedef struct {
    uint64_t gva;
    uint64_t hpa;
    int valid;
    uint32_t flags;
} tlb_entry_t;

typedef struct {
    tlb_entry_t *entries;
    size_t size;
    size_t hits;
    size_t misses;
} tlb_t;

Hint 2: Memory Map Implementation

Create and manage the GPA->HPA mapping:

vmm_memory_map_t* create_memory_map(size_t guest_mem_size, host_memory_t *host) {
    vmm_memory_map_t *map = calloc(1, sizeof(vmm_memory_map_t));
    map->guest_mem_size = guest_mem_size;

    size_t num_pages = guest_mem_size / PAGE_SIZE;
    map->mappings = calloc(num_pages, sizeof(gpa_mapping_t));
    map->count = num_pages;

    // Allocate HPAs for each GPA
    for (size_t i = 0; i < num_pages; i++) {
        uint64_t gpa = i * PAGE_SIZE;
        uint64_t hpa = allocate_host_page(host);
        map->mappings[i].gpa = gpa;
        map->mappings[i].hpa = hpa;
        map->mappings[i].flags = MAP_PRESENT;
    }
    return map;
}

uint64_t gpa_to_hpa(vmm_memory_map_t *map, uint64_t gpa) {
    size_t page_index = gpa / PAGE_SIZE;
    if (page_index >= map->count) {
        return INVALID_HPA;  // GPA out of range
    }
    uint64_t page_offset = gpa & (PAGE_SIZE - 1);
    return map->mappings[page_index].hpa + page_offset;
}

Hint 3: Trap-and-Update Logic

The core shadow table update routine:

void handle_guest_pte_write(shadow_state_t *shadow,
                            vmm_memory_map_t *map,
                            uint64_t fault_gpa,
                            uint64_t new_value) {
    // 1. Extract which GVA this PTE maps
    uint64_t guest_pt_gpa = shadow->guest_cr3_gpa;
    uint64_t pte_offset = fault_gpa - guest_pt_gpa;
    uint64_t pte_index = pte_offset / sizeof(pte_t);
    uint64_t gva = pte_index * PAGE_SIZE;

    // 2. Apply the write to guest's page table (in guest memory)
    uint64_t fault_hpa = gpa_to_hpa(map, fault_gpa);
    *(uint64_t*)hpa_to_ptr(fault_hpa) = new_value;

    // 3. Parse the new PTE
    pte_t guest_pte = *(pte_t*)&new_value;

    if (!(guest_pte.flags & PTE_PRESENT)) {
        // Guest unmapped this page
        shadow->shadow_pt[pte_index].flags &= ~PTE_PRESENT;
        return;
    }

    // 4. Translate GPA to HPA
    uint64_t target_gpa = guest_pte.pfn << PAGE_SHIFT;
    uint64_t target_hpa = gpa_to_hpa(map, target_gpa);

    // 5. Build shadow PTE
    pte_t shadow_pte;
    shadow_pte.pfn = target_hpa >> PAGE_SHIFT;
    shadow_pte.flags = guest_pte.flags;  // Copy permissions

    // 6. If target is a page table page, mark read-only in shadow
    if (is_page_table_page(map, target_gpa)) {
        shadow_pte.flags &= ~PTE_WRITABLE;
    }

    shadow->shadow_pt[pte_index] = shadow_pte;

    printf("[VMM] Shadow PTE[%lu]: GVA 0x%lx -> HPA 0x%lx\n",
           pte_index, gva, target_hpa);
}

Hint 4: TLB Simulation

Simple fully-associative TLB:

tlb_t* create_tlb(size_t size) {
    tlb_t *tlb = calloc(1, sizeof(tlb_t));
    tlb->entries = calloc(size, sizeof(tlb_entry_t));
    tlb->size = size;
    return tlb;
}

// Returns HPA or INVALID_HPA if miss
uint64_t tlb_lookup(tlb_t *tlb, uint64_t gva) {
    uint64_t page_gva = gva & ~(PAGE_SIZE - 1);

    for (size_t i = 0; i < tlb->size; i++) {
        if (tlb->entries[i].valid && tlb->entries[i].gva == page_gva) {
            tlb->hits++;
            uint64_t offset = gva & (PAGE_SIZE - 1);
            return tlb->entries[i].hpa + offset;
        }
    }
    tlb->misses++;
    return INVALID_HPA;
}

void tlb_insert(tlb_t *tlb, uint64_t gva, uint64_t hpa, uint32_t flags) {
    uint64_t page_gva = gva & ~(PAGE_SIZE - 1);
    uint64_t page_hpa = hpa & ~(PAGE_SIZE - 1);

    // Simple replacement: find invalid or random
    size_t slot = 0;
    for (size_t i = 0; i < tlb->size; i++) {
        if (!tlb->entries[i].valid) {
            slot = i;
            break;
        }
        slot = rand() % tlb->size;  // Random if all valid
    }

    tlb->entries[slot].gva = page_gva;
    tlb->entries[slot].hpa = page_hpa;
    tlb->entries[slot].flags = flags;
    tlb->entries[slot].valid = 1;
}

void tlb_invalidate_page(tlb_t *tlb, uint64_t gva) {
    uint64_t page_gva = gva & ~(PAGE_SIZE - 1);
    for (size_t i = 0; i < tlb->size; i++) {
        if (tlb->entries[i].gva == page_gva) {
            tlb->entries[i].valid = 0;
        }
    }
}

void tlb_flush_all(tlb_t *tlb) {
    for (size_t i = 0; i < tlb->size; i++) {
        tlb->entries[i].valid = 0;
    }
}

Hint 5: Simulation Loop Structure

Main simulation driver:

typedef enum {
    OP_LOAD_CR3,      // Guest sets CR3
    OP_WRITE_PTE,     // Guest modifies page table
    OP_READ_MEM,      // Guest reads from GVA
    OP_WRITE_MEM,     // Guest writes to GVA
    OP_INVLPG,        // Guest invalidates TLB entry
} guest_op_t;

typedef struct {
    guest_op_t op;
    uint64_t addr;      // GVA or GPA depending on op
    uint64_t value;     // For writes
} guest_instruction_t;

void simulate(guest_instruction_t *program, size_t count) {
    for (size_t i = 0; i < count; i++) {
        guest_instruction_t *inst = &program[i];

        switch (inst->op) {
        case OP_LOAD_CR3:
            stats.vm_exits++;
            handle_cr3_write(inst->value);
            tlb_flush_all(tlb);
            stats.tlb_flushes++;
            break;

        case OP_WRITE_PTE:
            stats.vm_exits++;
            stats.pt_modifications++;
            handle_guest_pte_write(shadow, map, inst->addr, inst->value);
            tlb_invalidate_page(tlb, pte_addr_to_gva(inst->addr));
            break;

        case OP_READ_MEM:
        case OP_WRITE_MEM:
            // Try TLB first
            uint64_t hpa = tlb_lookup(tlb, inst->addr);
            if (hpa == INVALID_HPA) {
                // TLB miss - walk shadow tables
                hpa = shadow_table_walk(shadow, inst->addr);
                tlb_insert(tlb, inst->addr, hpa, get_pte_flags(shadow, inst->addr));
            }
            // Perform the access
            if (inst->op == OP_READ_MEM) {
                do_read(hpa);
            } else {
                do_write(hpa, inst->value);
            }
            break;

        case OP_INVLPG:
            stats.vm_exits++;
            tlb_invalidate_page(tlb, inst->addr);
            break;
        }
    }
}

Hint 6: Multi-Level Page Tables (Advanced)

Extend to 2-level paging:

// Two-level: Page Directory -> Page Table -> Page

typedef struct {
    uint64_t guest_pd_gpa;     // Guest's CR3 points here
    pte_t *shadow_pd;          // Shadow page directory (HPA)
    pte_t **shadow_pts;        // Array of shadow page tables
} shadow_state_2level_t;

uint64_t shadow_walk_2level(shadow_state_2level_t *shadow, uint64_t gva) {
    uint64_t pd_index = (gva >> 22) & 0x3FF;  // Top 10 bits
    uint64_t pt_index = (gva >> 12) & 0x3FF;  // Next 10 bits
    uint64_t offset = gva & 0xFFF;            // Bottom 12 bits

    // Check PD entry
    pte_t pde = shadow->shadow_pd[pd_index];
    if (!(pde.flags & PTE_PRESENT)) {
        return INVALID_HPA;  // Page fault
    }

    // Get page table
    pte_t *pt = shadow->shadow_pts[pd_index];
    pte_t pte = pt[pt_index];
    if (!(pte.flags & PTE_PRESENT)) {
        return INVALID_HPA;  // Page fault
    }

    return (pte.pfn << PAGE_SHIFT) + offset;
}

void handle_guest_pd_write(shadow_state_2level_t *shadow,
                           vmm_memory_map_t *map,
                           uint64_t fault_gpa,
                           uint64_t new_value) {
    // Guest is creating/modifying a page directory entry
    // This points to a page table, which must also be write-protected

    uint64_t pde_index = (fault_gpa - shadow->guest_pd_gpa) / sizeof(pte_t);
    pte_t guest_pde = *(pte_t*)&new_value;

    if (guest_pde.flags & PTE_PRESENT) {
        uint64_t pt_gpa = guest_pde.pfn << PAGE_SHIFT;
        uint64_t pt_hpa = gpa_to_hpa(map, pt_gpa);

        // Mark this page table page as read-only so we trap PT writes
        mark_page_readonly_in_shadow(shadow, pt_gpa);

        // Allocate shadow page table if needed
        if (shadow->shadow_pts[pde_index] == NULL) {
            shadow->shadow_pts[pde_index] = allocate_shadow_pt();
        }

        // Populate shadow PT from guest PT
        sync_shadow_pt_from_guest(shadow, pde_index, pt_gpa, pt_hpa);
    }
}

Solution Architecture

High-Level Design

┌─────────────────────────────────────────────────────────────────────────────┐
│                         SHADOW PT SIMULATOR                                  │
├─────────────────────────────────────────────────────────────────────────────┤
│                                                                              │
│   ┌──────────────┐       ┌──────────────┐       ┌──────────────┐           │
│   │   Guest      │       │     VMM      │       │    Host      │           │
│   │   Program    │──────>│   Handler    │──────>│   Memory     │           │
│   │   Loader     │       │              │       │   Pool       │           │
│   └──────────────┘       └──────┬───────┘       └──────────────┘           │
│                                 │                                           │
│                    ┌────────────┼────────────┐                              │
│                    ▼            ▼            ▼                              │
│            ┌──────────┐  ┌──────────┐  ┌──────────┐                        │
│            │  Memory  │  │  Shadow  │  │   TLB    │                        │
│            │   Map    │  │   PT     │  │  Sim     │                        │
│            │ (GPA→HPA)│  │ Manager  │  │          │                        │
│            └──────────┘  └──────────┘  └──────────┘                        │
│                    │            │            │                              │
│                    └────────────┼────────────┘                              │
│                                 ▼                                           │
│                          ┌──────────────┐                                   │
│                          │  Statistics  │                                   │
│                          │  Collector   │                                   │
│                          └──────────────┘                                   │
│                                 │                                           │
│                                 ▼                                           │
│                          ┌──────────────┐                                   │
│                          │   Report     │                                   │
│                          │  Generator   │                                   │
│                          └──────────────┘                                   │
│                                                                              │
└─────────────────────────────────────────────────────────────────────────────┘

Key Components

Component	Responsibility	Key Decisions
Host Memory Pool	Simulates physical RAM	Use mmap for large allocation; bitmap for free tracking
Memory Map	GPA to HPA translation	Array indexed by GPA page number for O(1) lookup
Shadow PT Manager	Create/update shadow tables	Start with single-level, extend to multi-level
TLB Simulator	Model caching behavior	Fully associative, configurable size, LRU or random
VMM Handler	Process guest operations	State machine for trap-and-emulate
Statistics	Track performance metrics	VM exits, TLB stats, shadow updates
Report Generator	Format output	Verbose during sim, summary at end

Data Flow

Guest Operation
      │
      ▼
┌─────────────────┐
│ Is it a trap?   │─── No ──> TLB Lookup ──> Hit? ──> Access Memory
│ (CR3/PT write)  │                │
└────────┬────────┘                │ Miss
         │ Yes                     ▼
         ▼                    Shadow Walk
┌─────────────────┐                │
│  VMM Handler    │                ▼
│  - Update map   │           TLB Insert
│  - Sync shadow  │                │
│  - Update TLB   │                ▼
└─────────────────┘           Access Memory
         │
         ▼
   Resume Guest

Algorithm: Shadow Page Table Update

PROCEDURE: UpdateShadowOnGuestPTEWrite(fault_gpa, new_pte_value)
    INPUT: fault_gpa - GPA where guest wrote
           new_pte_value - value guest wrote
    OUTPUT: Updated shadow page table

    1. INCREMENT stats.vm_exits
    2. INCREMENT stats.pt_modifications

    3. // Apply write to guest memory (emulate the instruction)
       fault_hpa := gpa_to_hpa(fault_gpa)
       WRITE new_pte_value TO fault_hpa

    4. // Determine which GVA this PTE maps
       pte_index := (fault_gpa - guest_pt_base) / sizeof(PTE)
       gva := pte_index * PAGE_SIZE

    5. // Parse guest PTE
       guest_pte := parse_pte(new_pte_value)

    6. IF NOT guest_pte.present THEN
           shadow_pt[pte_index].present := false
           tlb_invalidate(gva)
           RETURN

    7. // Translate GPA to HPA
       target_gpa := guest_pte.pfn * PAGE_SIZE
       target_hpa := gpa_to_hpa(target_gpa)

    8. IF target_hpa == INVALID THEN
           // Guest mapped non-existent memory - inject fault to guest
           inject_page_fault_to_guest(gva)
           RETURN

    9. // Build shadow PTE
       shadow_pte.pfn := target_hpa / PAGE_SIZE
       shadow_pte.flags := guest_pte.flags

    10. // If target is a PT page, make shadow read-only
        IF is_page_table_page(target_gpa) THEN
            shadow_pte.flags.writable := false

    11. shadow_pt[pte_index] := shadow_pte

    12. tlb_invalidate(gva)

    13. LOG "Shadow: GVA %x -> HPA %x", gva, target_hpa

Implementation Guide

Development Environment Setup

# Create project structure
mkdir -p shadow-pt-sim/{src,include,tests,guest-programs}
cd shadow-pt-sim

# Required: C compiler, make
# Optional: gdb for debugging, graphviz for diagrams

# Create Makefile
cat > Makefile << 'EOF'
CC = gcc
CFLAGS = -Wall -Wextra -g -O2 -std=c11
LDFLAGS = -lm

SRC = src/main.c src/memory.c src/shadow.c src/tlb.c src/vmm.c src/stats.c
OBJ = $(SRC:.c=.o)
TARGET = shadow_sim

all: $(TARGET)

$(TARGET): $(OBJ)
	$(CC) $(CFLAGS) -o $@ $^ $(LDFLAGS)

%.o: %.c
	$(CC) $(CFLAGS) -c -o $@ $<

clean:
	rm -f $(OBJ) $(TARGET)

test: $(TARGET)
	./$(TARGET) --test

.PHONY: all clean test
EOF

Project Structure

shadow-pt-sim/
├── src/
│   ├── main.c           # Entry point, CLI parsing
│   ├── memory.c         # Host memory pool, GPA->HPA map
│   ├── shadow.c         # Shadow page table management
│   ├── tlb.c            # TLB simulation
│   ├── vmm.c            # VM exit handling, guest program execution
│   └── stats.c          # Statistics collection and reporting
├── include/
│   ├── types.h          # Common type definitions
│   ├── memory.h         # Memory management declarations
│   ├── shadow.h         # Shadow PT declarations
│   ├── tlb.h            # TLB declarations
│   └── vmm.h            # VMM declarations
├── tests/
│   ├── test_basic.c     # Basic single-mapping test
│   ├── test_multimap.c  # Multiple mappings
│   └── test_context.c   # Context switch simulation
├── guest-programs/
│   └── workload.txt     # Sample guest workload specification
├── Makefile
└── README.md

Implementation Phases

Phase 1: Memory Infrastructure (Days 1-3)

Goals:

Implement host physical memory pool
Implement GPA to HPA mapping
Basic memory read/write operations

Tasks:

Create memory.c with create_host_memory() and allocate_host_page()
Create GPA->HPA mapping structure and lookup function
Test: Allocate guest memory, verify mapping

Checkpoint: Can allocate 64MB guest memory from 256MB host pool, translate GPA to HPA.

Phase 2: Single-Level Shadow Tables (Days 4-7)

Goals:

Implement single-level page table structure
Create shadow page table
Handle basic PTE writes

Tasks:

Define PTE structure with flags
Create shadow PT allocation and initialization
Implement handle_cr3_write() - create shadow from guest PT
Implement handle_pte_write() - update shadow on guest modification
Test: Guest creates mapping, shadow reflects it

Checkpoint: Guest can set CR3, write PTEs, shadow tables are correctly updated.

Phase 3: TLB and Address Translation (Days 8-10)

Goals:

Implement TLB simulation
Implement shadow table walk
Complete address translation path

Tasks:

Create TLB structure with lookup/insert/invalidate
Implement shadow_table_walk() for TLB miss
Connect TLB to shadow PT
Track TLB hits/misses

Checkpoint: Memory accesses use TLB, misses walk shadow tables, statistics accurate.

Phase 4: Guest Program Execution (Days 11-14)

Goals:

Implement guest program parser/executor
Complete VM exit handling
Add INVLPG support

Tasks:

Create simple guest operation language (CR3 load, PTE write, mem read/write, INVLPG)
Implement simulation loop
Add CR3 change handling (context switch)
Add INVLPG handling

Checkpoint: Can run scripted guest workloads, all VM exits handled correctly.

Phase 5: Statistics and Polish (Days 15-17)

Goals:

Comprehensive statistics collection
Performance analysis output
EPT comparison

Tasks:

Implement detailed statistics tracking
Create summary report format
Calculate estimated cycle overhead
Add EPT comparison analysis
Clean up output, add verbose mode

Checkpoint: Full statistics, compelling comparison to EPT, clean output.

Testing Strategy

Test Categories

Category	Purpose	Examples
Unit Tests	Test individual functions	PTE parsing, TLB operations
Integration Tests	Test component interaction	Shadow update triggers TLB invalidate
Workload Tests	Test realistic scenarios	Boot simulation, context switch heavy
Regression Tests	Prevent bugs from returning	Fixed mapping bug stays fixed

Critical Test Cases

Basic Mapping Test:

CR3 = 0x1000
PTE[0] = 0x2003 (GVA 0 -> GPA 0x2000, present, R/W)
READ GVA 0x100
Expected: Read from HPA corresponding to GPA 0x2100

Mapping Change Test:

CR3 = 0x1000
PTE[0] = 0x2003
READ GVA 0x100  # From GPA 0x2100
PTE[0] = 0x3003 # Change mapping
READ GVA 0x100  # Should be from GPA 0x3100
Verify: TLB was invalidated, new HPA accessed

Context Switch Test:

CR3 = 0x1000 (Process 1)
PTE[0] = 0x2003
READ GVA 0
CR3 = 0x4000 (Process 2 - different PT)
PTE[0] = 0x5003
READ GVA 0  # Should be from different HPA
Verify: Full TLB flush on CR3 change

Write Protection Test:

CR3 = 0x1000 (PT at GPA 0x1000)
READ from GPA 0x1000 # Should succeed
WRITE to GPA 0x1000  # Should trap (PT page is protected)
Verify: VM exit occurs on PT write

TLB Hit Rate Test:

PTE[0] = 0x2003
Repeat 100x: READ GVA 0x100
Expected: 1 TLB miss, 99 TLB hits

Test Data

# Create test workloads
cat > guest-programs/basic.txt << 'EOF'
# Basic mapping test
CR3 1000
WRITE_PTE 0 2003
READ 100
READ 200
WRITE 150 DEADBEEF
READ 150
EOF

cat > guest-programs/context_switch.txt << 'EOF'
# Context switch simulation
CR3 1000
WRITE_PTE 0 2003
WRITE_PTE 1 3003
READ 100
READ 1100
CR3 4000
WRITE_PTE 0 5003
READ 100
CR3 1000
READ 100
EOF

Common Pitfalls & Debugging

Frequent Mistakes

Pitfall	Symptom	Solution
Not write-protecting PT pages	Guest can modify PT without VMM seeing	Mark PT pages read-only in shadow when creating
Forgetting TLB invalidation	Stale translations used after PT update	Always invalidate affected GVA on PTE change
Wrong GPA->HPA for PT access	Shadow points to wrong memory	Double-check that GPA of PT page maps to correct HPA
Off-by-one in PTE index	Wrong GVA affected	Verify PTE index calculation: (fault_gpa - pt_base) / 8
Not handling not-present PTEs	Crash on access to unmapped page	Check present bit before using PFN
TLB not flushed on CR3 change	Process sees other process’s memory	Flush entire TLB when CR3 changes

Debugging Strategies

Trace Everything Initially:

#define TRACE(fmt, ...) fprintf(stderr, "[TRACE] " fmt "\n", ##__VA_ARGS__)

void handle_pte_write(uint64_t gpa, uint64_t value) {
    TRACE("PTE write: GPA=0x%lx, value=0x%lx", gpa, value);
    // ... rest of function
}

Dump State After Each Operation:

void dump_shadow_state(shadow_state_t *s) {
    printf("Shadow PT State:\n");
    printf("  Guest CR3 GPA: 0x%lx\n", s->guest_cr3_gpa);
    printf("  Shadow CR3 HPA: 0x%lx\n", s->shadow_cr3_hpa);
    for (int i = 0; i < 16; i++) {
        if (s->shadow_pt[i].flags & PTE_PRESENT) {
            printf("  [%d] GVA 0x%x -> HPA 0x%lx (flags=0x%x)\n",
                   i, i * PAGE_SIZE,
                   s->shadow_pt[i].pfn << PAGE_SHIFT,
                   s->shadow_pt[i].flags);
        }
    }
}

Verify Round-Trip:

// After updating shadow, verify translation works
uint64_t test_hpa = shadow_table_walk(shadow, gva);
uint64_t expected_hpa = gpa_to_hpa(map, target_gpa);
assert(test_hpa == expected_hpa);

Statistics Sanity Check:

// At end of simulation
assert(stats.pt_modifications == stats.shadow_updates);
assert(stats.tlb_hits + stats.tlb_misses == stats.total_mem_accesses);

Extensions & Challenges

Beginner Extensions

Add verbose mode: Show every operation step by step
Multiple guest processes: Simulate multiple address spaces
Page fault injection: When guest accesses unmapped GVA

Intermediate Extensions

2-level paging: Implement PD + PT hierarchy
Large page support: 2MB pages (skip PT level)
LRU TLB: Replace random eviction with LRU
Accessed/Dirty bit emulation: Track page access patterns

Advanced Extensions

4-level x86-64 paging: Full PML4 + PDPT + PD + PT
ASID/VPID simulation: Tag TLB entries to avoid full flush
EPT simulator: Implement for comparison
Performance modeling: Estimate actual cycle costs
Real guest code: Load simple ELF and simulate execution

Self-Assessment Checklist

Before considering this project complete, verify:

Understanding

I can explain why shadow page tables require trapping PT modifications
I can draw the three address spaces (GVA, GPA, HPA) and translation path
I can explain why CR3 writes must cause VM exits
I understand why shadow PT pages are marked read-only in the shadow
I can explain the TLB coherency requirements
I understand why EPT is dramatically more efficient

Implementation

GPA to HPA mapping works correctly
Shadow table updates on guest PTE write
TLB hits and misses tracked accurately
CR3 changes handled with TLB flush
Statistics match expected values for test workloads

Growth

I can estimate the overhead of shadow PT vs EPT
I could extend this to multi-level paging
I understand how real hypervisors (KVM, Xen) handle this

The Interview Questions They’ll Ask

After completing this project, you’ll be ready for these questions:

“How do hypervisors virtualize memory?”
- They want: Explain GVA->GPA->HPA, shadow PT or EPT, why isolation matters
- Bonus: Mention performance implications of each approach
“What is a shadow page table?”
- They want: VMM-maintained table that maps GVA directly to HPA, updated on guest PT modifications
- Explain the trap mechanism
“Why was EPT/NPT such a big improvement?”
- They want: Eliminated VM exits for PT modifications, moved GPA->HPA to hardware
- Quantify: From 1000s of exits to nearly zero for memory management
“How does a hypervisor detect when a guest modifies its page tables?”
- They want: Write-protect PT pages in shadow, trap on write, emulate and sync
- Explain the VM exit/entry cost
“What is TLB shootdown and why does it matter for virtualization?”
- They want: Invalidating TLB entries across CPUs, even more expensive in VM context
- Discuss VPID/ASID as mitigation
“How would you debug a memory corruption issue in a VM?”
- They want: Understanding of memory layers, ability to trace translation, check shadow consistency
- Mention tools like VM introspection

Real-World Connections

Industry Applications

Cloud providers (AWS, GCP, Azure): Run millions of VMs, memory virtualization efficiency is critical
KVM/QEMU: Linux’s hypervisor, started with shadow PT, now uses EPT
VMware: Pioneered shadow page tables in VMware Workstation (late 1990s)
Security research: Hypervisor-based security tools need deep understanding
Embedded virtualization: Where EPT isn’t available, shadow PT still used

KVM source code: arch/x86/kvm/mmu/ - production shadow/EPT implementation
Xen: xen/arch/x86/mm/shadow/ - shadow page table code
QEMU: User-space component that works with KVM
bhyve: FreeBSD hypervisor, good learning codebase
Bareflank: Educational hypervisor project

Historical Context

1974: Popek & Goldberg formal requirements for virtualization
1998: VMware Workstation uses binary translation + shadow PT
2005: Intel VT-x released (CPU virtualization)
2008: Intel EPT released (memory virtualization)
2008: AMD NPT (Nested Page Tables) released
Today: Shadow PT still used for nested virtualization edge cases

Resources

Essential Reading

OSTEP Chapters 18-20: Virtual Memory, Paging, TLBs
CS:APP Chapter 9: Virtual Memory
Intel SDM Vol. 3A Chapter 4: Paging
Intel SDM Vol. 3C Chapters 28-29: EPT and Memory Virtualization

Papers

“Formal Requirements for Virtualizable Third Generation Architectures” - Popek & Goldberg (1974)
“A Comparison of Software and Hardware Techniques for x86 Virtualization” - Adams & Agesen (2006)
“Memory Resource Management in VMware ESX Server” - Waldspurger (2002)

Online Resources

Books That Will Help

Topic	Book	Chapter
Virtual memory fundamentals	Operating Systems: Three Easy Pieces	Ch. 18-20
Address translation hardware	Computer Systems: A Programmer’s Perspective	Ch. 9
x86 paging details	Intel SDM Volume 3A	Ch. 4
Hypervisor architecture	The Definitive Guide to the Xen Hypervisor	Ch. 8
Memory virtualization	Intel SDM Volume 3C	Ch. 28
Low-level memory management	Understanding the Linux Kernel	Ch. 2

Previous: P04 (Simple JIT Compiler) - Dynamic code generation
Next: P06 (User-Space Memory Mapper) - User-space memory management with mmap
Later: P12 (EPT Implementation) - Hardware-assisted memory virtualization

Learning Milestones

Track your progress through these milestones:

Milestone 1: Single-Level Paging Simulated (Days 1-7)

Host memory pool working
GPA->HPA mapping working
Single-level shadow PT created
Basic PTE write handling
Verification: Guest can create one mapping, access works

Milestone 2: Shadow Table Updates on PT Writes (Days 8-10)

Trap mechanism simulated
Shadow updates correctly on guest PTE modification
TLB invalidation on mapping change
Verification: Changing a mapping changes where accesses go

Milestone 3: Multi-Level Paging (Days 11-14) - Stretch

2-level PT hierarchy implemented
PD and PT pages both protected
Recursive PT walks work
Verification: More realistic memory layout

Milestone 4: Statistics Show VM Exit Cost (Days 15-16)

All VM exits counted
TLB statistics accurate
Performance overhead calculated
Verification: Numbers match expected for test workloads

Milestone 5: Compare to EPT Simulation (Day 17) - Stretch

EPT simulation added (or theoretical comparison)
Side-by-side statistics
Clear demonstration of why EPT is better
Verification: Compelling performance comparison output

Submission / Completion Criteria

Minimum Viable Completion:

Single-level shadow page tables work
Guest can create mappings, access memory
VM exits counted on PT modifications
Basic TLB simulation with hit/miss tracking
Summary statistics at end

Full Completion:

Multi-level paging (at least 2-level)
TLB coherency correct (flush on CR3, invalidate on PTE change)
INVLPG handling
Comprehensive statistics and analysis
Clean, educational output

Excellence (Going Above & Beyond):

Full 4-level x86-64 paging simulation
EPT simulation for comparison
VPID/ASID TLB optimization
Realistic guest workload generation
Performance modeling with estimated cycle costs
Visualization of address translation

This guide was expanded from HYPERVISOR_VIRTUALIZATION_DEEP_DIVE_PROJECTS.md. For the complete learning path, see the project index.

Project 5: Shadow Page Table Simulator

Quick Reference

Learning Objectives

The Core Question You’re Answering

Concepts You Must Understand First

Key Concepts Deep Dive

Theoretical Foundation

The Memory Virtualization Challenge

Without Hardware Support: The Problem

The Shadow Page Table Solution

The Trap-and-Update Mechanism

Multi-Level Page Table Structure (x86-64)

Shadow Page Table Entry Mapping

TLB Management in Shadow Paging

Why EPT Changed Everything

Project Specification

What You Will Build

Functional Requirements

Non-Functional Requirements

Real World Outcome

Questions to Guide Your Design

Thinking Exercise

Hints in Layers

Solution Architecture

High-Level Design

Key Components

Data Flow

Algorithm: Shadow Page Table Update

Implementation Guide

Development Environment Setup

Project Structure

Implementation Phases

Phase 1: Memory Infrastructure (Days 1-3)

Phase 2: Single-Level Shadow Tables (Days 4-7)

Phase 3: TLB and Address Translation (Days 8-10)

Phase 4: Guest Program Execution (Days 11-14)

Phase 5: Statistics and Polish (Days 15-17)

Testing Strategy

Test Categories

Critical Test Cases

Test Data

Common Pitfalls & Debugging

Frequent Mistakes

Debugging Strategies

Extensions & Challenges

Beginner Extensions

Intermediate Extensions

Advanced Extensions

Self-Assessment Checklist

Understanding

Implementation

Growth

The Interview Questions They’ll Ask

Real-World Connections

Industry Applications

Related Open Source Projects

Historical Context

Resources

Essential Reading

Papers

Online Resources

Books That Will Help

Related Projects in This Series

Learning Milestones

Milestone 1: Single-Level Paging Simulated (Days 1-7)

Milestone 2: Shadow Table Updates on PT Writes (Days 8-10)

Milestone 3: Multi-Level Paging (Days 11-14) - Stretch

Milestone 4: Statistics Show VM Exit Cost (Days 15-16)

Milestone 5: Compare to EPT Simulation (Day 17) - Stretch

Submission / Completion Criteria