Project 12: EPT (Extended Page Tables) Implementation

Extend your hypervisor with EPT support, allowing the guest to have its own physical address space that maps to real host physical memory, enabling hardware-accelerated memory virtualization.

Quick Reference

Attribute Value
Difficulty Expert (Level 4: The Systems Architect)
Time Estimate 2-3 weeks
Language C (alternatives: Rust)
Prerequisites Project 11 (Minimal VT-x Hypervisor), understanding of x86-64 paging
Key Topics Extended Page Tables, GPA/HPA translation, EPT violations, MMIO emulation, memory virtualization

1. Learning Objectives

By completing this project, you will:

  • Understand the two-dimensional paging model (guest paging + EPT)
  • Learn how to build EPT page tables for GPA to HPA translation
  • Implement EPT violation handling for MMIO emulation
  • Understand why EPT is a massive performance improvement over shadow page tables
  • Master the Intel EPT hardware mechanism at a deep level
  • Build the memory virtualization component used by KVM, VMware, and all modern hypervisors

2. Theoretical Foundation

2.1 Core Concepts

The Memory Virtualization Problem

Without hardware support, virtualizing memory is complex. The guest OS thinks it controls physical memory, but that “physical” memory is actually virtual from the hypervisor’s perspective:

Without EPT (Shadow Page Tables):
┌──────────────────────────────────────────────────────────────────────┐
│                                                                       │
│   Guest Virtual Address (GVA)                                        │
│           │                                                           │
│           │ Guest Page Tables (controlled by guest OS)               │
│           ▼                                                           │
│   Guest Physical Address (GPA)                                       │
│           │                                                           │
│           │ ??? How does this become real memory?                    │
│           ▼                                                           │
│   Host Physical Address (HPA) = Real RAM                             │
│                                                                       │
│   PROBLEM: Guest page tables point to GPA, but CPU needs HPA!        │
│                                                                       │
│   SOLUTION: Shadow Page Tables                                        │
│   - VMM maintains "shadow" page tables                               │
│   - Shadow PT maps GVA → HPA directly                                │
│   - Every guest PT change → trap → VMM updates shadow               │
│   - Very slow! Many VM exits for page table modifications            │
│                                                                       │
└──────────────────────────────────────────────────────────────────────┘

With EPT (Hardware Solution):
┌──────────────────────────────────────────────────────────────────────┐
│                                                                       │
│   Guest Virtual Address (GVA)                                        │
│           │                                                           │
│           │ Guest Page Tables (guest controls, no VMM traps!)        │
│           ▼                                                           │
│   Guest Physical Address (GPA)                                       │
│           │                                                           │
│           │ EPT Tables (VMM controls, hardware walks automatically)  │
│           ▼                                                           │
│   Host Physical Address (HPA) = Real RAM                             │
│                                                                       │
│   SOLUTION: Two-dimensional paging in hardware!                      │
│   - Guest PT: GVA → GPA (guest manages, no exits)                   │
│   - EPT: GPA → HPA (VMM manages, hardware walks)                    │
│   - CPU walks BOTH levels automatically                              │
│   - Guest PT changes don't cause VM exits!                           │
│                                                                       │
└──────────────────────────────────────────────────────────────────────┘

EPT Page Table Structure

EPT uses a 4-level page table structure identical to x86-64 paging:

┌────────────────────────────────────────────────────────────────────────┐
│                    EPT Address Translation                              │
│                                                                         │
│   Guest Physical Address (48 bits used)                                │
│   ┌────────┬────────┬────────┬────────┬────────────────┐               │
│   │  PML4  │  PDPT  │   PD   │   PT   │    Offset      │               │
│   │ [47:39]│ [38:30]│ [29:21]│ [20:12]│    [11:0]      │               │
│   │ 9 bits │ 9 bits │ 9 bits │ 9 bits │   12 bits      │               │
│   └────┬───┴────┬───┴────┬───┴────┬───┴───────┬────────┘               │
│        │        │        │        │           │                         │
│        │        │        │        │           │                         │
│   ┌────▼────────▼────────▼────────▼───────────▼────────┐               │
│   │                                                     │               │
│   │   EPT PML4                                         │               │
│   │   (512 entries)                                    │               │
│   │   ┌─────────────────────────────────────────┐     │               │
│   │   │ Entry[PML4 index] ──────────────────────┼──┐  │               │
│   │   └─────────────────────────────────────────┘  │  │               │
│   │                                                 │  │               │
│   │   EPT PDPT (Page Dir Pointer Table)            │  │               │
│   │   (512 entries)                            ◄───┘  │               │
│   │   ┌─────────────────────────────────────────┐     │               │
│   │   │ Entry[PDPT index] ──────────────────────┼──┐  │               │
│   │   │   Can map 1GB huge page here!          │  │  │               │
│   │   └─────────────────────────────────────────┘  │  │               │
│   │                                                 │  │               │
│   │   EPT PD (Page Directory)                      │  │               │
│   │   (512 entries)                            ◄───┘  │               │
│   │   ┌─────────────────────────────────────────┐     │               │
│   │   │ Entry[PD index] ────────────────────────┼──┐  │               │
│   │   │   Can map 2MB huge page here!          │  │  │               │
│   │   └─────────────────────────────────────────┘  │  │               │
│   │                                                 │  │               │
│   │   EPT PT (Page Table)                          │  │               │
│   │   (512 entries)                            ◄───┘  │               │
│   │   ┌─────────────────────────────────────────┐     │               │
│   │   │ Entry[PT index] ────────────────────────┼──┐  │               │
│   │   │   Maps 4KB page                         │  │  │               │
│   │   └─────────────────────────────────────────┘  │  │               │
│   │                                                 │  │               │
│   │   Host Physical Page                           │  │               │
│   │   ┌─────────────────────────────────────────┐  │  │               │
│   │   │ HPA + Offset ◄──────────────────────────┴──┘  │               │
│   │   │ = Final physical address in RAM              │               │
│   │   └─────────────────────────────────────────┘     │               │
│   │                                                     │               │
│   └─────────────────────────────────────────────────────┘               │
│                                                                         │
└────────────────────────────────────────────────────────────────────────┘

EPT Entry Format

┌────────────────────────────────────────────────────────────────────────┐
│                    EPT Entry Format (64 bits)                          │
├────────────────────────────────────────────────────────────────────────┤
│                                                                         │
│  Bits   │ Field                │ Description                           │
│  ───────┼──────────────────────┼─────────────────────────────────────  │
│  [0]    │ Read                 │ Allow reads from this region          │
│  [1]    │ Write                │ Allow writes to this region           │
│  [2]    │ Execute              │ Allow instruction fetches             │
│  [5:3]  │ Memory Type          │ UC(0), WC(1), WT(4), WP(5), WB(6)     │
│  [6]    │ Ignore PAT           │ Ignore guest PAT for memory type      │
│  [7]    │ Large Page           │ 1GB (PDPTE) or 2MB (PDE) page         │
│  [8]    │ Accessed             │ Set by CPU on access (if enabled)     │
│  [9]    │ Dirty                │ Set by CPU on write (if enabled)      │
│  [10]   │ Execute User         │ Allow user-mode execute (if enabled)  │
│  [11]   │ Reserved             │ Must be 0                             │
│  [N-1:12] Physical Address     │ HPA of next level or final page       │
│  [51:N] │ Reserved             │ Must be 0 (N = MAXPHYSADDR)           │
│  [62:52]│ Reserved             │ Must be 0                             │
│  [63]   │ Suppress VE          │ Suppress #VE (virtualization exc)     │
│                                                                         │
│  Example: 4KB page entry mapping GPA to HPA                            │
│  ┌──────────────────────────────────────────────────────────────────┐  │
│  │ 63    52 51        12 11  10  9   8   7   6  5:3  2   1   0     │  │
│  │ ┌──────┬─────────────┬───┬───┬───┬───┬───┬───┬───┬───┬───┬───┐  │  │
│  │ │ Rsvd │ HPA[51:12]  │Rsv│ExU│ D │ A │ 0 │IPT│MT │ X │ W │ R │  │  │
│  │ └──────┴─────────────┴───┴───┴───┴───┴───┴───┴───┴───┴───┴───┘  │  │
│  │                                                                   │  │
│  │ For Read/Write/Execute memory: R=1, W=1, X=1, MT=6 (WB)          │  │
│  │ Entry value: 0x0000_0001_2345_6007 (HPA 0x123456000, RWX)       │  │
│  └──────────────────────────────────────────────────────────────────┘  │
│                                                                         │
└────────────────────────────────────────────────────────────────────────┘

EPT Pointer (EPTP) Configuration

┌────────────────────────────────────────────────────────────────────────┐
│                    EPT Pointer (EPTP) Format                           │
├────────────────────────────────────────────────────────────────────────┤
│                                                                         │
│  The EPTP is stored in VMCS and tells the CPU where to find EPT       │
│                                                                         │
│  Bits   │ Field                │ Description                           │
│  ───────┼──────────────────────┼─────────────────────────────────────  │
│  [2:0]  │ Memory Type          │ Memory type for EPT structures        │
│         │                      │ (should be WB = 6 for performance)    │
│  [5:3]  │ Page Walk Length - 1 │ Number of levels minus 1 (3 = 4-level)│
│  [6]    │ Accessed/Dirty       │ Enable A/D bits in EPT entries        │
│  [11:7] │ Reserved             │ Must be 0                             │
│  [N-1:12] PML4 Address         │ Physical address of EPT PML4 table    │
│  [63:N] │ Reserved             │ Must be 0                             │
│                                                                         │
│  Example EPTP:                                                          │
│  ┌──────────────────────────────────────────────────────────────────┐  │
│  │ PML4 at HPA 0x100000, WB memory type, 4-level walk, A/D enabled  │  │
│  │                                                                   │  │
│  │ EPTP = 0x0000_0000_0010_005E                                     │  │
│  │        ────────────────────                                       │  │
│  │        │              │ │ │                                       │  │
│  │        │              │ │ └── Memory Type: 6 (WB)                │  │
│  │        │              │ └──── Page Walk Length - 1: 3 (4-level)  │  │
│  │        │              └────── A/D enabled: 1                      │  │
│  │        └───────────────────── PML4 address >> 12                 │  │
│  └──────────────────────────────────────────────────────────────────┘  │
│                                                                         │
└────────────────────────────────────────────────────────────────────────┘

Two-Dimensional Address Translation

When the guest accesses memory, the CPU performs a complex 2D walk:

┌────────────────────────────────────────────────────────────────────────┐
│              2D Address Translation (Guest Access)                      │
├────────────────────────────────────────────────────────────────────────┤
│                                                                         │
│   Step 1: Guest tries to access GVA 0x7FFE1234                         │
│                                                                         │
│   Step 2: Walk Guest Page Tables (each step needs EPT translation!)    │
│                                                                         │
│   ┌─────────────────────────────────────────────────────────────────┐  │
│   │                                                                  │  │
│   │  Guest CR3 (GPA) ──── EPT translate ────► PML4 (HPA)            │  │
│   │        │                                                         │  │
│   │        ▼                                                         │  │
│   │  Guest PML4[idx] (GPA) ── EPT translate ─► Read entry (HPA)     │  │
│   │        │                                                         │  │
│   │        ▼                                                         │  │
│   │  Guest PDPT[idx] (GPA) ── EPT translate ─► Read entry (HPA)     │  │
│   │        │                                                         │  │
│   │        ▼                                                         │  │
│   │  Guest PD[idx] (GPA) ──── EPT translate ─► Read entry (HPA)     │  │
│   │        │                                                         │  │
│   │        ▼                                                         │  │
│   │  Guest PT[idx] (GPA) ──── EPT translate ─► Read entry (HPA)     │  │
│   │        │                                                         │  │
│   │        ▼                                                         │  │
│   │  Final GPA + offset ───── EPT translate ─► Final HPA            │  │
│   │                                                                  │  │
│   └─────────────────────────────────────────────────────────────────┘  │
│                                                                         │
│   Total: Up to 24 memory accesses for one guest access!                │
│   (4 guest PT levels × 4 EPT levels + final access + TLB fills)        │
│                                                                         │
│   WHY IS THIS STILL FAST?                                               │
│   - TLB caches combined GVA→HPA translations                           │
│   - No VM exits for page table modifications!                          │
│   - Hardware is optimized for this walk                                │
│   - Much better than software shadow PT maintenance                    │
│                                                                         │
└────────────────────────────────────────────────────────────────────────┘

2.2 Why This Matters

EPT eliminated the biggest performance problem in virtualization.

Before EPT (Shadow Page Tables):

  • Every guest page table modification caused a VM exit
  • VMM had to trap and emulate every CR3 load
  • VMM had to track all guest page table writes
  • Context switches were extremely expensive

After EPT:

  • Guest page table modifications are transparent to VMM
  • Only EPT violations (unmapped GPA, MMIO) cause exits
  • 10-40% performance improvement for memory-intensive workloads
  • Made virtualization practical for more workloads

Real-world impact:

  • AWS/GCP/Azure: All use EPT for VM isolation
  • Containers: gVisor/Kata use EPT for security boundaries
  • Security: VM-based sandboxing relies on EPT isolation
  • Performance: EPT A/D bits enable efficient dirty page tracking for live migration

2.3 Historical Context

2008: Intel introduces EPT (Extended Page Tables)

  • Also known as NPT (Nested Page Tables) on AMD
  • First generation had performance issues with TLB

2010+: EPT improvements

  • Large page support (2MB, 1GB) reduces TLB pressure
  • Accessed/Dirty bits enable dirty tracking without exits
  • Execute-only pages for security (code, no read/write)

Modern CPUs: EPT is mature and optimized

  • All modern Intel CPUs since Nehalem support EPT
  • TLB optimizations make 2D walks efficient
  • INVEPT instruction for targeted TLB invalidation

2.4 Common Misconceptions

Misconception 1: “EPT is just another page table level”

  • Reality: EPT is a parallel translation layer, not just another level. Guest paging and EPT work together in 2D.

Misconception 2: “EPT makes VM exits unnecessary”

  • Reality: EPT eliminates memory virtualization exits. Other exits (I/O, CPUID, privileged instructions) still occur.

Misconception 3: “EPT violations are like page faults”

  • Reality: EPT violations are VM exits with different information. They occur when GPA translation fails, not GVA.

Misconception 4: “2D page walks are always slow”

  • Reality: TLBs cache the final GVA→HPA mapping. After warming up, performance is near-native.

3. Project Specification

3.1 What You Will Build

Extend your Project 11 hypervisor to add EPT support:

  1. Build EPT page tables mapping guest physical memory to host physical memory
  2. Configure EPTP in VMCS to enable EPT
  3. Handle EPT violations for MMIO regions
  4. Emulate device access through EPT violation handling
  5. Support both identity mapping and arbitrary GPA→HPA translation

3.2 Functional Requirements

  • EPT Structure: Build 4-level EPT page tables
  • Memory Mapping: Map guest RAM (GPA range) to allocated host memory (HPA)
  • EPTP Configuration: Set VMCS EPT pointer field correctly
  • EPT Violation Handling: Trap accesses to unmapped/MMIO regions
  • VGA Emulation: Handle VGA memory writes (0xB8000) via EPT violations
  • Serial Emulation: Continue I/O port emulation (unchanged from P11)

3.3 Non-Functional Requirements

  • Performance: Guest memory access should not cause VM exits (except MMIO)
  • Memory Efficiency: Use large pages (2MB/1GB) where possible
  • Safety: EPT misconfiguration should not crash host
  • Debuggability: Dump EPT structures for debugging

3.4 Example Usage / Output

$ sudo ./hypervisor_with_ept guest.bin

[HYPERVISOR] Building EPT structures...

[EPT] Checking EPT capabilities:
  - Execute-only pages: Supported
  - Page walk length 4: Supported
  - Memory types: UC, WB supported
  - 2MB large pages: Supported
  - 1GB large pages: Supported
  - INVEPT instruction: Supported

[EPT] Allocating EPT structures:
  - EPT PML4 at HPA 0x100000000 (1 page)
  - EPT PDPT at HPA 0x100001000 (1 page)
  - EPT PD at HPA 0x100002000 (512 pages for 1GB coverage)
  - EPT PT: Allocated on demand

[EPT] Building memory map:
  Guest Physical Memory: 256MB (0x00000000 - 0x10000000)
  Host Physical Memory: Allocated at HPA 0x200000000

[EPT] Page table entries created:
  ┌─────────────────────────────────────────────────────────────────┐
  │ GPA Range              │ HPA Range              │ Permissions   │
  ├────────────────────────┼────────────────────────┼───────────────┤
  │ 0x00000000-0x00200000  │ 0x200000000-0x200200000│ RWX (2MB huge)│
  │ 0x00200000-0x10000000  │ 0x200200000-0x210000000│ RWX (2MB huge)│
  │ 0x000A0000-0x000C0000  │ NOT MAPPED             │ MMIO (VGA)    │
  │ 0x000B8000-0x000B9000  │ NOT MAPPED             │ MMIO (Text)   │
  └─────────────────────────────────────────────────────────────────┘

[EPT] MMIO regions (will cause EPT violation):
  - 0xA0000-0xC0000: VGA frame buffer
  - 0xB8000-0xB9000: VGA text mode buffer

[VMCS] Setting EPTP:
  - EPT PML4 physical address: 0x100000000
  - Memory type: Write-back (6)
  - Page walk length: 4 (value: 3)
  - A/D bits: Enabled
  - EPTP value: 0x000000010000005E

[VMCS] Enabling EPT in secondary proc-based controls:
  - Before: 0x00000000
  - After:  0x00000002 (EPT enabled)

[HYPERVISOR] Launching guest with EPT enabled...

[HYPERVISOR] === GUEST IS NOW RUNNING ===

[HYPERVISOR] VM EXIT #1
  - Exit reason: 48 (EPT violation)
  - Exit qualification: 0x0000000000000082
    - Data read: No
    - Data write: Yes
    - Instruction fetch: No
    - GPA was readable: Yes
    - GPA was writeable: No
    - GPA was executable: No
  - Guest physical address: 0x00000B8000
  - Guest linear address: 0x00000B8000

[EPT] EPT Violation at GPA 0xB8000 (VGA text buffer write)
[EPT] Guest attempting to write to VGA memory
[EPT] Emulating VGA write:
  - Character: 'H' (0x48)
  - Attribute: 0x0F (white on black)
  - Row: 0, Column: 0

[HYPERVISOR] Resuming guest...

[HYPERVISOR] VM EXIT #2
  - Exit reason: 48 (EPT violation)
  - Guest physical address: 0x00000B8002
[EPT] VGA write: 'e' at row 0, col 1

[HYPERVISOR] VM EXIT #3
  - Exit reason: 48 (EPT violation)
  - Guest physical address: 0x00000B8004
[EPT] VGA write: 'l' at row 0, col 2

[... more EPT violations for remaining characters ...]

[HYPERVISOR] VM EXIT #12
  - Exit reason: 12 (HLT)
  - Guest executed HLT instruction

[VGA BUFFER OUTPUT]
Hello World!

[HYPERVISOR] === GUEST TERMINATED ===

[HYPERVISOR] Statistics:
  - Total VM exits: 12
  - EPT violations: 11 (all VGA writes)
  - HLT exits: 1
  - Regular memory access exits: 0 (EPT working!)

[HYPERVISOR] EPT Statistics:
  - EPT PML4 entries used: 1
  - EPT PDPT entries used: 1
  - EPT PD entries used: 128 (for 256MB)
  - Large pages (2MB): 128
  - 4KB pages: 0 (only large pages used)

[EPT] Freeing EPT structures...
[HYPERVISOR] Cleanup complete

4. Solution Architecture

4.1 High-Level Design

┌────────────────────────────────────────────────────────────────────────┐
│                    EPT Implementation Architecture                      │
├────────────────────────────────────────────────────────────────────────┤
│                                                                         │
│  ┌──────────────────────────────────────────────────────────────────┐  │
│  │                     Memory Layout                                 │  │
│  │                                                                   │  │
│  │   Host Physical Memory                                           │  │
│  │   ┌─────────────────────────────────────────────────────────────┐│  │
│  │   │ 0x100000000: EPT PML4 (4KB)                                ││  │
│  │   │ 0x100001000: EPT PDPT (4KB)                                ││  │
│  │   │ 0x100002000: EPT PD pages (as needed)                      ││  │
│  │   │ 0x100XXX000: EPT PT pages (as needed)                      ││  │
│  │   │                                                             ││  │
│  │   │ 0x200000000: Guest RAM (mapped to GPA 0x0)                 ││  │
│  │   │ 0x210000000: End of guest RAM                              ││  │
│  │   └─────────────────────────────────────────────────────────────┘│  │
│  │                                                                   │  │
│  │   Guest Physical Address Space (as seen by guest)                │  │
│  │   ┌─────────────────────────────────────────────────────────────┐│  │
│  │   │ 0x00000000: RAM (mapped via EPT)                           ││  │
│  │   │ 0x000A0000: VGA buffer (NOT mapped - EPT violation)        ││  │
│  │   │ 0x000C0000: BIOS ROM (mapped read-only)                    ││  │
│  │   │ 0x10000000: End of RAM                                     ││  │
│  │   └─────────────────────────────────────────────────────────────┘│  │
│  │                                                                   │  │
│  └──────────────────────────────────────────────────────────────────┘  │
│                                                                         │
│  ┌──────────────────────────────────────────────────────────────────┐  │
│  │                     Component Interactions                        │  │
│  │                                                                   │  │
│  │   ┌─────────────┐      ┌─────────────┐      ┌─────────────┐     │  │
│  │   │   Guest     │      │    CPU      │      │  Hypervisor │     │  │
│  │   │   Code      │      │  (VT-x)     │      │   (VMM)     │     │  │
│  │   └──────┬──────┘      └──────┬──────┘      └──────┬──────┘     │  │
│  │          │                    │                    │             │  │
│  │          │ MOV [0xB8000], 'H' │                    │             │  │
│  │          ├───────────────────►│                    │             │  │
│  │          │                    │                    │             │  │
│  │          │      CPU walks guest PT (GVA→GPA)      │             │  │
│  │          │      CPU walks EPT (GPA→HPA)           │             │  │
│  │          │      EPT says: GPA 0xB8000 not mapped! │             │  │
│  │          │                    │                    │             │  │
│  │          │                    │ EPT VIOLATION      │             │  │
│  │          │                    ├───────────────────►│             │  │
│  │          │                    │                    │             │  │
│  │          │                    │      Read exit     │             │  │
│  │          │                    │      qualification │             │  │
│  │          │                    │      and GPA       │             │  │
│  │          │                    │                    │             │  │
│  │          │                    │      Emulate VGA   │             │  │
│  │          │                    │      write         │             │  │
│  │          │                    │                    │             │  │
│  │          │                    │ VMRESUME           │             │  │
│  │          │                    │◄───────────────────┤             │  │
│  │          │                    │                    │             │  │
│  │          │ Continue execution │                    │             │  │
│  │          │◄───────────────────┤                    │             │  │
│  │                                                                   │  │
│  └──────────────────────────────────────────────────────────────────┘  │
│                                                                         │
└────────────────────────────────────────────────────────────────────────┘

4.2 Key Components

1. EPT Table Manager

Allocates and manages EPT page tables:

  • Allocates pages for PML4, PDPT, PD, PT
  • Provides functions to map/unmap GPA ranges
  • Handles large page (2MB, 1GB) optimization

2. Memory Map Manager

Tracks GPA→HPA mappings:

  • Manages host memory allocation for guest RAM
  • Tracks MMIO regions that should trigger violations
  • Supports different memory types (RAM, ROM, MMIO)

3. EPT Violation Handler

Handles EPT violation exits:

  • Reads exit qualification and faulting GPA
  • Dispatches to appropriate device emulator
  • Optionally installs mapping to avoid future violations

4. MMIO Emulators

Emulates memory-mapped devices:

  • VGA text buffer (0xB8000)
  • VGA framebuffer (0xA0000) - optional
  • Other devices as needed

4.3 Data Structures

/* EPT entry (same format for all levels) */
typedef union {
    uint64_t value;
    struct {
        uint64_t read       : 1;   /* Bit 0: Allow reads */
        uint64_t write      : 1;   /* Bit 1: Allow writes */
        uint64_t execute    : 1;   /* Bit 2: Allow instruction fetch */
        uint64_t mem_type   : 3;   /* Bits 5:3: Memory type (EPT) */
        uint64_t ignore_pat : 1;   /* Bit 6: Ignore guest PAT */
        uint64_t large_page : 1;   /* Bit 7: Large page (2MB or 1GB) */
        uint64_t accessed   : 1;   /* Bit 8: Accessed (if enabled) */
        uint64_t dirty      : 1;   /* Bit 9: Dirty (if enabled) */
        uint64_t exec_user  : 1;   /* Bit 10: User-mode execute */
        uint64_t reserved1  : 1;   /* Bit 11: Reserved (0) */
        uint64_t phys_addr  : 40;  /* Bits 51:12: Physical address >> 12 */
        uint64_t reserved2  : 12;  /* Bits 63:52: Reserved (0) */
    };
} ept_entry_t;

/* EPT state */
struct ept_state {
    ept_entry_t *pml4;           /* EPT PML4 table (virtual address) */
    uint64_t pml4_phys;          /* EPT PML4 physical address */

    /* Allocated page tracking for cleanup */
    struct list_head page_list;   /* List of allocated EPT pages */
    int total_pages;              /* Total EPT pages allocated */

    /* Statistics */
    uint64_t violations_handled;
    uint64_t pages_mapped;
};

/* Memory region descriptor */
struct mem_region {
    uint64_t gpa_start;          /* Guest physical start */
    uint64_t gpa_end;            /* Guest physical end */
    uint64_t hpa_start;          /* Host physical start (0 for MMIO) */
    int type;                     /* MEM_TYPE_RAM, MEM_TYPE_MMIO, etc. */
    int permissions;              /* EPT_READ, EPT_WRITE, EPT_EXECUTE */
    void (*mmio_handler)(uint64_t gpa, int is_write, uint64_t *value);
};

#define MEM_TYPE_RAM    0
#define MEM_TYPE_ROM    1  /* Read + Execute, no Write */
#define MEM_TYPE_MMIO   2  /* Causes EPT violation */

/* EPT violation info from VMCS */
struct ept_violation_info {
    uint64_t gpa;                /* Guest physical address */
    uint64_t gla;                /* Guest linear address */
    uint64_t qualification;      /* Exit qualification */

    /* Decoded from qualification */
    bool was_read;               /* Violation due to read */
    bool was_write;              /* Violation due to write */
    bool was_fetch;              /* Violation due to instruction fetch */
    bool gpa_readable;           /* GPA had read permission */
    bool gpa_writeable;          /* GPA had write permission */
    bool gpa_executable;         /* GPA had execute permission */
    bool caused_by_gpa;          /* Violation due to GPA access (vs GLA) */
};

4.4 Algorithm Overview

Building EPT Tables

ALGORITHM: build_ept_tables(guest_memory_size)

1. Allocate PML4 table (1 page, zeroed)
2. For each 512GB region needed:
   a. Allocate PDPT
   b. Set PML4 entry pointing to PDPT

3. For each 1GB region needed:
   IF CPU supports 1GB pages AND region is 1GB aligned AND entire region is RAM:
     a. Set PDPT entry as 1GB large page pointing to HPA
   ELSE:
     a. Allocate PD
     b. Set PDPT entry pointing to PD

4. For each 2MB region needed:
   IF region is 2MB aligned AND entire region is RAM:
     a. Set PD entry as 2MB large page pointing to HPA
   ELSE:
     a. Allocate PT
     b. Set PD entry pointing to PT

5. For each 4KB page needed:
   a. Set PT entry pointing to HPA with appropriate permissions

6. For MMIO regions:
   a. Do NOT create mapping (leave entry as 0)
   b. Access will cause EPT violation

Handling EPT Violation

ALGORITHM: handle_ept_violation()

1. Read VMCS fields:
   - Guest physical address (GPA)
   - Guest linear address (GLA) if available
   - Exit qualification

2. Decode exit qualification:
   - Bits 0-2: Type of access (read/write/fetch)
   - Bits 3-5: What permissions existed
   - Bit 7: GPA valid

3. Look up region containing GPA:
   IF region is MMIO:
     a. Call MMIO handler for that device
     b. Emulate the access
     c. Advance guest RIP by instruction length

   IF region should be mapped but isn't (lazy mapping):
     a. Install EPT mapping for the page
     b. VMRESUME will retry the access

   IF region is unmapped and shouldn't exist:
     a. Inject exception to guest (#PF or #GP)

4. Resume guest

5. Implementation Guide

5.1 Development Environment Setup

Same as Project 11, plus:

# Verify EPT support
$ grep -o 'ept' /proc/cpuinfo | head -1
ept

# Check EPT capabilities via MSR (requires root and msr module)
$ sudo modprobe msr
$ sudo rdmsr 0x48C  # IA32_VMX_EPT_VPID_CAP
# Non-zero output indicates EPT support

5.2 Project Structure

Extend Project 11:

hypervisor/
├── ... (Project 11 files)
├── ept.c                # EPT table management
├── ept.h                # EPT data structures
├── memory_map.c         # GPA→HPA mapping management
├── vga_emu.c            # VGA MMIO emulation
└── guest/
    └── guest_vga.asm    # Guest that writes to VGA memory

5.3 The Core Question You’re Answering

“How does the hypervisor give each guest the illusion of having its own physical memory, while maintaining isolation and being able to intercept specific memory accesses for device emulation?”

The answer is EPT:

  1. EPT provides a second level of address translation (GPA→HPA)
  2. The hypervisor controls EPT, the guest controls its own page tables
  3. Guest memory access goes through both translations automatically
  4. Unmapped regions in EPT cause violations, allowing MMIO emulation
  5. Each guest can have its own EPT, providing memory isolation

5.4 Concepts You Must Understand First

Before implementing, verify you can answer these questions:

x86-64 Paging:

  • How does 4-level paging work (PML4, PDPT, PD, PT)?
  • What are large pages (2MB, 1GB) and when are they used?
  • What is the difference between physical and virtual addresses?

EPT Specifics:

  • What is the difference between GPA and HPA?
  • When does an EPT violation occur?
  • What information is available on EPT violation?
  • How is EPTP configured in VMCS?

Memory Types:

  • What are memory types (UC, WB, WT, etc.)?
  • Why does memory type matter for performance?

Book References:

  • Intel SDM Vol. 3C, Chapter 28: “VMX Support for Address Translation”
  • Intel SDM Vol. 3A, Chapter 4: “Paging” (for understanding 4-level structure)

5.5 Questions to Guide Your Design

EPT Structure:

  1. How big is each EPT page table level?
  2. How do you calculate which entry to use at each level?
  3. When can you use large pages vs. 4KB pages?
  4. How do you track allocated EPT pages for cleanup?

MMIO Handling:

  1. How do you ensure MMIO regions cause EPT violations?
  2. How do you determine if an EPT violation is for MMIO?
  3. How do you decode the instruction that caused the violation?
  4. How do you emulate a write to VGA memory?

Integration:

  1. How do you enable EPT in the VMCS?
  2. What happens if EPT is enabled but tables aren’t set up?
  3. How do you invalidate EPT translations (INVEPT)?

5.6 Thinking Exercise

Before writing code, trace through this memory access:

Setup:

  • Guest RAM: 256MB at GPA 0x0 → HPA 0x200000000
  • VGA text buffer at GPA 0xB8000 is NOT mapped (MMIO)
  • Guest code at GPA 0x7C00

Scenario: Guest executes MOV WORD [0xB8000], 0x0F48 (“H” with white-on-black attribute)

  1. Guest instruction fetch
    • Guest RIP = 0x7C00 (GVA = GPA in real mode)
    • CPU walks EPT: 0x7C00 → HPA 0x200007C00
    • Instruction fetched successfully
  2. Guest memory operand decode
    • Effective address = 0xB8000
    • In real mode, GVA = GPA = 0xB8000
  3. EPT walk for 0xB8000
    • PML4 entry for GPA 0xB8000: Index = (0xB8000 » 39) & 0x1FF = 0
    • PML4[0] points to PDPT
    • PDPT entry: Index = (0xB8000 » 30) & 0x1FF = 0
    • PDPT[0] points to PD
    • PD entry: Index = (0xB8000 » 21) & 0x1FF = 0
    • PD[0] points to PT (or is 2MB page)
    • PT entry: Index = (0xB8000 » 12) & 0x1FF = 0xB8
    • PT[0xB8] = 0 (not mapped!)
  4. EPT Violation
    • CPU triggers VM exit
    • Exit reason = 48 (EPT violation)
    • Exit qualification indicates write attempt
    • GPA = 0xB8000
  5. Hypervisor handling
    • Read GPA from VMCS
    • Determine this is VGA MMIO region
    • Decode instruction to get value (0x0F48)
    • Emulate: write ‘H’ with attribute 0x0F to VGA buffer
    • Advance guest RIP
    • VMRESUME

Questions:

  • What if you accidentally mapped 0xB8000? (Answer: No EPT violation, write goes to RAM, no VGA output)
  • What if the guest uses paging? (Answer: CPU walks guest PT first, then EPT for each level)

5.7 Hints in Layers

Hint 1 - Starting Point (Conceptual Direction)

Start with the simplest possible EPT: identity-map everything with 2MB pages. This means GPA = HPA for all memory. Once that works, carve out MMIO holes.

Hint 2 - Next Level (More Specific Guidance)

EPT entry calculation for 2MB pages:

/* For a 2MB page at GPA, calculate entry indices */
int pml4_idx = (gpa >> 39) & 0x1FF;  /* Bits 47:39 */
int pdpt_idx = (gpa >> 30) & 0x1FF;  /* Bits 38:30 */
int pd_idx   = (gpa >> 21) & 0x1FF;  /* Bits 29:21 */

/* PD entry for 2MB page */
uint64_t entry = (hpa & ~0x1FFFFF)   /* HPA aligned to 2MB */
               | EPT_READ | EPT_WRITE | EPT_EXECUTE
               | EPT_LARGE_PAGE      /* Bit 7: This is a large page */
               | (EPT_MT_WB << 3);   /* Memory type WB */

Hint 3 - Technical Details (Approach/Pseudocode)

/* Enable EPT in VMCS */
void enable_ept(struct ept_state *ept) {
    uint64_t eptp;

    /* Build EPTP value */
    eptp = ept->pml4_phys           /* Physical address of PML4 */
         | (3 << 3)                  /* Page walk length - 1 = 3 (4 levels) */
         | EPT_MT_WB                 /* Memory type for EPT structures */
         | EPT_AD_ENABLE;            /* Enable A/D bits if supported */

    /* Write to VMCS */
    vmcs_write64(EPT_POINTER, eptp);

    /* Enable EPT in secondary proc-based controls */
    uint32_t secondary = vmcs_read32(SECONDARY_VM_EXEC_CONTROL);
    secondary |= SECONDARY_EXEC_ENABLE_EPT;
    vmcs_write32(SECONDARY_VM_EXEC_CONTROL, secondary);
}

/* Handle EPT violation */
int handle_ept_violation(void) {
    struct ept_violation_info info;

    /* Read violation info from VMCS */
    info.qualification = vmcs_read64(EXIT_QUALIFICATION);
    info.gpa = vmcs_read64(GUEST_PHYSICAL_ADDRESS);
    info.gla = vmcs_read64(GUEST_LINEAR_ADDRESS);

    /* Decode qualification */
    info.was_read  = info.qualification & (1 << 0);
    info.was_write = info.qualification & (1 << 1);
    info.was_fetch = info.qualification & (1 << 2);

    /* Check if this is VGA MMIO */
    if (info.gpa >= VGA_TEXT_BASE && info.gpa < VGA_TEXT_END) {
        return handle_vga_mmio(&info);
    }

    /* Unknown region - inject exception */
    pr_err("EPT violation at unmapped GPA 0x%llx\n", info.gpa);
    return -1;
}

Hint 4 - Tools/Debugging (Verification Methods)

Debug EPT with these techniques:

/* Dump EPT structure */
void dump_ept(struct ept_state *ept) {
    for (int i = 0; i < 512; i++) {
        if (ept->pml4[i].value == 0) continue;
        pr_info("PML4[%d] = 0x%llx\n", i, ept->pml4[i].value);
        /* Recursively dump lower levels... */
    }
}

/* Verify EPT walk matches expected HPA */
uint64_t ept_walk(struct ept_state *ept, uint64_t gpa) {
    int pml4_idx = (gpa >> 39) & 0x1FF;
    ept_entry_t *pdpt;

    if (!(ept->pml4[pml4_idx].read)) {
        pr_info("PML4[%d] not present\n", pml4_idx);
        return 0;
    }
    pdpt = phys_to_virt(ept->pml4[pml4_idx].phys_addr << 12);
    /* Continue walk... */
}

5.8 The Interview Questions They’ll Ask

  1. “Explain the difference between shadow page tables and EPT.”
    • Shadow PT: VMM traps every guest PT modification, maintains parallel tables
    • EPT: Hardware does 2D translation, guest PT changes are transparent
    • EPT is faster because it avoids VM exits for PT modifications
  2. “What causes an EPT violation?”
    • Access to GPA with insufficient EPT permissions (read/write/execute)
    • Access to unmapped GPA (entry not present)
    • Used for MMIO emulation by not mapping device memory regions
  3. “How does 2D paging affect TLB performance?”
    • 2D walk can require up to 24 memory accesses (4x4 + 4 + final)
    • TLB caches the final GVA→HPA translation
    • After TLB warm-up, performance is near-native
    • Large pages reduce TLB pressure
  4. “How would you implement dirty page tracking for live migration?”
    • Enable EPT Accessed/Dirty bits in EPTP
    • Periodically scan EPT for dirty pages
    • Or: Clear write bit, use write violations to track
    • Copy dirty pages to destination, iterate until convergence
  5. “What is INVEPT and when would you use it?”
    • Invalidates cached EPT translations
    • Types: Single-context (one EPTP), All-contexts (all EPTPs)
    • Use after modifying EPT entries
    • Use when switching to different guest (different EPTP)

5.9 Books That Will Help

Topic Book Chapter
EPT Mechanism Intel SDM Vol. 3C Chapter 28.2
EPT Paging Structures Intel SDM Vol. 3C Chapter 28.2.2
EPT Violation Intel SDM Vol. 3C Chapter 28.2.3
EPTP Format Intel SDM Vol. 3C Chapter 24.6.11
x86-64 Paging Intel SDM Vol. 3A Chapter 4
Memory Types Intel SDM Vol. 3A Chapter 11
Linux mm Understanding Linux Kernel Chapter 2, 8

5.10 Implementation Phases

Phase 1: EPT Capability Check (Days 1-2)

Goal: Detect and report EPT capabilities

void check_ept_capabilities(void) {
    uint64_t ept_cap = read_msr(IA32_VMX_EPT_VPID_CAP);

    pr_info("EPT Capabilities (MSR 0x48C):\n");
    pr_info("  Execute-only: %s\n",
            (ept_cap & EPT_CAP_EXEC_ONLY) ? "Yes" : "No");
    pr_info("  Page walk 4: %s\n",
            (ept_cap & EPT_CAP_PWL4) ? "Yes" : "No");
    pr_info("  UC memory type: %s\n",
            (ept_cap & EPT_CAP_UC) ? "Yes" : "No");
    pr_info("  WB memory type: %s\n",
            (ept_cap & EPT_CAP_WB) ? "Yes" : "No");
    pr_info("  2MB pages: %s\n",
            (ept_cap & EPT_CAP_2MB) ? "Yes" : "No");
    pr_info("  1GB pages: %s\n",
            (ept_cap & EPT_CAP_1GB) ? "Yes" : "No");
    pr_info("  INVEPT: %s\n",
            (ept_cap & EPT_CAP_INVEPT) ? "Yes" : "No");
    pr_info("  A/D bits: %s\n",
            (ept_cap & EPT_CAP_AD) ? "Yes" : "No");
}

Validation:

  • Module reports EPT capabilities correctly
  • Compare with /proc/cpuinfo features

Phase 2: EPT Table Construction (Days 3-7)

Goal: Build EPT tables with identity mapping

Start with identity mapping (GPA = HPA) using 2MB pages:

struct ept_state *build_ept_identity(uint64_t memory_size) {
    struct ept_state *ept = kzalloc(sizeof(*ept), GFP_KERNEL);

    /* Allocate PML4 */
    ept->pml4 = (ept_entry_t *)__get_free_page(GFP_KERNEL | __GFP_ZERO);
    ept->pml4_phys = virt_to_phys(ept->pml4);

    /* Map memory in 2MB pages */
    for (uint64_t gpa = 0; gpa < memory_size; gpa += SIZE_2MB) {
        ept_map_2mb(ept, gpa, gpa, EPT_READ | EPT_WRITE | EPT_EXECUTE);
    }

    return ept;
}

Validation:

  • EPT walk function returns correct HPA for any GPA
  • Dump shows expected structure

Phase 3: VMCS Integration (Days 8-10)

Goal: Enable EPT in VMCS and boot guest

Add to your Project 11 VMCS setup:

  1. Set EPTP field
  2. Enable EPT in secondary proc-based controls
  3. Enable “unrestricted guest” mode (makes real-mode easier with EPT)

Validation:

  • Guest boots with EPT enabled
  • No unexpected EPT violations for RAM access

Phase 4: MMIO Carve-out (Days 11-14)

Goal: VGA memory causes EPT violations

Modify EPT building to NOT map VGA region:

struct ept_state *build_ept_with_mmio(uint64_t memory_size) {
    struct ept_state *ept = build_ept_identity(memory_size);

    /* Unmap VGA region (0xA0000-0xC0000) */
    ept_unmap_range(ept, 0xA0000, 0x20000);

    return ept;
}

Validation:

  • Guest access to 0xB8000 causes EPT violation
  • Exit qualification shows correct access type

Phase 5: MMIO Emulation (Days 15-21)

Goal: Emulate VGA writes, display output

int handle_vga_mmio(struct ept_violation_info *info) {
    if (!info->was_write) {
        /* Handle read - return 0 or previous value */
        return 0;
    }

    /* Decode instruction to get value being written */
    uint64_t value = decode_instruction_operand();

    /* Calculate VGA position */
    int offset = (info->gpa - VGA_TEXT_BASE) / 2;
    int row = offset / 80;
    int col = offset % 80;

    char ch = value & 0xFF;
    uint8_t attr = (value >> 8) & 0xFF;

    pr_info("VGA write: '%c' (attr 0x%02x) at row %d, col %d\n",
            ch, attr, row, col);

    /* Store in our VGA buffer */
    vga_buffer[row][col].ch = ch;
    vga_buffer[row][col].attr = attr;

    return 0;
}

Validation:

  • Guest writes to VGA cause EPT violations
  • Character and position correctly decoded
  • Final output matches expected message

5.11 Key Implementation Decisions

Decision 1: 2MB vs 4KB Pages

2MB Pages (Recommended for RAM)

  • Fewer EPT entries to manage
  • Better TLB utilization
  • Must be 2MB aligned
  • Can’t mix permissions within 2MB

4KB Pages (Required for MMIO boundaries)

  • Fine-grained control
  • More memory for EPT structures
  • Required when MMIO is not 2MB aligned

Recommendation: Use 2MB for RAM, 4KB only for areas near MMIO boundaries.

Decision 2: Instruction Decoding for MMIO

For MMIO emulation, you need to know what value the guest is writing:

Option A: Full x86 Decoder

  • Parse instruction bytes to get operands
  • Complex but accurate
  • Can handle any instruction

Option B: Use VM-exit Instruction Info

  • VMCS provides some instruction info
  • Limited, doesn’t include all operands
  • Simpler but incomplete

Option C: Map MMIO with Write-trap, Decode on Write

  • Map region but mark write-protected
  • On write violation, have partial info from exit qualification
  • May need to inspect guest memory/registers

Recommendation for this project: Simple decode of common instructions (MOV) or use guest register state for value.

Decision 3: Lazy vs. Eager Mapping

Eager Mapping (This Project)

  • Build all EPT entries at VM creation
  • Simple, predictable
  • Uses more memory upfront

Lazy Mapping

  • Only map pages on first access
  • EPT violation → install mapping → resume
  • More efficient for sparse guests
  • More complex

Recommendation: Use eager mapping; add lazy mapping as an extension.


6. Testing Strategy

6.1 Unit Tests

/* Test EPT entry creation */
void test_ept_entry(void) {
    ept_entry_t entry = make_ept_entry(0x123456000,
                                        EPT_READ | EPT_WRITE | EPT_EXECUTE,
                                        EPT_MT_WB);
    assert(entry.read == 1);
    assert(entry.write == 1);
    assert(entry.execute == 1);
    assert(entry.mem_type == 6);
    assert((entry.phys_addr << 12) == 0x123456000);
    pr_info("TEST: EPT entry creation - PASS\n");
}

/* Test EPT walk */
void test_ept_walk(void) {
    struct ept_state *ept = build_ept_identity(SIZE_256MB);

    /* Walk should return identity mapping */
    assert(ept_walk(ept, 0x1000) == 0x1000);
    assert(ept_walk(ept, 0x100000) == 0x100000);
    assert(ept_walk(ept, 0xFFFF000) == 0xFFFF000);

    /* MMIO region should return 0 (not mapped) */
    ept_unmap_range(ept, 0xA0000, 0x20000);
    assert(ept_walk(ept, 0xB8000) == 0);

    free_ept(ept);
    pr_info("TEST: EPT walk - PASS\n");
}

6.2 Integration Tests

#!/bin/bash
# test_ept.sh

# Test 1: Guest RAM access (no violations)
echo "Test 1: RAM access..."
./hypervisor_ept guest_ram_test.bin 2>&1 | grep -q "EPT violations: 0" \
    && echo "PASS" || echo "FAIL"

# Test 2: VGA writes cause violations
echo "Test 2: VGA violations..."
./hypervisor_ept guest_vga.bin 2>&1 | grep -q "EPT violation.*0xB8" \
    && echo "PASS" || echo "FAIL"

# Test 3: VGA output correct
echo "Test 3: VGA output..."
./hypervisor_ept guest_vga.bin 2>&1 | grep -q "Hello World" \
    && echo "PASS" || echo "FAIL"

6.3 Guest Test Programs

guest_ram_test.asm - Tests RAM access (no EPT violations expected):

[BITS 16]
[ORG 0x7C00]

    ; Write to various RAM locations
    mov word [0x8000], 0x1234
    mov word [0x9000], 0x5678
    mov ax, [0x8000]
    cmp ax, 0x1234
    jne fail

    ; Success - halt
    hlt

fail:
    ; Infinite loop on failure
    jmp fail

guest_vga.asm - Writes to VGA (EPT violations expected):

[BITS 16]
[ORG 0x7C00]

    mov ax, 0xB800
    mov es, ax          ; ES = VGA segment

    mov di, 0           ; Start at position 0
    mov si, message

print_loop:
    lodsb               ; Load character
    test al, al
    jz done
    mov ah, 0x0F        ; White on black attribute
    stosw               ; Store char + attribute
    jmp print_loop

done:
    hlt

message: db "Hello World!", 0

7. Common Pitfalls & Debugging

Problem Root Cause Fix Verification
VM entry fails after enabling EPT Invalid EPTP Check alignment, page walk length, memory type Dump EPTP value, verify against spec
EPT violation on RAM access Missing EPT mapping Ensure all RAM GPA ranges are mapped Walk EPT manually for failing GPA
Guest hangs with EPT EPT violation handler bug Check you’re advancing RIP, resuming correctly Add extensive logging
Wrong VGA character Instruction decode error Verify you’re reading correct bytes Log full instruction bytes
EPT not taking effect Forgot secondary controls Enable EPT bit in secondary proc-based controls Dump control field
Performance worse than before Too many 4KB pages Use 2MB pages for RAM Check EPT page count

Debugging EPT

/* Add to EPT violation handler for debugging */
void debug_ept_violation(struct ept_violation_info *info) {
    pr_info("=== EPT VIOLATION DEBUG ===\n");
    pr_info("GPA: 0x%llx\n", info->gpa);
    pr_info("GLA: 0x%llx\n", info->gla);
    pr_info("Qualification: 0x%llx\n", info->qualification);
    pr_info("  Read: %d, Write: %d, Fetch: %d\n",
            info->was_read, info->was_write, info->was_fetch);
    pr_info("  GPA R:%d W:%d X:%d\n",
            info->gpa_readable, info->gpa_writeable, info->gpa_executable);

    /* Walk EPT to see current state */
    pr_info("EPT walk for GPA 0x%llx:\n", info->gpa);
    ept_walk_debug(current_ept, info->gpa);

    /* Show guest instruction */
    uint64_t guest_rip = vmcs_read64(GUEST_RIP);
    uint32_t inst_len = vmcs_read32(VM_EXIT_INSTRUCTION_LEN);
    pr_info("Guest RIP: 0x%llx, Instruction length: %d\n",
            guest_rip, inst_len);
}

8. Extensions & Challenges

8.1 Accessed/Dirty Bit Tracking

Enable EPT A/D bits and implement dirty page tracking:

  • Useful for live migration
  • Track which pages guest has modified
  • Implement efficient dirty page scan

8.2 Execute-Only Pages

Use EPT execute-only capability for security:

  • Map code pages with X but not R
  • Prevents reading code (e.g., encryption keys in code)
  • Requires hardware support (check capability MSR)

8.3 1GB Large Pages

Use 1GB pages for even better TLB efficiency:

  • Useful for large guest memory sizes
  • Must be 1GB aligned
  • Check capability before using

8.4 VGA Framebuffer Emulation

Extend VGA emulation beyond text mode:

  • Map framebuffer at 0xA0000
  • Emulate graphics modes
  • Display graphical output

8.5 Sub-page Write Protection

Track writes at finer granularity:

  • Intel SPP (Sub-Page write Protection)
  • 128-byte granularity within 4KB pages
  • For security monitoring

9. Real-World Connections

How KVM Uses EPT

/* Simplified KVM EPT setup (real code in arch/x86/kvm/mmu/) */

/* KVM uses "memory slots" for GPA→HPA mapping */
struct kvm_memory_slot {
    gfn_t base_gfn;           /* Guest frame number */
    unsigned long npages;      /* Number of pages */
    unsigned long *dirty_bitmap;
    struct {
        unsigned long hva;     /* Host virtual address */
    } userspace_addr;
};

/* KVM MMU builds EPT lazily on fault */
static int kvm_mmu_page_fault(struct kvm_vcpu *vcpu, gva_t gva,
                              uint32_t error_code) {
    gfn_t gfn = gva >> PAGE_SHIFT;  /* Get guest frame number */

    /* Look up memory slot */
    struct kvm_memory_slot *slot = gfn_to_memslot(kvm, gfn);

    /* Get host frame number */
    pfn_t pfn = gfn_to_pfn(vcpu->kvm, gfn);

    /* Install EPT mapping */
    __direct_map(vcpu, gpa, level, pfn, ...);

    return 0;  /* Retry the access */
}

How This Enables Cloud Computing

Every cloud VM relies on EPT:

  1. Memory Isolation: Each VM has separate EPT, can’t access other VMs
  2. Overcommit: VMM can page guest memory to disk (EPT controls mapping)
  3. Live Migration: EPT A/D bits track dirty pages for iterative copy
  4. Memory Ballooning: Reclaim guest memory by changing EPT mappings

10. Resources

Primary References

Code References

Academic Papers

  • “Hardware and Software Support for Virtualization” - Bugnion, Nieh, Tsafrir (covers EPT/NPT)

11. Self-Assessment Checklist

EPT Structure

  • Can detect EPT capabilities from MSR
  • Can allocate EPT page tables (PML4, PDPT, PD, PT)
  • Can create entries for 4KB, 2MB, and 1GB pages
  • EPT walk function returns correct HPA

VMCS Integration

  • EPTP configured correctly
  • EPT enabled in secondary controls
  • Guest boots with EPT enabled
  • No spurious EPT violations for RAM

Violation Handling

  • Can handle EPT violation VM exit
  • Can decode violation type (read/write/fetch)
  • Can extract faulting GPA
  • MMIO regions correctly trigger violations

MMIO Emulation

  • VGA text buffer writes detected
  • Correct character and position decoded
  • Guest RIP advanced correctly
  • Output displayed correctly

Robustness

  • EPT structures freed on cleanup
  • No memory leaks
  • Error handling for allocation failures

12. Submission / Completion Criteria

Your EPT implementation is complete when you can demonstrate:

  1. EPT Capability Detection
    • Show output of capability check matching CPU features
  2. EPT-Enabled Guest Boot
    • Guest boots successfully with EPT
    • Statistics show 0 EPT violations for pure RAM access
  3. VGA MMIO Emulation
    • Guest writes to 0xB8000 cause EPT violations
    • Violations are handled correctly
    • Output displays: “Hello World!” (or similar)
  4. Statistics
    • Show EPT violation count matches expected (one per VGA write)
    • Show EPT page statistics (number of 2MB/4KB pages)
  5. Code Quality
    • EPT entry format matches Intel specification
    • Clear separation of EPT management and violation handling
    • Documented memory map (what’s mapped where)

Bonus Points:

  • 1GB page support
  • A/D bit tracking implemented
  • Multiple MMIO devices emulated
  • Lazy EPT mapping on demand

After completing this project, you’ll understand how modern hypervisors achieve memory virtualization with near-native performance. The EPT mechanism you’ve implemented is the same one running in every cloud data center, enabling the memory isolation that makes multi-tenant computing possible.