Project 6: Protected Mode to Long Mode (64-bit)

Master the complete x86-64 boot journey: Real Mode to Protected Mode to Long Mode. Build the same transition sequence that every modern 64-bit operating system performs, understanding 4-level paging, PAE, Model-Specific Registers (MSRs), and the architectural requirements that make 64-bit computing possible.


Quick Reference

Attribute Value
Difficulty ★★★★★ Master
Time Estimate 1+ month
Language x86 Assembly (NASM)
Prerequisites Project 3 (Real to Protected Mode), understanding of paging concepts, Project 4 (Two-Stage Bootloader) recommended
Key Topics 4-level paging (PML4, PDPT, PDT, PT), PAE (Physical Address Extension), IA32_EFER MSR, Long Mode Enable (LME), identity mapping, 64-bit registers and ABI

1. Learning Objectives

By completing this project, you will:

  1. Understand why Long Mode requires paging: Unlike 32-bit Protected Mode where paging is optional, 64-bit Long Mode mandates paging. Explain the architectural reason for this requirement.

  2. Master 4-level page table architecture: Build PML4, PDPT, PDT, and optionally PT structures by hand, understanding how each level contributes 9 bits to virtual address translation.

  3. Work with Model-Specific Registers: Use RDMSR/WRMSR instructions to manipulate the IA32_EFER register, enabling Long Mode before activating paging.

  4. Navigate the complete CPU mode transition: Execute the exact sequence Real Mode (16-bit) -> Protected Mode (32-bit) -> Compatibility Mode -> Long Mode (64-bit).

  5. Implement identity mapping: Create page tables where virtual addresses equal physical addresses for your bootloader code, ensuring the CPU doesn’t triple fault when paging is enabled.

  6. Write and execute true 64-bit code: Use all 16 general-purpose 64-bit registers (RAX-R15), demonstrating full Long Mode capability.

  7. Debug complex mode transitions: Use QEMU/GDB to diagnose page faults, understand CR3/CR4/EFER register interactions, and troubleshoot triple faults at each stage.

  8. Understand Linux’s boot sequence: Your implementation mirrors the exact transition performed by Linux’s real-mode kernel code (arch/x86/boot/compressed/head_64.S).


2. Theoretical Foundation

2.1 Core Concepts

The Journey: Real Mode -> Protected Mode -> Long Mode

When an x86-64 CPU powers on, it behaves like an 8086 from 1978. To reach 64-bit mode, you must traverse through each historical layer:

+-----------------------------------------------------------------------------+
|                    THE x86-64 MODE TRANSITION JOURNEY                        |
+-----------------------------------------------------------------------------+
|                                                                              |
|     REAL MODE               PROTECTED MODE            LONG MODE             |
|     (16-bit)                (32-bit)                  (64-bit)              |
|     +----------+            +-------------+           +-------------+        |
|     | 8086     |            | 80386       |           | AMD64       |        |
|     | 1 MB     |            | 4 GB        |           | 16 EB       |        |
|     | No prot  |            | Segments    |           | Paging req  |        |
|     | No paging|            | Paging opt  |           | Flat memory |        |
|     +----------+            +-------------+           +-------------+        |
|          |                       |                         ^                 |
|          |                       |                         |                 |
|          v                       v                         |                 |
|     +----------------------------------------------------------------+      |
|     |                    TRANSITION SEQUENCE                          |      |
|     +----------------------------------------------------------------+      |
|     |                                                                 |      |
|     |  1. Real Mode Start (BIOS loads us at 0x7C00)                  |      |
|     |       |                                                         |      |
|     |       v                                                         |      |
|     |  2. Load GDT with 32-bit descriptors                           |      |
|     |       |                                                         |      |
|     |       v                                                         |      |
|     |  3. Enable A20 gate                                            |      |
|     |       |                                                         |      |
|     |       v                                                         |      |
|     |  4. Set CR0.PE = 1  -----> PROTECTED MODE                      |      |
|     |       |                                                         |      |
|     |       v                                                         |      |
|     |  5. Far jump to 32-bit code segment                            |      |
|     |       |                                                         |      |
|     |       v                                                         |      |
|     |  6. Build 4-level page tables (PML4, PDPT, PDT, [PT])         |      |
|     |       |                                                         |      |
|     |       v                                                         |      |
|     |  7. Set CR4.PAE = 1 (Physical Address Extension)               |      |
|     |       |                                                         |      |
|     |       v                                                         |      |
|     |  8. Load CR3 with PML4 physical address                        |      |
|     |       |                                                         |      |
|     |       v                                                         |      |
|     |  9. Set IA32_EFER.LME = 1 (Long Mode Enable via MSR)          |      |
|     |       |                                                         |      |
|     |       v                                                         |      |
|     | 10. Set CR0.PG = 1 (Enable Paging)                             |      |
|     |       |                                                         |      |
|     |       | -----> Now in COMPATIBILITY MODE (64-bit with 32-bit) |      |
|     |       v                                                         |      |
|     | 11. Far jump to 64-bit code segment -----> LONG MODE          |      |
|     |                                                                 |      |
|     +----------------------------------------------------------------+      |
|                                                                              |
+-----------------------------------------------------------------------------+

Why Long Mode Requires Paging

In 32-bit Protected Mode, paging is optional - you can use segmentation alone for memory management. Long Mode absolutely requires paging because:

  1. Segmentation is essentially disabled: In Long Mode, segment base and limit are ignored (except for FS/GS). The CPU only uses paging for address translation.

  2. Address space is too large for segments: With 64-bit addresses (16 exabytes theoretical), segment descriptors cannot describe all possible segments. Paging provides the necessary indirection.

  3. Canonical addresses: Long Mode enforces “canonical” addresses where bits 63:48 must match bit 47. This implicitly requires paging to manage the usable address space.

  4. Architectural design: AMD designed AMD64 to simplify memory management by making paging mandatory, removing the complexity of segment:offset addressing.

+-----------------------------------------------------------------------------+
|                   WHY PAGING IS MANDATORY IN LONG MODE                       |
+-----------------------------------------------------------------------------+
|                                                                              |
|   PROTECTED MODE (32-bit):                                                   |
|   +------------------+          +------------------+                         |
|   |  Virtual Addr    |          | Physical Addr    |                         |
|   |  (Segment:Offset)|   OR     | (Direct if no    |                         |
|   +--------+---------+          |  paging enabled) |                         |
|            |                    +------------------+                         |
|            v                                                                 |
|   +------------------+                                                       |
|   | Segment Desc     |   Paging Optional!                                   |
|   | (Base + Limit)   |   Can use segmentation alone                         |
|   +------------------+                                                       |
|                                                                              |
|   LONG MODE (64-bit):                                                        |
|   +------------------+                                                       |
|   |  Virtual Addr    |   Segmentation essentially DISABLED                  |
|   |  (64-bit linear) |   (CS/DS/SS/ES bases forced to 0)                   |
|   +--------+---------+                                                       |
|            |                                                                 |
|            v         Paging MANDATORY!                                      |
|   +------------------+                                                       |
|   | 4-Level Page     |   No other way to translate                          |
|   | Tables           |   64-bit addresses                                   |
|   +--------+---------+                                                       |
|            |                                                                 |
|            v                                                                 |
|   +------------------+                                                       |
|   | Physical Addr    |                                                       |
|   | (up to 52 bits)  |                                                       |
|   +------------------+                                                       |
|                                                                              |
+-----------------------------------------------------------------------------+

4-Level Page Tables (PML4 Architecture)

Long Mode uses 4 levels of page tables. Each level uses 9 bits of the virtual address (except the page offset which uses 12 bits for 4KB pages).

+-----------------------------------------------------------------------------+
|                    4-LEVEL PAGING ADDRESS TRANSLATION                        |
+-----------------------------------------------------------------------------+
|                                                                              |
|   64-bit Virtual Address Layout:                                             |
|   +--------+--------+--------+--------+--------+--------+--------+--------+ |
|   |63    56|55    48|47    39|38    30|29    21|20    12|11     0|         | |
|   +--------+--------+--------+--------+--------+--------+--------+         | |
|   | Sign   | Sign   | PML4   | PDPT   |  PDT   |  PT    | Page   |         | |
|   | Extend | Extend | Index  | Index  | Index  | Index  | Offset |         | |
|   | (copy  | (copy  | 9 bits | 9 bits | 9 bits | 9 bits | 12 bits|         | |
|   | bit 47)| bit 47)|        |        |        |        |        |         | |
|   +--------+--------+--------+--------+--------+--------+--------+         | |
|                                                                              |
|   Each table has 512 entries (2^9 = 512)                                    |
|   Each entry is 8 bytes (64 bits)                                           |
|   Each table is 4KB (512 * 8 = 4096 bytes)                                  |
|                                                                              |
|   TRANSLATION WALK:                                                          |
|                                                                              |
|   CR3 Register                                                               |
|   +----------------+                                                         |
|   | PML4 Base Addr |                                                         |
|   +-------+--------+                                                         |
|           |                                                                  |
|           v                                                                  |
|   +-------+--------+  PML4 (Page Map Level 4)                               |
|   | Entry 0        |                                                         |
|   | Entry 1        |  <-- Bits 47:39 select entry                           |
|   | ...            |                                                         |
|   | Entry 511      |                                                         |
|   +-------+--------+                                                         |
|           |                                                                  |
|           | Contains PDPT physical address                                   |
|           v                                                                  |
|   +-------+--------+  PDPT (Page Directory Pointer Table)                   |
|   | Entry 0        |                                                         |
|   | Entry 1        |  <-- Bits 38:30 select entry                           |
|   | ...            |                                                         |
|   | Entry 511      |                                                         |
|   +-------+--------+                                                         |
|           |                                                                  |
|           | Contains PDT physical address (or 1GB page if PS=1)             |
|           v                                                                  |
|   +-------+--------+  PDT (Page Directory Table)                            |
|   | Entry 0        |                                                         |
|   | Entry 1        |  <-- Bits 29:21 select entry                           |
|   | ...            |                                                         |
|   | Entry 511      |                                                         |
|   +-------+--------+                                                         |
|           |                                                                  |
|           | Contains PT physical address (or 2MB page if PS=1)              |
|           v                                                                  |
|   +-------+--------+  PT (Page Table) - OPTIONAL with huge pages            |
|   | Entry 0        |                                                         |
|   | Entry 1        |  <-- Bits 20:12 select entry                           |
|   | ...            |                                                         |
|   | Entry 511      |                                                         |
|   +-------+--------+                                                         |
|           |                                                                  |
|           | Contains 4KB page physical address                               |
|           v                                                                  |
|   +----------------+                                                         |
|   | Physical Page  |  + Bits 11:0 (page offset) = Physical Address          |
|   +----------------+                                                         |
|                                                                              |
+-----------------------------------------------------------------------------+

Page Table Entry Format

Each page table entry is 64 bits with specific bit assignments:

+-----------------------------------------------------------------------------+
|                        PAGE TABLE ENTRY FORMAT (64-bit)                      |
+-----------------------------------------------------------------------------+
|                                                                              |
|   Bit  | Name      | Description                                            |
|   -----|-----------|--------------------------------------------------------|
|   0    | P         | Present (1=page is in memory, 0=not present/fault)    |
|   1    | R/W       | Read/Write (1=writable, 0=read-only)                   |
|   2    | U/S       | User/Supervisor (1=user accessible, 0=supervisor only) |
|   3    | PWT       | Page-level Write-Through                               |
|   4    | PCD       | Page-level Cache Disable                               |
|   5    | A         | Accessed (set by CPU when page is accessed)            |
|   6    | D         | Dirty (set by CPU when page is written) - PT only      |
|   7    | PS        | Page Size (1=huge page: 2MB in PDT, 1GB in PDPT)      |
|   8    | G         | Global (TLB not flushed on CR3 write if set)          |
|   9-11 | AVL       | Available for software use                             |
|   12-51| ADDR      | Physical address of next level (or page if PS=1)      |
|        |           | Bits 12-51 = physical page frame number               |
|   52-62| AVL       | Available for software use                             |
|   63   | XD/NX     | Execute Disable (1=no execute, requires EFER.NXE=1)   |
|                                                                              |
|   ENTRY LAYOUT:                                                              |
|   +--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+ |
|   |63|62      52|51                    12|11  9| 8| 7| 6| 5| 4| 3| 2| 1| 0| |
|   +--+----------+-----------------------+-----+--+--+--+--+--+--+--+--+--+ |
|   |NX|  AVL     |  Physical Address     | AVL |G |PS|D |A |CD|WT|US|RW|P | |
|   +--+----------+-----------------------+-----+--+--+--+--+--+--+--+--+--+ |
|                                                                              |
|   MINIMAL ENTRY for identity mapping:                                        |
|   Present=1, R/W=1, U/S=0 --> 0x03 (or 0x83 for huge page with PS=1)       |
|                                                                              |
+-----------------------------------------------------------------------------+

Physical Address Extension (PAE) and CR4

Before Long Mode can work, you must enable PAE (Physical Address Extension) in CR4. PAE was introduced in the Pentium Pro to allow 32-bit systems to access more than 4GB of physical memory.

+-----------------------------------------------------------------------------+
|                          CR4 REGISTER (Control Register 4)                   |
+-----------------------------------------------------------------------------+
|                                                                              |
|   CR4 Controls CPU features. Key bits for Long Mode:                         |
|                                                                              |
|   Bit | Name  | Description                                                 |
|   ----|-------|-------------------------------------------------------------|
|   0   | VME   | Virtual-8086 Mode Extensions                                |
|   1   | PVI   | Protected-Mode Virtual Interrupts                           |
|   2   | TSD   | Time Stamp Disable (RDTSC in ring 0 only)                  |
|   3   | DE    | Debugging Extensions                                        |
|   4   | PSE   | Page Size Extensions (4MB pages in 32-bit mode)            |
|   5   | PAE   | Physical Address Extension  <<< REQUIRED FOR LONG MODE >>> |
|   6   | MCE   | Machine-Check Enable                                        |
|   7   | PGE   | Page Global Enable                                          |
|   ...                                                                        |
|                                                                              |
|   PAE (Bit 5):                                                               |
|   - When PAE=0: 32-bit paging (2-level), 4GB physical max                   |
|   - When PAE=1: PAE paging (3-level in 32-bit, 4-level in 64-bit)          |
|   - MUST be set to 1 BEFORE enabling Long Mode                              |
|                                                                              |
|   SETTING PAE:                                                               |
|   mov eax, cr4                                                               |
|   or eax, (1 << 5)    ; Set PAE bit                                         |
|   mov cr4, eax                                                               |
|                                                                              |
+-----------------------------------------------------------------------------+

IA32_EFER MSR (Extended Feature Enable Register)

The IA32_EFER is a Model-Specific Register (MSR) that controls Long Mode. MSRs are accessed via RDMSR/WRMSR instructions, not memory.

+-----------------------------------------------------------------------------+
|                    IA32_EFER MSR (Model-Specific Register)                   |
+-----------------------------------------------------------------------------+
|                                                                              |
|   MSR Address: 0xC0000080                                                    |
|                                                                              |
|   Bit | Name  | Description                                                 |
|   ----|-------|-------------------------------------------------------------|
|   0   | SCE   | System Call Extensions (SYSCALL/SYSRET enable)             |
|   1-7 | -     | Reserved                                                    |
|   8   | LME   | Long Mode Enable  <<< SET THIS TO ENABLE LONG MODE >>>     |
|   9   | -     | Reserved                                                    |
|   10  | LMA   | Long Mode Active (READ-ONLY, set by CPU when active)       |
|   11  | NXE   | No-Execute Enable (enables XD bit in page tables)          |
|   12  | SVME  | Secure Virtual Machine Enable (AMD-V)                      |
|   13  | LMSLE | Long Mode Segment Limit Enable                              |
|   14  | FFXSR | Fast FXSAVE/FXRSTOR                                        |
|   15  | TCE   | Translation Cache Extension                                 |
|   ...                                                                        |
|                                                                              |
|   LME (Bit 8) - Long Mode Enable:                                           |
|   - Set this bit to enable Long Mode                                        |
|   - Does NOT activate Long Mode immediately                                 |
|   - Long Mode activates when paging is enabled (CR0.PG=1)                  |
|   - After paging enables, LMA (bit 10) becomes 1 (CPU sets it)             |
|                                                                              |
|   LMA (Bit 10) - Long Mode Active:                                          |
|   - READ-ONLY bit                                                           |
|   - CPU sets this when Long Mode is actually active                         |
|   - Use this to verify Long Mode is working                                 |
|                                                                              |
|   READING/WRITING MSRs:                                                      |
|   mov ecx, 0xC0000080   ; MSR address                                       |
|   rdmsr                  ; Read: result in EDX:EAX                          |
|   or eax, (1 << 8)      ; Set LME bit                                       |
|   wrmsr                  ; Write: value from EDX:EAX                        |
|                                                                              |
+-----------------------------------------------------------------------------+

Identity Mapping for Bootloaders

During the mode transition, you’re executing code at a specific physical address. When you enable paging, the CPU immediately starts translating addresses. If your page tables don’t map your current code location, you’ll triple fault.

Identity mapping means virtual address = physical address for the memory region containing your bootloader.

+-----------------------------------------------------------------------------+
|                         IDENTITY MAPPING EXPLAINED                           |
+-----------------------------------------------------------------------------+
|                                                                              |
|   WITHOUT IDENTITY MAPPING:                                                  |
|   +-----------------+                         +-----------------+            |
|   | Your code at    |   Enable    TRIPLE     | Page tables map |            |
|   | physical 0x8000 |   paging    FAULT!     | 0x8000 -> ???   |            |
|   +-----------------+   ----->               +-----------------+            |
|                        CPU tries to fetch next instruction from             |
|                        virtual 0x8000, but it's not mapped!                 |
|                                                                              |
|   WITH IDENTITY MAPPING:                                                     |
|   +-----------------+                         +-----------------+            |
|   | Your code at    |   Enable               | Page tables map |            |
|   | physical 0x8000 |   paging    SUCCESS!   | 0x8000 -> 0x8000|            |
|   +-----------------+   ----->               +-----------------+            |
|                        Virtual 0x8000 = Physical 0x8000                     |
|                        Code continues executing normally                     |
|                                                                              |
|   MINIMAL IDENTITY MAP for bootloader (first 2MB):                          |
|                                                                              |
|   Virtual Address    Physical Address                                        |
|   0x0000_0000   -->  0x0000_0000  (IVT, BIOS data)                          |
|   0x0000_1000   -->  0x0000_1000  (Our page tables)                         |
|   0x0000_7C00   -->  0x0000_7C00  (Stage 1 bootloader)                      |
|   0x0000_8000   -->  0x0000_8000  (Stage 2 bootloader)                      |
|   0x000B_8000   -->  0x000B_8000  (VGA memory)                              |
|   ...           -->  ...                                                     |
|   0x001F_FFFF   -->  0x001F_FFFF  (End of first 2MB)                        |
|                                                                              |
|   With 2MB huge pages (PS=1 in PDT), you only need:                         |
|   - 1 PML4 entry pointing to PDPT                                           |
|   - 1 PDPT entry pointing to PDT                                            |
|   - 1 PDT entry with PS=1 covering 0x000000-0x1FFFFF                       |
|                                                                              |
+-----------------------------------------------------------------------------+

Using 2MB Huge Pages (Simplified Setup)

For a bootloader, you don’t need 4KB granularity. Using 2MB huge pages simplifies the page table setup significantly:

+-----------------------------------------------------------------------------+
|                    2MB HUGE PAGE SETUP (RECOMMENDED)                         |
+-----------------------------------------------------------------------------+
|                                                                              |
|   With 2MB pages, you skip the PT level entirely:                           |
|                                                                              |
|   4KB PAGES (4 levels):     2MB HUGE PAGES (3 levels):                      |
|   PML4 -> PDPT -> PDT -> PT -> Page    PML4 -> PDPT -> PDT -> 2MB Page     |
|                                                                              |
|   To use 2MB pages, set PS (Page Size) bit in PDT entry.                    |
|                                                                              |
|   MEMORY LAYOUT for page tables:                                             |
|   +------------------+                                                       |
|   | 0x1000: PML4     |  <-- CR3 points here                                 |
|   | (4096 bytes)     |  Entry 0 -> 0x2003 (PDPT at 0x2000, Present, R/W)   |
|   +------------------+                                                       |
|   | 0x2000: PDPT     |                                                       |
|   | (4096 bytes)     |  Entry 0 -> 0x3003 (PDT at 0x3000, Present, R/W)    |
|   +------------------+                                                       |
|   | 0x3000: PDT      |                                                       |
|   | (4096 bytes)     |  Entry 0 -> 0x0083 (2MB page at 0x0, PS=1, P=1, RW=1)|
|   +------------------+                                                       |
|                                                                              |
|   PDT Entry for 2MB page:                                                    |
|   +--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+ |
|   |63|62      52|51                    21|20  13|12| 8| 7| 6| 5| 4| 3| 2| 1| 0|
|   +--+----------+-----------------------+------+--+--+--+--+--+--+--+--+--+ |
|   |NX|  AVL     |  Page Physical Addr   | Resv |G |PS|D |A |CD|WT|US|RW|P | |
|   |  |          |  (2MB aligned)        |      |  |1!|  |  |  |  |  |  |  | |
|   +--+----------+-----------------------+------+--+--+--+--+--+--+--+--+--+ |
|                                                                              |
|   PS=1 is critical! Without it, CPU expects another level of tables.        |
|                                                                              |
|   For identity mapping first 2MB:                                            |
|   PDT[0] = 0x0000000000000083                                                |
|            ^^             ^^                                                 |
|            Physical       PS=1, R/W=1, P=1                                  |
|            base 0                                                            |
|                                                                              |
+-----------------------------------------------------------------------------+

The Complete Transition Code Flow

+-----------------------------------------------------------------------------+
|                    COMPLETE LONG MODE TRANSITION SEQUENCE                    |
+-----------------------------------------------------------------------------+
|                                                                              |
|   [16-bit Real Mode]                                                         |
|   |                                                                          |
|   v                                                                          |
|   1. CLI (disable interrupts)                                               |
|   |                                                                          |
|   v                                                                          |
|   2. Load 32-bit GDT (for Protected Mode)                                   |
|   |                                                                          |
|   v                                                                          |
|   3. Enable A20 gate                                                        |
|   |                                                                          |
|   v                                                                          |
|   4. mov eax, cr0                                                           |
|      or eax, 1        ; Set PE (Protection Enable)                          |
|      mov cr0, eax                                                           |
|   |                                                                          |
|   v                                                                          |
|   5. jmp 0x08:protected_mode   ; Far jump to 32-bit code segment            |
|                                                                              |
|   [32-bit Protected Mode]                                                    |
|   |                                                                          |
|   v                                                                          |
|   6. Reload segment registers (DS, ES, SS, FS, GS = 0x10)                   |
|   |                                                                          |
|   v                                                                          |
|   7. Set up page tables at known physical addresses (e.g., 0x1000)          |
|      - Zero out page table memory                                            |
|      - Set up PML4[0] -> PDPT                                               |
|      - Set up PDPT[0] -> PDT                                                |
|      - Set up PDT[0] -> 2MB identity-mapped page (with PS=1)                |
|   |                                                                          |
|   v                                                                          |
|   8. mov eax, cr4                                                           |
|      or eax, (1 << 5)    ; Set PAE                                          |
|      mov cr4, eax                                                           |
|   |                                                                          |
|   v                                                                          |
|   9. mov eax, 0x1000     ; PML4 physical address                            |
|      mov cr3, eax        ; Load page table base                             |
|   |                                                                          |
|   v                                                                          |
|   10. mov ecx, 0xC0000080  ; IA32_EFER MSR address                          |
|       rdmsr                 ; Read current value                             |
|       or eax, (1 << 8)     ; Set LME (Long Mode Enable)                     |
|       wrmsr                 ; Write back                                     |
|   |                                                                          |
|   v                                                                          |
|   11. mov eax, cr0                                                          |
|       or eax, (1 << 31)    ; Set PG (Paging Enable)                         |
|       mov cr0, eax                                                           |
|   |                                                                          |
|   | ---> CPU is now in COMPATIBILITY MODE (sub-mode of Long Mode)           |
|   |      (64-bit paging active, but still executing 32-bit code)            |
|   v                                                                          |
|   12. Load 64-bit GDT (with 64-bit code segment)                            |
|   |                                                                          |
|   v                                                                          |
|   13. jmp 0x08:long_mode_start  ; Far jump to 64-bit code segment           |
|                                                                              |
|   [64-bit Long Mode]                                                         |
|   |                                                                          |
|   v                                                                          |
|   14. mov ax, 0x10       ; Data selector (or 0x00, segments don't matter)   |
|       mov ds, ax                                                             |
|       mov es, ax                                                             |
|       mov ss, ax                                                             |
|   |                                                                          |
|   v                                                                          |
|   15. mov rsp, 0x90000   ; 64-bit stack pointer                             |
|   |                                                                          |
|   v                                                                          |
|   16. SUCCESS! Running in 64-bit mode                                       |
|       - All 64-bit registers available (RAX, RBX, ..., R8-R15)             |
|       - 64-bit address space                                                 |
|       - Write to VGA memory to confirm                                       |
|                                                                              |
+-----------------------------------------------------------------------------+

GDT for Long Mode

Long Mode requires a different GDT entry format for the 64-bit code segment:

+-----------------------------------------------------------------------------+
|                        GDT FOR LONG MODE TRANSITION                          |
+-----------------------------------------------------------------------------+
|                                                                              |
|   You need TWO GDTs during transition:                                       |
|   1. 32-bit GDT for Protected Mode (same as Project 3)                      |
|   2. 64-bit GDT for Long Mode (different code segment format)               |
|                                                                              |
|   64-bit CODE SEGMENT DESCRIPTOR:                                            |
|   +--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+ |
|   | Byte 7        | Byte 6        | Byte 5        | Byte 4                 | |
|   | Base 31:24    | Flags|Limit   | Access        | Base 23:16             | |
|   |               | 19:16        |               |                         | |
|   +---------------+---------------+---------------+-------------------------+ |
|   | Bytes 3-2: Base 15:0         | Bytes 1-0: Limit 15:0                   | |
|   +------------------------------+------------------------------------------+ |
|                                                                              |
|   For 64-bit code segment:                                                   |
|   - Base = 0 (ignored in Long Mode anyway)                                  |
|   - Limit = 0 (ignored in Long Mode)                                        |
|   - Access byte = 0x9A (Present, Ring 0, Code, Executable, Readable)        |
|   - Flags = 0x20 (L=1 for 64-bit, D=0 required when L=1)                   |
|                                                                              |
|   64-bit Access Byte (0x9A):                                                 |
|   +---+---+---+---+---+---+---+---+                                          |
|   | P |  DPL  | S | E | DC| RW| A |                                          |
|   | 1 | 0   0 | 1 | 1 | 0 | 1 | 0 |  = 0x9A                                 |
|   +---+---+---+---+---+---+---+---+                                          |
|                                                                              |
|   64-bit Flags (high nibble of byte 6):                                      |
|   +---+---+---+---+                                                          |
|   | G | D | L |AVL|                                                          |
|   | 0 | 0 | 1 | 0 |  = 0x2 (upper nibble of 0x20)                           |
|   +---+---+---+---+                                                          |
|   L=1: 64-bit code segment (REQUIRED for Long Mode)                         |
|   D=0: MUST be 0 when L=1 (Intel SDM requirement)                           |
|                                                                              |
|   COMPLETE 64-bit GDT:                                                       |
|   gdt64:                                                                     |
|       dq 0                        ; Null descriptor (offset 0x00)           |
|       dq 0x00209A0000000000       ; 64-bit code (offset 0x08)              |
|       dq 0x0000920000000000       ; 64-bit data (offset 0x10)              |
|                                                                              |
|   Code segment raw bytes:                                                    |
|   00 00 00 00 00 9A 20 00                                                    |
|   ^^^^^^^^^^^^^^ ^^  ^^                                                      |
|   Base/Limit=0   Access  Flags (L=1, D=0)                                   |
|                                                                              |
+-----------------------------------------------------------------------------+

2.2 Why This Matters

This project matters because:

  1. Every 64-bit x86 OS does this exact transition: Linux, Windows, macOS (on Intel), FreeBSD - all must traverse Real -> Protected -> Long Mode. Understanding this is foundational.

  2. You’ll understand modern CPU architecture: The interplay between CR0, CR3, CR4, and EFER reveals how modern x86 CPUs manage modes, memory, and features.

  3. Paging fundamentals become concrete: Rather than reading about page tables abstractly, you’ll build them byte by byte and watch the CPU use them.

  4. Interview differentiator: Very few candidates can explain the Long Mode transition at this depth. This knowledge sets you apart.

  5. Foundation for kernel development: If you want to write an OS kernel, you MUST master this transition.

  6. Security implications: Understanding page tables is crucial for understanding kernel exploits, virtual memory isolation, and hypervisor design.

2.3 Historical Context

The journey through CPU modes reflects 40+ years of x86 evolution:

  • 1978 (8086): Real Mode only, 20-bit addresses (1MB), no protection
  • 1982 (80286): Protected Mode introduced, but awkward to switch back to Real Mode
  • 1985 (80386): 32-bit Protected Mode, paging introduced (optional), easy mode switching
  • 1995 (Pentium Pro): PAE introduced for >4GB physical memory
  • 2003 (AMD Opteron): AMD64/x86-64 introduces Long Mode, making paging mandatory
  • 2004 (Intel EM64T): Intel adopts AMD’s 64-bit extensions

Every 64-bit x86 system today still powers on in Real Mode (for backward compatibility), then transitions through Protected Mode to reach Long Mode.

2.4 Common Misconceptions

Misconception 1: “Long Mode is just 64-bit Protected Mode” Reality: Long Mode is architecturally distinct. Segmentation is essentially disabled, paging is mandatory, and the GDT format differs. You can’t just set a 64-bit flag in Protected Mode.

Misconception 2: “You can skip Protected Mode and go directly to Long Mode” Reality: The CPU must be in Protected Mode to enable Long Mode. The sequence is: Set LME in EFER (while in Protected Mode) -> Enable paging -> CPU enters Compatibility Mode -> Far jump to 64-bit segment -> True Long Mode.

Misconception 3: “Page tables are the same in 32-bit PAE and 64-bit Long Mode” Reality: While similar, Long Mode uses 4 levels (PML4, PDPT, PDT, PT) while 32-bit PAE uses 3 levels (PDPT, PDT, PT). The entry formats are also slightly different.

Misconception 4: “You need 4KB pages for the transition” Reality: 2MB huge pages work perfectly and are simpler. For a bootloader, you don’t need fine-grained memory protection.

Misconception 5: “Setting CR0.PG immediately enters Long Mode” Reality: Setting CR0.PG with LME=1 enters Compatibility Mode (a sub-mode of Long Mode that runs 32-bit code). You must far jump to a 64-bit code segment to enter true Long Mode.


3. Project Specification

3.1 What You Will Build

A bootloader that:

  1. Starts in 16-bit Real Mode (loaded at 0x7C00 by BIOS)
  2. Transitions to 32-bit Protected Mode (same as Project 3)
  3. Builds 4-level page tables for identity mapping
  4. Enables PAE, Long Mode, and paging
  5. Transitions to true 64-bit Long Mode
  6. Demonstrates 64-bit capabilities (using RAX-R15 registers)
  7. Outputs status messages at each stage via VGA memory

3.2 Functional Requirements

Requirement Description
FR-1 Boot from MBR (512 bytes) or use two-stage loading for larger code
FR-2 Display “[16-bit Real Mode]” status using INT 10h
FR-3 Successfully transition to 32-bit Protected Mode
FR-4 Display “[32-bit Protected Mode]” via VGA memory
FR-5 Build 4-level page tables (PML4, PDPT, PDT) at known addresses
FR-6 Use 2MB huge pages for identity mapping (at minimum, first 2MB)
FR-7 Enable PAE in CR4
FR-8 Load PML4 address into CR3
FR-9 Enable LME in IA32_EFER MSR using RDMSR/WRMSR
FR-10 Enable paging in CR0 (entering Compatibility Mode)
FR-11 Load 64-bit GDT and far jump to 64-bit code segment
FR-12 Display “[64-bit Long Mode]” via VGA memory
FR-13 Demonstrate 64-bit register usage (store/display 64-bit value)
FR-14 Halt cleanly after displaying success

3.3 Non-Functional Requirements

Requirement Description
NFR-1 Must work in QEMU (qemu-system-x86_64)
NFR-2 Should work in Bochs with x86-64 support
NFR-3 Code must be well-commented for educational purposes
NFR-4 Page tables must be 4KB aligned (required by architecture)
NFR-5 Each mode transition should have visible confirmation

3.4 Example Usage / Output

# Assemble the bootloader (assuming two-stage for size)
$ nasm -f bin stage1.asm -o stage1.bin
$ nasm -f bin stage2.asm -o stage2.bin
$ cat stage1.bin stage2.bin > longmode.bin

# Pad to disk size
$ dd if=longmode.bin of=disk.img bs=512 conv=notrunc
$ dd if=/dev/zero of=disk.img bs=1M count=1 seek=0 conv=notrunc

# Run in QEMU
$ qemu-system-x86_64 -drive format=raw,file=disk.img -monitor stdio

# Verify with QEMU monitor
(qemu) info registers
# Should show RIP (not EIP), 64-bit register values

3.5 Real World Outcome

When you run the bootloader in QEMU, you’ll see on screen:

[16-bit Real Mode] Starting...
[16-bit Real Mode] Entering Protected Mode...
[32-bit Protected Mode] Setting up paging...
[32-bit Protected Mode] PML4 at 0x1000
[32-bit Protected Mode] Enabling PAE and Long Mode...
[64-bit Long Mode] Welcome to 64-bit mode!
[64-bit Long Mode] RAX = 0x123456789ABCDEF0 (full 64-bit register!)

The final line proves you’re in true Long Mode - only 64-bit mode can store and display a full 64-bit value in a single register. You’ve completed the same journey that every 64-bit operating system performs during boot.


4. Solution Architecture

4.1 High-Level Design

+-----------------------------------------------------------------------------+
|                    LONG MODE BOOTLOADER ARCHITECTURE                         |
+-----------------------------------------------------------------------------+
|                                                                              |
|   MEMORY LAYOUT:                                                             |
|   +------------------------------------------------------------------------+ |
|   | 0x0000 - 0x04FF  | IVT + BIOS Data Area (DO NOT TOUCH)                 | |
|   +------------------+-----------------------------------------------------+ |
|   | 0x0500 - 0x0FFF  | Free (we use 0x1000-0x3FFF for page tables)        | |
|   +------------------+-----------------------------------------------------+ |
|   | 0x1000 - 0x1FFF  | PML4 Table (4KB, 512 entries)                       | |
|   +------------------+-----------------------------------------------------+ |
|   | 0x2000 - 0x2FFF  | PDPT Table (4KB, 512 entries)                       | |
|   +------------------+-----------------------------------------------------+ |
|   | 0x3000 - 0x3FFF  | PDT Table (4KB, 512 entries)                        | |
|   +------------------+-----------------------------------------------------+ |
|   | 0x7C00 - 0x7DFF  | Stage 1 Bootloader (512 bytes, MBR)                 | |
|   +------------------+-----------------------------------------------------+ |
|   | 0x7E00 - 0x8FFF  | Stage 2 Bootloader (if using two-stage)             | |
|   +------------------+-----------------------------------------------------+ |
|   | 0x9000 - 0x9FFF  | GDT (32-bit and 64-bit)                             | |
|   +------------------+-----------------------------------------------------+ |
|   | 0xA000 - 0x9FBFF | Stack area (grows down from 0x90000)                | |
|   +------------------+-----------------------------------------------------+ |
|   | 0xB8000          | VGA Text Memory (for output)                        | |
|   +------------------------------------------------------------------------+ |
|                                                                              |
|   PAGE TABLE STRUCTURE (for 2MB identity mapping):                           |
|                                                                              |
|   CR3 = 0x1000                                                               |
|        |                                                                     |
|        v                                                                     |
|   +------------+  PML4 at 0x1000                                            |
|   | Entry 0    |---> 0x2003 (Present, R/W, points to 0x2000)               |
|   | Entry 1-511| = 0 (not present)                                          |
|   +------------+                                                             |
|        |                                                                     |
|        v                                                                     |
|   +------------+  PDPT at 0x2000                                            |
|   | Entry 0    |---> 0x3003 (Present, R/W, points to 0x3000)               |
|   | Entry 1-511| = 0 (not present)                                          |
|   +------------+                                                             |
|        |                                                                     |
|        v                                                                     |
|   +------------+  PDT at 0x3000                                             |
|   | Entry 0    |---> 0x0083 (Present, R/W, PS=1, 2MB page at 0x0)          |
|   | Entry 1-511| = 0 (not present)                                          |
|   +------------+                                                             |
|                                                                              |
|   This maps virtual 0x000000-0x1FFFFF to physical 0x000000-0x1FFFFF        |
|   (first 2MB identity mapped with a single 2MB huge page)                   |
|                                                                              |
+-----------------------------------------------------------------------------+

4.2 Key Components

Component Location Purpose
Stage 1 0x7C00 MBR, loads Stage 2, minimal setup
Stage 2 0x7E00+ Mode transitions, page table setup
PML4 0x1000 Level 4 page table (top level)
PDPT 0x2000 Level 3 page table
PDT 0x3000 Level 2 page table (with 2MB pages)
GDT32 Stage 2 data GDT for Protected Mode
GDT64 Stage 2 data GDT for Long Mode
Stack 0x90000 down Stack for all modes

4.3 Data Structures

PML4 Entry (for pointing to PDPT):

; PML4[0] = PDPT address | flags
; = 0x2000 | 0x03 (Present=1, R/W=1)
; = 0x0000_0000_0000_2003
dq 0x0000000000002003

PDPT Entry (for pointing to PDT):

; PDPT[0] = PDT address | flags
; = 0x3000 | 0x03 (Present=1, R/W=1)
; = 0x0000_0000_0000_3003
dq 0x0000000000003003

PDT Entry (for 2MB page):

; PDT[0] = Physical base | flags
; = 0x0000_0000 | 0x83 (Present=1, R/W=1, PS=1)
; PS=1 (bit 7) makes this a 2MB page
; = 0x0000_0000_0000_0083
dq 0x0000000000000083

64-bit GDT:

gdt64:
    dq 0                         ; Null descriptor
    dq 0x00209A0000000000        ; Code segment (L=1, D=0)
    dq 0x0000920000000000        ; Data segment
gdt64_end:

gdt64_descriptor:
    dw gdt64_end - gdt64 - 1    ; Limit
    dq gdt64                     ; Base (64-bit address)

4.4 Algorithm Overview

  1. Stage 1 (16-bit): Load Stage 2, jump to it
  2. Stage 2 Real Mode part: Print status, set up for Protected Mode
  3. Enter Protected Mode: Set CR0.PE, far jump
  4. Stage 2 Protected Mode part: a. Zero page table memory (0x1000-0x3FFF) b. Set PML4[0] = 0x2003 (points to PDPT) c. Set PDPT[0] = 0x3003 (points to PDT) d. Set PDT[0] = 0x0083 (2MB identity map) e. Set CR4.PAE = 1 f. Set CR3 = 0x1000 (PML4 base) g. Set IA32_EFER.LME = 1 (via RDMSR/WRMSR) h. Set CR0.PG = 1 (now in Compatibility Mode) i. Load 64-bit GDT j. Far jump to 64-bit code segment
  5. Stage 2 Long Mode part: a. Reload data segments b. Set up 64-bit stack c. Print success message d. Demonstrate 64-bit register e. Halt

5. Implementation Guide

5.1 Development Environment Setup

Required Tools:

# Ubuntu/Debian
sudo apt install nasm qemu-system-x86 gdb make

# Verify QEMU supports x86-64
qemu-system-x86_64 --version

# macOS
brew install nasm qemu

# Verify installation
nasm -v    # Should be 2.14+

Optional but Recommended:

# Bochs with x86-64 support (stricter emulation)
sudo apt install bochs bochs-x

# For debugging page tables
# QEMU's "info tlb" and "info mem" commands are invaluable

5.2 Project Structure

long-mode-bootloader/
|-- stage1.asm              # MBR bootloader (loads stage 2)
|-- stage2.asm              # Main bootloader (mode transitions)
|-- include/
|   |-- gdt.inc             # GDT definitions
|   |-- print.inc           # Print macros
|   `-- pagetable.inc       # Page table setup macros
|-- Makefile
|-- run.sh                  # QEMU launch script
|-- debug.sh                # QEMU with GDB server
`-- README.md

Makefile:

all: disk.img

stage1.bin: stage1.asm
	nasm -f bin stage1.asm -o stage1.bin

stage2.bin: stage2.asm
	nasm -f bin stage2.asm -o stage2.bin

disk.img: stage1.bin stage2.bin
	cat stage1.bin stage2.bin > bootloader.bin
	dd if=/dev/zero of=disk.img bs=1M count=1
	dd if=bootloader.bin of=disk.img conv=notrunc

run: disk.img
	qemu-system-x86_64 -drive format=raw,file=disk.img

debug: disk.img
	qemu-system-x86_64 -s -S -drive format=raw,file=disk.img &
	gdb -ex "target remote :1234" -ex "set architecture i386:x86-64"

clean:
	rm -f *.bin disk.img

5.3 The Core Question You’re Answering

“Why does Long Mode require paging, and how do you set up 4-level page tables to enable the transition from 32-bit Protected Mode to 64-bit Long Mode?”

This project forces you to understand:

  • The architectural decision that segmentation cannot scale to 64-bit addressing
  • How the CPU uses multiple levels of indirection to translate virtual addresses
  • The sequence of register manipulations that unlock 64-bit mode
  • Why identity mapping is crucial during mode transitions

5.4 Concepts You Must Understand First

Concept Self-Check Question Book Reference
Protected Mode Can you explain why we need a GDT to enter Protected Mode? “Low-Level Programming” Ch. 4
Paging basics What is a page fault and when does it occur? CS:APP Ch. 9
Page table entries What bits must be set for a valid page table entry? Intel SDM Vol. 3A, Ch. 4
MSRs How do RDMSR and WRMSR differ from MOV? Intel SDM Vol. 3A, Ch. 9
CR0/CR3/CR4 What does each control register control? Intel SDM Vol. 3A, Ch. 2
64-bit registers What new registers are available in Long Mode? AMD64 APM Vol. 1, Ch. 1

5.5 Questions to Guide Your Design

  1. Page Table Placement
    • At what physical addresses will you place your page tables?
    • How will you ensure they’re 4KB aligned?
    • How much memory do the page tables consume?
  2. Identity Mapping Scope
    • How much memory needs to be identity mapped?
    • Will you use 4KB pages or 2MB huge pages?
    • What happens to addresses above your identity-mapped region?
  3. GDT Management
    • Do you need separate GDTs for Protected Mode and Long Mode?
    • Where will the 64-bit GDT live in memory?
    • What’s the selector value for your 64-bit code segment?
  4. Transition Timing
    • What must happen BEFORE enabling paging?
    • What must happen AFTER enabling paging but BEFORE the far jump?
    • What does “Compatibility Mode” mean and when are you in it?
  5. Verification
    • How will you confirm each transition succeeded?
    • How can you tell you’re in Long Mode vs Compatibility Mode?
    • What QEMU commands help debug page table issues?

5.6 Thinking Exercise

Before writing any code, perform these exercises:

Exercise 1: Address Translation by Hand

Given:

  • PML4 at physical 0x1000
  • PDPT at physical 0x2000
  • PDT at physical 0x3000
  • PT at physical 0x4000 (if using 4KB pages)

Trace the translation of virtual address 0x0000000000008765:

  1. Extract bits 47:39 (PML4 index) = ?
  2. Extract bits 38:30 (PDPT index) = ?
  3. Extract bits 29:21 (PDT index) = ?
  4. Extract bits 20:12 (PT index, if 4KB pages) = ?
  5. Extract bits 11:0 (page offset) = ?

For a 2MB huge page setup, the translation stops at PDT. What’s the final physical address?

Exercise 2: Page Table Entry Construction

Construct the 64-bit value for:

  1. PML4 entry pointing to PDPT at 0x2000 (Present, R/W)
  2. PDPT entry pointing to PDT at 0x3000 (Present, R/W)
  3. PDT entry for a 2MB page at physical 0 (Present, R/W, PS=1)

Write each as a hex value (e.g., 0x0000000000002003).

Exercise 3: Register State Diagram

Draw the state of these registers at each stage:

  • CR0 (PE, PG bits)
  • CR3 (page table base)
  • CR4 (PAE bit)
  • IA32_EFER (LME, LMA bits)
  • CS (segment selector, L bit in descriptor)

Fill in: | Stage | CR0.PE | CR0.PG | CR3 | CR4.PAE | EFER.LME | EFER.LMA | CS.L | |——-|——–|——–|—–|———|———-|———-|——| | Real Mode | 0 | 0 | ? | 0 | 0 | 0 | 0 | | Protected Mode | ? | ? | ? | ? | ? | ? | ? | | After LME set | ? | ? | ? | ? | ? | ? | ? | | After PG set | ? | ? | ? | ? | ? | ? | ? | | After far jump | ? | ? | ? | ? | ? | ? | ? |

5.7 Hints in Layers

Hint 1: Page Table Memory Setup (Conceptual)

Place your page tables at fixed, known addresses:

  • PML4 at 0x1000 (4KB aligned)
  • PDPT at 0x2000 (4KB aligned)
  • PDT at 0x3000 (4KB aligned)

First, zero out all page table memory. This ensures all “not present” entries are properly cleared.

For 2MB huge pages (recommended for simplicity), you only need these three tables. Set the PS bit (bit 7) in the PDT entry to indicate it’s a 2MB page, not a pointer to a PT.

Hint 2: Setting Up Page Table Entries (Practical)

; After entering Protected Mode, set up page tables

; First, zero out page table memory (12KB total)
mov edi, 0x1000
mov ecx, 0x3000       ; 12KB = 0x3000 bytes
xor eax, eax
rep stosb             ; Zero memory

; PML4[0] = address of PDPT | Present | R/W
mov dword [0x1000], 0x2003
mov dword [0x1004], 0x00000000   ; Upper 32 bits (address < 4GB)

; PDPT[0] = address of PDT | Present | R/W
mov dword [0x2000], 0x3003
mov dword [0x2004], 0x00000000

; PDT[0] = 0 (physical base) | PS | Present | R/W
; PS (bit 7) = 0x80, Present = 0x01, R/W = 0x02
; 0x80 | 0x02 | 0x01 = 0x83
mov dword [0x3000], 0x00000083
mov dword [0x3004], 0x00000000

Hint 3: Enabling PAE and Loading CR3 (Step-by-Step)

; Step 1: Enable PAE in CR4
mov eax, cr4
or eax, (1 << 5)      ; PAE is bit 5
mov cr4, eax

; Step 2: Load CR3 with PML4 physical address
mov eax, 0x1000       ; PML4 at 0x1000
mov cr3, eax

; At this point:
; - PAE is enabled
; - CR3 points to our page tables
; - But paging is NOT yet enabled (CR0.PG = 0)
; - We're still in 32-bit Protected Mode

Hint 4: Enabling Long Mode via EFER MSR (Detailed)

; Enable Long Mode in IA32_EFER MSR
; MSR address 0xC0000080

mov ecx, 0xC0000080   ; IA32_EFER MSR address
rdmsr                  ; Read into EDX:EAX

or eax, (1 << 8)      ; Set LME (Long Mode Enable) - bit 8
wrmsr                  ; Write back

; At this point:
; - LME is set, but Long Mode is NOT yet active
; - LMA (Long Mode Active) is still 0
; - When we enable paging, the CPU will check LME
; - If LME=1 and paging enabled, CPU enters Long Mode

; NOW enable paging - this activates Long Mode
mov eax, cr0
or eax, (1 << 31)     ; PG is bit 31
mov cr0, eax

; We're now in COMPATIBILITY MODE (sub-mode of Long Mode)
; - 64-bit paging is active
; - But we're still running 32-bit code
; - To get to true Long Mode, we need to far jump to 64-bit code

Hint 5: 64-bit GDT and Far Jump (Complete Sequence)

; 64-bit GDT structure
align 16
gdt64:
    dq 0                         ; Null descriptor (offset 0x00)
    dq 0x00209A0000000000        ; 64-bit code (offset 0x08)
                                 ; L=1, D=0, Present, Ring 0, Execute/Read
    dq 0x0000920000000000        ; 64-bit data (offset 0x10)
                                 ; Present, Ring 0, Read/Write
gdt64_end:

gdt64_ptr:
    dw gdt64_end - gdt64 - 1    ; Limit
    dq gdt64                     ; Base address (8 bytes for 64-bit)

; ... after enabling paging (still in 32-bit Compatibility Mode) ...

; Load 64-bit GDT
lgdt [gdt64_ptr]

; Far jump to 64-bit code segment
; Selector 0x08 = second entry in GDT (64-bit code)
jmp 0x08:long_mode_start

; This is where 64-bit code begins
[BITS 64]
long_mode_start:
    ; We're now in true 64-bit Long Mode!

    ; Reload data segments (optional but clean)
    mov ax, 0x10          ; Data selector
    mov ds, ax
    mov es, ax
    mov fs, ax
    mov gs, ax
    mov ss, ax

    ; Set up 64-bit stack
    mov rsp, 0x90000

    ; Now we can use all 64-bit registers!
    mov rax, 0x123456789ABCDEF0
    ; ... continue execution ...

Hint 6: Debugging Tips (Troubleshooting)

Triple fault on enabling paging:

  • Page tables not properly set up
  • PML4 not 4KB aligned
  • Missing Present bit in any entry along the path
  • Check with QEMU: info mem, info tlb

Triple fault on far jump to 64-bit:

  • GDT64 not properly formatted
  • L bit not set in code segment
  • D bit set when L is set (must be 0)
  • GDT base address wrong (make sure it’s physical address)

Stuck in Compatibility Mode:

  • Far jump target address wrong
  • Using 32-bit code selector instead of 64-bit

QEMU debugging commands:

(qemu) info registers     # Show all registers
(qemu) info mem           # Show memory mappings from page tables
(qemu) info tlb           # Show TLB contents
(qemu) x /4xg 0x1000     # Examine PML4 entries (8 bytes each)
(qemu) x /4xg 0x2000     # Examine PDPT entries
(qemu) x /4xg 0x3000     # Examine PDT entries

GDB debugging:

(gdb) info registers           # All registers
(gdb) p/x $cr0                 # CR0 value
(gdb) p/x $cr3                 # CR3 value
(gdb) p/x $cr4                 # CR4 value
(gdb) x/8xg 0x1000            # PML4 contents

5.8 Interview Questions They’ll Ask

If you list this project on your resume, expect these questions:

  1. “Why does Long Mode require paging while Protected Mode doesn’t?”

    What they’re testing: Understanding of x86-64 architecture decisions

    Good answer: “In 32-bit Protected Mode, segmentation provides memory protection with base+limit checks. The segment descriptor format can describe any memory range up to 4GB. In Long Mode, the address space is 64-bit (theoretically 16 exabytes). Segment descriptors can’t describe this range - their base and limit fields are effectively ignored. AMD designed Long Mode to use paging exclusively for memory management, making segmentation vestigial. This simplifies the memory model and aligns with how modern OSes (which use flat memory models anyway) prefer to work.”

    Red flag answer: “I don’t know, that’s just how it works.” (No understanding of architectural rationale)

  2. “Walk me through the exact sequence of steps to enter Long Mode from Real Mode.”

    What they’re testing: Detailed knowledge of the transition

    Good answer: “Starting in Real Mode: (1) Set up and load a 32-bit GDT, (2) Enable A20 gate, (3) Set CR0.PE to enter Protected Mode, (4) Far jump to flush pipeline and load 32-bit code selector. In Protected Mode: (5) Set up 4-level page tables with identity mapping, (6) Enable PAE in CR4, (7) Load PML4 address into CR3, (8) Set LME bit in IA32_EFER MSR, (9) Enable paging by setting CR0.PG - this puts us in Compatibility Mode. (10) Load 64-bit GDT, (11) Far jump to 64-bit code segment - now we’re in true Long Mode. Order matters: PAE before CR3, CR3 before LME, LME before PG.”

    Red flag answer: “Just set a 64-bit flag somewhere.” (No understanding of the complexity)

  3. “What’s the difference between Compatibility Mode and Long Mode? How do you get from one to the other?”

    What they’re testing: Understanding of Long Mode sub-modes

    Good answer: “Compatibility Mode is a sub-mode of Long Mode where 64-bit paging is active but the CPU executes 32-bit code. The CS descriptor has L=0 and D=1, just like Protected Mode. You enter Compatibility Mode when you enable paging (CR0.PG=1) with LME already set, but you’re still using a 32-bit code segment. To enter true 64-bit Long Mode, you far jump to a code segment with L=1 and D=0. The CPU then switches to 64-bit instruction decoding and makes all 64-bit registers available.”

    Red flag answer: “They’re the same thing.” (Doesn’t understand the sub-mode structure)

  4. “Explain what each level of the 4-level page table hierarchy does. Why 4 levels?”

    What they’re testing: Page table architecture knowledge

    Good answer: “Each level handles 9 bits of the 48-bit virtual address (canonical addressing uses only 48 bits). PML4 (bits 47-39) is the top level; there’s only one and CR3 points to it. PDPT (bits 38-30) can either point to PDT or, with PS=1, describe a 1GB huge page. PDT (bits 29-21) points to PT or describes a 2MB huge page. PT (bits 20-12) describes 4KB pages. Four levels allow covering 2^48 bytes (256TB) with manageable table sizes - each table is 4KB with 512 8-byte entries. The depth balances memory overhead against TLB miss penalties.”

    Red flag answer: “It’s for more memory.” (No structural understanding)

  5. “What happens if you enable paging without proper identity mapping for your current code?”

    What they’re testing: Practical understanding of paging activation

    Good answer: “Triple fault, causing a system reset. When you set CR0.PG=1, the very next instruction fetch uses the page tables. If your code is at physical address 0x8000 but there’s no mapping for virtual 0x8000, the CPU takes a page fault. With no IDT set up (or improper IDT), the page fault can’t be handled, causing a double fault. The double fault handler also faults, causing a triple fault. The CPU has no recovery mechanism, so it resets. That’s why identity mapping - where virtual equals physical - is essential during the transition.”

    Red flag answer: “It probably crashes.” (No understanding of the specific failure mechanism)

  6. “How would you extend your page tables to map more memory? What about kernel-user separation?”

    What they’re testing: Ability to extend the concept

    Good answer: “For more memory, add entries to the PDT. Each PDT entry with PS=1 maps 2MB. For 4GB, I’d need 2048 entries across 4 PDTs. For kernel-user separation, I’d set up a different page table hierarchy for user space. The kernel typically identity maps itself in the higher half (e.g., 0xFFFF800000000000) so it’s accessible from all address spaces. User pages have the U/S bit set; kernel pages don’t. When switching to user mode, CR3 changes (or stays the same if using the same address space), but user code can’t access supervisor pages.”

    Red flag answer: “Just add more entries.” (No concrete understanding)

5.9 Books That Will Help

Concept Book Specific Chapters/Pages Why It Helps
Long Mode Transition “Low-Level Programming” by Igor Zhirkov Chapter 5 (pp. 191-220) Step-by-step Long Mode entry with NASM examples
4-Level Paging Intel 64 and IA-32 SDM, Volume 3A Chapter 4 (Paging) Authoritative page table format reference
AMD64 Architecture AMD64 Architecture Programmer’s Manual, Vol. 2 Chapter 5 (Page Translation) AMD’s perspective on Long Mode paging
Virtual Memory Concepts CS:APP (3rd ed.) by Bryant & O’Hallaron Chapter 9 (pp. 801-840) Conceptual understanding of address translation
OS Paging Implementation “Operating Systems: Three Easy Pieces” Chapters 18-20 High-level paging concepts with OS context
Linux Boot Process Linux Inside by 0xax Boot chapters How Linux actually does this transition
x86 Assembly Reference “Modern X86 Assembly” by Kusswurm Chapter 11 64-bit assembly specifics
MSR Reference Intel SDM Volume 4 IA32_EFER documentation All MSR details

Recommended Reading Order:

  1. CS:APP Chapter 9 for paging concepts
  2. “Low-Level Programming” Chapter 5 for practical implementation
  3. Intel SDM Vol. 3A Chapter 4 for precise page table formats
  4. Linux Inside for real-world reference

5.10 Implementation Phases

Phase 1: Protected Mode Foundation (Days 1-3)

  • Review Project 3 (Real to Protected Mode)
  • Ensure you have a working two-stage bootloader
  • Add status printing at each stage
  • Test: Successfully in Protected Mode with VGA output

Phase 2: Page Table Setup (Days 4-7)

  • Allocate page table memory (0x1000-0x3FFF)
  • Implement memory zeroing for page tables
  • Set up PML4, PDPT, PDT entries manually
  • Test: Page table memory contains correct values (verify with QEMU monitor)

Phase 3: Enable PAE and Load CR3 (Days 8-10)

  • Enable PAE in CR4
  • Load CR3 with PML4 address
  • Add verification output
  • Test: info registers shows correct CR4 and CR3 values

Phase 4: Enable Long Mode (Days 11-14)

  • Use RDMSR/WRMSR to set LME in EFER
  • Enable paging in CR0
  • Test: System doesn’t triple fault (you’re now in Compatibility Mode)

Phase 5: Jump to Long Mode (Days 15-21)

  • Create 64-bit GDT
  • Implement far jump to 64-bit code
  • Write 64-bit code section
  • Test: info registers shows RIP (not EIP), RAX contains 64-bit value

Phase 6: Polish and Verify (Days 22-30)

  • Add comprehensive status messages
  • Test in Bochs for strict validation
  • Clean up code, add comments
  • Demonstrate 64-bit register usage visually
  • Test on real hardware (if available)

6. Testing Strategy

6.1 Staged Verification

Stage Test Expected Result QEMU Command
1 Boot and Real Mode message “[16-bit Real Mode]” appears Visual
2 Protected Mode transition “[32-bit Protected Mode]” appears Visual
3 Page tables set up Memory at 0x1000-0x3FFF contains entries x /8xg 0x1000
4 PAE enabled CR4 bit 5 set info registers
5 CR3 loaded CR3 = 0x1000 info registers
6 LME set EFER.LME = 1 info registers shows LME
7 Paging enabled CR0 bit 31 set, no triple fault info registers
8 Long Mode active RIP register visible (not EIP) info registers
9 64-bit registers RAX shows 64-bit value info registers

6.2 Testing Commands

# Basic run
qemu-system-x86_64 -drive format=raw,file=disk.img

# With monitor access
qemu-system-x86_64 -drive format=raw,file=disk.img -monitor stdio

# Debug with GDB
qemu-system-x86_64 -s -S -drive format=raw,file=disk.img &
gdb -ex "target remote :1234"

QEMU Monitor Commands:

(qemu) info registers
# Look for:
# - RIP (not EIP) for Long Mode
# - CR0, CR3, CR4 values
# - EFER register (shows LME and LMA)

(qemu) info mem
# Shows page table mappings

(qemu) x /8xg 0x1000
# Examine PML4 (8-byte entries)

(qemu) x /8xg 0x2000
# Examine PDPT

(qemu) x /8xg 0x3000
# Examine PDT

GDB Commands:

# Set architecture for each mode
(gdb) set architecture i8086        # Real Mode
(gdb) set architecture i386         # Protected Mode
(gdb) set architecture i386:x86-64  # Long Mode

# Examine registers
(gdb) info registers
(gdb) p/x $cr0
(gdb) p/x $cr3
(gdb) p/x $cr4
(gdb) p/x $rax   # 64-bit in Long Mode

# Step through code
(gdb) si         # Single instruction
(gdb) ni         # Next instruction (over calls)

6.3 Verification Checklist

  • Binary is correct size and boots
  • Real Mode message displays
  • Protected Mode message displays
  • Page table memory is zeroed
  • PML4[0] = 0x2003
  • PDPT[0] = 0x3003
  • PDT[0] = 0x83 (2MB page, present, R/W)
  • CR4.PAE = 1
  • CR3 = 0x1000
  • EFER.LME = 1
  • CR0.PG = 1 (no triple fault)
  • Far jump to 64-bit segment succeeds
  • Long Mode message displays
  • 64-bit register demonstration works
  • info registers shows RIP and 64-bit RAX
  • Works in QEMU without issues
  • (Bonus) Works in Bochs

7. Common Pitfalls & Debugging

Pitfall 1: Triple Fault When Enabling Paging

Symptoms: System resets immediately after mov cr0, eax with PG bit set.

Root Cause: Page tables don’t identity map the currently executing code.

Why: When paging enables, the very next instruction fetch uses virtual addressing. If virtual address of your code isn’t mapped, page fault -> double fault -> triple fault.

Fix:

  • Ensure identity mapping covers your bootloader’s physical address
  • For code at 0x7C00-0x8FFF, a 2MB page at 0x000000-0x1FFFFF covers it
  • Verify page table entries have Present bit set

Quick Debug:

(qemu) x /8xg 0x1000  # Check PML4
# Should see 0x0000000000002003 (or similar with your PDPT address)
(qemu) x /8xg 0x3000  # Check PDT
# Should see 0x0000000000000083 for 2MB page

Pitfall 2: Triple Fault on Far Jump to 64-bit

Symptoms: Paging works, but far jump to 64-bit code causes reset.

Root Cause: 64-bit GDT incorrectly formatted, or L/D bits wrong.

Why: The 64-bit code segment must have L=1 and D=0. Any other combination is invalid for Long Mode code.

Fix:

  • Verify GDT64 code segment has L=1, D=0
  • Raw bytes should be: 0x00209A0000000000
  • Check byte 6: should have 0x20 (L=1, D=0, G=0)

Quick Debug:

(qemu) x /24xb &gdt64  # Examine raw GDT bytes
# Code segment (bytes 8-15) should be:
# 00 00 00 00 00 9A 20 00
#                ^^  ^^ = access byte and flags

Pitfall 3: Stuck in Compatibility Mode

Symptoms: Paging works, no crash, but info registers shows EIP not RIP.

Root Cause: Far jump didn’t use 64-bit code selector, or jump target wrong.

Why: After enabling paging with LME=1, you’re in Compatibility Mode (still executing 32-bit code). Only a far jump to a segment with L=1 transitions to true Long Mode.

Fix:

  • Ensure far jump uses correct selector: jmp 0x08:long_mode_target
  • Verify selector 0x08 points to 64-bit code segment (not 32-bit)
  • Target address must be correct and reachable

Quick Debug:

(gdb) p/x $cs
# Should be 0x08 (or your 64-bit code selector)
# Check CS descriptor in GDT

Pitfall 4: Page Table Entries Incorrect

Symptoms: Page fault or unexpected memory corruption.

Root Cause: Entries missing Present bit, or pointing to wrong address.

Why: Each page table entry must have bit 0 (Present) set. The address field must be properly aligned and point to the next level (or physical page).

Fix:

  • PML4 entry: (PDPT_address & 0xFFFFF000) | 0x03
  • PDPT entry: (PDT_address & 0xFFFFF000) | 0x03
  • PDT entry for 2MB page: (page_base & 0xFFE00000) | 0x83
  • Always set Present (0x01) and R/W (0x02) bits

Quick Debug:

; After setting up tables, verify:
mov eax, [0x1000]
; Should be 0x2003 (PDPT at 0x2000, Present, R/W)

Pitfall 5: EFER Read/Write Failure

Symptoms: RDMSR/WRMSR causes exception, or LME not set.

Root Cause: Wrong MSR address, or RDMSR/WRMSR usage error.

Why: MSR 0xC0000080 is IA32_EFER. The instruction reads/writes EDX:EAX, not just EAX.

Fix:

mov ecx, 0xC0000080   ; MSR address in ECX
rdmsr                  ; Result in EDX:EAX
or eax, (1 << 8)      ; Set LME (bit 8) - it's in EAX, not EDX
wrmsr                  ; Write from EDX:EAX

Quick Debug:

(qemu) info registers
# Look for EFER line, should show LME=1 after your code runs

8. Extensions & Challenges

After completing the basic transition, try these extensions:

Extension 1: Map More Memory (Beginner)

Extend your PDT to map the first 16MB (8 entries with 2MB pages). Update the page table setup and verify memory access above 2MB works.

Extension 2: 4KB Pages (Intermediate)

Implement a full 4-level hierarchy with 4KB pages instead of 2MB huge pages. Create a PT level and map individual pages. This prepares you for fine-grained memory management.

Extension 3: Higher Half Kernel (Intermediate)

Map your bootloader at both physical address AND at a high virtual address (e.g., 0xFFFF800000000000). This is how real kernels set up their address space.

Extension 4: No-Execute Bit (Intermediate)

Enable NX/XD support by setting EFER.NXE, then mark data pages as non-executable. Test that executing code from a NX page causes a fault.

Extension 5: Multiple Address Spaces (Advanced)

Create two different PML4 tables and switch between them by changing CR3. This is the foundation of process isolation.

Extension 6: 1GB Huge Pages (Advanced)

If your CPU supports it (check CPUID), use 1GB huge pages in PDPT entries. You’ll need only PML4 and PDPT levels.

Extension 7: Return to Protected Mode (Expert)

Implement the reverse transition: Long Mode -> Compatibility Mode -> Protected Mode -> Real Mode. This is useful for BIOS calls from 64-bit mode (some real-world bootloaders do this).


9. Real-World Connections

How Linux Does It

Linux’s kernel boot transition is in arch/x86/boot/compressed/head_64.S. The sequence:

  1. Real mode setup in arch/x86/boot/
  2. Protected mode entry in arch/x86/boot/pm.c
  3. Page table setup in arch/x86/boot/compressed/head_64.S
  4. Long mode entry with identity mapping
  5. Later, switch to kernel’s own page tables

Your bootloader implements the same conceptual sequence, simplified for learning.

GRUB and Modern Bootloaders

GRUB2’s 64-bit loading:

  1. Stage 1 (MBR) loads Stage 1.5/2
  2. Stage 2 can be in Protected Mode or Long Mode depending on target
  3. For 64-bit kernels, GRUB sets up minimal page tables
  4. Kernel then sets up its own page tables

UEFI Path

UEFI firmware already starts in Long Mode on 64-bit systems. UEFI bootloaders skip the Real -> Protected -> Long transition entirely. Understanding the legacy path helps appreciate what UEFI simplifies.

Industry Applications

  1. OS Development: Every 64-bit x86 OS must implement this transition
  2. Hypervisors: VMware, Xen, KVM manage guest CPU mode transitions
  3. Firmware: BIOS/UEFI implementations handle mode transitions
  4. Security Research: Understanding paging is crucial for exploit development/defense
  5. Embedded x86: Some x86 embedded systems run in various modes

10. Resources

Official Documentation

Online References

Tools

  • NASM - Assembler
  • QEMU - x86-64 emulator with excellent debugging
  • Bochs - Strict x86-64 emulator for validation
  • GDB - Debugger with QEMU integration

Books (Priority Order)

  1. “Low-Level Programming” by Igor Zhirkov - Best practical Long Mode guide
  2. Intel SDM Volume 3A - Authoritative page table reference
  3. CS:APP by Bryant & O’Hallaron - Virtual memory concepts
  4. AMD64 APM Volume 2 - AMD’s paging documentation

11. Self-Assessment Checklist

Understanding Check

  • I can explain why Long Mode requires paging (segmentation doesn’t scale)
  • I understand the 4-level page table hierarchy (PML4 -> PDPT -> PDT -> PT)
  • I know what each bit in a page table entry means
  • I can explain PAE and why it must be enabled before Long Mode
  • I understand the IA32_EFER MSR and LME/LMA bits
  • I know the difference between Compatibility Mode and Long Mode
  • I can explain why identity mapping is crucial during transition

Implementation Check

  • Page tables are 4KB aligned
  • PML4, PDPT, PDT entries are correctly formatted
  • 2MB huge pages work (PS bit set in PDT)
  • PAE enabled before loading CR3
  • CR3 loaded with PML4 address
  • LME set via RDMSR/WRMSR before enabling paging
  • Paging enabled without triple fault
  • 64-bit GDT has L=1, D=0 for code segment
  • Far jump to 64-bit segment succeeds
  • 64-bit code executes and uses full registers

Testing Check

  • Works in QEMU without triple faults
  • info registers shows RIP (not EIP)
  • 64-bit register demonstration works
  • Status messages appear at each stage
  • Works in Bochs (if available)

Interview Readiness

  • I can whiteboard the complete transition sequence
  • I can explain page table entry format from memory
  • I can discuss Long Mode vs Compatibility Mode
  • I can troubleshoot a page fault scenario
  • I understand how this relates to Linux boot process

12. Summary

This project represents the pinnacle of x86 bootloader development: transitioning from Real Mode through Protected Mode to Long Mode. You’ve learned:

  • Why paging is mandatory: Long Mode’s architectural decision to abandon segmentation
  • 4-level page tables: PML4 -> PDPT -> PDT -> PT/huge page hierarchy
  • Control register orchestration: CR0, CR3, CR4, and IA32_EFER working together
  • Identity mapping necessity: Why virtual=physical is essential during transition
  • GDT differences: 32-bit vs 64-bit code segment requirements (L and D bits)
  • MSR usage: RDMSR/WRMSR for accessing non-memory-mapped CPU configuration

This knowledge is foundational for:

  • Operating system kernel development
  • Hypervisor and virtualization technology
  • Security research (paging exploits, privilege escalation)
  • UEFI and firmware development
  • Understanding how every 64-bit x86 system boots

What you’ve achieved: You’ve completed the same CPU mode journey that Linux, Windows, and every 64-bit x86 operating system performs. Your code transitions through three decades of x86 history - from 1978’s 8086 (Real Mode) through 1985’s 80386 (Protected Mode) to 2003’s AMD64 (Long Mode). Few developers ever see this code, let alone write it themselves.

Next Steps: With Long Mode mastered, you’re ready for Project 7 (UEFI Hello World) which skips this transition entirely - UEFI starts in Long Mode. Or extend this project to build a minimal 64-bit kernel.


Project 6 of 17 in the Bootloader Deep Dive series

Previous: Project 5 - FAT12 Filesystem Bootloader Next: Project 7 - UEFI Hello World