Project 6: x86 Protected Mode Kernel
Build a kernel that transitions from 16-bit real mode to 32-bit protected mode, sets up a Global Descriptor Table (GDT), and executes C code with full 4GB memory access - the foundation for an operating system.
Quick Reference
| Attribute | Value |
|---|---|
| Difficulty | Expert |
| Time Estimate | 2-3 weeks |
| Language | x86 Assembly + C |
| Prerequisites | Project 5 (bootloader), GCC cross-compiler, basic C |
| Key Topics | GDT, protected mode, CR0, VGA text mode (0xB8000), calling conventions |
Table of Contents
- 1. Learning Objectives
- 2. Theoretical Foundation
- 3. Project Specification
- 4. Solution Architecture
- 5. Implementation Guide
- 6. Testing Strategy
- 7. Common Pitfalls & Debugging
- 8. Extensions & Challenges
- 9. Real-World Connections
- 10. Resources
- 11. Self-Assessment Checklist
- 12. Submission / Completion Criteria
1. Learning Objectives
By completing this project, you will:
- Understand x86 segmentation: Master the Global Descriptor Table and segment descriptors
- Perform CPU mode transitions: Switch from 16-bit real mode to 32-bit protected mode
- Control CPU state via control registers: Manipulate CR0 to enable protected mode
- Link assembly and C code: Use proper calling conventions and linker scripts
- Write directly to video memory: Display output via VGA text mode at 0xB8000
- Create a flat memory model: Set up segments for 32-bit flat addressing
- Build a cross-compilation toolchain: Use i686-elf-gcc for freestanding C
2. Theoretical Foundation
2.1 Core Concepts
What is Protected Mode?
Protected mode is the native 32-bit operating mode of x86 processors (80386+). It provides features essential for modern operating systems:
+-----------------------------------------------------------------------+
| PROTECTED MODE vs REAL MODE |
+-----------------------------------------------------------------------+
| |
| Feature Real Mode Protected Mode |
| ─────────────────────────────────────────────────────────────────── |
| Address Space 1 MB (20-bit) 4 GB (32-bit) |
| Registers 16-bit 32-bit |
| Segmentation Segment * 16 + Offset Descriptor-based |
| Memory Protection None Ring levels (0-3) |
| Paging Not available Optional (in CR0) |
| BIOS Services INT instructions Not available* |
| I/O Ports Always accessible Controlled by IOPL |
| |
| * BIOS services are 16-bit; can't be called from 32-bit mode |
| |
+-----------------------------------------------------------------------+
| |
| Memory Addressing Comparison: |
| |
| Real Mode: |
| Physical Address = Segment Register * 16 + Offset |
| Maximum: 0xFFFF * 16 + 0xFFFF = 0x10FFEF (~1 MB + 64 KB) |
| |
| Protected Mode (Flat Model): |
| Physical Address = Linear Address (segment base = 0) |
| Maximum: 0xFFFFFFFF (4 GB) |
| |
+-----------------------------------------------------------------------+
The Global Descriptor Table (GDT)
The GDT is a table in memory that defines memory segments. In protected mode, segment registers hold selectors that index into the GDT:
+-----------------------------------------------------------------------+
| GLOBAL DESCRIPTOR TABLE |
+-----------------------------------------------------------------------+
| |
| Memory Layout: |
| ┌───────────────────────────────────────────────────────────────┐ |
| │ GDT Pointer (GDTR Register - loaded with LGDT) │ |
| │ ┌─────────────────────┬─────────────────────────────────────┐│ |
| │ │ Limit (16-bit) │ Base Address (32-bit) ││ |
| │ │ Size of GDT - 1 │ Physical address of GDT ││ |
| │ └─────────────────────┴─────────────────────────────────────┘│ |
| └───────────────────────────────────────────────────────────────┘ |
| |
| GDT Entries (each 8 bytes): |
| ┌───────────────────────────────────────────────────────────────┐ |
| │ Entry 0: Null Descriptor (Required, all zeros) │ |
| ├───────────────────────────────────────────────────────────────┤ |
| │ Entry 1: Code Segment (0x08) │ |
| │ Base = 0x00000000, Limit = 0xFFFFF (4GB with granularity) │ |
| │ Executable, Readable, Ring 0 │ |
| ├───────────────────────────────────────────────────────────────┤ |
| │ Entry 2: Data Segment (0x10) │ |
| │ Base = 0x00000000, Limit = 0xFFFFF (4GB with granularity) │ |
| │ Read/Write, Ring 0 │ |
| └───────────────────────────────────────────────────────────────┘ |
| |
| Segment Selector (value in segment register): |
| ┌─────────────────────────────────────────────────────────────────┐ |
| │ Bits 15-3 │ Bit 2 │ Bits 1-0 │ |
| │ Index │ TI │ RPL │ |
| │ (GDT entry) │ 0=GDT │ Requested Privilege Level │ |
| └─────────────────────────────────────────────────────────────────┘ |
| |
| Example selectors: |
| 0x08 = Binary 0000000000001|0|00 = Index 1, GDT, Ring 0 (Code) |
| 0x10 = Binary 0000000000010|0|00 = Index 2, GDT, Ring 0 (Data) |
| |
+-----------------------------------------------------------------------+
GDT Descriptor Entry Format
Each GDT entry is 8 bytes with a complex, historical layout:
+-----------------------------------------------------------------------+
| GDT DESCRIPTOR FORMAT (8 bytes) |
+-----------------------------------------------------------------------+
| |
| Byte layout (why so complicated? Intel added bits in later CPUs): |
| |
| Byte 7 Byte 6 Byte 5 Byte 4 Byte 3-2 Byte 1-0 |
| ┌─────────┬─────────┬─────────┬─────────┬───────────┬───────────┐ |
| │Base 31:24│Flags+Lim│ Access │Base 23:16│ Base 15:0 │ Limit 15:0│ |
| └─────────┴─────────┴─────────┴─────────┴───────────┴───────────┘ |
| |
| Reconstructed fields: |
| - Base (32-bit): Bytes 7, 4, 3-2 (split across descriptor!) |
| - Limit (20-bit): Bytes 6[3:0], 1-0 |
| - Flags (4-bit): Byte 6[7:4] |
| - Access (8-bit): Byte 5 |
| |
| Access Byte (Byte 5): |
| ┌───┬───┬───┬───┬───┬───┬───┬───┐ |
| │ P │DPL│ S │ E │DC │RW │ A │ Bit numbers: 7 6 5 4 3 2 1 0 |
| └───┴───┴───┴───┴───┴───┴───┴───┘ |
| |
| P = Present (1 = valid segment) |
| DPL = Descriptor Privilege Level (0 = kernel, 3 = user) |
| S = Descriptor type (1 = code/data, 0 = system) |
| E = Executable (1 = code, 0 = data) |
| DC = Direction/Conforming |
| Data: 0 = grows up, 1 = grows down |
| Code: 0 = non-conforming, 1 = conforming |
| RW = Readable/Writable |
| Code: 1 = readable, 0 = execute-only |
| Data: 1 = writable, 0 = read-only |
| A = Accessed (CPU sets this when segment is used) |
| |
| Common Access values: |
| 0x9A = 10011010 = Present, Ring 0, Code, Readable |
| 0x92 = 10010010 = Present, Ring 0, Data, Writable |
| 0xFA = 11111010 = Present, Ring 3, Code, Readable (user) |
| 0xF2 = 11110010 = Present, Ring 3, Data, Writable (user) |
| |
| Flags (Byte 6, upper nibble): |
| ┌───┬───┬───┬───┐ |
| │ G │D/B│ L │AVL│ Bit numbers: 7 6 5 4 |
| └───┴───┴───┴───┘ |
| |
| G = Granularity (0 = 1 byte, 1 = 4KB units for limit) |
| D/B = Default operation size (1 = 32-bit, 0 = 16-bit) |
| L = Long mode (1 = 64-bit code, 0 = 32-bit) - not used here |
| AVL = Available for system software |
| |
| Common Flag values: |
| 0xC = 1100 = 4KB granularity, 32-bit mode |
| 0x4 = 0100 = 1-byte granularity, 32-bit mode |
| |
+-----------------------------------------------------------------------+
Mode Switching Procedure
+-----------------------------------------------------------------------+
| REAL MODE TO PROTECTED MODE SWITCH |
+-----------------------------------------------------------------------+
| |
| ┌─────────────────────────────────────────────────────────────────┐ |
| │ Step 1: Disable Interrupts (CLI) │ |
| │ - Interrupts must be disabled during transition │ |
| │ - IDT not set up yet; interrupt would crash │ |
| └─────────────────────────┬───────────────────────────────────────┘ |
| │ |
| ▼ |
| ┌─────────────────────────────────────────────────────────────────┐ |
| │ Step 2: Enable A20 Line │ |
| │ - Allow access to memory above 1MB │ |
| │ - Multiple methods: BIOS INT 0x15, keyboard controller, etc. │ |
| └─────────────────────────┬───────────────────────────────────────┘ |
| │ |
| ▼ |
| ┌─────────────────────────────────────────────────────────────────┐ |
| │ Step 3: Load GDT (LGDT instruction) │ |
| │ - Point GDTR to your GDT structure │ |
| │ - Format: 16-bit limit, 32-bit base address │ |
| └─────────────────────────┬───────────────────────────────────────┘ |
| │ |
| ▼ |
| ┌─────────────────────────────────────────────────────────────────┐ |
| │ Step 4: Set CR0.PE bit (enable protected mode) │ |
| │ mov eax, cr0 │ |
| │ or eax, 1 │ |
| │ mov cr0, eax │ |
| └─────────────────────────┬───────────────────────────────────────┘ |
| │ |
| ▼ |
| ┌─────────────────────────────────────────────────────────────────┐ |
| │ Step 5: Far Jump to 32-bit Code (flush CPU pipeline) │ |
| │ jmp 0x08:protected_mode_entry │ |
| │ - 0x08 = code segment selector │ |
| │ - Flushes prefetch queue, loads CS with selector │ |
| └─────────────────────────┬───────────────────────────────────────┘ |
| │ |
| ▼ |
| ┌─────────────────────────────────────────────────────────────────┐ |
| │ Step 6: Set Up Segment Registers [BITS 32] │ |
| │ mov ax, 0x10 ; Data segment selector │ |
| │ mov ds, ax │ |
| │ mov es, ax │ |
| │ mov fs, ax │ |
| │ mov gs, ax │ |
| │ mov ss, ax │ |
| └─────────────────────────┬───────────────────────────────────────┘ |
| │ |
| ▼ |
| ┌─────────────────────────────────────────────────────────────────┐ |
| │ Step 7: Set Up Stack │ |
| │ mov esp, 0x90000 ; Stack pointer (high address) │ |
| └─────────────────────────┬───────────────────────────────────────┘ |
| │ |
| ▼ |
| ┌─────────────────────────────────────────────────────────────────┐ |
| │ Step 8: Call C Kernel │ |
| │ call kernel_main ; Jump to C code! │ |
| └─────────────────────────────────────────────────────────────────┘ |
| |
+-----------------------------------------------------------------------+
2.2 Why This Matters
Protected mode is the foundation of all modern x86 operating systems:
- Memory protection: Prevent processes from corrupting each other
- Privilege separation: Kernel runs at Ring 0, users at Ring 3
- Virtual memory support: Paging builds on protected mode
- 32-bit addressing: Access 4GB of RAM (vs. 1MB in real mode)
- Modern instruction set: Full use of 80386+ instructions
2.3 Historical Context
+-----------------------------------------------------------------------+
| EVOLUTION OF x86 MODES |
+-----------------------------------------------------------------------+
| |
| 1978: 8086 - Real Mode Only |
| ├── 16-bit registers, 1MB addressing |
| └── Segment:offset addressing |
| |
| 1982: 80286 - Protected Mode Introduced |
| ├── 24-bit addressing (16MB) |
| ├── Privilege levels and memory protection |
| └── No way to return to real mode without reset! |
| |
| 1985: 80386 - Modern Protected Mode |
| ├── 32-bit registers and addressing (4GB) |
| ├── Paging support added |
| ├── Virtual 8086 mode for DOS compatibility |
| └── Can switch between modes freely |
| |
| 2003: AMD64 - Long Mode (64-bit) |
| ├── 64-bit addressing (16 exabytes theoretical) |
| ├── Compatibility mode for 32-bit code |
| └── Legacy mode for 16-bit real mode |
| |
| Modern systems: |
| Boot: Real Mode → Protected Mode → Long Mode (64-bit) |
| |
+-----------------------------------------------------------------------+
2.4 Common Misconceptions
| Misconception | Reality |
|---|---|
| “GDT entries must describe real memory regions” | No, with flat model all segments have base=0, limit=4GB |
| “You need many GDT entries” | Minimum is 3 (null, code, data); more for TSS, user mode |
| “Protected mode is automatic after setting CR0” | You must also do a far jump and reload segment registers |
| “BIOS calls still work” | No, BIOS is 16-bit; you need your own drivers now |
| “Paging is required in protected mode” | Paging is optional; we use pure segmentation first |
3. Project Specification
3.1 What You Will Build
A two-stage bootloader and kernel that:
- Stage 1 (512 bytes): Boot sector that loads Stage 2 and switches to protected mode
- Stage 2 (assembly): Sets up 32-bit environment and calls C code
- C Kernel: Main kernel code that outputs to VGA text mode
3.2 Functional Requirements
| Requirement | Description |
|---|---|
| FR-1 | Create a valid GDT with null, code, and data segments |
| FR-2 | Successfully transition from real mode to protected mode |
| FR-3 | Execute C code in 32-bit protected mode |
| FR-4 | Display output using VGA text mode (direct 0xB8000 access) |
| FR-5 | Work in QEMU with the multiboot specification (optional) |
| FR-6 | Build with a cross-compiler (i686-elf-gcc) |
3.3 Non-Functional Requirements
| Requirement | Description |
|---|---|
| NFR-1 | Code organized into separate assembly and C files |
| NFR-2 | Use a linker script to control memory layout |
| NFR-3 | Makefile for reproducible builds |
| NFR-4 | Comments explaining each GDT entry field |
3.4 Example Usage / Output
# Build the kernel
$ make
nasm -f elf32 boot.asm -o boot.o
i686-elf-gcc -c kernel.c -o kernel.o -ffreestanding -O2 -Wall
i686-elf-ld -T linker.ld -o kernel.elf boot.o kernel.o
i686-elf-objcopy -O binary kernel.elf kernel.bin
# Create bootable disk image
$ dd if=/dev/zero of=disk.img bs=512 count=2880
$ dd if=kernel.bin of=disk.img conv=notrunc
# Run in QEMU
$ qemu-system-i386 -kernel kernel.elf
# Or with disk image:
$ qemu-system-i386 -drive format=raw,file=disk.img
# Screen output:
================================================================================
MyOS v0.1 - Protected Mode Kernel
================================================================================
[OK] GDT loaded at 0x00000800
[OK] Switched to 32-bit protected mode
[OK] VGA driver initialized (80x25 color text mode)
[OK] C kernel executing at 0x00100000
Welcome to protected mode!
This text is written directly to VGA memory at 0xB8000.
Free memory: 3,932,160 bytes (kernel ends at 0x00101000)
3.5 Real World Outcome
After completing this project, you will have:
- Working 32-bit kernel that runs without any OS support
- Understanding of GDT that applies to all x86 OS development
- Cross-compilation skills essential for embedded/OS work
- Foundation for adding: interrupts (Project 7), paging (Project 8), multitasking
4. Solution Architecture
4.1 High-Level Design
+-----------------------------------------------------------------------+
| PROTECTED MODE KERNEL STRUCTURE |
+-----------------------------------------------------------------------+
| |
| Disk Layout: |
| ┌─────────────────────────────────────────────────────────────────┐ |
| │ Sector 1: Boot Sector (512 bytes) │ |
| │ - Loaded by BIOS at 0x7C00 │ |
| │ - Loads remaining sectors │ |
| │ - Jumps to second stage │ |
| ├─────────────────────────────────────────────────────────────────┤ |
| │ Sectors 2-N: Kernel Image │ |
| │ - 32-bit protected mode code │ |
| │ - GDT definition │ |
| │ - Mode switch code │ |
| │ - C kernel code │ |
| └─────────────────────────────────────────────────────────────────┘ |
| |
| Memory Layout After Boot: |
| ┌─────────────────────────────────────────────────────────────────┐ |
| │ 0x00000000 - 0x000003FF: Interrupt Vector Table (unused now) │ |
| │ 0x00000400 - 0x000004FF: BIOS Data Area (unused now) │ |
| │ 0x00000500 - 0x00007BFF: Free (for stack, temp data) │ |
| │ 0x00007C00 - 0x00007DFF: Boot sector (512 bytes) │ |
| │ 0x00007E00 - 0x0000FFFF: Free or loaded second stage │ |
| │ 0x00010000 - 0x0009FFFF: Kernel loaded here (typical) │ |
| │ 0x000A0000 - 0x000BFFFF: VGA Memory │ |
| │ 0x000B8000 - 0x000B8FA0: VGA Text Mode Buffer │ |
| │ 0x000C0000 - 0x000FFFFF: BIOS ROM (read-only) │ |
| │ 0x00100000+: Extended memory (kernel can be here with paging) │ |
| └─────────────────────────────────────────────────────────────────┘ |
| |
| Alternative: Use Multiboot and GRUB loads kernel at 0x00100000 |
| |
+-----------------------------------------------------------------------+
4.2 Key Components
+-----------------------------------------------------------------------+
| COMPONENT INTERACTION |
+-----------------------------------------------------------------------+
| |
| ┌─────────────────┐ ┌─────────────────┐ ┌─────────────────┐ |
| │ boot.asm │ │ setup.asm │ │ kernel.c │ |
| │ (16-bit) │────▶│ (16/32-bit) │────▶│ (32-bit C) │ |
| │ │ │ │ │ │ |
| │ - Load sectors │ │ - Define GDT │ │ - VGA output │ |
| │ - Jump to setup │ │ - Enable A20 │ │ - Kernel main │ |
| │ │ │ - Switch mode │ │ - String funcs │ |
| │ │ │ - Set up stack │ │ │ |
| │ │ │ - Call kernel │ │ │ |
| └─────────────────┘ └─────────────────┘ └─────────────────┘ |
| |
| Files: |
| - boot.asm: First stage bootloader (512 bytes) |
| - setup.asm: Second stage, GDT setup, mode switch |
| - kernel.c: Main C kernel |
| - vga.c/h: VGA text mode driver |
| - linker.ld: Linker script for memory layout |
| |
+-----------------------------------------------------------------------+
4.3 Data Structures
GDT Structure
// GDT Entry (8 bytes)
struct gdt_entry {
uint16_t limit_low; // Limit bits 0-15
uint16_t base_low; // Base bits 0-15
uint8_t base_middle; // Base bits 16-23
uint8_t access; // Access flags
uint8_t granularity; // Flags + limit bits 16-19
uint8_t base_high; // Base bits 24-31
} __attribute__((packed));
// GDT Pointer (6 bytes)
struct gdt_ptr {
uint16_t limit; // Size of GDT - 1
uint32_t base; // Address of GDT
} __attribute__((packed));
VGA Text Mode Character
// VGA character entry (2 bytes)
// At 0xB8000: 80 columns x 25 rows = 2000 characters = 4000 bytes
// Byte 0: ASCII character
// Byte 1: Attribute (foreground/background colors)
// Bits 0-3: Foreground color
// Bits 4-6: Background color
// Bit 7: Blink (if enabled)
#define VGA_COLOR_BLACK 0
#define VGA_COLOR_BLUE 1
#define VGA_COLOR_GREEN 2
#define VGA_COLOR_CYAN 3
#define VGA_COLOR_RED 4
#define VGA_COLOR_MAGENTA 5
#define VGA_COLOR_BROWN 6
#define VGA_COLOR_LIGHT_GREY 7
#define VGA_COLOR_DARK_GREY 8
#define VGA_COLOR_LIGHT_BLUE 9
#define VGA_COLOR_LIGHT_GREEN 10
#define VGA_COLOR_LIGHT_CYAN 11
#define VGA_COLOR_LIGHT_RED 12
#define VGA_COLOR_LIGHT_MAGENTA 13
#define VGA_COLOR_LIGHT_BROWN 14
#define VGA_COLOR_WHITE 15
4.4 Algorithm Overview
+-----------------------------------------------------------------------+
| EXECUTION FLOW |
+-----------------------------------------------------------------------+
| |
| BIOS loads boot sector (0x7C00) |
| │ |
| ▼ |
| ┌─────────────────────────────────────────────────────────────────┐ |
| │ Boot Sector (16-bit Real Mode) │ |
| │ 1. Set up segments (DS, ES, SS) │ |
| │ 2. Save boot drive (DL) │ |
| │ 3. Load remaining sectors (INT 0x13) │ |
| │ 4. Jump to loaded code │ |
| └────────────────────────────┬────────────────────────────────────┘ |
| │ |
| ▼ |
| ┌─────────────────────────────────────────────────────────────────┐ |
| │ Setup Code (16-bit → 32-bit transition) │ |
| │ 1. CLI (disable interrupts) │ |
| │ 2. Enable A20 line │ |
| │ 3. Load GDT (LGDT gdt_descriptor) │ |
| │ 4. Set CR0.PE = 1 │ |
| │ 5. Far jump to 32-bit code (jmp 0x08:start32) │ |
| │ 6. [Now in 32-bit mode] │ |
| │ 7. Load data segment registers with 0x10 │ |
| │ 8. Set up stack (ESP) │ |
| │ 9. Call kernel_main() │ |
| └────────────────────────────┬────────────────────────────────────┘ |
| │ |
| ▼ |
| ┌─────────────────────────────────────────────────────────────────┐ |
| │ C Kernel (32-bit Protected Mode) │ |
| │ 1. Initialize VGA driver │ |
| │ 2. Clear screen │ |
| │ 3. Print welcome message │ |
| │ 4. Initialize other subsystems │ |
| │ 5. Enter main loop or halt │ |
| └─────────────────────────────────────────────────────────────────┘ |
| |
+-----------------------------------------------------------------------+
5. Implementation Guide
5.1 Development Environment Setup
Build Cross-Compiler (i686-elf-gcc)
This is the most important step. You need a cross-compiler that targets bare-metal x86:
# Install build dependencies
# On Ubuntu/Debian:
sudo apt install build-essential bison flex libgmp3-dev libmpc-dev \
libmpfr-dev texinfo libisl-dev
# On macOS:
brew install gmp mpfr libmpc
# Set up directories
export PREFIX="$HOME/opt/cross"
export TARGET=i686-elf
export PATH="$PREFIX/bin:$PATH"
mkdir -p $HOME/src
cd $HOME/src
# Download and build binutils
wget https://ftp.gnu.org/gnu/binutils/binutils-2.41.tar.xz
tar xf binutils-2.41.tar.xz
mkdir build-binutils && cd build-binutils
../binutils-2.41/configure --target=$TARGET --prefix="$PREFIX" \
--with-sysroot --disable-nls --disable-werror
make -j$(nproc)
make install
cd ..
# Download and build GCC
wget https://ftp.gnu.org/gnu/gcc/gcc-13.2.0/gcc-13.2.0.tar.xz
tar xf gcc-13.2.0.tar.xz
mkdir build-gcc && cd build-gcc
../gcc-13.2.0/configure --target=$TARGET --prefix="$PREFIX" \
--disable-nls --enable-languages=c,c++ --without-headers
make -j$(nproc) all-gcc all-target-libgcc
make install-gcc install-target-libgcc
# Verify
i686-elf-gcc --version
Alternative: Use pre-built toolchains from osdev.org or Docker images.
5.2 Project Structure
protected-mode-kernel/
├── Makefile
├── linker.ld
├── src/
│ ├── boot.asm # First-stage bootloader
│ ├── setup.asm # Mode switch and GDT
│ └── kernel.c # Main C kernel
├── include/
│ └── vga.h # VGA constants
├── iso/ # For GRUB boot (optional)
│ └── boot/
│ └── grub/
│ └── grub.cfg
└── README.md
Linker Script (linker.ld)
/* Linker script for protected mode kernel */
ENTRY(_start)
SECTIONS
{
/* Kernel loaded at 1MB for multiboot, or lower for custom boot */
. = 0x00100000; /* 1 MB mark */
.text BLOCK(4K) : ALIGN(4K)
{
*(.multiboot) /* Multiboot header first */
*(.text.boot) /* Then entry point */
*(.text) /* Then all other code */
}
.rodata BLOCK(4K) : ALIGN(4K)
{
*(.rodata)
}
.data BLOCK(4K) : ALIGN(4K)
{
*(.data)
}
.bss BLOCK(4K) : ALIGN(4K)
{
*(COMMON)
*(.bss)
}
_kernel_end = .;
}
5.3 The Core Question You’re Answering
“How do you transition from the CPU’s initial 16-bit mode to the full-featured 32-bit mode that real operating systems use?”
This involves understanding:
- Why segmentation exists and how GDT works
- What the CPU needs to switch modes
- How to write code for a CPU without any OS support
- The calling conventions between assembly and C
5.4 Concepts You Must Understand First
| Concept | Self-Assessment Question | Book Reference |
|---|---|---|
| GDT Structure | What goes in bytes 5-6 of a descriptor? | Intel SDM Vol 3, Ch 3 |
| Segment Selectors | Why is the code segment 0x08 and data 0x10? | OSDev Wiki - GDT |
| Control Registers | What does CR0 bit 0 (PE) do? | Intel SDM Vol 3, Ch 2 |
| Flat Memory Model | How do you make segments “invisible”? | OS Three Easy Pieces |
| C Calling Convention | How are function arguments passed in cdecl? | System V i386 ABI |
5.5 Questions to Guide Your Design
GDT Design:
- How many entries do you need for a minimal kernel?
- What base and limit should code and data segments have?
- Why is the null descriptor required?
Mode Switching:
- Why must you disable interrupts before switching?
- What’s special about the far jump after setting CR0.PE?
- Why reload segment registers immediately after the jump?
Linking C and Assembly:
- What symbol naming conventions does your C compiler use?
- How do you export symbols from assembly?
- What stack layout does your C compiler expect?
VGA Output:
- How do you calculate the address of character at (row, col)?
- What attribute byte gives white text on blue background?
- How do you scroll the screen when it fills up?
5.6 Thinking Exercise
Before coding, manually trace through this GDT setup:
gdt_start:
dq 0 ; Null descriptor
gdt_code:
dw 0xFFFF ; Limit (low)
dw 0x0000 ; Base (low)
db 0x00 ; Base (middle)
db 10011010b ; Access byte
db 11001111b ; Flags + Limit (high)
db 0x00 ; Base (high)
gdt_data:
dw 0xFFFF ; Limit (low)
dw 0x0000 ; Base (low)
db 0x00 ; Base (middle)
db 10010010b ; Access byte
db 11001111b ; Flags + Limit (high)
db 0x00 ; Base (high)
gdt_end:
gdt_descriptor:
dw gdt_end - gdt_start - 1 ; Size
dd gdt_start ; Address
Questions:
- What is the base address of the code segment?
- What is the limit in bytes (considering granularity)?
- Is the code segment readable? How do you know?
- What privilege level (ring) is this for?
5.7 Hints in Layers
Hint 1: Starting Point (Conceptual Direction)
You need:
- Boot sector that loads more code
- GDT with flat memory model (base=0, limit=4GB)
- Assembly routine that: CLI, enable A20, load GDT, set CR0.PE, far jump
- 32-bit setup: reload segments, set stack, call C
Hint 2: Next Level (More Specific Guidance)
Assembly structure (setup.asm):
[BITS 16]
; ... 16-bit code to enable A20, load GDT, set CR0 ...
[BITS 32]
start_protected:
; Set up segments and stack
; Call C function
; GDT goes here
; GDT descriptor goes here
C kernel structure (kernel.c):
void kernel_main(void) {
// VGA is at 0xB8000
// Each character is 2 bytes (char + attribute)
char* video = (char*) 0xB8000;
video[0] = 'H';
video[1] = 0x0F; // White on black
// ...
}
Hint 3: Technical Details (Approach/Pseudocode)
Enable A20 (fast method):
enable_a20:
in al, 0x92
or al, 2
out 0x92, al
ret
Mode switch:
switch_to_pm:
cli
lgdt [gdt_descriptor]
mov eax, cr0
or eax, 1
mov cr0, eax
jmp 0x08:init_pm ; Far jump!
[BITS 32]
init_pm:
mov ax, 0x10 ; Data segment
mov ds, ax
mov ss, ax
mov es, ax
mov fs, ax
mov gs, ax
mov esp, 0x90000 ; Stack
call kernel_main ; C entry point
hlt
Hint 4: Tools/Debugging (Verification Methods)
# Check if GDT is valid using QEMU monitor
# Press Ctrl+Alt+2 to access monitor
info gdt
# Debug with GDB
qemu-system-i386 -kernel kernel.elf -s -S &
gdb -ex "target remote :1234" \
-ex "set architecture i386" \
-ex "break *0x7c00" \
-ex "continue"
# Step through mode switch
(gdb) break *0x7c50 # wherever switch happens
(gdb) info registers
(gdb) p/x $cr0
# View VGA memory
(gdb) x/20hx 0xB8000
5.8 The Interview Questions They’ll Ask
- “Explain the GDT and what happens during LGDT”
- GDT defines memory segments (base, limit, permissions)
- LGDT loads the GDTR register with table location
- Segment registers then hold selectors into this table
- “Why is the far jump necessary after setting CR0.PE?”
- CPU has prefetch queue with 16-bit decoded instructions
- Far jump flushes the queue and reloads CS
- Without it, CPU would execute garbage
- “What’s the difference between flat and segmented models?”
- Segmented: Different base addresses for different segments
- Flat: All segments have base=0, limit=4GB
- Flat is standard for modern OSes (Linux, Windows)
- “How does VGA text mode work?”
- Memory-mapped I/O at 0xB8000
- 2 bytes per character: ASCII + attribute
- 80 columns x 25 rows = 4000 bytes total
- Hardware automatically displays this memory
- “Why use a cross-compiler instead of native GCC?”
- Native GCC targets host OS with libraries
- Cross-compiler produces freestanding code
- No C library calls, no OS dependencies
- Proper calling conventions for bare metal
5.9 Books That Will Help
| Topic | Book | Chapter |
|---|---|---|
| GDT and Protected Mode | Intel SDM Volume 3 | Chapter 3 |
| Segmentation | “Operating Systems: Three Easy Pieces” | Chapter 16 |
| x86 Architecture | “Write Great Code, Volume 2” | Chapters 3-4 |
| OS Development | “Operating Systems: From 0 to 1” | Chapters 3-4 |
| VGA Programming | OSDev Wiki | Text Mode section |
5.10 Implementation Phases
Phase 1: Cross-Compiler Setup (1-2 days)
- Build or obtain i686-elf-gcc
- Verify with simple freestanding C compilation
- Test linking with NASM object files
Phase 2: Boot Sector Enhancement (2-3 days)
- Modify Project 5 bootloader to load more sectors
- Test loading code to 0x10000 or similar
Phase 3: GDT and Mode Switch (3-4 days)
- Define GDT entries
- Implement A20 enable
- Write mode switch code
- Verify 32-bit execution
Phase 4: C Kernel and VGA (3-4 days)
- Write linker script
- Implement VGA text output
- Create print functions
- Add formatting (colors, positions)
Phase 5: Polish and Documentation (2-3 days)
- Clean up code
- Add comments
- Write build instructions
- Test on multiple QEMU versions
5.11 Key Implementation Decisions
| Decision | Option A | Option B | Recommendation |
|---|---|---|---|
| Boot method | Custom bootloader | Multiboot/GRUB | Custom for learning, Multiboot for features |
| Kernel load address | 0x10000 (below 1MB) | 0x100000 (1MB+) | 0x10000 initially (simpler), 1MB+ with paging |
| A20 enable method | BIOS INT 0x15 | Fast A20 (0x92) | Fast A20 is simpler, works on most hardware |
| VGA driver | Direct memory | Abstraction layer | Abstraction for clean code |
6. Testing Strategy
Verification Points
#!/bin/bash
# test_protected_mode.sh
echo "Test 1: Verify kernel binary"
file kernel.elf
# Should show: ELF 32-bit LSB executable, Intel 80386
echo "Test 2: Check multiboot header (if using)"
objdump -h kernel.elf | grep multiboot
echo "Test 3: Verify entry point"
readelf -h kernel.elf | grep "Entry point"
# Should match linker script
echo "Test 4: Run in QEMU with debug output"
qemu-system-i386 -kernel kernel.elf -d int 2>&1 | head -50
# Should not show unexpected interrupts
echo "Test 5: QEMU monitor GDT check"
echo "info gdt" | qemu-system-i386 -kernel kernel.elf -monitor stdio -nographic &
sleep 2
kill %1
GDB Test Session
# Terminal 1: Start QEMU with GDB server
qemu-system-i386 -kernel kernel.elf -s -S -nographic
# Terminal 2: Connect GDB
gdb kernel.elf
(gdb) target remote :1234
(gdb) break kernel_main
(gdb) continue
# Should break in C code
(gdb) info registers
# EIP should be in kernel code
(gdb) x/20hx 0xB8000
# Should show your VGA output
7. Common Pitfalls & Debugging
Pitfall 1: GDT Not Aligned or Wrong Size
Symptom: Triple fault immediately after LGDT.
Cause: GDT descriptor has wrong size or address.
Fix:
gdt_descriptor:
dw gdt_end - gdt_start - 1 ; Size - 1 (not size!)
dd gdt_start ; Linear address, not segment:offset
Pitfall 2: Missing Far Jump After CR0.PE
Symptom: Strange crashes or garbage execution.
Cause: CPU executing old 16-bit instructions in 32-bit mode.
Fix:
mov eax, cr0
or eax, 1
mov cr0, eax
jmp 0x08:start32 ; MUST be far jump with code segment selector
Pitfall 3: Wrong Segment Selector Values
Symptom: General protection fault.
Cause: Using GDT index instead of selector value.
Fix:
; Selector = (Index * 8) | TI | RPL
; For GDT entry 1 (code): 1 * 8 = 0x08
; For GDT entry 2 (data): 2 * 8 = 0x10
mov ax, 0x10 ; Not 0x02!
mov ds, ax
Pitfall 4: Calling C Without Proper Stack
Symptom: Crash when C function executes or returns.
Cause: Stack pointer not set or in wrong location.
Fix:
; Before calling C
mov esp, 0x90000 ; Pick address in usable memory
push dword 0 ; Fake return address (C won't return)
call kernel_main
Pitfall 5: A20 Line Not Enabled
Symptom: Memory access wraps at 1MB boundary.
Cause: A20 line still disabled.
Fix:
; Fast A20 method (works on most hardware)
in al, 0x92
test al, 2
jnz .a20_done
or al, 2
and al, 0xFE ; Don't accidentally reset!
out 0x92, al
.a20_done:
Debugging with QEMU Monitor
# Press Ctrl+Alt+2 for QEMU monitor
# Useful commands:
info registers # Show CPU registers
info gdt # Show GDT
info idt # Show IDT
x/10i $eip # Disassemble at EIP
xp /20x 0xB8000 # Show VGA memory
log cpu_reset # Log CPU resets
8. Extensions & Challenges
Extension 1: Add Color Printing (Easy)
enum vga_color {
VGA_BLACK = 0, VGA_BLUE = 1, /* ... */
};
void print_colored(const char* str, uint8_t fg, uint8_t bg) {
uint8_t color = fg | (bg << 4);
volatile uint16_t* vga = (uint16_t*) 0xB8000;
for (size_t i = 0; str[i] != '\0'; i++) {
vga[i] = str[i] | (color << 8);
}
}
Extension 2: Scrolling Console (Medium)
void scroll(void) {
volatile uint16_t* vga = (uint16_t*) 0xB8000;
// Move lines 1-24 to lines 0-23
for (int i = 0; i < 24 * 80; i++) {
vga[i] = vga[i + 80];
}
// Clear line 24
for (int i = 24 * 80; i < 25 * 80; i++) {
vga[i] = ' ' | (0x0F << 8);
}
}
Extension 3: Printf Implementation (Medium)
Implement a basic printf for formatted output. Handle %d, %x, %s, %c.
Extension 4: Keyboard Input (Advanced)
Read scan codes from keyboard I/O port 0x60 (requires interrupt setup - see Project 7).
9. Real-World Connections
How This Relates to Professional Software
| This Project | Real-World Equivalent |
|---|---|
| GDT setup | Linux kernel GDT in arch/x86/kernel/cpu/common.c |
| Mode switch | Linux decompression code (real mode → protected mode) |
| VGA output | Early Linux printk, BIOS console |
| Linker script | Linux arch/x86/kernel/vmlinux.lds.S |
Industry Applications
- Operating System Development: Windows, Linux, BSD kernel startup
- BIOS/UEFI Firmware: AMI, Phoenix, Intel BIOS source
- Hypervisors: VMware, Xen, KVM host code
- Embedded x86: Industrial controllers, kiosks
- Security Research: Rootkit/bootkit analysis
10. Resources
Official Documentation
- Intel Software Developer Manual Vol 3 - System Programming Guide
- System V i386 ABI - Calling conventions
Tutorials
Tools
11. Self-Assessment Checklist
Understanding
- I can explain each field in a GDT entry
- I understand why base=0, limit=0xFFFFF creates 4GB segments
- I know what CR0.PE does and why the far jump is needed
- I can describe the difference between 16-bit and 32-bit execution
- I understand why BIOS calls don’t work in protected mode
Implementation
- GDT has correct null, code, and data segments
- Mode switch sequence is complete (CLI, A20, LGDT, CR0.PE, far jump)
- Segment registers are reloaded after switch
- Stack is properly set up before calling C
- VGA output works correctly
Skills
- I can build with the cross-compiler
- I can debug with QEMU and GDB
- I can read and write linker scripts
- I can link assembly and C code
- I understand memory layout of my kernel
12. Submission / Completion Criteria
Your project is complete when:
- Build Requirements
- Builds with
makeusing i686-elf-gcc - Produces a valid ELF or binary kernel image
- Linker script controls memory layout
- Builds with
- Functional Requirements
- Boots in QEMU (with -kernel or custom boot)
- Successfully switches to protected mode
- Executes C code
- Displays output via VGA text mode
- Code Quality
- GDT entries are documented
- Assembly and C are properly separated
- Build process is reproducible
- Documentation
- README explains the project
- Memory map is documented
- Build requirements are listed
Verification Commands
# Build
make clean && make
# Verify ELF file
file kernel.elf
readelf -h kernel.elf
# Run
qemu-system-i386 -kernel kernel.elf
# Debug
qemu-system-i386 -kernel kernel.elf -s -S &
gdb -ex "target remote :1234" -ex "break kernel_main" -ex "continue"
Congratulations! Completing this project means you understand the most critical transition in x86 system programming. You now have a 32-bit kernel that’s the foundation for interrupts (Project 7), paging (Project 8), and eventually a full operating system with multitasking.