Project 6: x86 Protected Mode Kernel

Build a kernel that transitions from 16-bit real mode to 32-bit protected mode, sets up a Global Descriptor Table (GDT), and executes C code with full 4GB memory access - the foundation for an operating system.


Quick Reference

Attribute Value
Difficulty Expert
Time Estimate 2-3 weeks
Language x86 Assembly + C
Prerequisites Project 5 (bootloader), GCC cross-compiler, basic C
Key Topics GDT, protected mode, CR0, VGA text mode (0xB8000), calling conventions

Table of Contents


1. Learning Objectives

By completing this project, you will:

  1. Understand x86 segmentation: Master the Global Descriptor Table and segment descriptors
  2. Perform CPU mode transitions: Switch from 16-bit real mode to 32-bit protected mode
  3. Control CPU state via control registers: Manipulate CR0 to enable protected mode
  4. Link assembly and C code: Use proper calling conventions and linker scripts
  5. Write directly to video memory: Display output via VGA text mode at 0xB8000
  6. Create a flat memory model: Set up segments for 32-bit flat addressing
  7. Build a cross-compilation toolchain: Use i686-elf-gcc for freestanding C

2. Theoretical Foundation

2.1 Core Concepts

What is Protected Mode?

Protected mode is the native 32-bit operating mode of x86 processors (80386+). It provides features essential for modern operating systems:

+-----------------------------------------------------------------------+
|                    PROTECTED MODE vs REAL MODE                         |
+-----------------------------------------------------------------------+
|                                                                        |
|  Feature              Real Mode              Protected Mode            |
|  ─────────────────────────────────────────────────────────────────── |
|  Address Space        1 MB (20-bit)          4 GB (32-bit)            |
|  Registers            16-bit                 32-bit                    |
|  Segmentation         Segment * 16 + Offset  Descriptor-based         |
|  Memory Protection    None                   Ring levels (0-3)        |
|  Paging               Not available          Optional (in CR0)        |
|  BIOS Services        INT instructions       Not available*           |
|  I/O Ports            Always accessible      Controlled by IOPL       |
|                                                                        |
|  * BIOS services are 16-bit; can't be called from 32-bit mode         |
|                                                                        |
+-----------------------------------------------------------------------+
|                                                                        |
|  Memory Addressing Comparison:                                         |
|                                                                        |
|  Real Mode:                                                            |
|    Physical Address = Segment Register * 16 + Offset                   |
|    Maximum: 0xFFFF * 16 + 0xFFFF = 0x10FFEF (~1 MB + 64 KB)           |
|                                                                        |
|  Protected Mode (Flat Model):                                          |
|    Physical Address = Linear Address (segment base = 0)                |
|    Maximum: 0xFFFFFFFF (4 GB)                                         |
|                                                                        |
+-----------------------------------------------------------------------+

The Global Descriptor Table (GDT)

The GDT is a table in memory that defines memory segments. In protected mode, segment registers hold selectors that index into the GDT:

+-----------------------------------------------------------------------+
|                    GLOBAL DESCRIPTOR TABLE                             |
+-----------------------------------------------------------------------+
|                                                                        |
|  Memory Layout:                                                        |
|  ┌───────────────────────────────────────────────────────────────┐    |
|  │  GDT Pointer (GDTR Register - loaded with LGDT)               │    |
|  │  ┌─────────────────────┬─────────────────────────────────────┐│    |
|  │  │  Limit (16-bit)     │  Base Address (32-bit)              ││    |
|  │  │  Size of GDT - 1    │  Physical address of GDT            ││    |
|  │  └─────────────────────┴─────────────────────────────────────┘│    |
|  └───────────────────────────────────────────────────────────────┘    |
|                                                                        |
|  GDT Entries (each 8 bytes):                                          |
|  ┌───────────────────────────────────────────────────────────────┐    |
|  │ Entry 0: Null Descriptor (Required, all zeros)                │    |
|  ├───────────────────────────────────────────────────────────────┤    |
|  │ Entry 1: Code Segment (0x08)                                  │    |
|  │   Base = 0x00000000, Limit = 0xFFFFF (4GB with granularity)   │    |
|  │   Executable, Readable, Ring 0                                │    |
|  ├───────────────────────────────────────────────────────────────┤    |
|  │ Entry 2: Data Segment (0x10)                                  │    |
|  │   Base = 0x00000000, Limit = 0xFFFFF (4GB with granularity)   │    |
|  │   Read/Write, Ring 0                                          │    |
|  └───────────────────────────────────────────────────────────────┘    |
|                                                                        |
|  Segment Selector (value in segment register):                        |
|  ┌─────────────────────────────────────────────────────────────────┐  |
|  │  Bits 15-3    │  Bit 2  │  Bits 1-0                             │  |
|  │  Index        │  TI     │  RPL                                  │  |
|  │  (GDT entry)  │  0=GDT  │  Requested Privilege Level            │  |
|  └─────────────────────────────────────────────────────────────────┘  |
|                                                                        |
|  Example selectors:                                                    |
|    0x08 = Binary 0000000000001|0|00 = Index 1, GDT, Ring 0 (Code)     |
|    0x10 = Binary 0000000000010|0|00 = Index 2, GDT, Ring 0 (Data)     |
|                                                                        |
+-----------------------------------------------------------------------+

GDT Descriptor Entry Format

Each GDT entry is 8 bytes with a complex, historical layout:

+-----------------------------------------------------------------------+
|                    GDT DESCRIPTOR FORMAT (8 bytes)                     |
+-----------------------------------------------------------------------+
|                                                                        |
|  Byte layout (why so complicated? Intel added bits in later CPUs):    |
|                                                                        |
|  Byte 7     Byte 6     Byte 5     Byte 4     Byte 3-2    Byte 1-0    |
|  ┌─────────┬─────────┬─────────┬─────────┬───────────┬───────────┐   |
|  │Base 31:24│Flags+Lim│ Access  │Base 23:16│ Base 15:0 │ Limit 15:0│   |
|  └─────────┴─────────┴─────────┴─────────┴───────────┴───────────┘   |
|                                                                        |
|  Reconstructed fields:                                                 |
|  - Base (32-bit): Bytes 7, 4, 3-2 (split across descriptor!)          |
|  - Limit (20-bit): Bytes 6[3:0], 1-0                                  |
|  - Flags (4-bit): Byte 6[7:4]                                         |
|  - Access (8-bit): Byte 5                                              |
|                                                                        |
|  Access Byte (Byte 5):                                                 |
|  ┌───┬───┬───┬───┬───┬───┬───┬───┐                                   |
|  │ P │DPL│ S │ E │DC │RW │ A │    Bit numbers: 7 6 5 4 3 2 1 0       |
|  └───┴───┴───┴───┴───┴───┴───┴───┘                                   |
|                                                                        |
|  P   = Present (1 = valid segment)                                    |
|  DPL = Descriptor Privilege Level (0 = kernel, 3 = user)              |
|  S   = Descriptor type (1 = code/data, 0 = system)                    |
|  E   = Executable (1 = code, 0 = data)                                |
|  DC  = Direction/Conforming                                           |
|        Data: 0 = grows up, 1 = grows down                             |
|        Code: 0 = non-conforming, 1 = conforming                       |
|  RW  = Readable/Writable                                              |
|        Code: 1 = readable, 0 = execute-only                           |
|        Data: 1 = writable, 0 = read-only                              |
|  A   = Accessed (CPU sets this when segment is used)                  |
|                                                                        |
|  Common Access values:                                                 |
|    0x9A = 10011010 = Present, Ring 0, Code, Readable                  |
|    0x92 = 10010010 = Present, Ring 0, Data, Writable                  |
|    0xFA = 11111010 = Present, Ring 3, Code, Readable (user)           |
|    0xF2 = 11110010 = Present, Ring 3, Data, Writable (user)           |
|                                                                        |
|  Flags (Byte 6, upper nibble):                                        |
|  ┌───┬───┬───┬───┐                                                   |
|  │ G │D/B│ L │AVL│    Bit numbers: 7 6 5 4                            |
|  └───┴───┴───┴───┘                                                   |
|                                                                        |
|  G   = Granularity (0 = 1 byte, 1 = 4KB units for limit)             |
|  D/B = Default operation size (1 = 32-bit, 0 = 16-bit)                |
|  L   = Long mode (1 = 64-bit code, 0 = 32-bit) - not used here        |
|  AVL = Available for system software                                   |
|                                                                        |
|  Common Flag values:                                                   |
|    0xC = 1100 = 4KB granularity, 32-bit mode                          |
|    0x4 = 0100 = 1-byte granularity, 32-bit mode                       |
|                                                                        |
+-----------------------------------------------------------------------+

Mode Switching Procedure

+-----------------------------------------------------------------------+
|                    REAL MODE TO PROTECTED MODE SWITCH                  |
+-----------------------------------------------------------------------+
|                                                                        |
|  ┌─────────────────────────────────────────────────────────────────┐  |
|  │ Step 1: Disable Interrupts (CLI)                                │  |
|  │   - Interrupts must be disabled during transition               │  |
|  │   - IDT not set up yet; interrupt would crash                   │  |
|  └─────────────────────────┬───────────────────────────────────────┘  |
|                            │                                           |
|                            ▼                                           |
|  ┌─────────────────────────────────────────────────────────────────┐  |
|  │ Step 2: Enable A20 Line                                         │  |
|  │   - Allow access to memory above 1MB                            │  |
|  │   - Multiple methods: BIOS INT 0x15, keyboard controller, etc.  │  |
|  └─────────────────────────┬───────────────────────────────────────┘  |
|                            │                                           |
|                            ▼                                           |
|  ┌─────────────────────────────────────────────────────────────────┐  |
|  │ Step 3: Load GDT (LGDT instruction)                             │  |
|  │   - Point GDTR to your GDT structure                            │  |
|  │   - Format: 16-bit limit, 32-bit base address                   │  |
|  └─────────────────────────┬───────────────────────────────────────┘  |
|                            │                                           |
|                            ▼                                           |
|  ┌─────────────────────────────────────────────────────────────────┐  |
|  │ Step 4: Set CR0.PE bit (enable protected mode)                  │  |
|  │   mov eax, cr0                                                  │  |
|  │   or eax, 1                                                     │  |
|  │   mov cr0, eax                                                  │  |
|  └─────────────────────────┬───────────────────────────────────────┘  |
|                            │                                           |
|                            ▼                                           |
|  ┌─────────────────────────────────────────────────────────────────┐  |
|  │ Step 5: Far Jump to 32-bit Code (flush CPU pipeline)            │  |
|  │   jmp 0x08:protected_mode_entry                                 │  |
|  │   - 0x08 = code segment selector                                │  |
|  │   - Flushes prefetch queue, loads CS with selector              │  |
|  └─────────────────────────┬───────────────────────────────────────┘  |
|                            │                                           |
|                            ▼                                           |
|  ┌─────────────────────────────────────────────────────────────────┐  |
|  │ Step 6: Set Up Segment Registers [BITS 32]                      │  |
|  │   mov ax, 0x10          ; Data segment selector                 │  |
|  │   mov ds, ax                                                    │  |
|  │   mov es, ax                                                    │  |
|  │   mov fs, ax                                                    │  |
|  │   mov gs, ax                                                    │  |
|  │   mov ss, ax                                                    │  |
|  └─────────────────────────┬───────────────────────────────────────┘  |
|                            │                                           |
|                            ▼                                           |
|  ┌─────────────────────────────────────────────────────────────────┐  |
|  │ Step 7: Set Up Stack                                            │  |
|  │   mov esp, 0x90000      ; Stack pointer (high address)          │  |
|  └─────────────────────────┬───────────────────────────────────────┘  |
|                            │                                           |
|                            ▼                                           |
|  ┌─────────────────────────────────────────────────────────────────┐  |
|  │ Step 8: Call C Kernel                                           │  |
|  │   call kernel_main       ; Jump to C code!                      │  |
|  └─────────────────────────────────────────────────────────────────┘  |
|                                                                        |
+-----------------------------------------------------------------------+

2.2 Why This Matters

Protected mode is the foundation of all modern x86 operating systems:

  1. Memory protection: Prevent processes from corrupting each other
  2. Privilege separation: Kernel runs at Ring 0, users at Ring 3
  3. Virtual memory support: Paging builds on protected mode
  4. 32-bit addressing: Access 4GB of RAM (vs. 1MB in real mode)
  5. Modern instruction set: Full use of 80386+ instructions

2.3 Historical Context

+-----------------------------------------------------------------------+
|                    EVOLUTION OF x86 MODES                              |
+-----------------------------------------------------------------------+
|                                                                        |
|  1978: 8086 - Real Mode Only                                          |
|  ├── 16-bit registers, 1MB addressing                                 |
|  └── Segment:offset addressing                                         |
|                                                                        |
|  1982: 80286 - Protected Mode Introduced                               |
|  ├── 24-bit addressing (16MB)                                         |
|  ├── Privilege levels and memory protection                            |
|  └── No way to return to real mode without reset!                      |
|                                                                        |
|  1985: 80386 - Modern Protected Mode                                   |
|  ├── 32-bit registers and addressing (4GB)                            |
|  ├── Paging support added                                              |
|  ├── Virtual 8086 mode for DOS compatibility                           |
|  └── Can switch between modes freely                                   |
|                                                                        |
|  2003: AMD64 - Long Mode (64-bit)                                      |
|  ├── 64-bit addressing (16 exabytes theoretical)                      |
|  ├── Compatibility mode for 32-bit code                                |
|  └── Legacy mode for 16-bit real mode                                  |
|                                                                        |
|  Modern systems:                                                        |
|  Boot: Real Mode → Protected Mode → Long Mode (64-bit)                 |
|                                                                        |
+-----------------------------------------------------------------------+

2.4 Common Misconceptions

Misconception Reality
“GDT entries must describe real memory regions” No, with flat model all segments have base=0, limit=4GB
“You need many GDT entries” Minimum is 3 (null, code, data); more for TSS, user mode
“Protected mode is automatic after setting CR0” You must also do a far jump and reload segment registers
“BIOS calls still work” No, BIOS is 16-bit; you need your own drivers now
“Paging is required in protected mode” Paging is optional; we use pure segmentation first

3. Project Specification

3.1 What You Will Build

A two-stage bootloader and kernel that:

  1. Stage 1 (512 bytes): Boot sector that loads Stage 2 and switches to protected mode
  2. Stage 2 (assembly): Sets up 32-bit environment and calls C code
  3. C Kernel: Main kernel code that outputs to VGA text mode

3.2 Functional Requirements

Requirement Description
FR-1 Create a valid GDT with null, code, and data segments
FR-2 Successfully transition from real mode to protected mode
FR-3 Execute C code in 32-bit protected mode
FR-4 Display output using VGA text mode (direct 0xB8000 access)
FR-5 Work in QEMU with the multiboot specification (optional)
FR-6 Build with a cross-compiler (i686-elf-gcc)

3.3 Non-Functional Requirements

Requirement Description
NFR-1 Code organized into separate assembly and C files
NFR-2 Use a linker script to control memory layout
NFR-3 Makefile for reproducible builds
NFR-4 Comments explaining each GDT entry field

3.4 Example Usage / Output

# Build the kernel
$ make
nasm -f elf32 boot.asm -o boot.o
i686-elf-gcc -c kernel.c -o kernel.o -ffreestanding -O2 -Wall
i686-elf-ld -T linker.ld -o kernel.elf boot.o kernel.o
i686-elf-objcopy -O binary kernel.elf kernel.bin

# Create bootable disk image
$ dd if=/dev/zero of=disk.img bs=512 count=2880
$ dd if=kernel.bin of=disk.img conv=notrunc

# Run in QEMU
$ qemu-system-i386 -kernel kernel.elf

# Or with disk image:
$ qemu-system-i386 -drive format=raw,file=disk.img

# Screen output:
================================================================================
        MyOS v0.1 - Protected Mode Kernel
================================================================================

[OK] GDT loaded at 0x00000800
[OK] Switched to 32-bit protected mode
[OK] VGA driver initialized (80x25 color text mode)
[OK] C kernel executing at 0x00100000

Welcome to protected mode!
This text is written directly to VGA memory at 0xB8000.

Free memory: 3,932,160 bytes (kernel ends at 0x00101000)

3.5 Real World Outcome

After completing this project, you will have:

  1. Working 32-bit kernel that runs without any OS support
  2. Understanding of GDT that applies to all x86 OS development
  3. Cross-compilation skills essential for embedded/OS work
  4. Foundation for adding: interrupts (Project 7), paging (Project 8), multitasking

4. Solution Architecture

4.1 High-Level Design

+-----------------------------------------------------------------------+
|                    PROTECTED MODE KERNEL STRUCTURE                     |
+-----------------------------------------------------------------------+
|                                                                        |
|  Disk Layout:                                                          |
|  ┌─────────────────────────────────────────────────────────────────┐  |
|  │ Sector 1: Boot Sector (512 bytes)                               │  |
|  │   - Loaded by BIOS at 0x7C00                                    │  |
|  │   - Loads remaining sectors                                     │  |
|  │   - Jumps to second stage                                       │  |
|  ├─────────────────────────────────────────────────────────────────┤  |
|  │ Sectors 2-N: Kernel Image                                       │  |
|  │   - 32-bit protected mode code                                  │  |
|  │   - GDT definition                                              │  |
|  │   - Mode switch code                                            │  |
|  │   - C kernel code                                               │  |
|  └─────────────────────────────────────────────────────────────────┘  |
|                                                                        |
|  Memory Layout After Boot:                                             |
|  ┌─────────────────────────────────────────────────────────────────┐  |
|  │ 0x00000000 - 0x000003FF: Interrupt Vector Table (unused now)    │  |
|  │ 0x00000400 - 0x000004FF: BIOS Data Area (unused now)            │  |
|  │ 0x00000500 - 0x00007BFF: Free (for stack, temp data)            │  |
|  │ 0x00007C00 - 0x00007DFF: Boot sector (512 bytes)                │  |
|  │ 0x00007E00 - 0x0000FFFF: Free or loaded second stage            │  |
|  │ 0x00010000 - 0x0009FFFF: Kernel loaded here (typical)           │  |
|  │ 0x000A0000 - 0x000BFFFF: VGA Memory                             │  |
|  │ 0x000B8000 - 0x000B8FA0: VGA Text Mode Buffer                   │  |
|  │ 0x000C0000 - 0x000FFFFF: BIOS ROM (read-only)                   │  |
|  │ 0x00100000+: Extended memory (kernel can be here with paging)    │  |
|  └─────────────────────────────────────────────────────────────────┘  |
|                                                                        |
|  Alternative: Use Multiboot and GRUB loads kernel at 0x00100000       |
|                                                                        |
+-----------------------------------------------------------------------+

4.2 Key Components

+-----------------------------------------------------------------------+
|                    COMPONENT INTERACTION                               |
+-----------------------------------------------------------------------+
|                                                                        |
|  ┌─────────────────┐     ┌─────────────────┐     ┌─────────────────┐  |
|  │   boot.asm      │     │   setup.asm     │     │   kernel.c      │  |
|  │   (16-bit)      │────▶│   (16/32-bit)   │────▶│   (32-bit C)    │  |
|  │                 │     │                 │     │                 │  |
|  │ - Load sectors  │     │ - Define GDT    │     │ - VGA output    │  |
|  │ - Jump to setup │     │ - Enable A20    │     │ - Kernel main   │  |
|  │                 │     │ - Switch mode   │     │ - String funcs  │  |
|  │                 │     │ - Set up stack  │     │                 │  |
|  │                 │     │ - Call kernel   │     │                 │  |
|  └─────────────────┘     └─────────────────┘     └─────────────────┘  |
|                                                                        |
|  Files:                                                                |
|  - boot.asm:     First stage bootloader (512 bytes)                   |
|  - setup.asm:    Second stage, GDT setup, mode switch                 |
|  - kernel.c:     Main C kernel                                        |
|  - vga.c/h:      VGA text mode driver                                 |
|  - linker.ld:    Linker script for memory layout                      |
|                                                                        |
+-----------------------------------------------------------------------+

4.3 Data Structures

GDT Structure

// GDT Entry (8 bytes)
struct gdt_entry {
    uint16_t limit_low;      // Limit bits 0-15
    uint16_t base_low;       // Base bits 0-15
    uint8_t  base_middle;    // Base bits 16-23
    uint8_t  access;         // Access flags
    uint8_t  granularity;    // Flags + limit bits 16-19
    uint8_t  base_high;      // Base bits 24-31
} __attribute__((packed));

// GDT Pointer (6 bytes)
struct gdt_ptr {
    uint16_t limit;          // Size of GDT - 1
    uint32_t base;           // Address of GDT
} __attribute__((packed));

VGA Text Mode Character

// VGA character entry (2 bytes)
// At 0xB8000: 80 columns x 25 rows = 2000 characters = 4000 bytes

// Byte 0: ASCII character
// Byte 1: Attribute (foreground/background colors)
//   Bits 0-3: Foreground color
//   Bits 4-6: Background color
//   Bit 7:    Blink (if enabled)

#define VGA_COLOR_BLACK         0
#define VGA_COLOR_BLUE          1
#define VGA_COLOR_GREEN         2
#define VGA_COLOR_CYAN          3
#define VGA_COLOR_RED           4
#define VGA_COLOR_MAGENTA       5
#define VGA_COLOR_BROWN         6
#define VGA_COLOR_LIGHT_GREY    7
#define VGA_COLOR_DARK_GREY     8
#define VGA_COLOR_LIGHT_BLUE    9
#define VGA_COLOR_LIGHT_GREEN   10
#define VGA_COLOR_LIGHT_CYAN    11
#define VGA_COLOR_LIGHT_RED     12
#define VGA_COLOR_LIGHT_MAGENTA 13
#define VGA_COLOR_LIGHT_BROWN   14
#define VGA_COLOR_WHITE         15

4.4 Algorithm Overview

+-----------------------------------------------------------------------+
|                    EXECUTION FLOW                                      |
+-----------------------------------------------------------------------+
|                                                                        |
|  BIOS loads boot sector (0x7C00)                                      |
|       │                                                                |
|       ▼                                                                |
|  ┌─────────────────────────────────────────────────────────────────┐  |
|  │ Boot Sector (16-bit Real Mode)                                  │  |
|  │   1. Set up segments (DS, ES, SS)                               │  |
|  │   2. Save boot drive (DL)                                       │  |
|  │   3. Load remaining sectors (INT 0x13)                          │  |
|  │   4. Jump to loaded code                                        │  |
|  └────────────────────────────┬────────────────────────────────────┘  |
|                               │                                        |
|                               ▼                                        |
|  ┌─────────────────────────────────────────────────────────────────┐  |
|  │ Setup Code (16-bit → 32-bit transition)                         │  |
|  │   1. CLI (disable interrupts)                                   │  |
|  │   2. Enable A20 line                                            │  |
|  │   3. Load GDT (LGDT gdt_descriptor)                             │  |
|  │   4. Set CR0.PE = 1                                             │  |
|  │   5. Far jump to 32-bit code (jmp 0x08:start32)                 │  |
|  │   6. [Now in 32-bit mode]                                       │  |
|  │   7. Load data segment registers with 0x10                      │  |
|  │   8. Set up stack (ESP)                                         │  |
|  │   9. Call kernel_main()                                         │  |
|  └────────────────────────────┬────────────────────────────────────┘  |
|                               │                                        |
|                               ▼                                        |
|  ┌─────────────────────────────────────────────────────────────────┐  |
|  │ C Kernel (32-bit Protected Mode)                                │  |
|  │   1. Initialize VGA driver                                      │  |
|  │   2. Clear screen                                               │  |
|  │   3. Print welcome message                                      │  |
|  │   4. Initialize other subsystems                                │  |
|  │   5. Enter main loop or halt                                    │  |
|  └─────────────────────────────────────────────────────────────────┘  |
|                                                                        |
+-----------------------------------------------------------------------+

5. Implementation Guide

5.1 Development Environment Setup

Build Cross-Compiler (i686-elf-gcc)

This is the most important step. You need a cross-compiler that targets bare-metal x86:

# Install build dependencies
# On Ubuntu/Debian:
sudo apt install build-essential bison flex libgmp3-dev libmpc-dev \
    libmpfr-dev texinfo libisl-dev

# On macOS:
brew install gmp mpfr libmpc

# Set up directories
export PREFIX="$HOME/opt/cross"
export TARGET=i686-elf
export PATH="$PREFIX/bin:$PATH"

mkdir -p $HOME/src
cd $HOME/src

# Download and build binutils
wget https://ftp.gnu.org/gnu/binutils/binutils-2.41.tar.xz
tar xf binutils-2.41.tar.xz
mkdir build-binutils && cd build-binutils
../binutils-2.41/configure --target=$TARGET --prefix="$PREFIX" \
    --with-sysroot --disable-nls --disable-werror
make -j$(nproc)
make install
cd ..

# Download and build GCC
wget https://ftp.gnu.org/gnu/gcc/gcc-13.2.0/gcc-13.2.0.tar.xz
tar xf gcc-13.2.0.tar.xz
mkdir build-gcc && cd build-gcc
../gcc-13.2.0/configure --target=$TARGET --prefix="$PREFIX" \
    --disable-nls --enable-languages=c,c++ --without-headers
make -j$(nproc) all-gcc all-target-libgcc
make install-gcc install-target-libgcc

# Verify
i686-elf-gcc --version

Alternative: Use pre-built toolchains from osdev.org or Docker images.

5.2 Project Structure

protected-mode-kernel/
├── Makefile
├── linker.ld
├── src/
│   ├── boot.asm          # First-stage bootloader
│   ├── setup.asm         # Mode switch and GDT
│   └── kernel.c          # Main C kernel
├── include/
│   └── vga.h             # VGA constants
├── iso/                  # For GRUB boot (optional)
│   └── boot/
│       └── grub/
│           └── grub.cfg
└── README.md

Linker Script (linker.ld)

/* Linker script for protected mode kernel */
ENTRY(_start)

SECTIONS
{
    /* Kernel loaded at 1MB for multiboot, or lower for custom boot */
    . = 0x00100000;  /* 1 MB mark */

    .text BLOCK(4K) : ALIGN(4K)
    {
        *(.multiboot)    /* Multiboot header first */
        *(.text.boot)    /* Then entry point */
        *(.text)         /* Then all other code */
    }

    .rodata BLOCK(4K) : ALIGN(4K)
    {
        *(.rodata)
    }

    .data BLOCK(4K) : ALIGN(4K)
    {
        *(.data)
    }

    .bss BLOCK(4K) : ALIGN(4K)
    {
        *(COMMON)
        *(.bss)
    }

    _kernel_end = .;
}

5.3 The Core Question You’re Answering

“How do you transition from the CPU’s initial 16-bit mode to the full-featured 32-bit mode that real operating systems use?”

This involves understanding:

  • Why segmentation exists and how GDT works
  • What the CPU needs to switch modes
  • How to write code for a CPU without any OS support
  • The calling conventions between assembly and C

5.4 Concepts You Must Understand First

Concept Self-Assessment Question Book Reference
GDT Structure What goes in bytes 5-6 of a descriptor? Intel SDM Vol 3, Ch 3
Segment Selectors Why is the code segment 0x08 and data 0x10? OSDev Wiki - GDT
Control Registers What does CR0 bit 0 (PE) do? Intel SDM Vol 3, Ch 2
Flat Memory Model How do you make segments “invisible”? OS Three Easy Pieces
C Calling Convention How are function arguments passed in cdecl? System V i386 ABI

5.5 Questions to Guide Your Design

GDT Design:

  • How many entries do you need for a minimal kernel?
  • What base and limit should code and data segments have?
  • Why is the null descriptor required?

Mode Switching:

  • Why must you disable interrupts before switching?
  • What’s special about the far jump after setting CR0.PE?
  • Why reload segment registers immediately after the jump?

Linking C and Assembly:

  • What symbol naming conventions does your C compiler use?
  • How do you export symbols from assembly?
  • What stack layout does your C compiler expect?

VGA Output:

  • How do you calculate the address of character at (row, col)?
  • What attribute byte gives white text on blue background?
  • How do you scroll the screen when it fills up?

5.6 Thinking Exercise

Before coding, manually trace through this GDT setup:

gdt_start:
    dq 0                        ; Null descriptor

gdt_code:
    dw 0xFFFF       ; Limit (low)
    dw 0x0000       ; Base (low)
    db 0x00         ; Base (middle)
    db 10011010b    ; Access byte
    db 11001111b    ; Flags + Limit (high)
    db 0x00         ; Base (high)

gdt_data:
    dw 0xFFFF       ; Limit (low)
    dw 0x0000       ; Base (low)
    db 0x00         ; Base (middle)
    db 10010010b    ; Access byte
    db 11001111b    ; Flags + Limit (high)
    db 0x00         ; Base (high)

gdt_end:

gdt_descriptor:
    dw gdt_end - gdt_start - 1  ; Size
    dd gdt_start                 ; Address

Questions:

  1. What is the base address of the code segment?
  2. What is the limit in bytes (considering granularity)?
  3. Is the code segment readable? How do you know?
  4. What privilege level (ring) is this for?

5.7 Hints in Layers

Hint 1: Starting Point (Conceptual Direction)

You need:

  1. Boot sector that loads more code
  2. GDT with flat memory model (base=0, limit=4GB)
  3. Assembly routine that: CLI, enable A20, load GDT, set CR0.PE, far jump
  4. 32-bit setup: reload segments, set stack, call C

Hint 2: Next Level (More Specific Guidance)

Assembly structure (setup.asm):

[BITS 16]
; ... 16-bit code to enable A20, load GDT, set CR0 ...

[BITS 32]
start_protected:
    ; Set up segments and stack
    ; Call C function

; GDT goes here
; GDT descriptor goes here

C kernel structure (kernel.c):

void kernel_main(void) {
    // VGA is at 0xB8000
    // Each character is 2 bytes (char + attribute)
    char* video = (char*) 0xB8000;
    video[0] = 'H';
    video[1] = 0x0F;  // White on black
    // ...
}

Hint 3: Technical Details (Approach/Pseudocode)

Enable A20 (fast method):

enable_a20:
    in al, 0x92
    or al, 2
    out 0x92, al
    ret

Mode switch:

switch_to_pm:
    cli
    lgdt [gdt_descriptor]
    mov eax, cr0
    or eax, 1
    mov cr0, eax
    jmp 0x08:init_pm      ; Far jump!

[BITS 32]
init_pm:
    mov ax, 0x10          ; Data segment
    mov ds, ax
    mov ss, ax
    mov es, ax
    mov fs, ax
    mov gs, ax
    mov esp, 0x90000      ; Stack
    call kernel_main      ; C entry point
    hlt

Hint 4: Tools/Debugging (Verification Methods)

# Check if GDT is valid using QEMU monitor
# Press Ctrl+Alt+2 to access monitor
info gdt

# Debug with GDB
qemu-system-i386 -kernel kernel.elf -s -S &
gdb -ex "target remote :1234" \
    -ex "set architecture i386" \
    -ex "break *0x7c00" \
    -ex "continue"

# Step through mode switch
(gdb) break *0x7c50  # wherever switch happens
(gdb) info registers
(gdb) p/x $cr0

# View VGA memory
(gdb) x/20hx 0xB8000

5.8 The Interview Questions They’ll Ask

  1. “Explain the GDT and what happens during LGDT”
    • GDT defines memory segments (base, limit, permissions)
    • LGDT loads the GDTR register with table location
    • Segment registers then hold selectors into this table
  2. “Why is the far jump necessary after setting CR0.PE?”
    • CPU has prefetch queue with 16-bit decoded instructions
    • Far jump flushes the queue and reloads CS
    • Without it, CPU would execute garbage
  3. “What’s the difference between flat and segmented models?”
    • Segmented: Different base addresses for different segments
    • Flat: All segments have base=0, limit=4GB
    • Flat is standard for modern OSes (Linux, Windows)
  4. “How does VGA text mode work?”
    • Memory-mapped I/O at 0xB8000
    • 2 bytes per character: ASCII + attribute
    • 80 columns x 25 rows = 4000 bytes total
    • Hardware automatically displays this memory
  5. “Why use a cross-compiler instead of native GCC?”
    • Native GCC targets host OS with libraries
    • Cross-compiler produces freestanding code
    • No C library calls, no OS dependencies
    • Proper calling conventions for bare metal

5.9 Books That Will Help

Topic Book Chapter
GDT and Protected Mode Intel SDM Volume 3 Chapter 3
Segmentation “Operating Systems: Three Easy Pieces” Chapter 16
x86 Architecture “Write Great Code, Volume 2” Chapters 3-4
OS Development “Operating Systems: From 0 to 1” Chapters 3-4
VGA Programming OSDev Wiki Text Mode section

5.10 Implementation Phases

Phase 1: Cross-Compiler Setup (1-2 days)

  • Build or obtain i686-elf-gcc
  • Verify with simple freestanding C compilation
  • Test linking with NASM object files

Phase 2: Boot Sector Enhancement (2-3 days)

  • Modify Project 5 bootloader to load more sectors
  • Test loading code to 0x10000 or similar

Phase 3: GDT and Mode Switch (3-4 days)

  • Define GDT entries
  • Implement A20 enable
  • Write mode switch code
  • Verify 32-bit execution

Phase 4: C Kernel and VGA (3-4 days)

  • Write linker script
  • Implement VGA text output
  • Create print functions
  • Add formatting (colors, positions)

Phase 5: Polish and Documentation (2-3 days)

  • Clean up code
  • Add comments
  • Write build instructions
  • Test on multiple QEMU versions

5.11 Key Implementation Decisions

Decision Option A Option B Recommendation
Boot method Custom bootloader Multiboot/GRUB Custom for learning, Multiboot for features
Kernel load address 0x10000 (below 1MB) 0x100000 (1MB+) 0x10000 initially (simpler), 1MB+ with paging
A20 enable method BIOS INT 0x15 Fast A20 (0x92) Fast A20 is simpler, works on most hardware
VGA driver Direct memory Abstraction layer Abstraction for clean code

6. Testing Strategy

Verification Points

#!/bin/bash
# test_protected_mode.sh

echo "Test 1: Verify kernel binary"
file kernel.elf
# Should show: ELF 32-bit LSB executable, Intel 80386

echo "Test 2: Check multiboot header (if using)"
objdump -h kernel.elf | grep multiboot

echo "Test 3: Verify entry point"
readelf -h kernel.elf | grep "Entry point"
# Should match linker script

echo "Test 4: Run in QEMU with debug output"
qemu-system-i386 -kernel kernel.elf -d int 2>&1 | head -50
# Should not show unexpected interrupts

echo "Test 5: QEMU monitor GDT check"
echo "info gdt" | qemu-system-i386 -kernel kernel.elf -monitor stdio -nographic &
sleep 2
kill %1

GDB Test Session

# Terminal 1: Start QEMU with GDB server
qemu-system-i386 -kernel kernel.elf -s -S -nographic

# Terminal 2: Connect GDB
gdb kernel.elf
(gdb) target remote :1234
(gdb) break kernel_main
(gdb) continue
# Should break in C code
(gdb) info registers
# EIP should be in kernel code
(gdb) x/20hx 0xB8000
# Should show your VGA output

7. Common Pitfalls & Debugging

Pitfall 1: GDT Not Aligned or Wrong Size

Symptom: Triple fault immediately after LGDT.

Cause: GDT descriptor has wrong size or address.

Fix:

gdt_descriptor:
    dw gdt_end - gdt_start - 1   ; Size - 1 (not size!)
    dd gdt_start                  ; Linear address, not segment:offset

Pitfall 2: Missing Far Jump After CR0.PE

Symptom: Strange crashes or garbage execution.

Cause: CPU executing old 16-bit instructions in 32-bit mode.

Fix:

mov eax, cr0
or eax, 1
mov cr0, eax
jmp 0x08:start32   ; MUST be far jump with code segment selector

Pitfall 3: Wrong Segment Selector Values

Symptom: General protection fault.

Cause: Using GDT index instead of selector value.

Fix:

; Selector = (Index * 8) | TI | RPL
; For GDT entry 1 (code): 1 * 8 = 0x08
; For GDT entry 2 (data): 2 * 8 = 0x10
mov ax, 0x10   ; Not 0x02!
mov ds, ax

Pitfall 4: Calling C Without Proper Stack

Symptom: Crash when C function executes or returns.

Cause: Stack pointer not set or in wrong location.

Fix:

; Before calling C
mov esp, 0x90000   ; Pick address in usable memory
push dword 0       ; Fake return address (C won't return)
call kernel_main

Pitfall 5: A20 Line Not Enabled

Symptom: Memory access wraps at 1MB boundary.

Cause: A20 line still disabled.

Fix:

; Fast A20 method (works on most hardware)
in al, 0x92
test al, 2
jnz .a20_done
or al, 2
and al, 0xFE      ; Don't accidentally reset!
out 0x92, al
.a20_done:

Debugging with QEMU Monitor

# Press Ctrl+Alt+2 for QEMU monitor
# Useful commands:
info registers     # Show CPU registers
info gdt          # Show GDT
info idt          # Show IDT
x/10i $eip        # Disassemble at EIP
xp /20x 0xB8000   # Show VGA memory
log cpu_reset     # Log CPU resets

8. Extensions & Challenges

Extension 1: Add Color Printing (Easy)

enum vga_color {
    VGA_BLACK = 0, VGA_BLUE = 1, /* ... */
};

void print_colored(const char* str, uint8_t fg, uint8_t bg) {
    uint8_t color = fg | (bg << 4);
    volatile uint16_t* vga = (uint16_t*) 0xB8000;
    for (size_t i = 0; str[i] != '\0'; i++) {
        vga[i] = str[i] | (color << 8);
    }
}

Extension 2: Scrolling Console (Medium)

void scroll(void) {
    volatile uint16_t* vga = (uint16_t*) 0xB8000;
    // Move lines 1-24 to lines 0-23
    for (int i = 0; i < 24 * 80; i++) {
        vga[i] = vga[i + 80];
    }
    // Clear line 24
    for (int i = 24 * 80; i < 25 * 80; i++) {
        vga[i] = ' ' | (0x0F << 8);
    }
}

Extension 3: Printf Implementation (Medium)

Implement a basic printf for formatted output. Handle %d, %x, %s, %c.

Extension 4: Keyboard Input (Advanced)

Read scan codes from keyboard I/O port 0x60 (requires interrupt setup - see Project 7).


9. Real-World Connections

How This Relates to Professional Software

This Project Real-World Equivalent
GDT setup Linux kernel GDT in arch/x86/kernel/cpu/common.c
Mode switch Linux decompression code (real mode → protected mode)
VGA output Early Linux printk, BIOS console
Linker script Linux arch/x86/kernel/vmlinux.lds.S

Industry Applications

  1. Operating System Development: Windows, Linux, BSD kernel startup
  2. BIOS/UEFI Firmware: AMI, Phoenix, Intel BIOS source
  3. Hypervisors: VMware, Xen, KVM host code
  4. Embedded x86: Industrial controllers, kiosks
  5. Security Research: Rootkit/bootkit analysis

10. Resources

Official Documentation

Tutorials

Tools


11. Self-Assessment Checklist

Understanding

  • I can explain each field in a GDT entry
  • I understand why base=0, limit=0xFFFFF creates 4GB segments
  • I know what CR0.PE does and why the far jump is needed
  • I can describe the difference between 16-bit and 32-bit execution
  • I understand why BIOS calls don’t work in protected mode

Implementation

  • GDT has correct null, code, and data segments
  • Mode switch sequence is complete (CLI, A20, LGDT, CR0.PE, far jump)
  • Segment registers are reloaded after switch
  • Stack is properly set up before calling C
  • VGA output works correctly

Skills

  • I can build with the cross-compiler
  • I can debug with QEMU and GDB
  • I can read and write linker scripts
  • I can link assembly and C code
  • I understand memory layout of my kernel

12. Submission / Completion Criteria

Your project is complete when:

  1. Build Requirements
    • Builds with make using i686-elf-gcc
    • Produces a valid ELF or binary kernel image
    • Linker script controls memory layout
  2. Functional Requirements
    • Boots in QEMU (with -kernel or custom boot)
    • Successfully switches to protected mode
    • Executes C code
    • Displays output via VGA text mode
  3. Code Quality
    • GDT entries are documented
    • Assembly and C are properly separated
    • Build process is reproducible
  4. Documentation
    • README explains the project
    • Memory map is documented
    • Build requirements are listed

Verification Commands

# Build
make clean && make

# Verify ELF file
file kernel.elf
readelf -h kernel.elf

# Run
qemu-system-i386 -kernel kernel.elf

# Debug
qemu-system-i386 -kernel kernel.elf -s -S &
gdb -ex "target remote :1234" -ex "break kernel_main" -ex "continue"

Congratulations! Completing this project means you understand the most critical transition in x86 system programming. You now have a 32-bit kernel that’s the foundation for interrupts (Project 7), paging (Project 8), and eventually a full operating system with multitasking.