Project 2: Bootloader that Loads a Kernel

Build a two-stage bootloader that loads a kernel image, switches to protected mode, and jumps into C code.

Quick Reference

Attribute Value
Difficulty Intermediate
Time Estimate 12-18 hours
Main Programming Language x86 Assembly + C (freestanding)
Alternative Programming Languages NASM or GAS; Rust for kernel stub
Coolness Level Very High
Business Potential Low, but foundational
Prerequisites Project 1, basic linker scripts
Key Topics Mode switching, GDT, disk reads, ELF loading

1. Learning Objectives

By completing this project, you will:

  1. Implement a multi-stage boot path with a size-constrained stage 1.
  2. Enable the A20 line and enter 32-bit protected mode safely.
  3. Load a kernel image into memory and jump to a C entry point.
  4. Explain how GDT selectors and segment registers enable flat memory.

2. All Theory Needed (Per-Concept Breakdown)

Mode Switching and Kernel Loading Pipeline

Fundamentals

A multi-stage bootloader exists because the boot sector is too small for real work. Stage 1 fits in 512 bytes and loads a larger stage 2 from disk. Stage 2 sets up a proper execution environment: it enables the A20 line (allowing access above 1 MB), builds a Global Descriptor Table (GDT), switches the CPU into 32-bit protected mode, and loads a kernel image into memory. Only after these steps can you jump into C code. The key idea is that each stage establishes a stronger contract for the next: more memory, more instructions, and a clearer execution model. You must also choose a kernel image format (raw binary or ELF) and place it at a safe physical address that does not overlap the bootloader itself.

Deep Dive into the concept

The bootloader pipeline is a carefully ordered sequence of machine state transitions. Stage 1 runs in 16-bit real mode, so it must use BIOS interrupts to read disk sectors. It cannot fit a full loader or parser, which is why it typically loads a fixed number of sectors containing stage 2. Stage 2 still starts in real mode but can be much larger, so it can implement disk reading logic, basic diagnostics, and memory map handling.

Before you can use memory above 1 MB, you must enable the A20 line. Historically, this is done by programming the keyboard controller, but modern systems often support the FAST A20 gate via port 0x92. If you skip this, addresses above 1 MB wrap around and you overwrite your own code, leading to instant failure. The A20 enable step is small but critical.

Next, you must build and load a GDT. In protected mode, segment registers contain selectors into the GDT; each descriptor defines a base, limit, and flags. The common teaching pattern is a flat memory model: base = 0x0 and limit = 4 GB for both code and data segments. This gives you linear addressing and avoids segmentation complexity. However, you must still load segment registers correctly, because protected mode requires valid selectors. The mode switch is not just a flag: after setting CR0.PE, you must perform a far jump to flush the prefetch queue and load the new CS selector. Forgetting this causes a triple fault.

Once in protected mode, you must set up a stack in a safe region of memory. The kernel entry point expects a C calling convention, which depends on a valid stack. You also need to decide where to load the kernel. Many teaching systems load at 0x10000 or 0x100000. The location must be aligned to your linker script, and it must not overlap with your bootloader or BIOS data areas. If you choose ELF as a kernel format, the loader reads the ELF header and copies each PT_LOAD segment to its target physical address, zeroing the BSS. This approach mirrors how real kernels are loaded and teaches you about executable formats.

A robust loader also communicates boot information to the kernel, such as memory size, boot device, or command line. For this project, a small struct passed in a register or on the stack is enough. The main goal is to prove that you can cleanly bridge real mode firmware into 32-bit C code without undefined behavior.

The kernel stub should be freestanding. You compile with -ffreestanding and avoid libc. That forces you to implement your own basic printing (VGA or serial) and understand the low-level ABI. The first time you see your C code print a line after a mode switch, you are holding the entire boot pipeline in your hands. Later projects, such as xv6, build on the same chain.

How this fit on projects

This concept is used in Section 3.1 (boot image layout), Section 3.7 (golden path boot log), and Section 5.10 Phases 1-2. It directly prepares you for Project 14.

Definitions & key terms

  • A20 line: Gate that enables access above 1 MB.
  • GDT: Table of segment descriptors used in protected mode.
  • Protected mode: 32-bit x86 mode with segmentation and privilege levels.
  • ELF: Executable format with loadable segments.
  • Far jump: Jump that reloads CS and flushes the instruction pipeline.

Mental model diagram (ASCII)

Stage 1 (real) -> Stage 2 (real) -> Protected mode -> Kernel C
   |                |                 |                 |
   | read sectors   | set A20 + GDT   | load ELF        | print

How it works (step-by-step)

  1. BIOS loads stage 1 at 0x7C00 and jumps.
  2. Stage 1 reads stage 2 into memory and jumps.
  3. Stage 2 enables A20, sets up GDT.
  4. Stage 2 sets CR0.PE and performs far jump.
  5. Protected-mode code sets up stack and segments.
  6. Kernel image is loaded into memory.
  7. Control transfers to kernel entry point.

Minimal concrete example

; after loading GDT
mov eax, cr0
or eax, 1
mov cr0, eax
jmp 0x08:pm_entry  ; far jump to 32-bit code segment

[bits 32]
pm_entry:
    mov ax, 0x10
    mov ds, ax
    mov ss, ax
    mov esp, 0x90000
    call kernel_entry

Common misconceptions

  • “Just set CR0.PE”: you must also far-jump and reload CS.
  • “The stack is already valid”: you must define SS:ESP.
  • “ELF is optional”: raw binaries are easier but less realistic.

Check-your-understanding questions

  1. Why does enabling A20 matter for loading a kernel above 1 MB?
  2. What does the GDT enable that real mode cannot?
  3. Why is a far jump required during mode switching?
  4. What does -ffreestanding change in the build?

Check-your-understanding answers

  1. Without A20, addresses wrap and corrupt memory.
  2. It defines segment descriptors required by protected mode.
  3. It reloads CS and flushes the instruction queue.
  4. It disables assumptions about libc and hosted environments.

Real-world applications

  • Bootloaders like GRUB and Syslinux.
  • Firmware chains in embedded x86 systems.

Where you’ll apply it

  • This project: Section 3.1, Section 3.7, Section 5.10 Phases 1-2.
  • Also used in: Project 1, Project 14.

References

  • Intel SDM Vol. 3A (Protected mode)
  • “Operating Systems: Three Easy Pieces” (boot overview)
  • “CS:APP” Ch. 7 (linking and ELF)

Key insights

The bootloader is a controlled state transition machine; each step makes the next step possible.

Summary

You are building the shortest possible bridge between firmware and a freestanding C kernel.

Homework/Exercises to practice the concept

  1. Add a visual progress log during the boot flow.
  2. Load a second C function and call it from assembly.
  3. Parse ELF headers to load only PT_LOAD segments.

Solutions to the homework/exercises

  1. Write to VGA memory at 0xB8000 after each step.
  2. Add a second symbol in kernel.c and call it.
  3. Read the ELF header, iterate program headers, copy segments.

3. Project Specification

3.1 What You Will Build

A two-stage bootloader image that reads a kernel image from disk, switches the CPU into 32-bit protected mode, and executes a freestanding C kernel entry point that prints a banner.

3.2 Functional Requirements

  1. Stage 1 loads stage 2 reliably from disk.
  2. Stage 2 enables A20 and switches to protected mode.
  3. Kernel loading copies the kernel image to its load address.
  4. Kernel entry prints a banner proving C execution.

3.3 Non-Functional Requirements

  • Performance: Boot log prints in under 1 second.
  • Reliability: No triple faults; repeatable boot.
  • Usability: Single make builds full image.

3.4 Example Usage / Output

$ make
$ qemu-system-x86_64 -drive format=raw,file=os.bin

Bootloader: reading kernel...
Bootloader: enabling A20...
Bootloader: entering protected mode...
Kernel: hello from 32-bit C

3.5 Data Formats / Schemas / Protocols

  • Stage 1/2 layout: fixed sector count or simple header.
  • Kernel: raw binary or ELF with loadable segments.

3.6 Edge Cases

  • Wrong sector count (kernel not fully loaded).
  • A20 not enabled (memory wrap).
  • GDT misconfigured (triple fault).

3.7 Real World Outcome

3.7.1 How to Run (Copy/Paste)

make
qemu-system-x86_64 -drive format=raw,file=os.bin

3.7.2 Golden Path Demo (Deterministic)

  • Fixed kernel size and fixed load address.
  • Deterministic boot log with fixed strings.

3.7.3 If CLI: exact terminal transcript

$ make
nasm -f bin stage1.asm -o stage1.bin
nasm -f bin stage2.asm -o stage2.bin
gcc -m32 -ffreestanding -c kernel.c -o kernel.o
ld -m elf_i386 -T linker.ld -o kernel.bin kernel.o
cat stage1.bin stage2.bin kernel.bin > os.bin
$ qemu-system-x86_64 -drive format=raw,file=os.bin

Bootloader: reading kernel...
Bootloader: enabling A20...
Bootloader: entering protected mode...
Kernel: hello from 32-bit C

Failure demo (deterministic):

$ dd if=/dev/zero of=os.bin bs=512 count=1
$ qemu-system-x86_64 -drive format=raw,file=os.bin

[QEMU]
Boot failed: not a bootable disk

Exit codes:

  • 0 success
  • 2 build error
  • 3 QEMU boot failure

4. Solution Architecture

4.1 High-Level Design

Stage1 -> Stage2 -> Mode Switch -> Kernel

4.2 Key Components

| Component | Responsibility | Key Decisions | |———–|—————-|—————| | Stage 1 | Load stage 2 | Fixed sector count | | Stage 2 | Enable A20 + GDT | Flat 32-bit segments | | Loader | Copy kernel | Raw or ELF | | Kernel stub | Print banner | VGA text output |

4.3 Data Structures (No Full Code)

struct boot_info {
    uint32_t mem_kb;
    uint32_t boot_device;
};

4.4 Algorithm Overview

Key Algorithm: Load kernel sectors

  1. Calculate sector count.
  2. Read sectors using BIOS INT 0x13.
  3. Copy to load address.

Complexity Analysis:

  • Time: O(n) sectors
  • Space: O(1) beyond buffers

5. Implementation Guide

5.1 Development Environment Setup

sudo apt-get install nasm gcc qemu-system-x86

5.2 Project Structure

project-root/
|-- stage1.asm
|-- stage2.asm
|-- kernel.c
|-- linker.ld
|-- Makefile
`-- os.bin

5.3 The Core Question You’re Answering

“How do I move from firmware-controlled real mode to my own protected-mode kernel with a clean ABI boundary?”

5.4 Concepts You Must Understand First

  1. BIOS disk I/O via INT 0x13.
  2. GDT layout and selectors.
  3. Kernel image layout and load address.

5.5 Questions to Guide Your Design

  1. What is the safe kernel load address?
  2. How will you verify A20 is enabled?
  3. Will you load a raw binary or parse ELF?

5.6 Thinking Exercise

Sketch memory from 0x0000 to 0x200000 and mark stage1, stage2, kernel.

5.7 The Interview Questions They’ll Ask

  1. Why is a far jump required after setting CR0.PE?
  2. What does -ffreestanding do?
  3. Why does a bootloader exist at all?

5.8 Hints in Layers

Hint 1: Start with a stage2 that only prints.

Hint 2: Add A20 enable and GDT setup.

Hint 3: Load a raw kernel and jump to it.

5.9 Books That Will Help

| Topic | Book | Chapter | |——-|——|———| | Boot flow | OSTEP | 36 | | Linking | CS:APP | 7 | | x86 modes | Intel SDM | 3A |

5.10 Implementation Phases

Phase 1: Stage 1 loader (3-4 hours)

Goals: Load stage2 from disk. Tasks: BIOS read loop, fixed sector count. Checkpoint: Stage2 prints message.

Phase 2: Mode switch (4-6 hours)

Goals: Enter protected mode safely. Tasks: A20 enable, GDT build, far jump. Checkpoint: Protected-mode code runs.

Phase 3: Kernel handoff (4-6 hours)

Goals: Load kernel and run C entry. Tasks: Build linker script, jump to entry. Checkpoint: C banner prints.

5.11 Key Implementation Decisions

| Decision | Options | Recommendation | Rationale | |———-|———|—————-|———–| | Kernel format | Raw vs ELF | Raw first, ELF later | Simpler initial success | | Load address | 0x10000 vs 0x100000 | 0x100000 | Matches many tutorials |


6. Testing Strategy

6.1 Test Categories

| Category | Purpose | Examples | |———-|———|———-| | Boot tests | Verify stages load | QEMU boot log | | Mode tests | Confirm protected mode | Print 32-bit register value | | Kernel tests | Verify C entry | Banner output |

6.2 Critical Test Cases

  1. Stage1 loads stage2 without corrupting itself.
  2. Protected-mode entry runs after far jump.
  3. Kernel banner prints from C.

6.3 Test Data

Expected boot log:
- reading kernel
- enabling A20
- entering protected mode
- kernel banner

7. Common Pitfalls & Debugging

7.1 Frequent Mistakes

| Pitfall | Symptom | Solution | |——–|———|———-| | Bad GDT | Triple fault | Verify descriptor flags and limits | | A20 disabled | Weird wrap-around | Use port 0x92 fast A20 | | Wrong entry address | Hang | Check linker script and symbol |

7.2 Debugging Strategies

  • Use QEMU -d int and -S -s for GDB attach.
  • Add VGA prints before and after mode switch.

7.3 Performance Traps

Not applicable; boot correctness is primary.


8. Extensions & Challenges

8.1 Beginner Extensions

  • Add a progress bar in VGA text mode.

8.2 Intermediate Extensions

  • Load kernel as ELF and support multiple segments.

8.3 Advanced Extensions

  • Switch to 64-bit long mode and run a 64-bit kernel stub.

9. Real-World Connections

9.1 Industry Applications

  • Bootloaders like GRUB and iPXE use multi-stage pipelines.
  • GRUB: Full-featured bootloader.
  • Limine: Modern boot protocol and loader.

9.3 Interview Relevance

  • Explaining mode switching and boot chain integrity.

10. Resources

10.1 Essential Reading

  • Intel SDM Vol. 3A
  • “CS:APP” Ch. 7

10.2 Video Resources

  • OSDev wiki bootloader series (video or written)

10.3 Tools & Documentation

  • NASM and QEMU manuals

11. Self-Assessment Checklist

11.1 Understanding

  • I can explain the role of the GDT.
  • I can describe the A20 line problem.
  • I can trace the mode switch sequence.

11.2 Implementation

  • Stage1, stage2, and kernel all load correctly.
  • Protected mode is entered reliably.
  • Kernel banner prints from C.

11.3 Growth

  • I can explain my loader to another engineer.

12. Submission / Completion Criteria

Minimum Viable Completion:

  • Two-stage loader boots and prints kernel banner.

Full Completion:

  • A20 enable, protected mode switch, and kernel load all verified.

Excellence (Going Above & Beyond):

  • ELF parsing and long mode support.