Project 13: Operating System Kernel (Capstone)
The ultimate C++ challenge: writing bare-metal code that runs with no safety net
Quick Reference
| Attribute | Value |
|---|---|
| Difficulty | Master |
| Time Estimate | 6-12 months |
| Language | C++ (with C and Assembly) |
| Prerequisites | All previous projects, assembly basics, computer architecture |
| Key Topics | Bare metal, paging, scheduling, interrupts, filesystems |
1. Learning Objectives
By completing this capstone project, you will:
- Understand the boot process from BIOS/UEFI through bootloader to kernel initialization
- Implement memory management including physical/virtual memory, paging, and heap allocation
- Master interrupt handling for hardware events, exceptions, and system calls
- Build a process scheduler with context switching and multiple execution contexts
- Create a simple filesystem for persistent storage and file operations
- Write C++ without the standard library using freestanding mode and custom allocators
- Use inline assembly to interface directly with CPU instructions and hardware
- Debug bare-metal code using QEMU, GDB, and serial output
This project synthesizes everything from the previous 12 projects: memory management from Project 3, data structures from Project 4, concurrency from Project 5, parsing from Projects 1/6/10, and low-level I/O from Projects 2/7/8.
2. Theoretical Foundation
2.1 Core Concepts
What is an Operating System Kernel?
The kernel is the core of an operating system. It’s the first program that runs after the bootloader, and it provides the fundamental abstractions that all other programs depend on:
Hardware Layer (CPU, Memory, Devices)
|
v
+------------------------+
| KERNEL |
| - Memory Management |
| - Process Scheduling |
| - Device Drivers |
| - System Calls |
| - Interrupt Handling |
+------------------------+
|
v
User Programs
Key Abstractions the Kernel Provides:
- Processes - Illusion of dedicated CPU for each program
- Virtual Memory - Illusion of private address space for each process
- Files - Uniform interface to persistent storage
- System Calls - Safe way for user programs to request kernel services
The Privilege Model:
Modern CPUs have protection rings. x86 has 4 rings, but most OSes use only 2:
Ring 0 (Kernel Mode)
+------------------------------------------+
| - Full hardware access |
| - Can execute privileged instructions |
| - Can access all memory |
| - Handles interrupts and exceptions |
+------------------------------------------+
Ring 3 (User Mode)
+------------------------------------------+
| - Restricted hardware access |
| - Cannot execute privileged instructions |
| - Can only access permitted memory |
| - Must use syscalls for kernel services |
+------------------------------------------+
Memory Architecture:
Physical Memory Layout (simplified):
+----------------+ 0x00000000
| Reserved/BIOS |
+----------------+ 0x00100000 (1 MB)
| Kernel Code |
| Kernel Data |
| Kernel Heap |
+----------------+
| Free Memory |
| (for processes)|
+----------------+ End of RAM
Virtual Address Space (per process):
+----------------+ 0x00000000
| User Code |
| User Data |
| User Heap |
| v |
| |
| ^ |
| User Stack |
+----------------+ 0xC0000000 (typical split)
| Kernel Space |
| (mapped same |
| in all procs) |
+----------------+ 0xFFFFFFFF
2.2 Why This Matters
Understanding the Foundation:
Every program you’ve ever written runs on top of an operating system. When you call malloc(), the kernel’s memory manager handles the request. When you read a file, the kernel’s filesystem and device drivers do the work. When your program runs concurrently with others, the kernel’s scheduler makes it possible.
Writing a kernel gives you:
- Complete understanding of how software interacts with hardware
- Debugging superpowers - you’ll understand crashes, memory issues, and performance at the deepest level
- Systems intuition - you’ll make better design decisions in all future projects
- Ultimate C++ mastery - no safety net means you truly understand the language
Industry Relevance:
- Embedded systems - IoT devices, automotive, aerospace all need kernel-level code
- Cloud infrastructure - Hypervisors, containers, and orchestration require kernel knowledge
- Security - Kernel vulnerabilities are the most critical; understanding prevents them
- Performance - When milliseconds matter, kernel-level optimization is essential
2.3 Historical Context
The Evolution of Operating Systems:
1960s: Batch Processing
Programs run one at a time, no interaction
1970s: Time-Sharing (UNIX born)
Multiple users share a single computer
Ken Thompson and Dennis Ritchie at Bell Labs
1980s: Personal Computers
MS-DOS, Mac OS, early Windows
Single-user, limited multitasking
1990s: Modern Kernels
Linux kernel (1991), Windows NT (1993)
True multitasking, virtual memory, networking
2000s-Present: Cloud and Containers
Hypervisors, containers, microservices
Kernels now virtualized and containerized
Key Kernel Architectures:
Monolithic Kernel (Linux, BSD):
+------------------------------------------+
| All kernel services in one address space |
| Drivers, filesystem, networking, etc. |
| Fast (no IPC overhead) |
| Large attack surface |
+------------------------------------------+
Microkernel (MINIX, QNX, seL4):
+------------------+
| Minimal kernel |
| IPC, scheduling |
| memory basics |
+------------------+
^
| IPC
v
+-------+-------+-------+
| FS | Net |Drivers| <- User-space servers
+-------+-------+-------+
Small attack surface, but IPC overhead
Hybrid (Windows NT, macOS):
Mix of both approaches
2.4 Common Misconceptions
Misconception 1: “Writing an OS requires years of work” Reality: A minimal kernel that boots, manages memory, and runs simple programs can be built in months. Linux started as a hobby project.
Misconception 2: “You can’t use C++ for kernel development” Reality: Many modern kernels use C++. You just can’t use features that require runtime support (exceptions, RTTI) without implementing that support first.
Misconception 3: “Kernels are only for x86” Reality: The concepts are universal. ARM, RISC-V, MIPS all work similarly. We use x86/x86-64 because QEMU emulation is excellent.
Misconception 4: “I need real hardware to develop a kernel” Reality: QEMU provides perfect emulation. You can develop entirely in a virtual machine with full debugging support.
Misconception 5: “Kernel development is dangerous and could break my computer” Reality: In an emulator, the worst case is restarting QEMU. Your host system is completely isolated.
3. Project Specification
3.1 What You Will Build
A minimal but complete operating system kernel for x86-64 architecture that:
- Boots via GRUB using the Multiboot2 specification
- Sets up the Global Descriptor Table (GDT) for memory segmentation
- Configures the Interrupt Descriptor Table (IDT) for interrupt handling
- Implements paging for virtual memory with 4KB pages
- Provides a heap allocator with custom
operator newandoperator delete - Includes a process scheduler with round-robin scheduling and context switching
- Has a simple filesystem for reading files from a RAM disk
- Runs user-space programs with proper privilege separation
- Provides system calls for I/O, process control, and memory allocation
3.2 Functional Requirements
| ID | Requirement | Description |
|---|---|---|
| F1 | Boot | Kernel boots via GRUB and reaches kernel_main() |
| F2 | Terminal | VGA text-mode output with scrolling and colors |
| F3 | GDT | Proper memory segmentation for kernel and user mode |
| F4 | IDT | Interrupt handlers for CPU exceptions and hardware IRQs |
| F5 | Paging | Virtual memory with kernel mapped in high memory |
| F6 | Heap | Dynamic memory allocation with new/delete |
| F7 | Keyboard | PS/2 keyboard input via IRQ1 |
| F8 | Timer | PIT or APIC timer for scheduling at ~100Hz |
| F9 | Scheduler | Round-robin scheduling with context switching |
| F10 | Filesystem | Simple filesystem (initrd or ext2-like) |
| F11 | User Mode | Transition to Ring 3 for user programs |
| F12 | Syscalls | System call interface via software interrupt |
| F13 | Shell | Basic command interpreter for user interaction |
3.3 Non-Functional Requirements
| ID | Requirement | Target |
|---|---|---|
| NF1 | Boot Time | < 1 second from GRUB to shell prompt |
| NF2 | Memory | Support up to 4GB RAM (32-bit) or 16GB (64-bit) |
| NF3 | Stability | No kernel panics during normal operation |
| NF4 | Debuggability | Serial console output for QEMU debugging |
| NF5 | Portability | Clean separation between arch-specific and generic code |
3.4 Example Usage / Output
$ make run
qemu-system-x86_64 -kernel kernel.bin -serial stdio
Booting MyOS v0.1.0...
[ 0.000] Kernel loaded at 0xFFFFFFFF80100000
[ 0.001] CPU: GenuineIntel, 64-bit mode
[ 0.002] Memory: 128 MB detected
[ 0.003] Initializing subsystems...
[ 0.003] GDT: OK (5 entries)
[ 0.004] IDT: OK (256 entries)
[ 0.005] Paging: OK (kernel mapped at 0xFFFFFFFF80000000)
[ 0.010] Heap: OK (64 MB available)
[ 0.011] Timer: OK (PIT at 100 Hz)
[ 0.012] Keyboard: OK (PS/2)
[ 0.015] Filesystem: OK (initrd mounted at /)
[ 0.020] Scheduler: OK
[ 0.025] Starting init process (PID 1)...
Welcome to MyOS!
myos:/$ ls
bin/ dev/ etc/ home/ usr/
myos:/$ cat /etc/motd
Welcome to MyOS - A minimal operating system built from scratch!
myos:/$ ps
PID STATE CMD
1 RUNNING init
2 RUNNING shell
myos:/$ /bin/hello
Hello from user space!
System call test: write() works!
myos:/$ /bin/forktest
Parent process: PID 3
Child process: PID 4
Fork and exec working correctly!
myos:/$ free
Total: 128 MB
Used: 12 MB
Free: 116 MB
Kernel: 4 MB
Heap: 8 MB
myos:/$ uptime
System uptime: 0d 0h 5m 23s
Tick count: 32300
myos:/$ exit
Halting system...
3.5 Real World Outcome
When you complete this project, you will have:
- A bootable kernel that runs in QEMU (and potentially real hardware)
- Full source code for memory management, scheduling, and I/O
- Understanding of OS concepts that’s impossible to get from reading alone
- A foundation for building more advanced OS features
- Interview-ready knowledge of systems programming fundamentals
You’ll be able to explain:
- How a computer boots from power-on to running your code
- How virtual memory protects processes from each other
- How the CPU switches between processes
- How system calls transition between user and kernel mode
- How interrupts enable hardware communication
4. Solution Architecture
4.1 High-Level Design
+------------------------------------------------------------------+
| User Space (Ring 3) |
| +----------+ +----------+ +----------+ +----------+ |
| | shell | | hello | | forktest | | ... | |
| +----+-----+ +----+-----+ +----+-----+ +----+-----+ |
| | | | | |
+-------|-------------|-------------|-------------|----------------+
| | | |
v v v v
+----------------------------------------------------------+
| System Call Interface |
| int 0x80 / syscall instruction |
+----------------------------------------------------------+
| | | |
+-------|-------------|-------------|-------------|----------------+
| Kernel Space (Ring 0) |
| +-------------+ +-------------+ +-------------+ |
| | Scheduler | | Memory Mgr | | Filesystem | |
| +------+------+ +------+------+ +------+------+ |
| | | | |
| +------v------+ +------v------+ +------v------+ |
| | Process | | Page | | VFS | |
| | Control | | Allocator | | Layer | |
| +------+------+ +------+------+ +------+------+ |
| | | | |
| +------v----------------v----------------v------+ |
| | Hardware Abstraction | |
| | +-------+ +-------+ +-------+ +-------+ | |
| | | GDT | | IDT | | Timer | | KBD | | |
| | +-------+ +-------+ +-------+ +-------+ | |
| +------------------------------------------------+ |
+------------------------------------------------------------------+
| | | |
v v v v
+------------------------------------------------------------------+
| Hardware Layer |
| CPU RAM VGA PIT PIC PS/2 Keyboard |
+------------------------------------------------------------------+
4.2 Key Components
Boot Sequence:
1. BIOS/UEFI -> GRUB Bootloader
|
2. GRUB loads kernel.bin at 0x100000 (1MB)
|
3. boot.asm: _start entry point
- Set up initial stack
- Check multiboot magic
- Call kernel_main()
|
4. kernel_main() (C++)
- Initialize GDT (segmentation)
- Initialize IDT (interrupts)
- Enable paging (virtual memory)
- Initialize heap (dynamic allocation)
- Initialize drivers (keyboard, timer)
- Start scheduler
- Launch init process
Memory Map After Boot:
Virtual Address Space (Higher-Half Kernel):
0xFFFFFFFF'FFFFFFFF +------------------------+
| Kernel Stack |
0xFFFFFFFF'80100000 +------------------------+
| Kernel Code/Data |
0xFFFFFFFF'80000000 +------------------------+
| Kernel Heap |
| v |
| |
| ^ |
0x00007FFF'FFFFFFFF +------------------------+
| User Stack |
| v |
| |
| ^ |
| User Heap |
+------------------------+
| User Code/Data |
0x00000000'00400000 +------------------------+
| Reserved |
0x00000000'00000000 +------------------------+
4.3 Data Structures
Process Control Block (PCB):
enum class ProcessState {
READY,
RUNNING,
BLOCKED,
TERMINATED
};
struct Context {
uint64_t rax, rbx, rcx, rdx;
uint64_t rsi, rdi, rbp;
uint64_t r8, r9, r10, r11, r12, r13, r14, r15;
uint64_t rip; // Instruction pointer
uint64_t cs; // Code segment
uint64_t rflags; // CPU flags
uint64_t rsp; // Stack pointer
uint64_t ss; // Stack segment
};
struct Process {
uint32_t pid;
ProcessState state;
Context context;
uint64_t* page_table; // CR3 value
uint64_t kernel_stack;
uint64_t user_stack;
char name[64];
Process* next; // For scheduler queue
};
Page Table Entry (x86-64):
struct PageTableEntry {
uint64_t present : 1; // Page is in memory
uint64_t writable : 1; // Page is writable
uint64_t user : 1; // User-mode accessible
uint64_t pwt : 1; // Page write-through
uint64_t pcd : 1; // Page cache disable
uint64_t accessed : 1; // Page was accessed
uint64_t dirty : 1; // Page was written to
uint64_t pat : 1; // Page attribute table
uint64_t global : 1; // Global page
uint64_t available : 3; // Available for OS use
uint64_t address : 40; // Physical address (bits 12-51)
uint64_t reserved : 11; // Reserved
uint64_t nx : 1; // No-execute bit
};
GDT Entry:
struct GDTEntry {
uint16_t limit_low;
uint16_t base_low;
uint8_t base_middle;
uint8_t access;
uint8_t granularity;
uint8_t base_high;
} __attribute__((packed));
// Access byte bits:
// Bit 7: Present (1 = valid)
// Bit 6-5: Ring (0 = kernel, 3 = user)
// Bit 4: Descriptor type (1 = code/data)
// Bit 3: Executable (1 = code, 0 = data)
// Bit 2: Direction/Conforming
// Bit 1: Readable/Writable
// Bit 0: Accessed
IDT Entry:
struct IDTEntry {
uint16_t offset_low; // Offset bits 0-15
uint16_t selector; // Code segment selector
uint8_t ist; // Interrupt Stack Table offset
uint8_t type_attr; // Type and attributes
uint16_t offset_mid; // Offset bits 16-31
uint32_t offset_high; // Offset bits 32-63
uint32_t zero; // Reserved
} __attribute__((packed));
4.4 Algorithm Overview
Context Switching:
1. Timer interrupt fires (IRQ0)
2. CPU pushes interrupt frame (RIP, CS, RFLAGS, RSP, SS)
3. Interrupt handler saves remaining registers
4. Scheduler selects next process
5. Scheduler loads next process's registers
6. IRETQ restores RIP, CS, RFLAGS, RSP, SS
7. Execution continues in new process
Page Fault Handling:
1. CPU generates #PF exception
2. CR2 register contains faulting address
3. Error code indicates:
- Bit 0: Present (0 = not present, 1 = protection violation)
- Bit 1: Write (0 = read, 1 = write)
- Bit 2: User (0 = kernel, 1 = user mode)
4. Handler checks if address is valid for process
5. If valid: allocate page, map it, return
6. If invalid: kill process (segfault)
System Call Flow:
User code:
mov rax, SYSCALL_WRITE ; syscall number
mov rdi, fd ; arg1
mov rsi, buffer ; arg2
mov rdx, length ; arg3
int 0x80 ; trigger syscall
Kernel:
1. IDT entry 0x80 points to syscall_handler
2. Handler saves user context
3. Dispatch based on rax value
4. Execute kernel function
5. Store return value in rax
6. Restore user context
7. IRETQ back to user code
5. Implementation Guide
5.1 Development Environment Setup
Required Tools:
# Cross-compiler (builds code for target, not host)
# On Ubuntu/Debian:
sudo apt install build-essential bison flex libgmp3-dev \
libmpc-dev libmpfr-dev texinfo libisl-dev
# Or install pre-built cross-compiler:
sudo apt install gcc-x86-64-linux-gnu g++-x86-64-linux-gnu
# Assembler and linker
sudo apt install nasm
# QEMU for testing
sudo apt install qemu-system-x86
# GRUB tools for bootable image
sudo apt install grub-pc-bin xorriso mtools
# Debugging
sudo apt install gdb
# macOS (using Homebrew):
brew install x86_64-elf-gcc nasm qemu xorriso mtools
Verify Setup:
x86_64-elf-gcc --version # Cross-compiler
nasm --version # Assembler
qemu-system-x86_64 --version # Emulator
grub-mkrescue --version # GRUB tools
5.2 Project Structure
myos/
├── Makefile
├── linker.ld # Linker script
├── grub.cfg # GRUB configuration
│
├── boot/
│ ├── boot.asm # Multiboot header and entry point
│ ├── gdt.asm # GDT setup (assembly)
│ └── interrupts.asm # Interrupt stubs (assembly)
│
├── kernel/
│ ├── kernel.cpp # kernel_main()
│ ├── terminal.cpp # VGA text output
│ ├── terminal.h
│ ├── string.cpp # String utilities (no libc!)
│ ├── string.h
│ ├── port.h # I/O port access (inb, outb)
│ └── panic.cpp # Kernel panic handler
│
├── cpu/
│ ├── gdt.cpp # GDT initialization
│ ├── gdt.h
│ ├── idt.cpp # IDT initialization
│ ├── idt.h
│ ├── isr.cpp # Interrupt service routines
│ └── isr.h
│
├── memory/
│ ├── pmm.cpp # Physical memory manager
│ ├── pmm.h
│ ├── vmm.cpp # Virtual memory manager (paging)
│ ├── vmm.h
│ ├── heap.cpp # Kernel heap
│ └── heap.h
│
├── drivers/
│ ├── keyboard.cpp # PS/2 keyboard driver
│ ├── keyboard.h
│ ├── timer.cpp # PIT timer driver
│ └── timer.h
│
├── process/
│ ├── process.cpp # Process management
│ ├── process.h
│ ├── scheduler.cpp # Round-robin scheduler
│ └── scheduler.h
│
├── fs/
│ ├── vfs.cpp # Virtual filesystem layer
│ ├── vfs.h
│ ├── initrd.cpp # Initial RAM disk
│ └── initrd.h
│
├── syscall/
│ ├── syscall.cpp # System call dispatcher
│ └── syscall.h
│
└── user/
├── shell.cpp # Simple shell
├── init.cpp # Init process
└── programs/
├── hello.cpp # Hello world user program
└── forktest.cpp # Fork test program
5.3 The Core Question You’re Answering
“How does a computer execute code when there is no operating system to provide services?”
When you power on a computer, there’s nothing to call malloc() on, no printf() to call, no filesystem to read from. The kernel must create all of these services from scratch, using only:
- CPU instructions
- Memory (as raw bytes)
- I/O ports (for hardware communication)
You’re building the layer that makes everything else possible.
5.4 Concepts You Must Understand First
Before writing code, ensure you can answer:
Computer Architecture:
- What is the difference between real mode and protected mode? (x86)
- What are privilege rings and how do they work?
- How does the CPU switch from user mode to kernel mode?
Memory:
- What is the difference between physical and virtual addresses?
- How does a page table translate virtual to physical addresses?
- What is a TLB and why does it matter?
Interrupts:
- What happens when an interrupt fires?
- What is the difference between an interrupt and an exception?
- How does the CPU know where to jump when an interrupt occurs?
C++ Specifics:
- What is a freestanding environment vs hosted environment?
- What C++ features require runtime support (and thus won’t work)?
- How does operator new work when there’s no malloc?
Book References:
- “Operating Systems: Three Easy Pieces” - Chapters 1-6, 12-16, 36-39
- “Computer Systems: A Programmer’s Perspective” - Chapter 9 (Virtual Memory)
- “The Linux Programming Interface” - Chapters 2, 6 (concepts, not Linux-specific)
5.5 Questions to Guide Your Design
Boot Process:
- How will GRUB find and load your kernel?
- What state is the CPU in when your code starts executing?
- What’s the minimum setup needed before calling C++ code?
Memory Management:
- How will you track which physical pages are free?
- Will you use a bitmap, linked list, or buddy allocator?
- How will you handle memory fragmentation in the heap?
Interrupts:
- How will you handle CPU exceptions (divide by zero, page fault)?
- How will you handle hardware interrupts (keyboard, timer)?
- What’s your interrupt handler calling convention?
Scheduling:
- What scheduling algorithm will you use?
- How will you handle processes that block (waiting for I/O)?
- What data do you need to save during context switch?
User Space:
- How will you transition from kernel mode to user mode?
- How will user programs request kernel services?
- How will you protect kernel memory from user programs?
5.6 Thinking Exercise
Before writing any code, trace through this scenario mentally:
Exercise: A key is pressed on the keyboard
- Where does the electrical signal go first?
- How does the CPU know a key was pressed?
- What happens to the currently running process?
- How does the kernel get control?
- How is the key data read?
- How does the data get to the waiting process?
- How does the waiting process resume?
Write out each step with the specific registers, instructions, and data structures involved. This exercise will reveal gaps in your understanding before you start coding.
5.7 Hints in Layers
Hint 1 - Starting Point: Begin with boot.asm that sets up a Multiboot header. The header tells GRUB this is a valid kernel. Set up a stack and call kernel_main(). In kernel_main(), just write “Hello” to VGA memory at 0xB8000.
Hint 2 - Next Level (GDT/IDT): The GDT defines memory segments. Set up null, kernel code, kernel data, user code, and user data segments. The IDT maps interrupt numbers to handlers. Create stub handlers that just print “Interrupt X” and hang.
Hint 3 - Technical Details (Memory): For physical memory, use a bitmap where each bit represents a 4KB page. Parse the multiboot memory map to find usable regions. For paging, set up a 4-level page table (PML4 -> PDPT -> PD -> PT) and enable paging by setting CR3 and the PG bit in CR0.
Hint 4 - Tools/Debugging:
Use QEMU with -d int -no-reboot -no-shutdown to see interrupts. Use -serial stdio for serial output (easier to debug than VGA). Use GDB with qemu-system-x86_64 -s -S to single-step through code.
5.8 The Interview Questions They’ll Ask
- “Explain the x86 boot process from power-on to your kernel’s entry point.”
- Expected: BIOS POST, MBR/GPT, bootloader stages, protected/long mode transition
- “How does virtual memory protect processes from each other?”
- Expected: Separate page tables, user/supervisor bit, page fault handling
- “Walk me through what happens when a process makes a system call.”
- Expected: int/syscall instruction, privilege switch, register conventions, iret
- “How does your scheduler choose the next process to run?”
- Expected: Ready queue, round-robin, time quantum, context switch mechanics
- “What happens when a page fault occurs?”
- Expected: CR2 contains faulting address, error code interpretation, demand paging
- “How do you handle the ‘no standard library’ constraint in C++?”
- Expected: Freestanding mode, custom operator new/delete, no exceptions/RTTI
- “What’s the difference between an interrupt and an exception?”
- Expected: External vs internal, IRQs vs traps/faults, PIC/APIC vs CPU-generated
- “How would you add support for multiprocessing (SMP)?”
- Expected: Per-CPU structures, APIC, spinlocks, scheduler per CPU
5.9 Books That Will Help
| Topic | Book | Chapter(s) |
|---|---|---|
| OS Concepts | “Operating Systems: Three Easy Pieces” | All, especially Part I & II |
| x86 Architecture | Intel Manual Volume 3A | Chapters 2-6 |
| Virtual Memory | “CS:APP” | Chapter 9 |
| Boot Process | “Low-Level Programming” by Igor Zhirkov | Chapters 17-19 |
| C++ Freestanding | “Effective Modern C++” | Items relevant to embedded |
| Filesystems | “Operating Systems: Three Easy Pieces” | Chapters 39-42 |
| Scheduling | “Operating Systems: Three Easy Pieces” | Chapters 7-10 |
5.10 Implementation Phases
Phase 1: Boot and Print (Week 1-2)
Goal: Kernel boots and prints “Hello” to screen
; boot.asm - Multiboot header
section .multiboot
align 4
dd 0x1BADB002 ; Magic number
dd 0x00 ; Flags (no special requirements)
dd -(0x1BADB002 + 0x00) ; Checksum (must equal zero when added)
section .bss
align 16
stack_bottom:
resb 16384 ; 16 KB stack
stack_top:
section .text
global _start
extern kernel_main
_start:
; Set up stack
mov esp, stack_top
; Disable interrupts
cli
; Call C++ kernel
call kernel_main
; Halt if kernel returns
cli
.hang:
hlt
jmp .hang
// kernel.cpp
extern "C" void kernel_main() {
// VGA text buffer at 0xB8000
volatile char* video = (volatile char*)0xB8000;
const char* hello = "Hello from kernel!";
for (int i = 0; hello[i] != '\0'; i++) {
video[i * 2] = hello[i]; // Character
video[i * 2 + 1] = 0x0F; // White on black
}
while (true) {
asm volatile("hlt");
}
}
Phase 2: GDT and Protected Mode (Week 3-4)
Goal: Proper memory segmentation
// gdt.h
struct GDTEntry {
uint16_t limit_low;
uint16_t base_low;
uint8_t base_middle;
uint8_t access;
uint8_t granularity;
uint8_t base_high;
} __attribute__((packed));
struct GDTPointer {
uint16_t limit;
uint64_t base;
} __attribute__((packed));
namespace GDT {
void initialize();
}
// gdt.cpp
static GDTEntry gdt[5];
static GDTPointer gdtr;
void setEntry(int num, uint32_t base, uint32_t limit,
uint8_t access, uint8_t gran) {
gdt[num].base_low = base & 0xFFFF;
gdt[num].base_middle = (base >> 16) & 0xFF;
gdt[num].base_high = (base >> 24) & 0xFF;
gdt[num].limit_low = limit & 0xFFFF;
gdt[num].granularity = ((limit >> 16) & 0x0F) | (gran & 0xF0);
gdt[num].access = access;
}
void GDT::initialize() {
gdtr.limit = sizeof(gdt) - 1;
gdtr.base = (uint64_t)&gdt;
setEntry(0, 0, 0, 0, 0); // Null
setEntry(1, 0, 0xFFFFFFFF, 0x9A, 0xCF); // Kernel code
setEntry(2, 0, 0xFFFFFFFF, 0x92, 0xCF); // Kernel data
setEntry(3, 0, 0xFFFFFFFF, 0xFA, 0xCF); // User code
setEntry(4, 0, 0xFFFFFFFF, 0xF2, 0xCF); // User data
// Load GDT (assembly helper)
gdt_flush(&gdtr);
}
Phase 3: Interrupts (Week 5-6)
Goal: Handle CPU exceptions and hardware interrupts
// idt.h
struct IDTEntry {
uint16_t offset_low;
uint16_t selector;
uint8_t ist;
uint8_t type_attr;
uint16_t offset_mid;
uint32_t offset_high;
uint32_t zero;
} __attribute__((packed));
namespace IDT {
void initialize();
void setEntry(uint8_t vector, void (*handler)(), uint8_t type);
}
// isr.cpp
extern "C" void isr_handler(uint64_t vector, uint64_t error_code) {
if (vector == 14) {
// Page fault
uint64_t faulting_address;
asm volatile("mov %%cr2, %0" : "=r"(faulting_address));
Terminal::printf("Page Fault at 0x%x\n", faulting_address);
} else {
Terminal::printf("Exception %d (error: %d)\n", vector, error_code);
}
while (true) asm volatile("hlt");
}
extern "C" void irq_handler(uint64_t irq) {
if (irq == 0) {
// Timer
Scheduler::tick();
} else if (irq == 1) {
// Keyboard
uint8_t scancode = inb(0x60);
Keyboard::handle(scancode);
}
// Send EOI to PIC
if (irq >= 8) outb(0xA0, 0x20); // Slave PIC
outb(0x20, 0x20); // Master PIC
}
Phase 4: Memory Management (Week 7-10)
Goal: Physical and virtual memory working
// pmm.cpp - Physical Memory Manager
class PhysicalMemoryManager {
uint8_t* bitmap;
size_t total_pages;
size_t used_pages;
public:
void initialize(uint64_t memory_size) {
total_pages = memory_size / 4096;
used_pages = 0;
// Bitmap needs 1 bit per page
size_t bitmap_size = total_pages / 8;
bitmap = (uint8_t*)BITMAP_ADDRESS;
// Mark all pages as used initially
memset(bitmap, 0xFF, bitmap_size);
}
void markFree(uint64_t base, uint64_t length) {
uint64_t start_page = base / 4096;
uint64_t num_pages = length / 4096;
for (uint64_t i = 0; i < num_pages; i++) {
uint64_t page = start_page + i;
bitmap[page / 8] &= ~(1 << (page % 8));
}
}
uint64_t allocate() {
for (size_t i = 0; i < total_pages; i++) {
if (!(bitmap[i / 8] & (1 << (i % 8)))) {
bitmap[i / 8] |= (1 << (i % 8));
used_pages++;
return i * 4096;
}
}
return 0; // Out of memory
}
void free(uint64_t address) {
uint64_t page = address / 4096;
bitmap[page / 8] &= ~(1 << (page % 8));
used_pages--;
}
};
// vmm.cpp - Virtual Memory Manager
void mapPage(uint64_t* pml4, uint64_t virtual_addr, uint64_t physical_addr,
uint64_t flags) {
uint64_t pml4_idx = (virtual_addr >> 39) & 0x1FF;
uint64_t pdpt_idx = (virtual_addr >> 30) & 0x1FF;
uint64_t pd_idx = (virtual_addr >> 21) & 0x1FF;
uint64_t pt_idx = (virtual_addr >> 12) & 0x1FF;
// Traverse/create page table hierarchy
uint64_t* pdpt = getOrCreateTable(pml4, pml4_idx);
uint64_t* pd = getOrCreateTable(pdpt, pdpt_idx);
uint64_t* pt = getOrCreateTable(pd, pd_idx);
// Set the final mapping
pt[pt_idx] = (physical_addr & ~0xFFF) | flags | PAGE_PRESENT;
// Invalidate TLB for this address
asm volatile("invlpg (%0)" :: "r"(virtual_addr) : "memory");
}
Phase 5: Heap and Allocators (Week 11-12)
Goal: operator new/delete working
// heap.cpp
struct BlockHeader {
size_t size;
bool free;
BlockHeader* next;
};
static BlockHeader* heap_start = nullptr;
static BlockHeader* heap_end = nullptr;
void Heap::initialize(uint64_t start, size_t size) {
heap_start = (BlockHeader*)start;
heap_start->size = size - sizeof(BlockHeader);
heap_start->free = true;
heap_start->next = nullptr;
heap_end = heap_start;
}
void* Heap::allocate(size_t size) {
// Align to 16 bytes
size = (size + 15) & ~15;
BlockHeader* current = heap_start;
while (current) {
if (current->free && current->size >= size) {
// Split block if significantly larger
if (current->size > size + sizeof(BlockHeader) + 32) {
BlockHeader* new_block = (BlockHeader*)
((uint8_t*)current + sizeof(BlockHeader) + size);
new_block->size = current->size - size - sizeof(BlockHeader);
new_block->free = true;
new_block->next = current->next;
current->size = size;
current->next = new_block;
}
current->free = false;
return (void*)((uint8_t*)current + sizeof(BlockHeader));
}
current = current->next;
}
return nullptr; // Out of heap
}
void Heap::free(void* ptr) {
if (!ptr) return;
BlockHeader* header = (BlockHeader*)((uint8_t*)ptr - sizeof(BlockHeader));
header->free = true;
// Coalesce with next block if free
if (header->next && header->next->free) {
header->size += sizeof(BlockHeader) + header->next->size;
header->next = header->next->next;
}
}
// Custom operators
void* operator new(size_t size) {
return Heap::allocate(size);
}
void* operator new[](size_t size) {
return Heap::allocate(size);
}
void operator delete(void* ptr) noexcept {
Heap::free(ptr);
}
void operator delete[](void* ptr) noexcept {
Heap::free(ptr);
}
void operator delete(void* ptr, size_t) noexcept {
Heap::free(ptr);
}
void operator delete[](void* ptr, size_t) noexcept {
Heap::free(ptr);
}
// Placement new (always available)
inline void* operator new(size_t, void* ptr) noexcept {
return ptr;
}
Phase 6: Scheduling (Week 13-16)
Goal: Multiple processes with context switching
// scheduler.cpp
static Process* current_process = nullptr;
static Process* ready_queue = nullptr;
Process* Scheduler::createProcess(void (*entry)()) {
Process* proc = new Process();
proc->pid = next_pid++;
proc->state = ProcessState::READY;
// Allocate kernel stack
proc->kernel_stack = (uint64_t)Heap::allocate(KERNEL_STACK_SIZE);
// Allocate page table
proc->page_table = VMM::createAddressSpace();
// Set up initial context
proc->context.rip = (uint64_t)entry;
proc->context.cs = KERNEL_CODE_SELECTOR;
proc->context.rflags = 0x202; // IF set
proc->context.rsp = proc->kernel_stack + KERNEL_STACK_SIZE;
proc->context.ss = KERNEL_DATA_SELECTOR;
// Add to ready queue
addToReadyQueue(proc);
return proc;
}
void Scheduler::schedule() {
if (!ready_queue) return;
// Round-robin: move current to end of queue
if (current_process && current_process->state == ProcessState::RUNNING) {
current_process->state = ProcessState::READY;
addToReadyQueue(current_process);
}
// Get next process
Process* next = ready_queue;
ready_queue = ready_queue->next;
next->next = nullptr;
next->state = ProcessState::RUNNING;
Process* prev = current_process;
current_process = next;
if (prev != next) {
// Switch page tables
asm volatile("mov %0, %%cr3" :: "r"(next->page_table));
// Context switch (assembly helper)
context_switch(&prev->context, &next->context);
}
}
void Scheduler::tick() {
static int ticks = 0;
if (++ticks >= TIME_QUANTUM) {
ticks = 0;
schedule();
}
}
; context_switch.asm
global context_switch
; void context_switch(Context* old, Context* new)
context_switch:
; Save old context
mov [rdi + 0], rax
mov [rdi + 8], rbx
mov [rdi + 16], rcx
mov [rdi + 24], rdx
mov [rdi + 32], rsi
mov [rdi + 40], rdi
mov [rdi + 48], rbp
mov [rdi + 56], r8
mov [rdi + 64], r9
mov [rdi + 72], r10
mov [rdi + 80], r11
mov [rdi + 88], r12
mov [rdi + 96], r13
mov [rdi + 104], r14
mov [rdi + 112], r15
; Save RIP (return address is on stack)
mov rax, [rsp]
mov [rdi + 120], rax
; Save RSP
mov [rdi + 144], rsp
; Load new context
mov rax, [rsi + 0]
mov rbx, [rsi + 8]
mov rcx, [rsi + 16]
mov rdx, [rsi + 24]
; rsi loaded last (we need it for loads)
mov rdi, [rsi + 40]
mov rbp, [rsi + 48]
mov r8, [rsi + 56]
mov r9, [rsi + 64]
mov r10, [rsi + 72]
mov r11, [rsi + 80]
mov r12, [rsi + 88]
mov r13, [rsi + 96]
mov r14, [rsi + 104]
mov r15, [rsi + 112]
; Load RSP
mov rsp, [rsi + 144]
; Load RIP (push and ret)
push qword [rsi + 120]
; Finally load rsi
mov rsi, [rsi + 32]
ret
Phase 7: System Calls (Week 17-18)
Goal: User programs can request kernel services
// syscall.cpp
#define SYSCALL_EXIT 0
#define SYSCALL_WRITE 1
#define SYSCALL_READ 2
#define SYSCALL_FORK 3
#define SYSCALL_EXEC 4
#define SYSCALL_GETPID 5
extern "C" int64_t syscall_handler(uint64_t num, uint64_t arg1,
uint64_t arg2, uint64_t arg3) {
switch (num) {
case SYSCALL_EXIT:
Scheduler::exit((int)arg1);
return 0;
case SYSCALL_WRITE: {
int fd = (int)arg1;
const char* buf = (const char*)arg2;
size_t count = (size_t)arg3;
if (fd == 1) { // stdout
for (size_t i = 0; i < count; i++) {
Terminal::putchar(buf[i]);
}
return count;
}
return -1;
}
case SYSCALL_FORK:
return Scheduler::fork();
case SYSCALL_GETPID:
return Scheduler::getCurrentPID();
default:
return -1;
}
}
Phase 8: User Space and Shell (Week 19-24)
Goal: User programs run in Ring 3
// user/shell.cpp - Compiled as separate user binary
int main() {
char buffer[256];
const char* prompt = "myos:/$ ";
while (true) {
write(1, prompt, strlen(prompt));
int len = read(0, buffer, sizeof(buffer) - 1);
if (len <= 0) continue;
buffer[len] = '\0';
if (strcmp(buffer, "exit\n") == 0) {
break;
} else if (strcmp(buffer, "ps\n") == 0) {
// System call to list processes
ps();
} else if (strncmp(buffer, "/bin/", 5) == 0) {
// Execute program
int pid = fork();
if (pid == 0) {
exec(buffer);
exit(1); // exec failed
} else {
wait(pid);
}
} else {
write(1, "Unknown command\n", 16);
}
}
return 0;
}
5.11 Key Implementation Decisions
Decision 1: Higher-Half Kernel
Map the kernel to high virtual addresses (e.g., 0xFFFFFFFF80000000 for x86-64). This leaves the lower half for user programs and simplifies memory layout.
Decision 2: Physical Memory Allocator
Use a bitmap for simplicity. Each bit represents one 4KB page. Set bits indicate used pages. Buddy allocator is more complex but better for larger allocations.
Decision 3: Process Representation
Keep process control blocks in kernel heap. Use a singly-linked list for ready queue (simple round-robin). Consider priority queues for more sophisticated scheduling.
Decision 4: Interrupt Handling Strategy
Use a two-phase approach: quick handler in assembly that saves state and calls C++ handler. This keeps interrupt handlers clean and debuggable.
6. Testing Strategy
Unit Testing (Limited in Kernel):
// test.cpp - Early tests before scheduler works
void test_heap() {
void* p1 = Heap::allocate(100);
void* p2 = Heap::allocate(200);
void* p3 = Heap::allocate(300);
assert(p1 != nullptr);
assert(p2 != nullptr);
assert(p3 != nullptr);
assert(p1 != p2 && p2 != p3);
Heap::free(p2);
void* p4 = Heap::allocate(150);
assert(p4 == p2); // Reused freed block
Terminal::println("Heap tests passed!");
}
Integration Testing with QEMU:
# Test boot
qemu-system-x86_64 -kernel kernel.bin -serial stdio 2>&1 | grep "Kernel loaded"
# Test with GDB
qemu-system-x86_64 -kernel kernel.bin -s -S &
gdb -ex "target remote :1234" -ex "break kernel_main" -ex "continue" kernel.bin
# Test filesystem
echo "test content" > initrd/test.txt
./make_initrd initrd/ initrd.img
qemu-system-x86_64 -kernel kernel.bin -initrd initrd.img -serial stdio
Automated Test Script:
#!/bin/bash
# test.sh
TIMEOUT=10
EXPECTED_STRINGS=("Kernel loaded" "GDT initialized" "IDT initialized" "Heap initialized")
OUTPUT=$(timeout $TIMEOUT qemu-system-x86_64 -kernel kernel.bin -nographic -serial mon:stdio 2>&1)
for expected in "${EXPECTED_STRINGS[@]}"; do
if echo "$OUTPUT" | grep -q "$expected"; then
echo "PASS: Found '$expected'"
else
echo "FAIL: Missing '$expected'"
exit 1
fi
done
echo "All tests passed!"
7. Common Pitfalls & Debugging
| Problem | Symptom | Root Cause | Fix |
|---|---|---|---|
| Triple fault on boot | QEMU resets immediately | Invalid GDT, stack overflow, or page fault with no handler | Use -d int -no-reboot to see the exception |
| Page fault in kernel | Exception 14 | Accessing unmapped virtual address | Check page table setup, ensure kernel is mapped |
| General protection fault | Exception 13 | Segment selector issue or privilege violation | Verify GDT entries and segment registers |
| No keyboard input | Key presses ignored | IRQ1 not enabled or PIC not configured | Initialize PIC and unmask IRQ1 |
| Scheduler deadlock | System hangs | All processes blocked or ready queue corrupted | Add debug output in scheduler, check queue pointers |
| Heap corruption | Random crashes | Buffer overflow or double free | Add canary values around allocations |
| User program crashes | Page fault in Ring 3 | User page table missing mapping | Verify user space is mapped with user flag |
Debugging Techniques:
# See all interrupts
qemu-system-x86_64 -d int -no-reboot
# Serial output to terminal
qemu-system-x86_64 -serial stdio
# GDB debugging
qemu-system-x86_64 -s -S -kernel kernel.bin &
gdb kernel.bin -ex "target remote :1234"
# View registers at interrupt
(gdb) info registers
# Examine memory
(gdb) x/16xg 0xFFFFFFFF80100000
8. Extensions & Challenges
After completing the basic kernel:
- SMP Support - Boot multiple CPU cores, per-CPU schedulers
- ext2 Filesystem - Real filesystem with directories and permissions
- Network Stack - Ethernet driver, IP, UDP, TCP
- Graphics Mode - VESA framebuffer, basic GUI
- Shared Memory - IPC between processes
- Signals - Unix-style signal handling
- Dynamic Linking - Shared libraries (.so)
- UEFI Boot - Modern boot method instead of BIOS/GRUB
- Device Tree - For ARM port
- Port to RISC-V - Different architecture, same concepts
9. Real-World Connections
This project relates to:
- Linux Kernel - Many of the same concepts, vastly more complexity
- Xen/KVM - Hypervisors that run below operating systems
- RTOS (FreeRTOS, Zephyr) - Real-time kernels for embedded systems
- Container Runtimes - Use kernel features like namespaces and cgroups
- Game Console Development - Bare-metal or minimal OS environments
Job Roles Using These Skills:
- Kernel Developer (Linux, Windows, embedded)
- Systems Programmer
- Embedded Software Engineer
- Security Researcher (kernel exploits)
- Hypervisor/Virtualization Engineer
- Firmware Developer
10. Resources
Essential Reading:
| Resource | URL | Purpose |
|---|---|---|
| OSDev Wiki | wiki.osdev.org | Comprehensive OS development reference |
| Intel Manual Vol 3 | intel.com | x86 architecture specification |
| OSDEV Bare Bones | wiki.osdev.org/Bare_Bones | Minimal kernel tutorial |
Video Courses:
- “Writing a Simple Operating System from Scratch” by Nick Blundell (free PDF)
- YouTube: “Build an OS” series by various creators
Example Projects to Study:
- xv6 - MIT’s teaching OS (simple, well-documented)
- ToaruOS - Hobby OS with full GUI
- SerenityOS - Modern C++ hobby OS
- MINIX 3 - Microkernel design
Books:
- “Operating Systems: Three Easy Pieces” - ostep.org (free online)
- “The Design of the UNIX Operating System” by Maurice Bach
- “Linux Kernel Development” by Robert Love
11. Self-Assessment Checklist
Before marking this project complete, verify:
Boot Process:
- Kernel boots via GRUB without errors
- Multiboot info structure is parsed correctly
- Memory map is extracted and used
Memory Management:
- Physical memory manager tracks free/used pages
- Paging is enabled with kernel mapped
- Heap allocator works correctly (no leaks, no corruption)
newanddeleteoperators work
Interrupts:
- CPU exceptions print error info (don’t just hang)
- Timer interrupt fires at regular intervals
- Keyboard input is received
Scheduling:
- Multiple processes can be created
- Context switching works correctly
- Round-robin scheduling is fair
User Space:
- Processes run in Ring 3
- System calls work
- User programs can print output
Filesystem:
- initrd is loaded and parsed
- Files can be read
12. Submission / Completion Criteria
Your kernel is complete when:
- Boot to Shell - Boots from GRUB to interactive shell in < 2 seconds
- Run Programs - At least 3 user programs (hello, ps, forktest) work
- Stable - No crashes during 10 minutes of interactive use
- Documented - README explains build process and design decisions
- Clean Code - Organized structure, consistent style, comments for tricky parts
Deliverables:
myos/
├── README.md # How to build and run
├── DESIGN.md # Architecture decisions explained
├── Makefile # Build system
├── kernel.bin # Compiled kernel
├── myos.iso # Bootable ISO image
└── src/ # All source code
Build Verification:
# Clean build
make clean && make
# Run in QEMU
make run
# Create bootable ISO
make iso
# Run ISO
qemu-system-x86_64 -cdrom myos.iso
Conclusion
Congratulations on reaching the capstone project of the C++ Mastery path. Building an operating system kernel represents the pinnacle of systems programming. You’re not just using the computer - you’re telling it exactly what to do at the lowest level.
This project synthesizes everything you’ve learned:
- Memory management from Project 3 (Smart Pointers) and Project 4 (Containers)
- Concurrency from Project 5 (Thread Pool)
- Parsing from Project 1 (Calculator), Project 6 (JSON), and Project 10 (Compiler)
- Networking patterns from Project 7 (TCP Chat) and Project 11 (HTTP Server)
- Low-level control from Project 2 (Memory-Mapped Files) and Project 8 (Database)
- Data structures from Project 4 (Generic Containers)
- Performance mindset from Project 9 (Game Engine)
- Template mastery from Project 12 (TMP Library)
When you complete this project, you’ll have achieved true C++ mastery. You’ll understand not just how to use the language, but how the entire software stack works - from transistors to high-level applications.
The journey doesn’t end here. Your kernel can grow into a real operating system. Or you can apply this knowledge to kernel modules, embedded systems, security research, or any domain where deep systems understanding matters.
Good luck, and enjoy writing code that runs on bare metal!
“In the beginning, there was the machine, and the machine was without software, and void. And the programmer said, ‘Let there be code,’ and there was code. And the programmer saw the code, and it was good.”