Learn Linux Kernel Development: From Zero to Contributor

Goal: Build a deep, working mental model of the Linux kernel from the syscall boundary down to core subsystems (scheduling, memory, VFS, and drivers), and then convert that understanding into real contributions. You will learn how kernel data structures map to user-visible behavior, how performance and correctness trade off in low-level code, and how to debug failures that only appear under real workloads. By the end, you will be able to read kernel code with intent, design fixes that respect kernel conventions, and submit patches that stand up to maintainer review.


Why Linux Kernel Development Matters

Linux is the OS kernel for servers, cloud infrastructure, Android, embedded devices, and most of the internet. The kernel is the boundary where the machine’s reality becomes the software’s guarantees: isolation, scheduling fairness, I/O correctness, and security. If you can reason about the kernel, you can reason about every other layer.

What changes when you understand the kernel:

  • You stop guessing and start tracing (syscalls, scheduling, memory pressure)
  • You explain failures with root causes, not symptoms
  • You can read performance like a story (cache misses, reclaim, I/O waits)
  • You ship safer code because you know how memory, concurrency, and I/O break

A kernel bug is rarely “random.” It is usually one of a few repeatable patterns:

  • A pointer crosses a boundary without validation
  • A lock is taken in the wrong context
  • Memory is allocated in a path that cannot sleep
  • A refcount is dropped too early or too late
User Programs                         Kernel Subsystems
┌──────────────────────────┐          ┌──────────────────────────┐
│  shell / app / daemon    │          │  Scheduler   (CFS)       │
│  libc / runtime          │  syscalls│  Memory (MM, SLUB)       │
└──────────────┬───────────┘─────────▶│  VFS / Filesystems       │
               │                      │  Networking / Netfilter │
               │                      │  Device Drivers         │
               ▼                      └──────────────┬──────────┘
     User/Kernel Boundary                           │
                                                    ▼
                                            Hardware Devices

What You Will Be Able to Do

  • Explain why a kernel log line appears and which subsystem emitted it
  • Trace a syscall from userspace through the kernel and back
  • Read data structures like task_struct, mm_struct, inode, and sk_buff
  • Build, boot, and debug a custom kernel in QEMU
  • Submit clean patches with correct style and rationale

The Big Picture: From Request to Hardware and Back

Every kernel project is a slice of a single pipeline. The kernel receives requests, schedules work, talks to hardware, and reports results back to user space.

User Space                                Kernel Space
┌──────────────────────────┐              ┌────────────────────────────┐
│  app → libc → syscall    │  transition  │  entry → subsystem → driver│
└──────────────┬───────────┘────────────▶│  IRQ → wakeup → return      │
               │                         └──────────────┬─────────────┘
               ▼                                        ▼
            Result                                  Hardware

The learning path mirrors real kernel work:

Read → Build → Boot → Trace → Patch → Test → Send

You will learn to answer these questions consistently:

  • Where is the state stored? (which struct, which cache, which list)
  • Who owns the lifetime? (who allocates, who frees, who holds refs)
  • What is the context? (process vs interrupt, can it sleep?)
  • What is the boundary? (user/kernel, VFS, netfilter, driver)

Kernel Navigation Cheatsheet

When you are lost, go to these directories first. They map directly to the concepts in this guide.

Core subsystems:

  • kernel/ — scheduler, fork/exec, core task management
  • mm/ — memory management, buddy allocator, SLUB
  • fs/ — VFS core, filesystems, path lookup
  • net/ — networking stack and netfilter
  • drivers/ — device drivers, hardware interfaces
  • arch/x86/ or arch/arm64/ — architecture-specific entry and syscall code

Commonly used headers:

  • include/linux/sched.htask_struct, scheduler interfaces
  • include/linux/mm.h — memory management interfaces
  • include/linux/fs.h — VFS and file operations
  • include/linux/netdevice.h — network devices
  • include/linux/interrupt.h — IRQ handling

Entry points and paths to bookmark:

  • init/main.cstart_kernel() and boot flow
  • arch/x86/entry/entry_64.S — syscall entry (x86-64)
  • arch/x86/entry/syscalls/syscall_64.tbl — syscall table
  • kernel/sched/ — CFS internals and scheduling decisions
  • mm/slub.c — slab allocator internals
  • fs/open.c — open syscall path
  • fs/read_write.c — read/write syscall path
  • drivers/char/ — sample char drivers

Search patterns that work:

# Find where a syscall lands
$ rg "SYSCALL_DEFINE" -n kernel/ fs/ mm/ net/

# Find a struct definition
$ rg "struct task_struct" -n include/

# Find where a log came from
$ rg "Your log text" -n

# Trace a function implementation from a prototype
$ rg "^ssize_t vfs_read" -n

Mental map rule: If you cannot locate a behavior in 5 minutes, search kernel/, mm/, or fs/ first. Most core behavior lives there.


Debugging Toolkit

This is the minimum set of tools you will use across projects. Each one maps to a specific kind of kernel failure.

1) printk / dmesg — fast, low‑friction visibility

  • Use pr_info, pr_warn, pr_err for structured logs
  • View with dmesg -w or journalctl -k -f
$ sudo dmesg -w
[  123.456] mymodule: init ok

2) ftrace — function‑level tracing

  • Trace scheduling, memory allocation, and I/O paths
$ echo function > /sys/kernel/debug/tracing/current_tracer
$ echo vfs_read > /sys/kernel/debug/tracing/set_ftrace_filter
$ cat /sys/kernel/debug/tracing/trace_pipe

3) tracepoints — low‑overhead structured events

  • Use sched:sched_switch, kmem:kmalloc, syscalls:sys_enter_*
$ echo 1 > /sys/kernel/debug/tracing/events/sched/sched_switch/enable
$ cat /sys/kernel/debug/tracing/trace_pipe

4) perf — performance counters and flame graphs

  • Profile kernel hot paths and stalls
$ sudo perf record -g -a -- sleep 5
$ sudo perf report

5) GDB + QEMU — stop the kernel safely

  • Always debug in a VM, not your host
# In one terminal
$ qemu-system-x86_64 -kernel bzImage -initrd initramfs.cpio.gz -append "console=ttyS0 nokaslr" -s -S

# In another
$ gdb vmlinux
(gdb) target remote :1234
(gdb) b start_kernel
(gdb) c

6) kgdb — interactive kernel debugging

  • For deeper breakpoints and stepping in kernel code
# Add to cmdline
kgdbwait

7) Lockdep — deadlock detection

  • Enable CONFIG_PROVE_LOCKING to detect lock ordering issues

Core Subsystem Walkthrough

This is the single most useful mental model for kernel work: follow one real request from userspace to hardware and back. If you can trace this, you can trace almost anything.

Example: read() from a file

userspace read()
  → libc wrapper
  → syscall entry (arch/x86/entry/)
  → vfs_read()
  → file_operations.read()
  → filesystem driver (e.g., ext4)
  → page cache / block layer
  → device driver (NVMe/SATA)
  → interrupt
  → wakeup
  → return to userspace

Where to look in the tree:

  • Syscall entry: arch/x86/entry/entry_64.S
  • VFS path: fs/read_write.c
  • Filesystem: fs/ext4/
  • Block layer: block/
  • Driver: drivers/nvme/ or drivers/ata/

What to ask at each step:

  • Which struct carries the state here?
  • Can this code sleep?
  • Who owns the memory right now?
  • Where are errors propagated?

Example: write() to a network socket

userspace write()
  → syscall entry
  → sock_write_iter()
  → tcp_sendmsg()
  → IP layer
  → qdisc / netfilter
  → driver → NIC
  → IRQ → softirq → completion

Where to look in the tree:

  • Socket layer: net/socket.c
  • TCP stack: net/ipv4/tcp.c
  • Netfilter: net/netfilter/
  • Drivers: drivers/net/

Kernel Build Optimization

Kernel builds can be slow. These practices keep the feedback loop tight.

1) Use ccache

$ export CC="ccache gcc"
$ export HOSTCC="ccache gcc"

2) Start from a minimal config

$ make defconfig
$ make menuconfig

3) Keep a small, fast QEMU config

$ qemu-system-x86_64 -kernel bzImage -initrd initramfs.cpio.gz   -append "console=ttyS0 nokaslr" -nographic -enable-kvm -m 1G

4) Use make olddefconfig after pulling updates

$ make olddefconfig

5) Build only what changed

$ make -j$(nproc)

6) Keep a separate build directory

$ make O=../linux-build defconfig
$ make O=../linux-build -j$(nproc)

Kernel Data Structure Atlas

These are the core structs you will see in nearly every subsystem. Learn what they represent and where they live.

Process & scheduling

  • task_struct — the process/thread descriptor
    • Where: include/linux/sched.h
    • Why it matters: state, PID, scheduling class, memory map pointer
  • mm_struct — process address space
    • Where: include/linux/mm_types.h
    • Why it matters: page tables, VMAs, memory accounting

Filesystem

  • inode — metadata for a file
    • Where: include/linux/fs.h
    • Why it matters: permissions, size, block mapping
  • dentry — directory entry (name → inode)
    • Where: include/linux/dcache.h
    • Why it matters: path lookup and caching
  • file — open file instance
    • Where: include/linux/fs.h
    • Why it matters: file position, ops table, flags

Networking

  • sk_buff — packet buffer
    • Where: include/linux/skbuff.h
    • Why it matters: every packet flows through this
  • sock — socket state
    • Where: include/net/sock.h
    • Why it matters: protocol state and queues

Drivers / Devices

  • device — kernel device model object
    • Where: include/linux/device.h
    • Why it matters: sysfs exposure, driver binding
  • net_device — network device abstraction
    • Where: include/linux/netdevice.h
    • Why it matters: driver hooks and network stats

Synchronization

  • spinlock_t, mutex, rcu_head
    • Where: include/linux/spinlock.h, include/linux/mutex.h, include/linux/rcupdate.h
    • Why it matters: correctness and lock context rules

Kernel Contribution Checklist

Use this before you send any patch. It mirrors what maintainers expect.

Patch quality

  1. Does the patch fix one thing? (small, reviewable)
  2. Does the commit message explain what and why?
  3. Did you run scripts/checkpatch.pl?

Testing discipline

  1. Reproduce the bug before the fix (if possible)
  2. Validate after the fix (logs, tests, or repro script)
  3. Note any testing limitations in the commit message

Formatting & submission

$ git format-patch -1
$ ./scripts/checkpatch.pl 0001-*.patch
$ ./scripts/get_maintainer.pl -f path/to/file.c

Email checklist

  • Subject line: subsystem: short summary
  • Description includes: problem, root cause, fix
  • CC the right maintainers and lists
  • Reply inline to review comments with updated patches

Red flags to avoid

  • Mixing unrelated changes in one patch
  • Renaming variables without justification
  • Skipping the “why” in the commit message

Glossary of Kernel Terms

  • Syscall: Controlled entry into the kernel from user space.
  • Task: Kernel unit of scheduling (process or thread).
  • Context switch: CPU stops one task, starts another.
  • Preemption: Kernel stops a task to run a higher-priority one.
  • RCU: Read-Copy-Update; readers run lock-free, writers defer frees.
  • VFS: Virtual File System; unified filesystem API.
  • Dentry: Name lookup entry (path component → inode).
  • Inode: File metadata representation.
  • Page cache: Cached file data stored in memory.
  • Buddy allocator: Physical page allocator using power-of-2 blocks.
  • SLUB: Object cache allocator for kernel objects.
  • Softirq: Deferred interrupt processing context.
  • IRQ: Hardware interrupt.
  • KASLR: Kernel Address Space Layout Randomization.
  • Kconfig: Kernel configuration system.
  • Kbuild: Kernel build system.

Minimal Patch Walkthrough

This is the smallest “real” kernel patch flow. Use it before tackling bigger fixes.

1) Find a tiny issue

  • A checkpatch warning or a single missing error check

2) Make the change

  • Keep the diff small and focused

3) Build or run targeted test

$ make -j$(nproc)

4) Create the patch

$ git add path/to/file.c
$ git commit -s -m "subsystem: fix warning in foo"
$ git format-patch -1

5) Check style

$ ./scripts/checkpatch.pl 0001-*.patch

6) Find maintainers

$ ./scripts/get_maintainer.pl -f path/to/file.c

7) Send email

  • Include the problem, root cause, and fix in the commit body
  • CC maintainers and lists from get_maintainer

Example commit message (good):

subsystem: fix null deref in foo()

foo() can dereference a NULL pointer when bar() fails, which
leads to a kernel oops during device probe. Add a NULL check
and return -ENOMEM. No behavior change on success paths.

Signed-off-by: Your Name <you@example.com>

Kernel Reading Plan

Use this to build daily familiarity with the source tree. The goal is muscle memory, not mastery on day one.

Week 1: Entry and boundaries

  • Day 1: init/main.c (start_kernel)
  • Day 2: arch/x86/entry/entry_64.S
  • Day 3: kernel/sys.c and fs/open.c
  • Day 4: fs/read_write.c
  • Day 5: kernel/exit.c

Week 2: Scheduling and memory

  • Day 1: kernel/sched/core.c
  • Day 2: kernel/sched/fair.c
  • Day 3: mm/memory.c
  • Day 4: mm/page_alloc.c
  • Day 5: mm/slub.c

Week 3: VFS and drivers

  • Day 1: fs/inode.c
  • Day 2: fs/dcache.c
  • Day 3: drivers/char/
  • Day 4: drivers/block/
  • Day 5: drivers/net/

Week 4: Networking and tracing

  • Day 1: net/core/dev.c
  • Day 2: net/ipv4/tcp.c
  • Day 3: kernel/trace/
  • Day 4: kernel/rcu/
  • Day 5: Documentation/ (pick one subsystem doc)

Rule: Read 30–45 minutes per day. Always take notes on what struct holds the state and what function performs the decision.


Common Failure Signatures

Learn these patterns and you will debug 80% of kernel failures faster.

1) NULL pointer dereference

  • Symptom: Unable to handle kernel NULL pointer dereference at ...
  • Usually means: Missing NULL check or use-after-free
  • First check: The call trace for the first dereference

2) BUG: sleeping function called from invalid context

  • Symptom: BUG: sleeping function called from invalid context
  • Usually means: Allocation or mutex in atomic context
  • First check: Was the code running in interrupt or softirq?

3) WARNING: possible recursive locking detected

  • Symptom: lockdep warning
  • Usually means: Lock ordering inversion
  • First check: lock order and lock class in the trace

4) Kernel panic - not syncing

  • Symptom: panic with backtrace, system halt
  • Usually means: fatal error path, BUG_ON, or corruption
  • First check: panic message and last subsystem in trace

5) Oops with stack trace

  • Symptom: Oops: 0000 with call trace
  • Usually means: Invalid memory access
  • First check: the first function in the trace and its inputs

6) hung_task_timeout_secs

  • Symptom: task blocked for more than N seconds
  • Usually means: deadlock or I/O wait stall
  • First check: lock stack and wait channel

Prerequisites & Background Knowledge

Before starting these projects, you should have foundational understanding in these areas:

Essential Prerequisites (Must Have)

Programming Skills:

  • Confident C programming (pointers, memory, structs, function pointers)
  • Comfort reading Makefiles and shell scripts
  • Basic debugging with gdb and strace

Operating System Fundamentals:

  • Processes vs threads and basic scheduling concepts
  • Virtual memory, page tables, and address spaces
  • File descriptors and standard Unix I/O
  • Recommended Reading: Operating Systems: Three Easy Pieces — Ch. 4-6, 13-15

Linux Basics:

  • Familiarity with /proc, /sys, permissions, and common CLI tools
  • Comfort with git (branching, rebasing, sending patches)
  • Recommended Reading: How Linux Works — Ch. 4-6

Helpful But Not Required

Assembly & ABI knowledge: helps in Project 2 and debugging

  • Can learn during: Project 2, Project 10

Device driver experience: helps in Projects 5, 11, 12

  • Can learn during: Project 4 and Project 5

Networking internals: helps in Project 9 and 13

  • Can learn during: Project 9 and Project 13

Self-Assessment Questions

  1. ✅ Can you explain what a system call does at a high level?
  2. ✅ Can you read a C struct definition and reason about its memory layout?
  3. ✅ Can you compile a small C program and inspect its output with strace?
  4. ✅ Can you explain what /proc and /sys contain?
  5. ✅ Can you use git format-patch and git send-email?

If you answered “no” to questions 1-3, spend 1-2 weeks with Linux System Programming and Operating Systems: Three Easy Pieces first.

Development Environment Setup

Required Tools:

  • A Linux machine or VM (Ubuntu 22.04+ or Fedora 38+ recommended)
  • gcc, make, binutils, git, bc, bison, flex, libssl-dev, libelf-dev
  • qemu-system-x86, gdb, dwarves

Recommended Tools:

  • ccache (build speed), rg (source search), perf (profiling)

Testing Your Setup:

$ gcc --version
gcc (Ubuntu ...) 11+

$ qemu-system-x86_64 --version
QEMU emulator version 7+

$ gdb --version
GNU gdb (GDB) 12+

Time Investment

  • Beginner projects (1-2, 4): weekend each (4-8 hours)
  • Intermediate projects (3, 5, 7, 8): 1-2 weeks each
  • Advanced projects (6, 9-13): 2-3 weeks each
  • Contribution projects (14-15): 1-3 weeks depending on scope

Important Reality Check

Kernel development is not “learn it in a weekend.” It is a repeated loop of building, testing, crashing, and reading source. The learning happens in layers:

  1. Make it work (minimal behavior)
  2. Make it correct (race-free, error-aware)
  3. Make it kernel-grade (style, conventions, review readiness)

Core Concept Analysis

1. The System Call Boundary

System calls are the only legal entry into kernel mode. Understanding the boundary explains why you must never trust userspace pointers, and why kernel code must validate everything.

User Space                 Kernel Space
┌───────────────┐          ┌───────────────────────────┐
│  libc wrapper │  syscall │  do_syscall_64()          │
│  open(), read │ ───────▶ │  ksys_openat(), vfs_read  │
└───────────────┘          └───────────────────────────┘

Key insight: Every bug at this boundary is a security bug.

2. Process & Scheduling Model

The kernel represents each process as a task_struct and schedules them with CFS. The scheduler’s job is fairness under load, not just speed.

Runnable tasks → CFS runqueue → context switch → CPU executes

Key insight: Performance regressions often come from scheduling behavior, not “slow code.”

3. Memory Management & Allocation

User memory is virtual; kernel memory is physical plus mappings. The buddy allocator manages pages; SLUB manages object caches.

Physical pages → Buddy allocator → SLUB cache → kmalloc

Key insight: Memory allocation behavior explains latency spikes and fragmentation.

4. VFS and the File Abstraction

The VFS makes every filesystem look like the same set of operations. VFS glue code is where errors and performance costs appear.

read() → vfs_read() → file_operations.read() → filesystem driver

Key insight: Kernel file I/O isn’t one function; it’s a layered pipeline.

5. Device Model & Drivers

Drivers are where kernel policy meets hardware reality. You register devices, expose sysfs, and handle interrupts safely.

Device driver ↔ device model ↔ bus subsystem ↔ hardware

Key insight: Drivers are mostly state machines with strict lifecycle rules.

6. Concurrency & Synchronization

Kernel code runs concurrently on many CPUs. It uses locks, atomic ops, and RCU to protect shared state without killing performance.

CPU0: writer → lock → update
CPU1: reader → RCU → read without blocking

Key insight: Correctness is about where you can sleep and what you can lock.

7. Execution Contexts & Error Discipline

Kernel code runs in distinct contexts: process context (can sleep) and atomic/interrupt context (cannot sleep). Many bugs are simply violations of this rule.

Process context → may sleep, may allocate (GFP_KERNEL)
Interrupt context → must not sleep, use GFP_ATOMIC

Key insight: Kernel correctness is often about where you are running, not just what you are doing.

8. Kernel Infrastructure & Contribution Workflow

Kconfig, Kbuild, and coding style are not optional. Kernel patches are judged as much by clarity as by correctness.


Concept Summary Table

Concept Cluster What You Need to Internalize
Syscall boundary User/kernel transitions, pointer safety, ABI conventions
Process & scheduling task_struct, runqueues, fairness vs latency
Memory management Pages, zones, buddy allocator, SLUB caches
VFS & I/O file_operations, inodes, dentries, error paths
Device model Major/minor, sysfs, lifecycle, interrupts
Kernel concurrency spinlocks, mutexes, RCU, preemption rules
Execution contexts process vs interrupt context, sleeping rules, GFP flags
Build & debugging Kconfig, QEMU, GDB, printk, tracepoints
Contribution flow patches, reviews, style, regression discipline

Deep Dive Reading by Concept

Fundamentals & Architecture

Concept Book & Chapter Why This Matters
Syscall boundary The Linux Programming Interface — Ch. 3, 4 Understand how userspace enters the kernel
Process model Linux Kernel Development — Ch. 3 task_struct and process lifecycle
Memory management Understanding the Linux Kernel — Ch. 8 Page tables, buddy, SLUB
VFS Linux Kernel Development — Ch. 12 Filesystem abstraction and hooks

Debugging & Tooling

Concept Book & Chapter Why This Matters
Kernel debugging The Art of Debugging with GDB — Ch. 7-9 Core debugging techniques
Systems programming Linux System Programming — Ch. 1-3 Real syscall usage patterns
OS fundamentals Operating Systems: Three Easy Pieces — Ch. 5-6 Scheduling and memory

Drivers & I/O

Concept Book & Chapter Why This Matters
Device drivers Linux Kernel Development — Ch. 14 Driver model and device lifecycles
I/O and block devices Linux System Programming — Ch. 12 How kernel I/O is structured

Quick Start: Your First 48 Hours

Day 1 (4 hours):

  1. Read the “Core Concept Analysis” section above
  2. Skim /proc/uptime, /proc/meminfo, /proc/[pid]/stat
  3. Start Project 1 and parse only /proc/uptime + /proc/version

Day 2 (4 hours):

  1. Add /proc/[pid]/stat parsing to Project 1
  2. Run strace ls and compare to your output in Project 2
  3. Read “The Core Question” sections for Projects 1 and 2

End of Weekend: You can explain what /proc exposes and how syscalls flow. That is a major kernel mental model.


  1. Project 1 → Project 2 → Project 3
  2. Project 4 → Project 5 → Project 7
  3. Project 8 → Project 9 → Project 10
  4. Project 14 → Project 15

Path 2: The Driver Builder

  1. Project 3 → Project 4 → Project 5
  2. Project 11 → Project 12
  3. Project 8 (filesystem) → Project 9 (netfilter)

Path 3: The Performance & Debugging Track

  1. Project 2 → Project 6 → Project 7
  2. Project 10 → Project 13
  3. Project 14 → Project 15

Project List

Projects progress from userspace exploration to kernel internals to contribution readiness.


Project 1: Kernel Interface Explorer (The /proc and /sys Spelunker)

  • File: LEARN_LINUX_KERNEL_DEEP_DIVE.md
  • Main Programming Language: C
  • Alternative Programming Languages: Python (for initial exploration), Rust
  • Coolness Level: Level 2: Practical but Forgettable
  • Business Potential: 1. The “Resume Gold”
  • Difficulty: Level 1: Beginner
  • Knowledge Area: Kernel Interfaces / Pseudo-filesystems
  • Software or Tool: procfs, sysfs
  • Main Book: “The Linux Programming Interface” by Michael Kerrisk

What you’ll build: A comprehensive tool that reads and interprets data from /proc and /sys, presenting kernel information (CPU info, memory stats, process details, device configuration) in a human-readable dashboard format.

Why it teaches Linux kernel: Before diving into the kernel, you need to understand how it exposes information to userspace. The /proc and /sys filesystems are the kernel’s “API” to the outside world. Understanding what data is available and how it’s structured gives you a mental map of kernel subsystems.

Core challenges you’ll face:

  • Parsing /proc/[pid]/ directories → maps to understanding task_struct and process representation
  • Interpreting /proc/meminfo fields → maps to memory management subsystem concepts
  • Navigating /sys/devices hierarchy → maps to the kernel device model
  • Understanding /proc/interrupts format → maps to interrupt handling architecture

Key Concepts:

  • procfs structure: “The Linux Programming Interface” Chapter 12 - Michael Kerrisk
  • sysfs and the device model: “Linux Device Drivers, 3rd Edition” Chapter 14 - Corbet, Rubini, Kroah-Hartman
  • Process representation in kernel: “Linux Kernel Development” Chapter 3 - Robert Love
  • Memory management overview: “Understanding the Linux Kernel” Chapter 8 - Bovet & Cesati

Difficulty: Beginner Time estimate: Weekend Prerequisites: Basic C programming, familiarity with Linux command line, understanding of file I/O

Real world outcome:

$ ./kernel_explorer --dashboard
╔══════════════════════ KERNEL EXPLORER v1.0 ══════════════════════╗
║ Kernel: 6.8.0-generic    Uptime: 3d 14h 22m    Load: 0.52 0.48 0.51
╠══════════════════════════════════════════════════════════════════╣
║ CPU: Intel i7-12700K (20 cores)     Memory: 31.2 GB / 64 GB (48%)
║ Context Switches: 1,234,567,890     Interrupts: 987,654,321
╠══════════════════════════════════════════════════════════════════╣
║ TOP PROCESSES BY MEMORY:
║   PID     NAME              RSS        STATE    THREADS
║   1234    firefox          2.1 GB     S        156
║   5678    code             1.8 GB     S        89
║   9012    chrome           1.2 GB     S        45
╠══════════════════════════════════════════════════════════════════╣
║ DEVICES:
║   [block] nvme0n1 (Samsung SSD 980 PRO) - 1TB
║   [net]   eth0 (Intel I225-V) - 1000Mb/s, UP
║   [usb]   1-1: Logitech USB Receiver
╚══════════════════════════════════════════════════════════════════╝

$ ./kernel_explorer --process 1234
Process: firefox (PID 1234)
├── State: Sleeping (interruptible)
├── Parent: 1 (systemd)
├── Threads: 156
├── Memory:
│   ├── Virtual: 15.2 GB
│   ├── Resident: 2.1 GB
│   ├── Shared: 234 MB
│   └── Memory Maps: 847 regions
├── File Descriptors: 234 open
├── CPU Affinity: 0-19
└── Cgroups: /user.slice/user-1000.slice

Implementation Hints:

The /proc filesystem exposes kernel data structures as files. Key files to parse:

For system-wide information:

  • /proc/version - Kernel version string
  • /proc/uptime - System uptime in seconds
  • /proc/loadavg - Load averages and running processes
  • /proc/meminfo - Detailed memory statistics
  • /proc/cpuinfo - CPU information
  • /proc/stat - Kernel/system statistics (context switches, interrupts)
  • /proc/interrupts - Interrupt counts per CPU per IRQ

For per-process information (/proc/[pid]/):

  • stat - Process status (state, ppid, nice, threads)
  • status - Human-readable status
  • maps - Memory mappings
  • fd/ - Open file descriptors (directory of symlinks)
  • cmdline - Command line arguments
  • cgroup - Control group membership

For device information (/sys/):

  • /sys/class/ - Devices organized by class (block, net, tty)
  • /sys/devices/ - Actual device hierarchy
  • /sys/bus/ - Devices organized by bus type (pci, usb)

Questions to guide your implementation:

  1. How are process states encoded in /proc/[pid]/stat? (Hint: single character codes)
  2. What’s the difference between VmRSS, VmSize, and VmData in /proc/[pid]/status?
  3. How does /sys/class/net/eth0 relate to /sys/devices/.../net/eth0?
  4. Why are some /proc files zero-sized but still contain data?

Architecture considerations:

  • Create data structures that mirror kernel structures conceptually
  • Parse files in a streaming manner (don’t assume they fit in memory)
  • Handle the case where processes disappear between directory listing and reading

Learning milestones:

  1. You can parse /proc/[pid]/stat correctly → You understand how the kernel represents process state
  2. You interpret memory statistics accurately → You grasp virtual vs physical memory concepts
  3. You navigate the sysfs device hierarchy → You understand the kernel device model
  4. Your tool handles race conditions gracefully → You understand the dynamic nature of kernel data

The Core Question You’re Answering

“How does the kernel expose its internal state as files, and what can those files tell you about real kernel data structures?”

/proc and /sys are not just text files; they are live views into kernel state. If you can interpret them correctly, you can explain behavior without a debugger.

Concepts You Must Understand First

Stop and research these before coding:

  1. procfs semantics
    • Why do zero-length proc files still return data?
    • How are process states encoded in /proc/[pid]/stat?
    • Book Reference: The Linux Programming Interface — Ch. 12
  2. sysfs and the device model
    • How does /sys/class relate to /sys/devices?
    • What does a kobject represent?
    • Book Reference: Linux Kernel Development — Ch. 14
  3. task_struct basics
    • What fields map to PID, state, and parent?
    • Why are threads just tasks?
    • Book Reference: Linux Kernel Development — Ch. 3

Questions to Guide Your Design

  1. Data collection
    • Which files are stable across kernel versions?
    • How will you handle permission errors or missing files?
  2. Parsing strategy
    • How will you parse /proc/[pid]/stat given that the command name can contain spaces?
    • Will you stream or slurp file contents?
  3. Presentation
    • How will you normalize values (KiB vs bytes)?
    • How will you highlight outliers (top memory or CPU usage)?

Thinking Exercise

The /proc/[pid]/stat Puzzle

1234 (my process) R 1 2 3 4 0 0 0 0 0 0 0 0 0 0 0 0 20 0 1 0 123456 7890123 ...

Questions while tracing:

  • Which field is the state and how does it map to scheduling?
  • Which field represents the parent PID?
  • Why is the command name wrapped in parentheses?

The Interview Questions They’ll Ask

  1. “What’s the difference between /proc and /sys?”
  2. “Why can reading /proc race with a process exit?”
  3. “How does the kernel expose device state to user space?”
  4. “What does VmRSS vs VmSize mean?”
  5. “Why are some proc files generated on read?”

Hints in Layers

Hint 1: Start with static files Parse /proc/version and /proc/uptime first.

Hint 2: Add process iteration List /proc, filter numeric directories, then read /proc/[pid]/status.

Hint 3: Handle parsing edge cases Treat the command name in /proc/[pid]/stat as a parenthesized token.

Hint 4: Graceful failure If a process disappears, skip it and continue; never crash the scan.

Books That Will Help

Topic Book Chapter
procfs The Linux Programming Interface Ch. 12
process model Linux Kernel Development Ch. 3
memory metrics Operating Systems: Three Easy Pieces Ch. 13
sysfs/device model Linux Kernel Development Ch. 14

Common Pitfalls & Debugging

Problem 1: “Random parse errors in /proc/[pid]/stat”

  • Why: Command names with spaces shift fields
  • Fix: Parse the command name between the first ‘(‘ and last ‘)’
  • Quick test: Compare with ps -o pid,stat,comm

Problem 2: “Permission denied on /proc/[pid]/fd”

  • Why: Access is restricted for other users’ processes
  • Fix: Skip or use sudo for debugging
  • Quick test: ls /proc/1/fd as non-root

Problem 3: “Process disappears mid-read”

  • Why: The kernel removes entries at exit
  • Fix: Handle ENOENT and continue
  • Quick test: Trace a short-lived process like true

Project 2: System Call Tracer (Build Your Own strace)

  • File: LEARN_LINUX_KERNEL_DEEP_DIVE.md
  • Main Programming Language: C
  • Alternative Programming Languages: Rust, Go
  • Coolness Level: Level 3: Genuinely Clever
  • Business Potential: 1. The “Resume Gold”
  • Difficulty: Level 2: Intermediate
  • Knowledge Area: System Calls / ptrace Interface
  • Software or Tool: ptrace, strace
  • Main Book: “The Linux Programming Interface” by Michael Kerrisk

What you’ll build: A tool that traces system calls made by a process, showing the syscall name, arguments, and return value—like a simplified version of strace.

Why it teaches Linux kernel: System calls are the only way userspace can request kernel services. Understanding the syscall interface means understanding the boundary between user and kernel mode. Building a tracer forces you to learn the ptrace interface, which kernel debuggers also use.

Core challenges you’ll face:

  • Using ptrace to intercept syscalls → maps to understanding the user/kernel boundary
  • Decoding syscall numbers to names → maps to the syscall table architecture
  • Parsing syscall arguments from registers → maps to calling conventions and ABI
  • Handling different syscall ABIs (32-bit vs 64-bit) → maps to architecture-specific kernel code

Key Concepts:

  • System call mechanism: “The Linux Programming Interface” Chapters 3 and 44 - Michael Kerrisk
  • ptrace interface: “Linux System Programming” Chapter 10 - Robert Love
  • x86-64 calling convention: System V AMD64 ABI specification
  • Syscall table organization: Kernel source arch/x86/entry/syscalls/

Difficulty: Intermediate Time estimate: 1-2 weeks Prerequisites: Project 1 completed, understanding of process execution, basic assembly knowledge helpful

Real world outcome:

$ ./mytrace ls -la
[1234] execve("/bin/ls", ["ls", "-la"], [...]) = 0
[1234] brk(NULL) = 0x55a1e2f3d000
[1234] arch_prctl(0x3001, 0x7ffd2a3f3160) = -1 EINVAL
[1234] access("/etc/ld.so.preload", R_OK) = -1 ENOENT
[1234] openat(AT_FDCWD, "/etc/ld.so.cache", O_RDONLY|O_CLOEXEC) = 3
[1234] fstat(3, {st_mode=S_IFREG|0644, st_size=87441, ...}) = 0
[1234] mmap(NULL, 87441, PROT_READ, MAP_PRIVATE, 3, 0) = 0x7f8c12345000
[1234] close(3) = 0
... (file operations for directory listing)
[1234] write(1, "total 48\ndrwxr-xr-x 5 user...", 234) = 234
[1234] close(1) = 0
[1234] exit_group(0) = ?
+++ exited with 0 +++

$ ./mytrace -c ls    # Summary mode
% time     seconds  usecs/call     calls    errors syscall
------ ----------- ----------- --------- --------- ----------------
 45.23    0.000234          11        21           mmap
 22.11    0.000114           8        14           close
 15.67    0.000081           6        13           openat
  8.45    0.000044           7         6           fstat
  5.32    0.000027          27         1           execve
  3.22    0.000017           8         2         2 access
------ ----------- ----------- --------- --------- ----------------
100.00    0.000517                    57         2 total

Implementation Hints:

The ptrace system call is the foundation of debuggers and tracers:

ptrace(PTRACE_TRACEME, ...)      - Child requests to be traced
ptrace(PTRACE_SYSCALL, pid, ...) - Continue until next syscall entry/exit
ptrace(PTRACE_GETREGS, pid, ...) - Read registers (syscall number & args)
ptrace(PTRACE_PEEKDATA, pid, addr, ...) - Read traced process memory

Architecture for your tracer:

  1. Fork a child process
  2. Child calls ptrace(PTRACE_TRACEME) then execve() the target
  3. Parent waits for child to stop at execve entry
  4. Parent loops: ptrace(PTRACE_SYSCALL), wait, read registers, decode, print

On x86-64, syscall arguments are in registers:

  • Syscall number: rax (before call) / return value: rax (after)
  • Arguments: rdi, rsi, rdx, r10, r8, r9 (in order)

Questions to guide implementation:

  1. How do you distinguish syscall entry from syscall exit? (Hint: count stops)
  2. How do you read string arguments like filenames? (Hint: PTRACE_PEEKDATA)
  3. How do you decode flags like O_RDONLY | O_CLOEXEC?
  4. What happens when the traced process forks?

Decoding syscall numbers: Look at /usr/include/asm/unistd_64.h or the kernel’s arch/x86/entry/syscalls/syscall_64.tbl

Learning milestones:

  1. You can trace a simple program’s syscalls → You understand the ptrace mechanism
  2. You decode syscall arguments correctly → You understand the x86-64 ABI
  3. You handle string and buffer arguments → You understand process memory access
  4. You follow child processes → You understand fork/clone from the kernel’s perspective

The Core Question You’re Answering

“What does a program actually ask the kernel to do, and how do those requests map to ABI details?”

A syscall trace turns kernel behavior into a readable timeline. This makes performance and correctness issues visible.

Concepts You Must Understand First

  1. Syscall ABI on x86-64
    • Which registers carry arguments?
    • How do you read return values?
    • Book Reference: The Linux Programming Interface — Ch. 3
  2. ptrace basics
    • What does PTRACE_SYSCALL do?
    • How does the tracer synchronize with the tracee?
    • Book Reference: Linux System Programming — Ch. 10
  3. ELF exec flow
    • What happens at execve?
    • Book Reference: Computer Systems: A Programmer’s Perspective — Ch. 8

Questions to Guide Your Design

  1. Trace accuracy
    • How will you distinguish syscall entry vs exit?
    • How will you decode errno values?
  2. Argument decoding
    • How will you read string arguments from tracee memory?
    • How will you handle pointers to structs?
  3. Process tree
    • Will you follow forks? How?

Thinking Exercise

openat(AT_FDCWD, "/etc/ld.so.cache", O_RDONLY|O_CLOEXEC) = 3

Questions while tracing:

  • Which register holds the pathname pointer?
  • What does AT_FDCWD represent?
  • Why is openat used instead of open?

The Interview Questions They’ll Ask

  1. “How does ptrace intercept syscalls?”
  2. “What is the difference between syscall entry and exit stops?”
  3. “How do you safely read user memory from the tracer?”
  4. “Why are there different syscall numbers across architectures?”
  5. “What happens when the tracee forks?”

Hints in Layers

Hint 1: Trace only one process Start with PTRACE_TRACEME and a single child.

Hint 2: Track entry/exit Keep a boolean toggle for entry vs exit stops.

Hint 3: Decode syscall names Load a table from /usr/include/asm/unistd_64.h.

Hint 4: Read strings Use PTRACE_PEEKDATA in a loop until \0.

Books That Will Help

Topic Book Chapter
Syscalls The Linux Programming Interface Ch. 3-4
ptrace Linux System Programming Ch. 10
ABI basics Computer Systems: A Programmer’s Perspective Ch. 3
Process exec Linux Kernel Development Ch. 3

Common Pitfalls & Debugging

Problem 1: “All syscalls show the same number”

  • Why: Reading registers at the wrong stop
  • Fix: Read at syscall entry, not after return
  • Quick test: Trace getpid and verify the syscall number

Problem 2: “Strings are garbage”

  • Why: Wrong pointer width or address space
  • Fix: Read from the tracee’s address space, not yours
  • Quick test: Trace openat and dump the pathname

Problem 3: “Tracer hangs”

  • Why: Missing waitpid or not continuing the child
  • Fix: Always wait and resume with PTRACE_SYSCALL

Project 3: Build and Boot Your First Custom Kernel

  • File: LEARN_LINUX_KERNEL_DEEP_DIVE.md
  • Main Programming Language: C (kernel), Shell (scripting)
  • Alternative Programming Languages: N/A (kernel is C-only, now also Rust)
  • Coolness Level: Level 4: Hardcore Tech Flex
  • Business Potential: 1. The “Resume Gold”
  • Difficulty: Level 2: Intermediate
  • Knowledge Area: Kernel Build System / Boot Process
  • Software or Tool: QEMU, GCC, Make, Kconfig
  • Main Book: “Linux Kernel Programming” by Kaiwan N. Billimoria

What you’ll build: A complete workflow for building the Linux kernel from source, creating a minimal root filesystem, and booting it in QEMU—with modifications you can observe (like a custom kernel version string).

Why it teaches Linux kernel: You cannot contribute to the kernel without being able to build and test it. This project establishes your development workflow. You’ll understand Kconfig, the build system, and how the kernel boots—foundational knowledge for everything that follows.

Core challenges you’ll face:

  • Configuring the kernel with Kconfig → maps to understanding kernel compile-time options
  • Creating a bootable system with initramfs → maps to the boot process and init
  • Setting up QEMU with kernel debugging → maps to kernel debugging techniques
  • Modifying kernel code and seeing effects → maps to the development feedback loop

Key Concepts:

  • Kernel build system: “Linux Kernel Programming” Chapter 2-3 - Kaiwan N. Billimoria
  • Kconfig system: Kernel documentation Documentation/kbuild/
  • Boot process: “How Linux Works” Chapter 5 - Brian Ward
  • QEMU kernel development: LWN.net “Speeding up kernel development with QEMU”

Difficulty: Intermediate Time estimate: Weekend Prerequisites: Projects 1-2 completed, basic understanding of makefiles

Real world outcome:

$ ./build_kernel.sh
Configuring kernel for QEMU...
Building kernel (this takes a few minutes)...
  [kernel build output]
Creating minimal initramfs...
Kernel ready at: bzImage
Initramfs ready at: initramfs.cpio.gz

$ ./boot_qemu.sh
[    0.000000] Linux version 6.8.0-custom (you@yourhost) ...
[    0.000000] Command line: console=ttyS0
[    0.012345] x86/fpu: Supporting XSAVE feature...
...
[    0.234567] Run /init as init process

Welcome to your custom kernel!
Kernel version: 6.8.0-MYKERNEL-learning
# uname -a
Linux (none) 6.8.0-MYKERNEL-learning #1 SMP PREEMPT_DYNAMIC ...

# cat /proc/version
Linux version 6.8.0-MYKERNEL-learning (you@yourhost) ...

# dmesg | grep -i "hello"
[    0.001234] Hello from my modified kernel startup!

Implementation Hints:

Kernel configuration flow:

  1. make defconfig - Create default config
  2. make kvm_guest.config - Add KVM-specific options
  3. make menuconfig - Interactively customize (optional)
  4. The config is stored in .config

Build the kernel:

make -j$(nproc) bzImage    # Builds arch/x86/boot/bzImage

Creating a minimal initramfs: You need at least an init program. Options:

  • Use BusyBox (static build) as /init
  • Or create a simple C program that prints info and spawns a shell

QEMU command for kernel development:

qemu-system-x86_64 \
    -kernel bzImage \
    -initrd initramfs.cpio.gz \
    -append "console=ttyS0 nokaslr" \
    -nographic \
    -enable-kvm \
    -m 1G \
    -s  # Enable GDB stub on port 1234

Making your first kernel modification: Edit init/main.c and find start_kernel(). Add a printk:

pr_info("Hello from my modified kernel startup!\n");

Customizing the version string: Edit the top-level Makefile and modify EXTRAVERSION:

EXTRAVERSION = -MYKERNEL-learning

Questions to guide implementation:

  1. What’s the difference between bzImage, vmlinux, and vmlinuz?
  2. Why do we need nokaslr for debugging?
  3. What does the -s flag in QEMU enable?
  4. How does the kernel find and run /init?

Learning milestones:

  1. You successfully build the kernel → You understand Kconfig and the build system
  2. QEMU boots your kernel → You understand the boot process basics
  3. You see your custom version string → You’ve modified kernel source successfully
  4. You can attach GDB to the running kernel → You’re ready for kernel debugging

The Core Question You’re Answering

“How does a kernel go from source code to a running system you can debug?”

Building and booting teaches you the full pipeline: configuration, compilation, bootloader handoff, and early kernel init.

Concepts You Must Understand First

  1. Kernel build artifacts
    • What is bzImage vs vmlinux?
    • Book Reference: How Linux Works — Ch. 5
  2. Boot flow
    • How does the kernel find /init?
    • Book Reference: Linux Kernel Development — Ch. 2
  3. Kconfig
    • What does .config control?
    • Book Reference: Linux Kernel Development — Ch. 2

Questions to Guide Your Design

  1. Build speed
    • How will you reduce rebuild time (ccache, minimal config)?
  2. Debuggability
    • Will you disable KASLR? Will you enable debug symbols?
  3. Reproducibility
    • How will you script the build and boot steps?

Thinking Exercise

qemu-system-x86_64 -kernel bzImage -initrd initramfs.cpio.gz -append "console=ttyS0"

Questions while tracing:

  • Which component passes the kernel command line?
  • What happens if /init is missing?

The Interview Questions They’ll Ask

  1. “What is Kconfig and why is it critical?”
  2. “How do you debug early boot failures?”
  3. “Why disable KASLR for debugging?”
  4. “What is initramfs and why use it?”
  5. “How does QEMU help kernel development?”

Hints in Layers

Hint 1: Start with defconfig Use make defconfig and boot before changing anything.

Hint 2: Add a custom printk Modify init/main.c to verify your build actually runs.

Hint 3: Script boot A single boot_qemu.sh file saves hours.

Hint 4: Add GDB support Use -s in QEMU and target remote :1234.

Books That Will Help

Topic Book Chapter
Kernel build Linux Kernel Development Ch. 2
Boot process How Linux Works Ch. 5
ELF basics Computer Systems: A Programmer’s Perspective Ch. 8
Debugging The Art of Debugging with GDB Ch. 7

Common Pitfalls & Debugging

Problem 1: “Kernel boots to panic: no init found”

  • Why: initramfs missing /init
  • Fix: Add a static BusyBox or custom init
  • Quick test: lsinitramfs initramfs.cpio.gz | rg '^init$'

Problem 2: “QEMU shows nothing”

  • Why: Console not set to ttyS0
  • Fix: Add console=ttyS0 to the kernel cmdline
  • Quick test: Verify QEMU output on the terminal

Problem 3: “GDB symbols missing”

  • Why: Booting bzImage without vmlinux for symbols
  • Fix: Load vmlinux in GDB

Project 4: Hello World Kernel Module

  • File: LEARN_LINUX_KERNEL_DEEP_DIVE.md
  • Main Programming Language: C
  • Alternative Programming Languages: Rust (kernel Rust is emerging)
  • Coolness Level: Level 3: Genuinely Clever
  • Business Potential: 1. The “Resume Gold”
  • Difficulty: Level 2: Intermediate
  • Knowledge Area: Loadable Kernel Modules
  • Software or Tool: insmod, rmmod, modprobe, dmesg
  • Main Book: “Linux Device Drivers, 3rd Edition” by Corbet, Rubini, Kroah-Hartman

What you’ll build: A series of increasingly sophisticated kernel modules—from “hello world” to modules that create entries in /proc and /sys, accept parameters, and demonstrate proper cleanup.

Why it teaches Linux kernel: Kernel modules are the gateway to kernel development. They let you write and test kernel code without rebooting. You’ll learn kernel memory allocation, the module lifecycle, and how to expose data to userspace—skills essential for driver development.

Core challenges you’ll face:

  • Understanding the module lifecycle (init/exit) → maps to kernel object lifecycle management
  • Using kernel APIs (printk, kmalloc) → maps to kernel programming conventions
  • Creating /proc entries → maps to the procfs interface
  • Handling module parameters → maps to kernel configuration mechanisms
  • Proper cleanup and error handling → maps to kernel resource management

Key Concepts:

  • Module basics: “Linux Device Drivers, 3rd Edition” Chapter 2 - Corbet et al.
  • procfs interface: “Linux Kernel Programming” Chapter 7 - Kaiwan N. Billimoria
  • Kernel memory allocation: “Understanding the Linux Kernel” Chapter 8 - Bovet & Cesati
  • Module parameters: Kernel documentation Documentation/admin-guide/kernel-parameters.txt

Difficulty: Intermediate Time estimate: 1 week Prerequisites: Project 3 completed (you need a kernel build environment)

Real world outcome:

# Load the basic module
$ sudo insmod hello.ko
$ dmesg | tail -3
[12345.678] hello: module loaded
[12345.678] hello: Hello from kernel space!
[12345.678] hello: Running on CPU 0, PID 1234 (insmod)

# Unload it
$ sudo rmmod hello
$ dmesg | tail -1
[12346.789] hello: Goodbye from kernel space!

# Load module with parameters
$ sudo insmod hello_params.ko debug_level=2 message="Custom greeting"
$ dmesg | tail
[12350.123] hello_params: Debug level set to 2
[12350.123] hello_params: Message: Custom greeting

# The /proc interface module
$ sudo insmod procfs_example.ko
$ cat /proc/kernel_explorer/info
Module: procfs_example
Loaded at: 1234567890
Read count: 1

$ cat /proc/kernel_explorer/info
Module: procfs_example
Loaded at: 1234567890
Read count: 2

# The sysfs module
$ sudo insmod sysfs_example.ko
$ ls /sys/kernel/my_module/
value  description

$ cat /sys/kernel/my_module/value
42

$ echo 100 > /sys/kernel/my_module/value
$ cat /sys/kernel/my_module/value
100

Implementation Hints:

Basic module structure:

#include <linux/init.h>
#include <linux/module.h>

MODULE_LICENSE("GPL");
MODULE_AUTHOR("Your Name");
MODULE_DESCRIPTION("Learning kernel modules");

static int __init my_init(void) { ... }
static void __exit my_exit(void) { ... }

module_init(my_init);
module_exit(my_exit);

Makefile for out-of-tree module:

obj-m += hello.o
KDIR := /lib/modules/$(shell uname -r)/build

all:
    make -C $(KDIR) M=$(PWD) modules

clean:
    make -C $(KDIR) M=$(PWD) clean

Key kernel APIs to learn:

  • printk() / pr_info(), pr_err(), pr_debug() - Kernel logging
  • kmalloc() / kfree() - Kernel memory allocation
  • proc_create() / proc_remove() - procfs entries
  • kobject_create_and_add() - sysfs objects
  • module_param() - Module parameters

Questions to guide implementation:

  1. Why must kernel modules use kmalloc instead of malloc?
  2. What happens if you forget to call cleanup functions in module_exit?
  3. Why do we need MODULE_LICENSE("GPL")?
  4. How do you debug a module that causes a kernel panic?

Build a progression:

  1. hello.ko - Just prints messages on load/unload
  2. hello_params.ko - Accepts module parameters
  3. procfs_example.ko - Creates a readable /proc entry
  4. sysfs_example.ko - Creates read/write sysfs attributes
  5. timer_example.ko - Uses kernel timers

Learning milestones:

  1. Your module loads and unloads cleanly → You understand the module lifecycle
  2. Module parameters work → You understand kernel configuration mechanisms
  3. procfs entries appear and function → You can expose data to userspace
  4. sysfs read/write works → You understand the kernel device model foundation
  5. No memory leaks in rmmod → You’ve mastered kernel resource management

The Core Question You’re Answering

“How do kernel modules load, run, and safely clean up without rebooting the system?”

Modules are the safest way to practice kernel code because they run in-kernel but can be removed.

Concepts You Must Understand First

  1. Module lifecycle
    • What is module_init and module_exit?
    • Book Reference: Linux Kernel Development — Ch. 7
  2. Kernel logging
    • When to use pr_info vs pr_err?
    • Book Reference: Linux Kernel Development — Ch. 1
  3. Kernel memory allocation
    • Why use kmalloc?
    • Book Reference: Understanding the Linux Kernel — Ch. 8

Questions to Guide Your Design

  1. Safety
    • How will you guarantee cleanup on failure paths?
  2. Interfaces
    • Will you expose data via procfs or sysfs?
  3. Observability
    • What logs prove the module does what you think?

Thinking Exercise

static int __init hello_init(void) {
    pr_info("hello: init
");
    return 0;
}

Questions while tracing:

  • What happens if hello_init returns non-zero?
  • Where does that error show up in userspace?

The Interview Questions They’ll Ask

  1. “Why do modules need a GPL license tag?”
  2. “What’s the difference between procfs and sysfs?”
  3. “Why is kmalloc required in kernel space?”
  4. “How do you debug a module crash?”
  5. “When would you prefer a module vs built-in code?”

Hints in Layers

Hint 1: Start with printk Load/unload and watch dmesg for messages.

Hint 2: Add a parameter Use module_param to pass config on load.

Hint 3: Add /proc Expose a read-only proc entry first.

Hint 4: Add sysfs Use kobject_create_and_add to create attributes.

Books That Will Help

Topic Book Chapter
Modules Linux Kernel Development Ch. 7
procfs/sysfs Linux System Programming Ch. 8
Kernel memory Understanding the Linux Kernel Ch. 8

Common Pitfalls & Debugging

Problem 1: “Module won’t load: invalid module format”

  • Why: Kernel version mismatch
  • Fix: Build against the running kernel headers
  • Quick test: modinfo hello.ko | rg vermagic

Problem 2: “System freezes on rmmod”

  • Why: Missing cleanup or stuck references
  • Fix: Ensure module_exit releases resources
  • Quick test: Reboot and inspect logs

Problem 3: “No logs show in dmesg”

  • Why: Wrong log level or dmesg restricted
  • Fix: Use pr_info and run sudo dmesg -w

Project 5: Character Device Driver

  • File: LEARN_LINUX_KERNEL_DEEP_DIVE.md
  • Main Programming Language: C
  • Alternative Programming Languages: Rust (experimental kernel Rust support)
  • Coolness Level: Level 4: Hardcore Tech Flex
  • Business Potential: 1. The “Resume Gold”
  • Difficulty: Level 3: Advanced
  • Knowledge Area: Device Drivers / Character Devices
  • Software or Tool: mknod, /dev, udev
  • Main Book: “Linux Device Drivers, 3rd Edition” by Corbet, Rubini, Kroah-Hartman

What you’ll build: A fully functional character device driver that creates a device in /dev, supports open, read, write, ioctl, and proper file operations. The device will act as a simple message buffer—write messages in, read them out.

Why it teaches Linux kernel: Device drivers are the largest part of the kernel codebase. Character devices are the simplest driver type and teach you the fundamentals: device registration, file operations callbacks, user/kernel data copying, and concurrency. This is the stepping stone to real hardware drivers.

Core challenges you’ll face:

  • Registering a character device → maps to major/minor numbers and device registration
  • Implementing file_operations callbacks → maps to the VFS interface
  • Copying data between user and kernel space → maps to security and memory isolation
  • Handling concurrent access → maps to kernel synchronization primitives
  • Proper ioctl implementation → maps to device control interfaces

Key Concepts:

  • Character device basics: “Linux Device Drivers, 3rd Edition” Chapter 3 - Corbet et al.
  • File operations structure: Kernel source include/linux/fs.h
  • User/kernel data transfer: “Linux Kernel Programming” Chapter 8 - Kaiwan N. Billimoria
  • Device number allocation: “Linux Device Drivers, 3rd Edition” Chapter 3

Difficulty: Advanced Time estimate: 1-2 weeks Prerequisites: Project 4 completed, understanding of file operations

Real world outcome:

# Load the driver
$ sudo insmod chardev.ko
$ dmesg | tail -3
[12345.678] chardev: Registered with major number 240
[12345.678] chardev: Device class created
[12345.679] chardev: Device node /dev/msgbuffer created

# The device appears automatically (udev integration)
$ ls -la /dev/msgbuffer
crw-rw-r-- 1 root root 240, 0 Dec 20 10:00 /dev/msgbuffer

# Write a message to the device
$ echo "Hello, kernel!" > /dev/msgbuffer
$ dmesg | tail -1
[12350.123] chardev: Received 15 bytes from user

# Read it back
$ cat /dev/msgbuffer
Hello, kernel!

# Multiple messages (acts like a queue)
$ echo "First message" > /dev/msgbuffer
$ echo "Second message" > /dev/msgbuffer
$ cat /dev/msgbuffer
First message
Second message

# Use ioctl to control the device
$ ./chardev_test
Device opened successfully
Setting buffer size to 4096... OK
Getting buffer info:
  Current size: 4096
  Messages stored: 2
  Total bytes: 29
Clearing buffer... OK
Messages stored: 0

Implementation Hints:

Device registration flow:

  1. Allocate major/minor numbers: alloc_chrdev_region()
  2. Initialize cdev structure: cdev_init(), cdev_add()
  3. Create device class: class_create()
  4. Create device node: device_create()

The file_operations structure:

static const struct file_operations fops = {
    .owner = THIS_MODULE,
    .open = device_open,
    .release = device_release,
    .read = device_read,
    .write = device_write,
    .unlocked_ioctl = device_ioctl,
};

Critical: Copying data safely between user and kernel:

  • NEVER dereference user pointers directly in kernel code!
  • Use copy_to_user() and copy_from_user()
  • These functions handle page faults and return bytes NOT copied

Concurrency protection:

static DEFINE_MUTEX(device_mutex);

static ssize_t device_read(...) {
    if (mutex_lock_interruptible(&device_mutex))
        return -ERESTARTSYS;
    // ... do work ...
    mutex_unlock(&device_mutex);
}

ioctl implementation: Define commands using macros from <linux/ioctl.h>:

#define MYDEV_IOC_MAGIC 'k'
#define MYDEV_IOC_CLEAR    _IO(MYDEV_IOC_MAGIC, 0)
#define MYDEV_IOC_SETSIZE  _IOW(MYDEV_IOC_MAGIC, 1, int)
#define MYDEV_IOC_GETINFO  _IOR(MYDEV_IOC_MAGIC, 2, struct device_info)

Questions to guide implementation:

  1. What’s the difference between major and minor device numbers?
  2. Why do we need separate open and release callbacks?
  3. What happens if copy_from_user fails partway through?
  4. How do you handle a read when no data is available? (Hint: blocking vs non-blocking)
  5. Why use unlocked_ioctl instead of the old ioctl?

Learning milestones:

  1. Device appears in /dev → You understand device registration
  2. read/write work correctly → You understand user/kernel data transfer
  3. Multiple processes can access the device safely → You understand kernel locking
  4. ioctl commands work → You understand device control interfaces
  5. Module unload cleans up completely → You’ve mastered the device lifecycle

The Core Question You’re Answering

“How does the kernel expose a device as a file, and how do you safely move data across that boundary?”

Character devices teach the exact contract between userspace and kernelspace.

Concepts You Must Understand First

  1. VFS file operations
    • What is struct file_operations?
    • Book Reference: Linux Kernel Development — Ch. 12
  2. User/kernel memory safety
    • Why must you use copy_to_user?
    • Book Reference: Linux System Programming — Ch. 2
  3. Device numbers
    • What are major/minor numbers?
    • Book Reference: Linux Kernel Development — Ch. 14

Questions to Guide Your Design

  1. Concurrency
    • Will the device be exclusive or shared?
    • How will you lock access?
  2. Buffering model
    • Is it a ring buffer, queue, or single message?
  3. ioctl surface
    • Which operations should be read/write vs ioctl?

Thinking Exercise

ssize_t device_read(struct file *f, char __user *buf, size_t len, loff_t *off);

Questions while tracing:

  • What should happen if len > available data?
  • How do you handle partial reads?

The Interview Questions They’ll Ask

  1. “What does copy_from_user protect against?”
  2. “How do you prevent race conditions in a device driver?”
  3. “What’s the difference between character and block devices?”
  4. “Why use ioctl at all?”
  5. “How does udev create /dev nodes?”

Hints in Layers

Hint 1: Register the device Use alloc_chrdev_region and cdev_add.

Hint 2: Minimal read/write Implement read and write with a static buffer.

Hint 3: Add locking Guard the buffer with a mutex.

Hint 4: Add ioctl Define ioctl commands with _IO, _IOR, _IOW.

Books That Will Help

Topic Book Chapter
VFS Linux Kernel Development Ch. 12
Device model Linux Kernel Development Ch. 14
User/kernel copy Linux System Programming Ch. 2

Common Pitfalls & Debugging

Problem 1: “Kernel OOPS on read”

  • Why: Dereferencing userspace pointer directly
  • Fix: Use copy_to_user
  • Quick test: Run with CONFIG_DEBUG_KERNEL and check logs

Problem 2: “Device node missing”

  • Why: device_create failed or udev rules missing
  • Fix: Create with mknod manually as a test

Problem 3: “Race conditions with multiple writers”

  • Why: No locking around shared buffer
  • Fix: Add mutex or spinlock depending on context

Project 6: Memory Allocator Visualizer

  • File: LEARN_LINUX_KERNEL_DEEP_DIVE.md
  • Main Programming Language: C
  • Alternative Programming Languages: Python (for visualization frontend)
  • Coolness Level: Level 3: Genuinely Clever
  • Business Potential: 1. The “Resume Gold”
  • Difficulty: Level 3: Advanced
  • Knowledge Area: Memory Management / Kernel Memory Allocators
  • Software or Tool: /proc/buddyinfo, /proc/slabinfo, ftrace
  • Main Book: “Understanding the Linux Kernel” by Bovet & Cesati

What you’ll build: A tool that visualizes how the Linux kernel allocates memory—showing the buddy allocator’s page pools, slab/slub cache status, and memory fragmentation over time. It will include a kernel module that tracks allocations and a userspace visualizer.

Why it teaches Linux kernel: Memory management is one of the kernel’s most complex subsystems. Understanding how kmalloc actually works—from the slab allocator to the buddy system—gives you insight into performance characteristics and helps you write efficient kernel code. This knowledge is essential for debugging memory-related kernel issues.

Core challenges you’ll face:

  • Parsing /proc/buddyinfo → maps to understanding the buddy allocator
  • Understanding /proc/slabinfo → maps to slab/slub allocator internals
  • Tracking allocations with tracepoints → maps to kernel tracing infrastructure
  • Detecting memory fragmentation → maps to memory compaction concepts

Key Concepts:

  • Buddy allocator: “Understanding the Linux Kernel” Chapter 8 - Bovet & Cesati
  • Slab allocator: “Linux Kernel Development” Chapter 12 - Robert Love
  • SLUB internals: Kernel documentation Documentation/mm/slub.rst
  • Memory zones: “Linux Kernel Programming” Chapter 11 - Kaiwan N. Billimoria

Difficulty: Advanced Time estimate: 2 weeks Prerequisites: Projects 4-5 completed, understanding of virtual memory concepts

Real world outcome:

$ ./mem_visualizer --buddy
═══════════════════ BUDDY ALLOCATOR STATUS ═══════════════════
Zone: DMA (0-16MB)
  Order:    0    1    2    3    4    5    6    7    8    9   10
  Free:     1    1    0    1    1    1    0    0    1    1    3

Zone: DMA32 (16MB-4GB)
  Order:    0    1    2    3    4    5    6    7    8    9   10
  Free:   234  156   89   45   23   12    6    3    1    0    0

Zone: Normal (4GB+)
  Order:    0    1    2    3    4    5    6    7    8    9   10
  Free:  1823  945  234  123   67   34   12    5    2    1    0

Fragmentation Index: 0.23 (low)
Largest contiguous block: 4MB (order 10, DMA32)

$ ./mem_visualizer --slab
═══════════════════ SLAB CACHE STATUS ═══════════════════
Cache Name            Active    Total    ObjSize    SlabSize
─────────────────────────────────────────────────────────────
kmalloc-8                156      256        8        4096
kmalloc-16               234      512       16        4096
kmalloc-32               567     1024       32        4096
kmalloc-64              1234     2048       64        4096
kmalloc-128              456      768      128        4096
kmalloc-256              123      256      256        4096
task_struct               89       96     9792       32768
files_cache              234      256      704        4096
inode_cache             1567     2048      600        4096
dentry                  4523     5120      192        4096

Memory in slabs: 45.6 MB
Fragmentation: 12.3%

$ sudo ./mem_visualizer --trace &
[Tracking memory allocations...]

$ stress --vm 1 --vm-bytes 100M &
$ ./mem_visualizer --report
═══════════════════ ALLOCATION TRACE REPORT ═══════════════════
Last 10 seconds:
  kmalloc calls: 12,456
  kfree calls: 11,234
  Peak allocated: 234 MB

Top allocating call sites:
  1. ext4_inode_alloc (fs/ext4/ialloc.c:789)     - 2,345 calls
  2. alloc_skb (net/core/skbuff.c:234)          - 1,890 calls
  3. kmem_cache_alloc (mm/slub.c:2890)          - 1,234 calls

Implementation Hints:

Understanding the buddy allocator: The buddy allocator manages physical pages. Pages are grouped by “order” (power of 2):

  • Order 0 = 1 page (4KB)
  • Order 1 = 2 pages (8KB)
  • Order 10 = 1024 pages (4MB)

/proc/buddyinfo shows free pages at each order for each zone.

Understanding slab allocators: Slabs cache kernel objects of fixed sizes to avoid fragmentation:

Page frames (from buddy) → Slabs → Objects (task_struct, inode, etc.)

/proc/slabinfo shows cache statistics. /sys/kernel/slab/ provides per-cache details (if CONFIG_SLUB_DEBUG enabled).

Tracing allocations: Use ftrace tracepoints:

echo 1 > /sys/kernel/debug/tracing/events/kmem/kmalloc/enable
cat /sys/kernel/debug/tracing/trace_pipe

Or write a kernel module that uses register_kprobe() on kmalloc.

Visualization approach:

  1. Kernel module exports data via debugfs or procfs
  2. Userspace tool reads and formats the data
  3. Consider using ASCII art or ncurses for terminal visualization
  4. For graphical output, export data to JSON and use a web frontend

Questions to guide implementation:

  1. Why does the buddy allocator use powers of 2?
  2. What happens when you kmalloc(100)? What size does the slab give you?
  3. How does the kernel decide which zone to allocate from?
  4. What causes external fragmentation vs internal fragmentation?

Learning milestones:

  1. You interpret /proc/buddyinfo correctly → You understand physical page allocation
  2. You explain slab cache utilization → You understand object caching
  3. You detect fragmentation patterns → You understand memory pressure scenarios
  4. Your tracer captures allocation call sites → You understand kernel tracing

The Core Question You’re Answering

“How does the kernel allocate memory under pressure, and what does fragmentation look like in reality?”

This project turns allocator internals into observable behavior.

Concepts You Must Understand First

  1. Buddy allocator
    • Why does it allocate in powers of two?
    • Book Reference: Understanding the Linux Kernel — Ch. 8
  2. SLUB/SLAB caches
    • What problem do object caches solve?
    • Book Reference: Linux Kernel Development — Ch. 12
  3. Memory zones
    • Why do DMA/DMA32/Normal exist?
    • Book Reference: Operating Systems: Three Easy Pieces — Ch. 13

Questions to Guide Your Design

  1. Sampling frequency
    • How often do you sample /proc and tracing?
  2. Data format
    • Will you expose JSON for a frontend?
  3. Overhead
    • How do you avoid making allocations to trace allocations?

Thinking Exercise

kmalloc(100) → which cache? what internal size? what fragmentation?

Questions while tracing:

  • What size class would satisfy 100 bytes?
  • What is internal fragmentation in this case?

The Interview Questions They’ll Ask

  1. “Why does the buddy allocator use orders?”
  2. “What is slab fragmentation?”
  3. “How does the kernel handle memory pressure?”
  4. “What is the difference between SLAB and SLUB?”
  5. “Why can kmalloc fail even with free memory?”

Hints in Layers

Hint 1: Parse /proc/buddyinfo Start with a simple parser and print orders.

Hint 2: Add /proc/slabinfo Summarize the top 5 caches by memory use.

Hint 3: Add tracing Use tracepoints kmem/kmalloc and kmem/kfree.

Hint 4: Visualize trends Plot fragmentation over time rather than a single snapshot.

Books That Will Help

Topic Book Chapter
Buddy allocator Understanding the Linux Kernel Ch. 8
SLUB Linux Kernel Development Ch. 12
Memory pressure Operating Systems: Three Easy Pieces Ch. 13

Common Pitfalls & Debugging

Problem 1: “Numbers don’t match reality”

  • Why: Units are pages, not bytes
  • Fix: Multiply by PAGE_SIZE
  • Quick test: Compare against free -m

Problem 2: “Tracing floods the system”

  • Why: Tracepoints enabled without filtering
  • Fix: Filter by PID or sampling window

Problem 3: “Missing slab caches”

  • Why: SLUB debug not enabled
  • Fix: Rebuild kernel with CONFIG_SLUB_DEBUG

Project 7: Process Scheduler Analyzer

  • File: LEARN_LINUX_KERNEL_DEEP_DIVE.md
  • Main Programming Language: C
  • Alternative Programming Languages: Python (for analysis scripts)
  • Coolness Level: Level 3: Genuinely Clever
  • Business Potential: 1. The “Resume Gold”
  • Difficulty: Level 3: Advanced
  • Knowledge Area: Process Scheduling / CFS
  • Software or Tool: /proc/sched_debug, perf sched, ftrace
  • Main Book: “Linux Kernel Development” by Robert Love

What you’ll build: A tool that analyzes Linux’s Completely Fair Scheduler (CFS) in action—visualizing run queues, tracking process vruntime, measuring scheduling latency, and demonstrating how priorities affect scheduling decisions.

Why it teaches Linux kernel: The scheduler is the heart of multitasking. Understanding CFS means understanding how the kernel decides what runs and when. This knowledge is crucial for performance tuning and understanding why processes behave the way they do.

Core challenges you’ll face:

  • Understanding vruntime and the red-black tree → maps to CFS core algorithm
  • Parsing /proc/[pid]/sched data → maps to per-process scheduler statistics
  • Measuring scheduling latency → maps to real-time considerations
  • Observing nice values effects → maps to priority and weight calculations

Key Concepts:

  • CFS algorithm: “Linux Kernel Development” Chapter 4 - Robert Love
  • Scheduler implementation: Kernel source kernel/sched/fair.c
  • Real-time scheduling: “Operating Systems: Three Easy Pieces” Scheduling chapters
  • perf sched tool: perf sched record and perf sched latency

Difficulty: Advanced Time estimate: 1-2 weeks Prerequisites: Projects 1-4 completed, understanding of process concepts

Real world outcome:

$ ./sched_analyzer --overview
═══════════════════ CFS SCHEDULER STATUS ═══════════════════
System Load: 2.34 (4 CPUs)
Total runnable tasks: 156
Context switches (last second): 12,456

Per-CPU Run Queue Status:
  CPU0: 12 tasks, min_vruntime=123456789012, load=45%
  CPU1: 15 tasks, min_vruntime=123456789234, load=52%
  CPU2:  8 tasks, min_vruntime=123456789001, load=38%
  CPU3: 11 tasks, min_vruntime=123456789123, load=41%

$ ./sched_analyzer --trace-process firefox
═══════════════════ PROCESS SCHEDULING TRACE ═══════════════════
Process: firefox (PID 1234)
  Nice value: 0 (weight: 1024)
  Policy: SCHED_NORMAL (CFS)

Time     CPU  Event        Duration    vruntime delta
──────────────────────────────────────────────────────────
10.001   2    scheduled     3.2ms      +3276800
10.004   2    preempted     -          -
10.007   1    scheduled     4.1ms      +4198400
10.011   1    sleep(io)     -          -
10.015   1    wakeup        -          -
10.016   1    scheduled     2.8ms      +2867200
...

Statistics:
  Total CPU time: 234.5ms
  Wait time: 12.3ms
  Average timeslice: 3.4ms
  Preemption count: 23
  Voluntary switches: 145

$ ./sched_analyzer --experiment nice
═══════════════════ NICE VALUE EXPERIMENT ═══════════════════
Running CPU-bound tasks with different nice values...

Process    Nice   Weight   Expected%   Actual%   Delta
─────────────────────────────────────────────────────────
task_0      0     1024     25.0%       24.8%     -0.2%
task_19    19       15      0.4%        0.5%     +0.1%
task_-5    -5     3121     76.4%       74.7%     -1.7%

Analysis: CFS weighted fair sharing working correctly.
task_-5 receives ~5x more CPU than task_0 (weight ratio: 3121/1024 ≈ 3x)

Implementation Hints:

Understanding CFS: CFS tracks “virtual runtime” (vruntime) for each process. The process with the smallest vruntime runs next. Higher-priority (lower nice) processes accumulate vruntime slower.

vruntime_delta = actual_runtime * (NICE_0_WEIGHT / process_weight)

Weight values (from kernel source):

nice  -20: weight = 88761
nice    0: weight =  1024  (baseline)
nice   19: weight =    15

Key files to analyze:

  • /proc/[pid]/sched - Per-process scheduler statistics
  • /proc/sched_debug - Scheduler debug info (needs CONFIG_SCHED_DEBUG)
  • /sys/kernel/debug/sched/ - More scheduler internals

Using perf for scheduler analysis:

perf sched record -- sleep 5       # Record scheduler events
perf sched latency                  # Show wakeup latencies
perf sched map                      # Visual CPU usage map

Using ftrace:

echo 1 > /sys/kernel/debug/tracing/events/sched/sched_switch/enable
echo 1 > /sys/kernel/debug/tracing/events/sched/sched_wakeup/enable
cat /sys/kernel/debug/tracing/trace_pipe

Experiment design:

  1. Fork multiple CPU-bound processes with different nice values
  2. Let them run for a fixed time
  3. Measure actual CPU time received by each
  4. Compare to expected ratios based on weights

Questions to guide implementation:

  1. Why does CFS use a red-black tree? What’s the time complexity?
  2. How does CFS handle processes that sleep? (Hint: sleeper fairness)
  3. What’s the difference between preemption and voluntary context switch?
  4. How do cgroups affect CFS decisions?

Learning milestones:

  1. You explain vruntime calculation → You understand CFS fundamentals
  2. Your experiments match theory → You understand weight-based scheduling
  3. You measure scheduling latency → You understand real-time concerns
  4. You observe load balancing → You understand multi-CPU scheduling

The Core Question You’re Answering

“How does the scheduler decide who runs next, and how do you measure that decision?”

Schedulers are policy. This project forces you to treat scheduling as measurable behavior.

Concepts You Must Understand First

  1. CFS basics
    • What is virtual runtime?
    • Book Reference: Operating Systems: Three Easy Pieces — Ch. 7
  2. task_struct fields
    • Which fields expose scheduling state?
    • Book Reference: Linux Kernel Development — Ch. 3
  3. Timing and accounting
    • How does the kernel measure CPU time?
    • Book Reference: Linux Kernel Development — Ch. 6

Questions to Guide Your Design

  1. Data sources
    • Will you read /proc/schedstat or use tracepoints?
  2. Visualization
    • How will you show fairness vs latency?
  3. Workload design
    • What synthetic workloads reveal scheduler behavior?

Thinking Exercise

Two CPU-bound processes and one IO-bound process

Questions while tracing:

  • Which process gets CPU time after IO completes?
  • How does CFS try to keep fairness?

The Interview Questions They’ll Ask

  1. “What is CFS and why was it designed?”
  2. “What does virtual runtime mean?”
  3. “How does the kernel prevent starvation?”
  4. “What is a context switch and why is it expensive?”
  5. “How do you measure scheduler latency?”

Hints in Layers

Hint 1: Start with /proc/schedstat Parse and display per-CPU stats.

Hint 2: Add workload markers Run CPU-bound and IO-bound workloads and capture deltas.

Hint 3: Use tracepoints Enable sched/sched_switch and sched/sched_wakeup.

Hint 4: Compare nice values Run tasks with different nice values and show runtimes.

Books That Will Help

Topic Book Chapter
Scheduling Operating Systems: Three Easy Pieces Ch. 7
Kernel scheduler Linux Kernel Development Ch. 6
Process model Linux Kernel Development Ch. 3

Common Pitfalls & Debugging

Problem 1: “Stats look unchanged”

  • Why: Sampling interval too short
  • Fix: Increase measurement window

Problem 2: “Trace output too noisy”

  • Why: Tracepoints enabled globally
  • Fix: Filter by PID or cgroup

Problem 3: “Results vary wildly”

  • Why: CPU frequency scaling
  • Fix: Pin CPU frequency or use performance governor

Project 8: Simple Filesystem (FUSE, then Kernel)

  • File: LEARN_LINUX_KERNEL_DEEP_DIVE.md
  • Main Programming Language: C
  • Alternative Programming Languages: Rust, Go (for FUSE version)
  • Coolness Level: Level 4: Hardcore Tech Flex
  • Business Potential: 2. The “Micro-SaaS / Pro Tool”
  • Difficulty: Level 4: Expert
  • Knowledge Area: Filesystems / VFS
  • Software or Tool: FUSE, VFS
  • Main Book: “Understanding the Linux Kernel” by Bovet & Cesati

What you’ll build: A simple in-memory filesystem, first using FUSE (Filesystem in Userspace) to learn the concepts, then porting it to a real kernel module that registers with VFS.

Why it teaches Linux kernel: Filesystems are a major kernel subsystem. The VFS (Virtual File System) is an elegant abstraction that all filesystems implement. Understanding VFS means understanding inodes, dentries, superblocks, and file operations—concepts that appear throughout the kernel.

Core challenges you’ll face:

  • Implementing the FUSE interface → maps to understanding VFS operations
  • Managing inodes and dentries → maps to kernel filesystem data structures
  • Porting FUSE code to kernel module → maps to kernel vs userspace programming
  • Proper error handling and concurrency → maps to kernel filesystem requirements

Key Concepts:

  • VFS architecture: “Understanding the Linux Kernel” Chapter 12 - Bovet & Cesati
  • FUSE tutorial: libfuse documentation and examples
  • Inode operations: Kernel source include/linux/fs.h
  • Simple filesystem examples: Kernel source fs/ramfs/, fs/minix/

Difficulty: Expert Time estimate: 3-4 weeks Prerequisites: Projects 4-5 completed, understanding of file system concepts

Real world outcome:

# Phase 1: FUSE version
$ ./simplefs_fuse /mnt/simple &
$ mount | grep simple
simplefs on /mnt/simple type fuse.simplefs (rw,nosuid,nodev)

$ ls /mnt/simple
(empty)

$ echo "Hello filesystem!" > /mnt/simple/greeting.txt
$ ls -la /mnt/simple
total 0
drwxr-xr-x 2 user user  0 Dec 20 10:00 .
drwxr-xr-x 3 root root  0 Dec 20 09:00 ..
-rw-r--r-- 1 user user 18 Dec 20 10:01 greeting.txt

$ cat /mnt/simple/greeting.txt
Hello filesystem!

$ mkdir /mnt/simple/subdir
$ touch /mnt/simple/subdir/file.txt
$ tree /mnt/simple
/mnt/simple
├── greeting.txt
└── subdir
    └── file.txt

$ fusermount -u /mnt/simple

# Phase 2: Kernel module version
$ sudo insmod simplefs_kernel.ko
$ sudo mount -t simplefs none /mnt/simple
$ ls /mnt/simple
(empty)

$ echo "Now in kernel space!" > /mnt/simple/kernel.txt
$ cat /mnt/simple/kernel.txt
Now in kernel space!

$ df -h /mnt/simple
Filesystem      Size  Used Avail Use% Mounted on
none            64M   12K   64M   1% /mnt/simple

$ sudo umount /mnt/simple
$ sudo rmmod simplefs_kernel

Implementation Hints:

FUSE filesystem structure:

static struct fuse_operations simplefs_ops = {
    .getattr  = simplefs_getattr,   // stat()
    .readdir  = simplefs_readdir,   // ls
    .open     = simplefs_open,
    .read     = simplefs_read,
    .write    = simplefs_write,
    .create   = simplefs_create,    // touch, create new file
    .mkdir    = simplefs_mkdir,
    .unlink   = simplefs_unlink,    // rm
    .rmdir    = simplefs_rmdir,
};

In-memory data structures:

struct simplefs_inode {
    ino_t ino;
    mode_t mode;        // file type and permissions
    uid_t uid;
    gid_t gid;
    size_t size;
    time_t atime, mtime, ctime;

    // For directories: list of entries
    // For files: data buffer
    union {
        struct list_head children;  // if S_ISDIR
        char *data;                  // if S_ISREG
    };
};

Kernel VFS registration:

static struct file_system_type simplefs_type = {
    .owner = THIS_MODULE,
    .name = "simplefs",
    .mount = simplefs_mount,
    .kill_sb = simplefs_kill_sb,
};

static struct super_operations simplefs_super_ops = { ... };
static struct inode_operations simplefs_inode_ops = { ... };
static struct file_operations simplefs_file_ops = { ... };

Questions to guide implementation:

  1. What’s the difference between an inode and a dentry?
  2. Why does VFS separate inode_operations from file_operations?
  3. How do you handle the root directory specially?
  4. What’s the relationship between superblock, inode, and dentry?
  5. How does the kernel find a file given a path like “/foo/bar/baz”?

Development approach:

  1. Start with FUSE read-only filesystem (just getattr, readdir, read)
  2. Add write support (write, create)
  3. Add directory support (mkdir, rmdir, unlink)
  4. Port to kernel module, using ramfs/minix as reference

Learning milestones:

  1. FUSE read-only works → You understand basic VFS operations
  2. FUSE read-write works → You understand file lifecycle
  3. Kernel module mounts → You understand filesystem registration
  4. Kernel module persists data → You’ve built a working kernel filesystem

The Core Question You’re Answering

“How does the kernel translate file operations into filesystem-specific behavior?”

The filesystem project makes the VFS stack tangible.

Concepts You Must Understand First

  1. VFS objects
    • What is an inode vs dentry?
    • Book Reference: Linux Kernel Development — Ch. 12
  2. Filesystem operations
    • What does file_operations do?
    • Book Reference: Linux Kernel Development — Ch. 12
  3. User/kernel I/O flow
    • How does read() reach your code?
    • Book Reference: Linux System Programming — Ch. 13

Questions to Guide Your Design

  1. Data model
    • Will you build an in-memory filesystem first?
  2. Consistency
    • How will you handle concurrent writes?
  3. Failure behavior
    • What errors will you return on missing paths?

Thinking Exercise

open() → read() → close()

Questions while tracing:

  • Which VFS hooks are called in sequence?
  • Where should you validate permissions?

The Interview Questions They’ll Ask

  1. “What is the role of the VFS?”
  2. “What are inodes and dentries?”
  3. “Why start with FUSE before kernel?”
  4. “How do page cache and filesystem interact?”
  5. “How do you ensure filesystem consistency?”

Hints in Layers

Hint 1: Build a FUSE version first Implement getattr, readdir, read, write.

Hint 2: Map VFS operations Create a table that maps FUSE ops to kernel ops.

Hint 3: Port to kernel Start with a read-only filesystem.

Hint 4: Add write support Implement basic file creation and write paths.

Books That Will Help

Topic Book Chapter
VFS Linux Kernel Development Ch. 12
Filesystem I/O Linux System Programming Ch. 13
OS storage Operating Systems: Three Easy Pieces Ch. 39

Common Pitfalls & Debugging

Problem 1: “Mount succeeds but reads fail”

  • Why: Missing read or lookup operations
  • Fix: Add minimal ops and return -ENOENT correctly

Problem 2: “Kernel crash on unmount”

  • Why: Missing cleanup of inode/dentry structures
  • Fix: Free resources in superblock teardown

Problem 3: “Writes silently truncate”

  • Why: Not updating file size and inode metadata
  • Fix: Update inode size and timestamps

Project 9: Network Packet Filter (Netfilter Module)

  • File: LEARN_LINUX_KERNEL_DEEP_DIVE.md
  • Main Programming Language: C
  • Alternative Programming Languages: eBPF/XDP (modern alternative)
  • Coolness Level: Level 4: Hardcore Tech Flex
  • Business Potential: 3. The “Service & Support” Model
  • Difficulty: Level 3: Advanced
  • Knowledge Area: Networking Stack / Netfilter
  • Software or Tool: iptables, nftables, Netfilter
  • Main Book: “Understanding Linux Network Internals” by Christian Benvenuti

What you’ll build: A kernel module that hooks into Netfilter to inspect, modify, and filter network packets. You’ll implement a simple firewall that can block/allow based on IP addresses, ports, and packet content.

Why it teaches Linux kernel: The networking stack is a major kernel subsystem. Netfilter is how iptables/nftables work. Understanding packet processing, socket buffers (sk_buff), and the network hooks teaches you about one of the kernel’s most performance-critical paths.

Core challenges you’ll face:

  • Registering Netfilter hooks → maps to understanding the packet flow path
  • Parsing sk_buff structures → maps to network packet representation
  • Implementing packet filtering logic → maps to network protocols at the kernel level
  • Handling performance considerations → maps to critical path optimization

Key Concepts:

  • Netfilter architecture: “Understanding Linux Network Internals” Chapter 19 - Benvenuti
  • sk_buff structure: Kernel source include/linux/skbuff.h
  • Network hooks: Kernel documentation Documentation/networking/netfilter-sysctl.txt
  • Packet processing: “Linux Kernel Networking” by Rami Rosen

Difficulty: Advanced Time estimate: 2 weeks Prerequisites: Project 4-5 completed, understanding of TCP/IP networking

Real world outcome:

$ sudo insmod packetfilter.ko
$ dmesg | tail
[12345.678] packetfilter: Module loaded
[12345.679] packetfilter: Netfilter hooks registered

# View current rules (via procfs interface)
$ cat /proc/packetfilter/rules
(empty - all traffic allowed)

# Block all traffic from a specific IP
$ echo "block src 192.168.1.100" > /proc/packetfilter/rules

# Block SSH port
$ echo "block dst_port 22" > /proc/packetfilter/rules

# Show active rules
$ cat /proc/packetfilter/rules
1: BLOCK src=192.168.1.100/32
2: BLOCK dst_port=22

# View statistics
$ cat /proc/packetfilter/stats
Packets inspected: 12,456
Packets blocked: 234
Packets allowed: 12,222

Blocked breakdown:
  By IP: 123
  By port: 111
  By content: 0

# Test it
$ ping 8.8.8.8
PING 8.8.8.8 ...
64 bytes from 8.8.8.8: icmp_seq=1 ttl=118

$ # From another machine at 192.168.1.100:
$ ping [your-machine]
# (no response - blocked!)

$ dmesg | tail
[12400.123] packetfilter: BLOCKED packet from 192.168.1.100:54321 -> 10.0.0.1:0 (ICMP)

# Log mode
$ echo "log" > /proc/packetfilter/mode
$ cat /proc/packetfilter/log
[12401.234] IN: 192.168.1.1:443 -> 10.0.0.1:52345 TCP SYN
[12401.235] OUT: 10.0.0.1:52345 -> 192.168.1.1:443 TCP SYN-ACK
[12401.236] IN: 192.168.1.1:443 -> 10.0.0.1:52345 TCP ACK

Implementation Hints:

Netfilter hook registration:

static struct nf_hook_ops my_hook_ops = {
    .hook = my_hook_func,
    .pf = PF_INET,                    // IPv4
    .hooknum = NF_INET_PRE_ROUTING,   // or POST_ROUTING, LOCAL_IN, etc.
    .priority = NF_IP_PRI_FIRST,
};

// In module init:
nf_register_net_hook(&init_net, &my_hook_ops);

Hook function signature:

static unsigned int my_hook_func(
    void *priv,
    struct sk_buff *skb,
    const struct nf_hook_state *state)
{
    // Return NF_ACCEPT, NF_DROP, or NF_QUEUE
}

Parsing packet headers:

struct iphdr *ip_header = ip_hdr(skb);
// ip_header->saddr, ip_header->daddr, ip_header->protocol

if (ip_header->protocol == IPPROTO_TCP) {
    struct tcphdr *tcp_header = tcp_hdr(skb);
    // tcp_header->source, tcp_header->dest (in network byte order!)
}

Important considerations:

  • Network byte order: use ntohs(), ntohl() for port/IP comparisons
  • Performance: this code runs for EVERY packet - keep it fast
  • Memory: don’t allocate memory in the fast path
  • Locking: consider RCU for rule list access

Questions to guide implementation:

  1. What are the different Netfilter hook points? When would you use each?
  2. What’s the difference between NF_DROP and NF_STOLEN?
  3. How do you access the payload data in an sk_buff?
  4. Why is network byte order different from host byte order?

Learning milestones:

  1. Hook is called for packets → You understand Netfilter registration
  2. You parse IP headers correctly → You understand sk_buff structure
  3. You block packets by rule → You understand packet filtering logic
  4. Your module handles high traffic → You understand performance in kernel

The Core Question You’re Answering

“How does a packet flow through the kernel, and where can you safely intercept it?”

Netfilter exposes the kernel networking path as hooks you can instrument and control.

Concepts You Must Understand First

  1. Network stack layers
    • What is sk_buff?
    • Book Reference: Linux Kernel Development — Ch. 17
  2. Netfilter hooks
    • Which hooks run for inbound vs outbound?
    • Book Reference: Linux Kernel Development — Ch. 17
  3. Packet filtering semantics
    • When do you DROP vs ACCEPT?
    • Book Reference: Computer Networks — Ch. 4

Questions to Guide Your Design

  1. Hook selection
    • Will you filter at PRE_ROUTING or LOCAL_IN?
  2. Rule management
    • Will you implement a rules table or hardcode?
  3. Performance
    • How do you avoid per-packet allocations?

Thinking Exercise

Incoming packet → PREROUTING → LOCAL_IN → socket

Questions while tracing:

  • Where would you block a port scan?
  • Where would you block outbound DNS?

The Interview Questions They’ll Ask

  1. “What is netfilter and how does it relate to iptables?”
  2. “What is an sk_buff?”
  3. “Where would you implement a firewall rule?”
  4. “How do you avoid performance regressions in a packet filter?”
  5. “How does the networking stack handle local delivery?”

Hints in Layers

Hint 1: Start with logging Register a hook that only logs packets.

Hint 2: Add a simple rule Drop packets to a single port.

Hint 3: Add configuration Use procfs or sysfs to update rules.

Hint 4: Benchmark Use iperf or wrk to measure overhead.

Books That Will Help

Topic Book Chapter
Netfilter Linux Kernel Development Ch. 17
Networking Computer Networks Ch. 4
sk_buff Linux Kernel Development Ch. 17

Common Pitfalls & Debugging

Problem 1: “No packets captured”

  • Why: Hook registered for wrong protocol or hook point
  • Fix: Use NF_INET_PRE_ROUTING for inbound traffic

Problem 2: “Kernel warning about sleeping”

  • Why: Allocations in atomic context
  • Fix: Use GFP_ATOMIC or avoid allocation

Problem 3: “Network throughput drops”

  • Why: Heavy per-packet logging
  • Fix: Sample or rate-limit logs

Project 10: Kernel Debugger Experience (GDB + QEMU + kgdb)

  • File: LEARN_LINUX_KERNEL_DEEP_DIVE.md
  • Main Programming Language: C (debugging kernel code)
  • Alternative Programming Languages: N/A
  • Coolness Level: Level 4: Hardcore Tech Flex
  • Business Potential: 1. The “Resume Gold”
  • Difficulty: Level 3: Advanced
  • Knowledge Area: Kernel Debugging / Development Tools
  • Software or Tool: GDB, QEMU, kgdb, ftrace
  • Main Book: “Linux Kernel Debugging” by Kaiwan N. Billimoria

What you’ll build: A complete kernel debugging workflow—setting breakpoints in kernel code, stepping through kernel functions, inspecting data structures, and using various debugging facilities. You’ll debug your own modules and also explore mainline kernel code.

Why it teaches Linux kernel: You cannot effectively develop for or contribute to the kernel without debugging skills. This project teaches you the tools maintainers use: GDB attached to QEMU, kgdb, ftrace, printk, and crash dump analysis. These skills are essential for understanding kernel behavior.

Core challenges you’ll face:

  • Setting up GDB with QEMU → maps to kernel debugging infrastructure
  • Navigating kernel data structures in GDB → maps to understanding kernel internals
  • Using ftrace effectively → maps to kernel tracing subsystem
  • Analyzing kernel panics → maps to understanding crash dumps

Key Concepts:

  • GDB kernel debugging: Kernel documentation Documentation/dev-tools/gdb-kernel-debugging.rst
  • kgdb setup: Kernel documentation Documentation/dev-tools/kgdb.rst
  • ftrace usage: “Linux Kernel Debugging” Chapter 9 - Kaiwan N. Billimoria
  • crash analysis: crash utility documentation

Difficulty: Advanced Time estimate: 1 week Prerequisites: Project 3-5 completed

Real world outcome:

# Terminal 1: Start QEMU with GDB stub
$ qemu-system-x86_64 -kernel bzImage -initrd initramfs.cpio.gz \
    -append "console=ttyS0 nokaslr" -nographic -s -S

# Terminal 2: Connect GDB
$ gdb vmlinux
(gdb) target remote :1234
Remote debugging using :1234
0x000000000000fff0 in ?? ()

(gdb) hbreak start_kernel
Hardware assisted breakpoint 1 at 0xffffffff81000000

(gdb) continue
Continuing.
Breakpoint 1, start_kernel () at init/main.c:878
878     {

(gdb) list
873 asmlinkage __visible void __init __no_sanitize_address start_kernel(void)
874 {
875     char *command_line;
876     char *after_dashes;
877
878     set_task_stack_end_magic(&init_task);

(gdb) print init_task.comm
$1 = "swapper/0"

(gdb) print init_task.pid
$2 = 0

# Set breakpoint in scheduler
(gdb) break schedule
Breakpoint 2 at 0xffffffff81a23450

(gdb) continue
Breakpoint 2, schedule () at kernel/sched/core.c:5678

# Examine the current task
(gdb) print current->comm
$3 = "init"

(gdb) print current->state
$4 = 0  # TASK_RUNNING

# Examine run queue
(gdb) print rq->nr_running
$5 = 3

# Using ftrace from the guest
# cat /sys/kernel/debug/tracing/available_tracers
function function_graph nop

# echo function_graph > /sys/kernel/debug/tracing/current_tracer
# echo schedule > /sys/kernel/debug/tracing/set_ftrace_filter
# echo 1 > /sys/kernel/debug/tracing/tracing_on
# cat /sys/kernel/debug/tracing/trace

# ...
 0)               |  schedule() {
 0)               |    rcu_note_context_switch() {
 0)   0.123 us    |      rcu_qs();
 0)   0.456 us    |    }
 0)               |    __schedule() {
 0)   0.234 us    |      pick_next_task_fair();

Implementation Hints:

QEMU/GDB setup:

  1. Build kernel with debug info: CONFIG_DEBUG_INFO=y
  2. Disable KASLR: add nokaslr to kernel command line
  3. QEMU flags: -s (gdb stub on 1234), -S (pause at start)

Essential GDB commands for kernel debugging:

# Kernel-specific
lx-dmesg         # Show dmesg output
lx-lsmod         # List loaded modules
lx-ps            # Show process list
lx-symbols       # Load module symbols

# Use after sourcing vmlinux-gdb.py:
(gdb) source scripts/gdb/vmlinux-gdb.py

Setting up symbol loading for modules:

(gdb) add-symbol-file /path/to/mymodule.ko 0xffffffffa0000000

Using ftrace:

# Function tracer
echo function > /sys/kernel/debug/tracing/current_tracer
echo 1 > /sys/kernel/debug/tracing/tracing_on
# ... do something ...
cat /sys/kernel/debug/tracing/trace

# Function graph tracer (with call graphs)
echo function_graph > /sys/kernel/debug/tracing/current_tracer

# Filter to specific functions
echo "schedule*" > /sys/kernel/debug/tracing/set_ftrace_filter

# Dynamic events
echo 'p:myprobe kmalloc size=%di' > /sys/kernel/debug/tracing/kprobe_events

Questions to guide exploration:

  1. What’s the difference between hardware and software breakpoints?
  2. Why do we need nokaslr for debugging?
  3. How do you debug a function that runs with interrupts disabled?
  4. How do you debug a module before it’s loaded?

Learning milestones:

  1. GDB connects and you hit start_kernel → You have basic debugging working
  2. You step through schedule() → You understand scheduler operation
  3. You use ftrace to trace function calls → You understand kernel tracing
  4. You debug your own module → You can develop with full debugging support

The Core Question You’re Answering

“How do you debug a live kernel without crashing the system?”

Kernel debugging is about controlled observation. QEMU + GDB + kgdb let you stop the kernel safely.

Concepts You Must Understand First

  1. Kernel debug symbols
    • Why use vmlinux for symbols?
    • Book Reference: The Art of Debugging with GDB — Ch. 7
  2. kgdb basics
    • What is the kgdb stub?
    • Book Reference: Linux Kernel Development — Ch. 21
  3. Early boot debugging
    • Why disable KASLR and enable nokaslr?
    • Book Reference: Linux Kernel Development — Ch. 2

Questions to Guide Your Design

  1. Breakpoints
    • Where will you set the first breakpoint (start_kernel)?
  2. Safety
    • How will you avoid stopping the host kernel?
  3. Repeatability
    • How will you restart and reattach quickly?

Thinking Exercise

(gdb) b start_kernel
(gdb) c

Questions while tracing:

  • What registers indicate the current CPU?
  • How do you inspect a task_struct?

The Interview Questions They’ll Ask

  1. “Why can’t you use gdb on a live host kernel directly?”
  2. “What does kgdb do?”
  3. “How do you debug early boot failures?”
  4. “What is KASLR and why disable it?”
  5. “How do you interpret a kernel backtrace?”

Hints in Layers

Hint 1: Use QEMU Always debug in a VM with -s and -S for GDB.

Hint 2: Load vmlinux Use file vmlinux in GDB for symbols.

Hint 3: Add printk breadcrumbs Use printk to narrow breakpoints.

Hint 4: Use kgdb Enable CONFIG_KGDB and kgdbwait for entry.

Books That Will Help

Topic Book Chapter
GDB The Art of Debugging with GDB Ch. 7-9
Kernel debugging Linux Kernel Development Ch. 21
Boot flow How Linux Works Ch. 5

Common Pitfalls & Debugging

Problem 1: “No symbols in GDB”

  • Why: Loading bzImage instead of vmlinux
  • Fix: Use file vmlinux in GDB

Problem 2: “Breakpoints never hit”

  • Why: CPU already passed that stage
  • Fix: Start QEMU with -S to stop at reset

Problem 3: “Kernel hangs at kgdbwait”

  • Why: No debugger attached
  • Fix: Attach GDB and issue continue

Project 11: Block Device Driver

  • File: LEARN_LINUX_KERNEL_DEEP_DIVE.md
  • Main Programming Language: C
  • Alternative Programming Languages: N/A
  • Coolness Level: Level 4: Hardcore Tech Flex
  • Business Potential: 2. The “Micro-SaaS / Pro Tool”
  • Difficulty: Level 4: Expert
  • Knowledge Area: Block Layer / Storage Drivers
  • Software or Tool: Block layer, request queues
  • Main Book: “Linux Device Drivers, 3rd Edition” by Corbet, Rubini, Kroah-Hartman

What you’ll build: A RAM-backed block device that appears as a real disk—you can partition it, format it with any filesystem, and mount it. This teaches the block I/O subsystem that underlies all storage.

Why it teaches Linux kernel: Block devices are fundamental to storage. Understanding the block layer—request queues, bio structures, and the I/O scheduler interface—gives you insight into how the kernel handles storage devices. This knowledge applies to SSDs, HDDs, NVMe, and virtual disks.

Core challenges you’ll face:

  • Registering a block device → maps to gendisk and block_device registration
  • Handling block I/O requests → maps to the request queue and bio structures
  • Implementing partition support → maps to partition table parsing
  • Performance optimization → maps to I/O scheduling concepts

Key Concepts:

  • Block device basics: “Linux Device Drivers, 3rd Edition” Chapter 16 - Corbet et al.
  • Request queue API: Kernel documentation Documentation/block/
  • Multi-queue block layer: Kernel source block/blk-mq.c
  • Bio structure: Kernel source include/linux/bio.h

Difficulty: Expert Time estimate: 2-3 weeks Prerequisites: Projects 5, 6 completed, understanding of block I/O concepts

Real world outcome:

$ sudo insmod ramblock.ko size_mb=64
$ dmesg | tail
[12345.678] ramblock: Module loaded
[12345.679] ramblock: Creating 64MB block device
[12345.680] ramblock: Registered block device major 252
[12345.681] ramblock: /dev/ramblock0 created (64 MB, 131072 sectors)

$ lsblk | grep ramblock
ramblock0  252:0    0    64M  0 disk

$ sudo fdisk /dev/ramblock0
Command (m for help): n
Partition type: p
Partition number: 1
First sector: 2048
Last sector: 131071
Created a new partition 1 of type 'Linux' and of size 63 MB.

Command (m for help): w

$ lsblk | grep ramblock
ramblock0    252:0    0    64M  0 disk
└─ramblock0p1 252:1    0    63M  0 part

$ sudo mkfs.ext4 /dev/ramblock0p1
Creating filesystem with 64512 1k blocks and 16128 inodes

$ sudo mount /dev/ramblock0p1 /mnt/ramblock
$ df -h /mnt/ramblock
Filesystem         Size  Used Avail Use% Mounted on
/dev/ramblock0p1    62M   24K   57M   1% /mnt/ramblock

$ sudo dd if=/dev/urandom of=/mnt/ramblock/testfile bs=1M count=10
10+0 records in
10+0 records out
10485760 bytes (10 MB) copied, 0.0234 s, 448 MB/s

$ ls -la /mnt/ramblock/
total 10256
drwxr-xr-x 2 root root     1024 Dec 20 10:00 .
drwxr-xr-x 3 root root     4096 Dec 20 09:00 ..
drwx------ 2 root root    12288 Dec 20 10:00 lost+found
-rw-r--r-- 1 root root 10485760 Dec 20 10:01 testfile

$ cat /sys/block/ramblock0/stat
    1234     567    8901    234    5678    2345   12345    567     0    789    801

$ sudo umount /mnt/ramblock
$ sudo rmmod ramblock

Implementation Hints:

Modern block device registration (blk-mq):

static const struct blk_mq_ops my_mq_ops = {
    .queue_rq = my_queue_rq,  // Handle I/O requests
};

static struct gendisk *create_block_device(void) {
    // Allocate tag set
    struct blk_mq_tag_set *tag_set = ...;
    blk_mq_alloc_tag_set(tag_set);

    // Create disk
    struct gendisk *disk = blk_mq_alloc_disk(tag_set, NULL);
    disk->major = register_blkdev(0, "ramblock");
    disk->first_minor = 0;
    disk->minors = 16;  // Support partitions
    disk->fops = &my_block_ops;
    set_capacity(disk, sectors);

    // Add disk to system
    add_disk(disk);
}

Handling I/O requests:

static blk_status_t my_queue_rq(struct blk_mq_hw_ctx *hctx,
                                 const struct blk_mq_queue_data *bd)
{
    struct request *rq = bd->rq;
    struct bio_vec bvec;
    struct req_iterator iter;

    blk_mq_start_request(rq);

    rq_for_each_segment(bvec, rq, iter) {
        void *buffer = kmap_atomic(bvec.bv_page);
        unsigned long offset = bvec.bv_offset;
        size_t len = bvec.bv_len;
        sector_t sector = iter.iter.bi_sector;

        if (rq_data_dir(rq) == READ) {
            // Copy from our RAM disk to buffer
        } else {
            // Copy from buffer to our RAM disk
        }

        kunmap_atomic(buffer);
    }

    blk_mq_end_request(rq, BLK_STS_OK);
    return BLK_STS_OK;
}

Block device operations:

static const struct block_device_operations my_block_ops = {
    .owner = THIS_MODULE,
    .open = my_open,
    .release = my_release,
    .ioctl = my_ioctl,
    .getgeo = my_getgeo,  // For fdisk geometry
};

Questions to guide implementation:

  1. What’s the difference between request-based and bio-based drivers?
  2. How does the kernel handle sector size vs block size?
  3. Why use blk_mq_ (multi-queue) instead of the old single-queue API?
  4. How do you report disk geometry for partition tools?

Learning milestones:

  1. Device appears in /dev and lsblk → You understand block device registration
  2. You can partition and format it → You understand the block layer interface
  3. I/O works correctly → You understand request handling
  4. Performance is reasonable → You understand block layer optimization

The Core Question You’re Answering

“How does the kernel expose block storage as a file-like device?”

Block devices are the backbone of storage. This project teaches the I/O pipeline.

Concepts You Must Understand First

  1. Block layer basics
    • What is a request queue?
    • Book Reference: Linux Kernel Development — Ch. 14
  2. I/O scheduling
    • Why do we merge and reorder requests?
    • Book Reference: Operating Systems: Three Easy Pieces — Ch. 39
  3. Page cache
    • How does buffered I/O interact with devices?
    • Book Reference: Linux Kernel Development — Ch. 12

Questions to Guide Your Design

  1. Storage backend
    • Will you implement a RAM disk or file-backed device?
  2. Queue depth
    • How will you size your request queue?
  3. Error handling
    • What errors will you return on overflow?

Thinking Exercise

write() → page cache → block layer → device

Questions while tracing:

  • Where can requests be merged?
  • How does the kernel handle writeback?

The Interview Questions They’ll Ask

  1. “What is the difference between block and char devices?”
  2. “What is the block request queue?”
  3. “How does the page cache interact with block devices?”
  4. “What is I/O scheduling?”
  5. “How do you test a block device driver?”

Hints in Layers

Hint 1: Start with a RAM disk Implement a memory-backed block device.

Hint 2: Implement request handling Complete requests with blk_end_request_all.

Hint 3: Add partitioning Expose a simple partition table to test tools.

Hint 4: Benchmark Use dd and fio to measure throughput.

Books That Will Help

Topic Book Chapter
Block layer Linux Kernel Development Ch. 14
Storage I/O Operating Systems: Three Easy Pieces Ch. 39
VFS + page cache Linux Kernel Development Ch. 12

Common Pitfalls & Debugging

Problem 1: “Device appears but I/O fails”

  • Why: Request queue not initialized
  • Fix: Set queue->queuedata and handlers

Problem 2: “Data corruption”

  • Why: Incorrect sector offset math
  • Fix: Verify offset = sector * 512

Problem 3: “Unmount hangs”

  • Why: Requests not completed
  • Fix: Always complete requests on error paths

Project 12: USB Device Driver

  • File: LEARN_LINUX_KERNEL_DEEP_DIVE.md
  • Main Programming Language: C
  • Alternative Programming Languages: N/A
  • Coolness Level: Level 5: Pure Magic
  • Business Potential: 2. The “Micro-SaaS / Pro Tool”
  • Difficulty: Level 4: Expert
  • Knowledge Area: USB Subsystem / Hardware Drivers
  • Software or Tool: USB core, usbmon
  • Main Book: “Linux Device Drivers, 3rd Edition” by Corbet, Rubini, Kroah-Hartman

What you’ll build: A driver for a simple USB device—ideally a real device like a USB LED, custom microcontroller (Arduino/STM32), or a USB-to-serial adapter. You’ll handle device enumeration, setup, and I/O.

Why it teaches Linux kernel: USB is one of the most common hardware interfaces. Writing a USB driver teaches you device enumeration, endpoint configuration, and URB (USB Request Block) handling. These concepts apply to any bus-based driver (PCI, I2C, SPI).

Core challenges you’ll face:

  • USB device enumeration → maps to device/driver matching
  • Understanding USB descriptors → maps to hardware discovery
  • URB submission and completion → maps to asynchronous I/O in kernel
  • Handling hot-plug → maps to device lifecycle management

Key Concepts:

  • USB driver basics: “Linux Device Drivers, 3rd Edition” Chapter 13 - Corbet et al.
  • USB protocol: USB specification (usb.org)
  • USB core API: Kernel source include/linux/usb.h
  • usbmon debugging: Kernel documentation Documentation/usb/usbmon.rst

Difficulty: Expert Time estimate: 3-4 weeks Prerequisites: Projects 5, 11 completed, basic USB protocol knowledge, access to a USB device

Real world outcome:

# Plug in your custom USB device (e.g., Arduino with custom firmware)
$ dmesg | tail
[12345.678] usb 1-2: new full-speed USB device number 4 using xhci_hcd
[12345.789] usb 1-2: New USB device found, idVendor=1234, idProduct=5678
[12345.789] usb 1-2: Product: My Custom Device
[12345.790] my_usb_driver: Device connected!
[12345.790] my_usb_driver: Found 2 endpoints (1 IN, 1 OUT)
[12345.791] my_usb_driver: Device registered as /dev/myusb0

$ ls -la /dev/myusb0
crw-rw-r-- 1 root root 180, 192 Dec 20 10:00 /dev/myusb0

# Communicate with the device
$ echo "LED ON" > /dev/myusb0
$ dmesg | tail -1
[12350.123] my_usb_driver: Sent 6 bytes to device

$ cat /dev/myusb0
Device Status: OK
Temperature: 23.5C
LED: ON

# USB debugging with usbmon
$ sudo modprobe usbmon
$ sudo cat /sys/kernel/debug/usb/usbmon/1u | head -20
ffff8881234567800 1234567890 S Bo:1:004:1 -115 6 = 4c454420 4f4e
ffff8881234567800 1234567891 C Bo:1:004:1 0 6 >

# View device info via sysfs
$ cat /sys/bus/usb/devices/1-2/product
My Custom Device

$ cat /sys/bus/usb/devices/1-2/manufacturer
Your Name

$ sudo rmmod my_usb_driver
$ dmesg | tail -1
[12400.456] my_usb_driver: Device disconnected, cleanup complete

Implementation Hints:

USB driver structure:

static struct usb_device_id my_device_table[] = {
    { USB_DEVICE(VENDOR_ID, PRODUCT_ID) },
    { }  // Terminating entry
};
MODULE_DEVICE_TABLE(usb, my_device_table);

static struct usb_driver my_driver = {
    .name = "my_usb_driver",
    .probe = my_probe,
    .disconnect = my_disconnect,
    .id_table = my_device_table,
};

module_usb_driver(my_driver);

Probe function:

static int my_probe(struct usb_interface *interface,
                    const struct usb_device_id *id)
{
    struct usb_device *udev = interface_to_usbdev(interface);
    struct usb_endpoint_descriptor *endpoint;

    // Find endpoints
    struct usb_host_interface *iface_desc = interface->cur_altsetting;
    for (int i = 0; i < iface_desc->desc.bNumEndpoints; i++) {
        endpoint = &iface_desc->endpoint[i].desc;
        if (usb_endpoint_is_bulk_in(endpoint)) {
            // Found bulk IN endpoint
        }
        if (usb_endpoint_is_bulk_out(endpoint)) {
            // Found bulk OUT endpoint
        }
    }

    // Register character device interface
    // ...
}

URB submission:

struct urb *urb = usb_alloc_urb(0, GFP_KERNEL);
usb_fill_bulk_urb(urb, dev, pipe, buffer, length,
                   completion_callback, context);
usb_submit_urb(urb, GFP_KERNEL);

Questions to guide implementation:

  1. What’s the difference between control, bulk, interrupt, and isochronous endpoints?
  2. How does the kernel match a USB device to your driver?
  3. What happens if you unplug the device while a URB is pending?
  4. How do you handle USB device resets?

Learning milestones:

  1. Driver loads when device plugs in → You understand USB enumeration
  2. You can send data to device → You understand USB bulk OUT transfers
  3. You can receive data from device → You understand USB bulk IN transfers
  4. Hot-plug works correctly → You understand device lifecycle

The Core Question You’re Answering

“How does the kernel communicate with USB devices and manage hotplug state?”

USB drivers are complex because the hardware is dynamic and asynchronous.

Concepts You Must Understand First

  1. USB device model
    • What are endpoints and interfaces?
    • Book Reference: Linux Kernel Development — Ch. 15
  2. URBs
    • How does a USB request block work?
    • Book Reference: Linux Kernel Development — Ch. 15
  3. Hotplug lifecycle
    • What happens when a device is disconnected?
    • Book Reference: Linux Kernel Development — Ch. 14

Questions to Guide Your Design

  1. Device matching
    • Which VID/PID will you bind to?
  2. Data path
    • Will you implement control transfers or bulk transfers?
  3. Resilience
    • How will you handle disconnects mid-transfer?

Thinking Exercise

Device plugged → probe() → endpoints discovered → URB submitted

Questions while tracing:

  • Where do you allocate URBs?
  • What happens if the device disappears?

The Interview Questions They’ll Ask

  1. “What is a USB endpoint?”
  2. “How does the kernel bind drivers to devices?”
  3. “What is an URB?”
  4. “How do you handle disconnects safely?”
  5. “What are control vs bulk transfers?”

Hints in Layers

Hint 1: Use a known device Start with a simple HID or USB serial device.

Hint 2: Implement probe/disconnect Make sure cleanup happens on disconnect.

Hint 3: Submit an URB Send a simple control message.

Hint 4: Add user interface Expose a char device for user interaction.

Books That Will Help

Topic Book Chapter
USB driver model Linux Kernel Development Ch. 15
Device lifecycle Linux Kernel Development Ch. 14

Common Pitfalls & Debugging

Problem 1: “Driver never binds”

  • Why: Wrong VID/PID table
  • Fix: Verify with lsusb -v

Problem 2: “Kernel crash on disconnect”

  • Why: URBs still active
  • Fix: Cancel URBs in disconnect handler

Problem 3: “Transfers time out”

  • Why: Wrong endpoint type or size
  • Fix: Inspect descriptors and adjust URB settings

Project 13: Kernel Tracing with eBPF

  • File: LEARN_LINUX_KERNEL_DEEP_DIVE.md
  • Main Programming Language: C (eBPF programs), Python (BCC tools)
  • Alternative Programming Languages: Rust (using libbpf-rs)
  • Coolness Level: Level 4: Hardcore Tech Flex
  • Business Potential: 3. The “Service & Support” Model
  • Difficulty: Level 3: Advanced
  • Knowledge Area: eBPF / Kernel Tracing
  • Software or Tool: BCC, bpftrace, libbpf
  • Main Book: “Learning eBPF” by Liz Rice

What you’ll build: A collection of eBPF programs that trace kernel events—system calls, network packets, filesystem operations—without modifying kernel code or loading traditional kernel modules.

Why it teaches Linux kernel: eBPF is revolutionizing how we interact with the kernel. It’s used for performance analysis, security monitoring, and networking. Understanding eBPF means understanding where to hook into the kernel and what data structures matter. This is essential knowledge for modern kernel work.

Core challenges you’ll face:

  • Writing eBPF programs → maps to understanding kernel hook points
  • Accessing kernel data structures safely → maps to BTF and CO-RE
  • Communicating results to userspace → maps to eBPF maps
  • Understanding eBPF verifier → maps to kernel safety requirements

Key Concepts:

  • eBPF fundamentals: “Learning eBPF” Chapters 1-5 - Liz Rice
  • BCC tools: BCC documentation (iovisor/bcc)
  • libbpf and CO-RE: Kernel documentation Documentation/bpf/
  • bpftrace: bpftrace documentation

Difficulty: Advanced Time estimate: 2 weeks Prerequisites: Projects 1-2, 7 completed, understanding of kernel concepts

Real world outcome:

# Install tools
$ sudo apt install bpfcc-tools bpftrace linux-headers-$(uname -r)

# Example 1: Trace all open() syscalls
$ sudo bpftrace -e 'tracepoint:syscalls:sys_enter_openat {
    printf("%s opened %s\n", comm, str(args->filename));
}'
Attaching 1 probe...
firefox opened /home/user/.mozilla/cookies.sqlite
code opened /home/user/project/main.c

# Example 2: Your custom syscall latency tracer
$ sudo ./syscall_latency.py
Tracing syscall latency... Hit Ctrl-C to end.

SYSCALL               COUNT    AVG_NS      MAX_NS
read                  12456    1234        45678
write                  8901    2345        56789
openat                 1234    5678        123456
close                  9012    567         12345

# Example 3: Network packet tracer
$ sudo ./net_tracer.py
Tracing TCP connections... Hit Ctrl-C to end.

TIME     PID    COMM         SADDR:SPORT        DADDR:DPORT
10:00:01 1234   firefox      192.168.1.10:52345 142.250.80.46:443
10:00:02 5678   curl         192.168.1.10:52346 151.101.1.69:80

# Example 4: Filesystem latency histogram
$ sudo ./fs_latency.py ext4
Tracing ext4 operations... Hit Ctrl-C to end.

     usecs           : count     distribution
       0 -> 1        : 0        |                    |
       1 -> 2        : 12       |**                  |
       2 -> 4        : 89       |*****************   |
       4 -> 8        : 234      |********************|
       8 -> 16       : 156      |**************      |
      16 -> 32       : 45       |****                |
      32 -> 64       : 12       |*                   |

# Your custom eBPF program
$ cat trace_malloc.bpf.c
SEC("uprobe/libc.so.6:malloc")
int trace_malloc(struct pt_regs *ctx) {
    size_t size = PT_REGS_PARM1(ctx);
    bpf_printk("malloc(%zu)\n", size);
    return 0;
}

$ sudo ./trace_malloc
Attaching probe...
PID 1234: malloc(4096)
PID 1234: malloc(256)
PID 5678: malloc(1024)

Implementation Hints:

bpftrace one-liners to start:

# Count syscalls by process
bpftrace -e 'tracepoint:raw_syscalls:sys_enter { @[comm] = count(); }'

# Trace file opens
bpftrace -e 'tracepoint:syscalls:sys_enter_openat { printf("%s\n", str(args->filename)); }'

# Histogram of read sizes
bpftrace -e 'tracepoint:syscalls:sys_exit_read /args->ret > 0/ { @bytes = hist(args->ret); }'

# Trace context switches
bpftrace -e 'tracepoint:sched:sched_switch { printf("%s -> %s\n", args->prev_comm, args->next_comm); }'

BCC Python template:

from bcc import BPF

prog = """
int trace_func(struct pt_regs *ctx) {
    bpf_trace_printk("function called\\n");
    return 0;
}
"""

b = BPF(text=prog)
b.attach_kprobe(event="do_sys_openat2", fn_name="trace_func")
b.trace_print()

libbpf C template:

// trace.bpf.c
SEC("tracepoint/syscalls/sys_enter_write")
int trace_write(struct trace_event_raw_sys_enter *ctx) {
    int fd = ctx->args[0];
    bpf_printk("write to fd %d\n", fd);
    return 0;
}

Questions to guide implementation:

  1. What’s the difference between kprobes, tracepoints, and uprobes?
  2. What restrictions does the eBPF verifier enforce?
  3. How do you share data between eBPF programs and userspace?
  4. What is BTF and why is it important for portability?

Learning milestones:

  1. bpftrace one-liners work → You understand basic tracing
  2. BCC script traces your chosen event → You understand eBPF programming
  3. You write a libbpf program → You understand CO-RE and BTF
  4. You build useful debugging tools → You can apply eBPF to real problems

The Core Question You’re Answering

“How do you observe kernel behavior safely without modifying core logic?”

eBPF and tracing provide visibility with minimal disruption.

Concepts You Must Understand First

  1. BPF program types
    • Which program types attach to tracepoints?
    • Book Reference: Linux Kernel Development — Ch. 21
  2. BPF maps
    • How do you store state across events?
    • Book Reference: Linux Kernel Development — Ch. 21
  3. Tracing vs kprobes
    • When do you use each?
    • Book Reference: The Linux Programming Interface — Ch. 44

Questions to Guide Your Design

  1. Target events
    • Which tracepoints reveal the subsystem behavior?
  2. Data volume
    • How will you aggregate to avoid overhead?
  3. Safety
    • How will you ensure the BPF verifier accepts the program?

Thinking Exercise

tracepoint:syscalls:sys_enter_openat

Questions while tracing:

  • What arguments can you read safely?
  • How will you group by PID or command?

The Interview Questions They’ll Ask

  1. “What is eBPF and why is it safe?”
  2. “What does the BPF verifier enforce?”
  3. “How do you pass data to userspace?”
  4. “Why are tracepoints preferred over kprobes?”
  5. “What are common eBPF use cases?”

Hints in Layers

Hint 1: Start with bpftrace Write a one-liner to count syscalls.

Hint 2: Use a map Aggregate by PID or syscall number.

Hint 3: Build a C BPF program Use libbpf to load and attach.

Hint 4: Add filtering Only trace a specific process to reduce noise.

Books That Will Help

Topic Book Chapter
Tracing basics The Linux Programming Interface Ch. 44
Kernel internals Linux Kernel Development Ch. 21

Common Pitfalls & Debugging

Problem 1: “Verifier rejects program”

  • Why: Unsafe pointer access
  • Fix: Use helper functions and bounded loops

Problem 2: “No output”

  • Why: Tracepoint not firing or permissions
  • Fix: Run as root and confirm event names

Problem 3: “High overhead”

  • Why: Printing per-event
  • Fix: Aggregate in a map and print periodically

Project 14: Kernel Coding Style Cleanup (Staging Cleanup)

  • File: LEARN_LINUX_KERNEL_DEEP_DIVE.md
  • Main Programming Language: C
  • Alternative Programming Languages: N/A
  • Coolness Level: Level 2: Practical but Forgettable
  • Business Potential: 1. The “Resume Gold”
  • Difficulty: Level 2: Intermediate
  • Knowledge Area: Kernel Patch Submission / Community Process
  • Software or Tool: checkpatch.pl, git format-patch, git send-email
  • Main Book: Kernel documentation (Documentation/process/)

What you’ll build: Your first actual kernel patches—coding style cleanups in the drivers/staging/ directory, submitted to the Linux kernel mailing list and (hopefully) accepted.

Why it teaches Linux kernel: This is where the rubber meets the road. The staging tree is explicitly designed for new contributors. You’ll learn checkpatch.pl, git format-patch, git send-email, and how to respond to maintainer feedback. This is the exact process for any kernel contribution.

Core challenges you’ll face:

  • Finding suitable cleanup opportunities → maps to reading and understanding kernel code
  • Using checkpatch.pl correctly → maps to kernel coding style requirements
  • Formatting patches properly → maps to git workflows for kernel
  • Responding to review feedback → maps to community interaction

Key Concepts:

  • Coding style: Kernel documentation Documentation/process/coding-style.rst
  • Submitting patches: Kernel documentation Documentation/process/submitting-patches.rst
  • First patch checklist: KernelNewbies FirstKernelPatch
  • Git email setup: Kernel documentation Documentation/process/email-clients.rst

Difficulty: Intermediate Time estimate: 1-2 weeks Prerequisites: Projects 3-4 completed, git proficiency

Real world outcome:

# Find checkpatch issues in staging
$ cd linux
$ ./scripts/checkpatch.pl --file drivers/staging/rtl8723bs/core/*.c | head -50
WARNING: line length of 85 exceeds 80 columns
#42: FILE: drivers/staging/rtl8723bs/core/rtw_cmd.c:42:
+       u8 *pcmd_para;          /* cmd parameter */

ERROR: space prohibited before that close parenthesis ')'
#156: FILE: drivers/staging/rtl8723bs/core/rtw_cmd.c:156:
+       if (pcmdpriv == NULL )

# Fix issues and check your changes
$ git diff drivers/staging/rtl8723bs/core/rtw_cmd.c
-       if (pcmdpriv == NULL )
+       if (!pcmdpriv)

$ ./scripts/checkpatch.pl --git HEAD
total: 0 errors, 0 warnings, 12 lines checked

# Create commit with proper format
$ git commit -s

# (Your editor opens with this template)
staging: rtl8723bs: fix coding style issues in rtw_cmd.c

Fix the following issues reported by checkpatch.pl:
- Replace 'foo == NULL' with '!foo'
- Remove space before closing parenthesis
- Break lines exceeding 80 characters

Signed-off-by: Your Name <your@email.com>

# Find maintainers
$ ./scripts/get_maintainer.pl drivers/staging/rtl8723bs/
Greg Kroah-Hartman <gregkh@linuxfoundation.org> (maintainer:STAGING SUBSYSTEM)
linux-staging@lists.linux.dev (open list:STAGING SUBSYSTEM)

# Format patch for email
$ git format-patch -1
0001-staging-rtl8723bs-fix-coding-style-issues-in-rtw_cmd.patch

# Send to mailing list (after configuring git send-email)
$ git send-email --to=gregkh@linuxfoundation.org \
                 --cc=linux-staging@lists.linux.dev \
                 0001-staging-rtl8723bs-fix-coding-style-issues-in-rtw_cmd.patch

# Watch for replies on the mailing list or your email
# You might get:
# - Accepted: "Applied, thanks!"
# - Feedback: "Please split this into multiple patches"
# - Rejection: "This doesn't actually improve the code"

Implementation Hints:

Finding good first patches:

  1. Look at TODO files in staging drivers
  2. Run checkpatch.pl on staging drivers
  3. Search for “Fix checkpatch” commits for examples
  4. Start with cosmetic changes (whitespace, comments)

checkpatch.pl common fixes:

// Before
if (ptr == NULL)
if (ptr != NULL)
while (x == true)

// After
if (!ptr)
if (ptr)
while (x)

Commit message format:

subsystem: component: short description (< 50 chars)

Longer explanation of what and why (not how).
Wrap at 72 characters.

Signed-off-by: Your Name <email@example.com>

git send-email setup:

git config --global sendemail.smtpserver smtp.gmail.com
git config --global sendemail.smtpserverport 587
git config --global sendemail.smtpencryption tls
git config --global sendemail.smtpuser your@gmail.com

Important tips:

  • ALWAYS run checkpatch.pl before submitting
  • One logical change per patch
  • Don’t mix coding style fixes with functional changes
  • Be patient—maintainers are busy
  • Be gracious when receiving feedback

Learning milestones:

  1. checkpatch.pl passes on your changes → You understand kernel style
  2. You send your first patch → You understand the submission process
  3. You respond to feedback appropriately → You can work with the community
  4. Your patch is accepted → You are officially a kernel contributor!

The Core Question You’re Answering

“What does kernel-quality code look like, and how do you make small changes that matter?”

Cleanup work teaches you kernel style, API usage, and review discipline.

Concepts You Must Understand First

  1. Kernel coding style
    • What does checkpatch.pl enforce?
    • Book Reference: Linux Kernel Development — Ch. 2
  2. Staging drivers
    • Why are they in drivers/staging?
    • Book Reference: Linux Kernel Development — Ch. 14
  3. Patch workflow
    • How do you send a patch?
    • Book Reference: Linux System Programming — Ch. 1

Questions to Guide Your Design

  1. Scope
    • Can the patch be reviewed in under 5 minutes?
  2. Testing
    • How will you prove behavior did not change?
  3. Communication
    • How will you explain the change in the commit message?

Thinking Exercise

Fix a single warning in checkpatch output without changing behavior.

Questions while tracing:

  • Does the patch change semantics?
  • Did you update only what is necessary?

The Interview Questions They’ll Ask

  1. “Why are staging drivers different?”
  2. “What makes a good kernel commit message?”
  3. “How do you run kernel style checks?”
  4. “What is the difference between cleanup and refactor?”
  5. “How do you pick a subsystem to contribute to?”

Hints in Layers

Hint 1: Run checkpatch Use ./scripts/checkpatch.pl on your patch.

Hint 2: Start tiny Fix a single warning or whitespace issue.

Hint 3: Explain the change Write a commit message that says why it matters.

Hint 4: Send to staging tree Use the staging tree mailing list for first patches.

Books That Will Help

Topic Book Chapter
Kernel build/workflow Linux Kernel Development Ch. 2
Patch discipline The Pragmatic Programmer Ch. 3

Common Pitfalls & Debugging

Problem 1: “Patch rejected for style”

  • Why: Missed checkpatch warnings
  • Fix: Run checkpatch.pl and address each issue

Problem 2: “Too many changes”

  • Why: Mixed cleanups in one patch
  • Fix: Split into multiple small patches

Problem 3: “No maintainer response”

  • Why: Wrong mailing list or missing CC
  • Fix: Use scripts/get_maintainer.pl

Project 15: Bug Fix Contribution

  • File: LEARN_LINUX_KERNEL_DEEP_DIVE.md
  • Main Programming Language: C
  • Alternative Programming Languages: N/A
  • Coolness Level: Level 4: Hardcore Tech Flex
  • Business Potential: 1. The “Resume Gold”
  • Difficulty: Level 4: Expert
  • Knowledge Area: Kernel Debugging / Bug Fixing
  • Software or Tool: bugzilla, syzkaller, syzbot
  • Main Book: “Linux Kernel Debugging” by Kaiwan N. Billimoria

What you’ll build: A real bug fix for the Linux kernel—finding a bug (or picking one from bug trackers), analyzing the root cause, writing a fix, and submitting it upstream.

Why it teaches Linux kernel: Bug fixing is how most kernel developers start making meaningful contributions. It requires understanding the code deeply, using debugging tools, and thinking critically about edge cases. Successfully fixing a kernel bug demonstrates real competence.

Core challenges you’ll face:

  • Finding and reproducing bugs → maps to bug tracking and test case creation
  • Root cause analysis → maps to deep kernel debugging
  • Writing correct fixes → maps to understanding subsystem behavior
  • Testing thoroughly → maps to kernel testing methodology

Key Concepts:

  • Bug sources: kernel.org bugzilla, syzbot reports
  • Analysis techniques: “Linux Kernel Debugging” - Kaiwan N. Billimoria
  • Testing: Kernel documentation Documentation/dev-tools/
  • Fix patterns: Study existing bug fix commits

Difficulty: Expert Time estimate: 2-4 weeks Prerequisites: Projects 10, 13, 14 completed

Real world outcome:

# Example: Finding and fixing a bug from syzbot

# 1. Browse syzbot for open bugs
# https://syzkaller.appspot.com/

# 2. Pick a bug you understand (e.g., null pointer dereference)
# Example: "KASAN: null-ptr-deref in foo_function"

# 3. Reproduce it (syzbot provides a reproducer)
$ ./syz-repro ./reproducer.syz
[  123.456789] BUG: KASAN: null-ptr-deref in foo_function+0x123/0x456
[  123.456790] Read of size 8 at addr 0000000000000000 by task repro/1234

# 4. Analyze the code path
$ addr2line -e vmlinux ffffffff81234567
drivers/foo/bar.c:123

# 5. Examine the code
$ vim drivers/foo/bar.c +123
// Line 123: ptr->member = value;  // ptr could be NULL!

# 6. Find where ptr should have been validated
# Trace back through the call stack

# 7. Write and test your fix
$ git diff
diff --git a/drivers/foo/bar.c b/drivers/foo/bar.c
--- a/drivers/foo/bar.c
+++ b/drivers/foo/bar.c
@@ -120,6 +120,8 @@ static int process_request(struct foo *ptr)
 {
     int ret;

+    if (!ptr)
+        return -EINVAL;
+
     ptr->member = value;

# 8. Build and test
$ make -j$(nproc)
$ ./boot_qemu.sh
# Run reproducer - should no longer crash

# 9. Run kernel self-tests
$ make kselftest TARGETS=drivers/foo

# 10. Submit the patch
$ git commit -s -m "foo: fix null pointer dereference in process_request

Add a null check for ptr before dereferencing it. This fixes a
KASAN-reported null pointer dereference that can occur when
foo_init() fails to allocate the structure.

Reported-by: syzbot+abc123@syzkaller.appspotmail.com
Signed-off-by: Your Name <your@email.com>"

$ git format-patch -1
$ ./scripts/get_maintainer.pl drivers/foo/bar.c
$ git send-email ...

# 11. Reply to syzbot confirming the fix
# (They'll test it and add Tested-by if it works)

Implementation Hints:

Finding bugs to fix:

  1. kernel.org bugzilla: https://bugzilla.kernel.org/
  2. syzbot dashboard: https://syzkaller.appspot.com/
  3. mailing list reports: Search lkml for “BUG:” or “WARNING:”
  4. Your own testing: Run syzkaller yourself

Analyzing crash reports:

# Decode stack trace
$ ./scripts/decode_stacktrace.sh vmlinux < crash.txt

# Find function + offset
$ addr2line -e vmlinux -f ffffffff81234567

# Disassemble around the crash
$ objdump -d vmlinux | grep -A20 "foo_function>:"

Common bug patterns:

  • Null pointer dereference (missing validation)
  • Use-after-free (wrong refcounting)
  • Buffer overflow (bounds check missing)
  • Race condition (missing locking)
  • Resource leak (missing cleanup path)

Testing your fix:

  • Verify the reproducer no longer crashes
  • Run related kernel self-tests
  • Check for regressions (doesn’t break normal use)
  • Test edge cases you identified

Questions to consider:

  1. Is this the right place to fix, or is there a deeper issue?
  2. Could this fix introduce other problems?
  3. Are there similar bugs in related code?
  4. Should this be backported to stable kernels?

Learning milestones:

  1. You reproduce a reported bug → You can set up testing environments
  2. You identify the root cause → You can analyze kernel code
  3. Your fix works without regressions → You understand the subsystem
  4. Your patch is accepted → You’ve made a real contribution

The Core Question You’re Answering

“Can you identify a real kernel bug, build a minimal fix, and communicate it clearly?”

This project is about real contribution behavior, not toy code.

Concepts You Must Understand First

  1. Bug triage
    • How to reduce to a minimal repro?
    • Book Reference: Linux Kernel Development — Ch. 20
  2. Regression discipline
    • Why do kernel maintainers care about regressions?
    • Book Reference: Linux Kernel Development — Ch. 20
  3. Patch submission
    • How to format and send patches?
    • Book Reference: Linux System Programming — Ch. 1

Questions to Guide Your Design

  1. Repro
    • Can you reproduce on mainline?
  2. Scope
    • Is the fix minimal and targeted?
  3. Validation
    • How will you prove the fix works?

Thinking Exercise

A bug appears only after 2 hours of uptime. How do you isolate it?

Questions while tracing:

  • How would you reduce runtime?
  • What logs or tracepoints would you add?

The Interview Questions They’ll Ask

  1. “How do you approach a kernel bug report?”
  2. “What does a good reproduction look like?”
  3. “How do you justify the fix in a commit message?”
  4. “What is the difference between a bug fix and a workaround?”
  5. “How do you validate a kernel patch?”

Hints in Layers

Hint 1: Start with a small bug Choose a warning or small correctness issue.

Hint 2: Build a minimal repro Write a tiny userspace trigger program.

Hint 3: Write the fix Keep the diff minimal and focused.

Hint 4: Communicate clearly Explain symptoms, root cause, and fix in the commit message.

Books That Will Help

Topic Book Chapter
Bug fixing Linux Kernel Development Ch. 20
Debugging The Art of Debugging with GDB Ch. 7

Common Pitfalls & Debugging

Problem 1: “Fix changes behavior”

  • Why: You adjusted logic instead of guarding edge cases
  • Fix: Minimize diff, add targeted checks

Problem 2: “Patch rejected for missing context”

  • Why: Commit message does not explain root cause
  • Fix: Add “what”, “why”, and “how” sections

Problem 3: “Can’t reproduce”

  • Why: Race conditions or timing issues
  • Fix: Add tracing and reduce to minimal repro

Project Comparison Table

Project Difficulty Time Depth of Understanding Fun Factor
1. Kernel Interface Explorer Beginner Weekend ⭐⭐ ⭐⭐⭐
2. System Call Tracer Intermediate 1-2 weeks ⭐⭐⭐ ⭐⭐⭐⭐
3. Custom Kernel Build Intermediate Weekend ⭐⭐⭐ ⭐⭐⭐⭐
4. Kernel Module Intermediate 1 week ⭐⭐⭐ ⭐⭐⭐
5. Character Device Driver Advanced 1-2 weeks ⭐⭐⭐⭐ ⭐⭐⭐⭐
6. Memory Allocator Visualizer Advanced 2 weeks ⭐⭐⭐⭐⭐ ⭐⭐⭐
7. Scheduler Analyzer Advanced 1-2 weeks ⭐⭐⭐⭐⭐ ⭐⭐⭐
8. Simple Filesystem Expert 3-4 weeks ⭐⭐⭐⭐⭐ ⭐⭐⭐⭐
9. Netfilter Module Advanced 2 weeks ⭐⭐⭐⭐ ⭐⭐⭐⭐⭐
10. Kernel Debugging Advanced 1 week ⭐⭐⭐⭐ ⭐⭐⭐
11. Block Device Driver Expert 2-3 weeks ⭐⭐⭐⭐⭐ ⭐⭐⭐⭐
12. USB Device Driver Expert 3-4 weeks ⭐⭐⭐⭐⭐ ⭐⭐⭐⭐⭐
13. eBPF Tracing Advanced 2 weeks ⭐⭐⭐⭐ ⭐⭐⭐⭐⭐
14. Style Cleanup Patches Intermediate 1-2 weeks ⭐⭐ ⭐⭐
15. Bug Fix Contribution Expert 2-4 weeks ⭐⭐⭐⭐⭐ ⭐⭐⭐⭐⭐

Based on the goal of becoming a kernel contributor, here’s the recommended order:

Phase 1: Foundation (2-4 weeks)

  1. Project 1: Kernel Interface Explorer - Understand what the kernel exposes
  2. Project 2: System Call Tracer - Understand the user/kernel boundary
  3. Project 3: Build and Boot Custom Kernel - Establish your development workflow

Phase 2: Kernel Programming (4-6 weeks)

  1. Project 4: Kernel Modules - Learn kernel programming fundamentals
  2. Project 5: Character Device Driver - Understand driver basics
  3. Project 10: Kernel Debugging - Essential skill for all kernel work

Phase 3: Subsystem Deep Dives (6-10 weeks)

Pick 2-3 based on your interests:

  • Project 6: Memory Allocator - If interested in core kernel
  • Project 7: Scheduler Analyzer - If interested in performance
  • Project 8: Filesystem - If interested in storage
  • Project 9: Netfilter - If interested in networking
  • Project 11: Block Device - If interested in storage drivers
  • Project 12: USB Driver - If interested in hardware

Phase 4: Modern Tools & Contribution (4-6 weeks)

  1. Project 13: eBPF Tracing - Modern kernel interaction
  2. Project 14: Coding Style Cleanup - Your first real patch
  3. Project 15: Bug Fix - Your first meaningful contribution

Total estimated time: 4-6 months of focused learning


Final Capstone Project: Contribute a New Feature or Subsystem Enhancement

  • File: LEARN_LINUX_KERNEL_DEEP_DIVE.md
  • Main Programming Language: C
  • Alternative Programming Languages: Rust (for supported subsystems)
  • Coolness Level: Level 5: Pure Magic
  • Business Potential: 1. The “Resume Gold” / 5. The “Industry Disruptor”
  • Difficulty: Level 5: Master
  • Knowledge Area: Full Kernel Development
  • Software or Tool: Full kernel development stack
  • Main Book: Linux kernel source code itself

What you’ll build: A genuine new feature or significant enhancement to a kernel subsystem—something that requires RFC discussions, multiple patch series, and extended interaction with maintainers.

Why it teaches Linux kernel: This is the culmination of your learning. You’re not just fixing bugs—you’re improving the kernel’s capabilities. This requires understanding a subsystem deeply, designing solutions that work with existing architecture, and convincing maintainers that your approach is sound.

Example projects:

  • Add a new scheduler class for a specific workload type
  • Implement a new filesystem feature (e.g., new ioctl, new mount option)
  • Add hardware support for a new device
  • Improve performance of an existing subsystem
  • Add new tracepoints for better observability
  • Implement a new security feature

Core challenges you’ll face:

  • Design discussion with maintainers → maps to community collaboration
  • Writing an RFC patch series → maps to technical communication
  • Handling design feedback and iteration → maps to engineering compromise
  • Documentation and testing → maps to production-quality development
  • Shepherding through review → maps to persistence and professionalism

Real world outcome:

Subject: [RFC PATCH 0/5] Add foo feature to bar subsystem

This series adds support for foo in the bar subsystem, addressing
the need for [specific use case].

The implementation adds:
- New API for userspace to configure foo (patch 1)
- Core foo logic in bar subsystem (patches 2-3)
- Integration with existing baz feature (patch 4)
- Documentation and tests (patch 5)

Performance results show [X% improvement] in [benchmark] for
[workload type].

This has been tested on x86_64 and arm64 with [test suite].

Feedback welcome, especially on the API design in patch 1.

[Your Name] (5):
  bar: add UAPI definitions for foo feature
  bar: implement core foo functionality
  bar: add sysfs interface for foo configuration
  bar: integrate foo with existing baz codepath
  Documentation: bar: document foo feature and usage

 Documentation/admin-guide/bar.rst |  45 +++++
 drivers/bar/core.c                | 234 ++++++++++++++++++++++
 drivers/bar/foo.c                 | 567 ++++++++++++++++++++++++++++
 include/uapi/linux/bar.h          |  23 ++
 4 files changed, 869 insertions(+)
 create mode 100644 drivers/bar/foo.c

---

[Months of discussion, iteration, and refinement later...]

Subject: [PATCH v7 0/5] Add foo feature to bar subsystem

Applied, thanks!

Signed-off-by: Maintainer <maintainer@kernel.org>

Prerequisites for this capstone:

  • All previous projects completed
  • Multiple accepted patches (style fixes and bug fixes)
  • Good relationship with at least one subsystem community
  • Deep understanding of your target subsystem
  • Time and patience (this can take 6+ months)

Essential Resources

Official Documentation

Books (In Priority Order)

  1. “Linux Kernel Programming” by Kaiwan N. Billimoria - Best modern introduction
  2. “Linux Device Drivers, 3rd Edition” by Corbet, Rubini, Kroah-Hartman - Classic, still relevant
  3. “Understanding the Linux Kernel” by Bovet & Cesati - Deep internals reference
  4. “The Linux Programming Interface” by Michael Kerrisk - Essential for syscall interface
  5. “Learning eBPF” by Liz Rice - Modern tracing and observability

Courses

Online Resources

Community

  • Linux Kernel Mailing List (LKML)
  • #kernelnewbies on OFTC IRC
  • linux-kernel subreddit

Summary

# Project Main Language
1 Kernel Interface Explorer (/proc /sys) C
2 System Call Tracer (strace clone) C
3 Build and Boot Custom Kernel C + Shell
4 Hello World Kernel Module C
5 Character Device Driver C
6 Memory Allocator Visualizer C
7 Process Scheduler Analyzer C
8 Simple Filesystem (FUSE + Kernel) C
9 Network Packet Filter (Netfilter) C
10 Kernel Debugger Experience (GDB + QEMU) C
11 Block Device Driver C
12 USB Device Driver C
13 Kernel Tracing with eBPF C + Python
14 Kernel Coding Style Cleanup (First Patch) C
15 Bug Fix Contribution C
Capstone New Feature Contribution C

Final Notes

Becoming a Linux kernel contributor is a marathon, not a sprint. The kernel community values:

  • Quality over quantity - One well-crafted patch beats ten sloppy ones
  • Persistence - Patches often require multiple revisions
  • Humility - Maintainers know more about their subsystem than you
  • Communication - Clear explanations matter as much as correct code
  • Patience - Review can take weeks or months

The journey is challenging but rewarding. You’ll join a community that has shaped computing for three decades and continues to push the boundaries of what’s possible with software.

Good luck, and happy hacking! 🐧


Sources consulted: