Sprint: Linux and Unix Internals Mastery - Real World Projects

Goal: Build a durable mental model of how Linux and Unix systems move from power-on to user space, from a keystroke to a syscall, and from a file name to a disk block. You will learn the contracts that make the OS predictable: the syscall boundary, the process model, virtual memory, and the VFS layer. You will be able to trace behavior in /proc, reason about why a process blocks, and explain how containers are assembled from namespaces and cgroups. By the end, you can build, debug, and explain core system behaviors instead of guessing.

Introduction

What is Linux and Unix internals? The concrete rules, data structures, and interfaces that connect firmware, kernel, and user space.
What problem does it solve today? It turns nondeterministic “system behavior” into explainable cause-and-effect.
What will you build across the projects? A bootable image, a shell, a syscall tracer, a process inspector, a TTY tool, a page cache lab, a FUSE filesystem, and a mini container runtime.
In scope: boot flow, syscalls, processes, memory, VFS, signals, terminals, namespaces, cgroups.
Out of scope: kernel hacking, device drivers, networking internals beyond sockets, and full OS design.

Big-picture mental model (from input to disk and back):

User Input
   |
   v
Shell -> libc wrapper -> syscall gate
   |                     |
   |                     v
   |                Kernel boundary
   |                     |
   v                     v
Process model -> Scheduler -> VFS -> Page cache -> Block layer -> Device
   ^                                                         |
   |                                                         v
   +-------------------- result, errno, data ----------------+

How to Use This Guide

Read the Theory Primer first to build the mental model for each system layer.
Pick a learning path that matches your background and time constraints.
After each project, verify results against the Definition of Done and compare behavior with expected CLI output.

Prerequisites & Background Knowledge

Essential Prerequisites (Must Have)

C basics: pointers, structs, memory allocation, and string handling.
Shell fluency: pipes, redirection, job control, and man pages.
Basic process vocabulary: PID, file descriptor, exit code.
Recommended Reading: “The Linux Programming Interface” by Michael Kerrisk - Ch. 1-6.

Helpful But Not Required

Assembly basics (you will encounter this in Project 1).
Filesystem basics (blocks, inodes, directories) - learned in Projects 4 and 8.
Containers overview - learned in Projects 9 and 10.

Self-Assessment Questions

What is the difference between a file descriptor and a pathname?
Why can a child process not change the parent working directory?
What does it mean for a signal to be pending?

Development Environment Setup Required Tools:

Linux host or VM (x86_64).
C compiler (gcc 10+ or clang 12+).
Build tools: make and ld.
Tracing tools: strace, ltrace.
Debugger: gdb.

Recommended Tools:

QEMU (for boot project).
perf or bpftrace (for performance inspection).
tmux (for multi-pane tracing sessions).

Testing Your Setup:

$ uname -r
6.x.y
$ strace -c true
% time     seconds  usecs/call     calls    errors syscall
------ ----------- ----------- --------- --------- ----------------
100.00    0.000000           0         1           exit_group

Time Investment

Simple projects: 4-8 hours each
Moderate projects: 10-20 hours each
Complex projects: 20-40 hours each
Total sprint: 2-4 months

Important Reality Check You will hit silent failures: buffered output, missing permissions, and races. Treat them as signals to refine your mental model, not as personal failure.

Big Picture / Mental Model

Linux and Unix internals are best viewed as layered contracts. Each layer hides complexity behind a narrow interface. When a program misbehaves, it is usually violating a contract or making a wrong assumption about a layer boundary.

+-----------------------------------------------------+
| User Space                                           |
|  - Shell, apps, libraries                            |
+---------------------------+-------------------------+
                            |
                            v
+---------------------------+-------------------------+
| Syscall Interface (ABI)                              |
|  - numbers, args, return values, errno              |
+---------------------------+-------------------------+
                            |
                            v
+---------------------------+-------------------------+
| Kernel Core                                           |
|  - Process model, scheduler, memory, VFS            |
+---------------------------+-------------------------+
                            |
                            v
+---------------------------+-------------------------+
| Devices and Firmware                                   |
|  - block I/O, terminals, timers, boot firmware       |
+-----------------------------------------------------+

Theory Primer

Chapter 1: Boot and Early Userspace

Fundamentals The boot path is a chain of responsibility from firmware to kernel to user space. Firmware (BIOS or UEFI) initializes hardware and loads a bootloader, which then loads the kernel image and optional initramfs. The kernel is not the first code executed, but it is the first general-purpose code that owns the machine. Early kernel work includes setting up memory management, drivers, and the scheduler, then mounting a root filesystem. The transition to user space happens when the kernel launches PID 1, which is responsible for starting the rest of the system. Understanding this sequence matters because failures before PID 1 require a different debugging approach than failures after user space starts.

Deep Dive Boot is a pipeline with strict handoff points and explicit contracts. Firmware performs basic hardware initialization and defines the starting environment for the bootloader. In UEFI systems, the firmware provides standardized boot services and runtime services, along with tables that describe hardware and boot options. In legacy BIOS flows, the firmware loads a boot sector from disk into memory and jumps to it. Regardless of the path, the bootloader is the piece that understands how to locate and load the kernel image, often along with an initramfs that provides temporary root filesystem contents for early user space.

The Linux kernel expects a defined boot protocol on each architecture. On x86, the boot protocol defines where the kernel image is loaded, how the command line is passed, and where the bootloader communicates memory layout, video info, and other parameters. This is crucial: the kernel cannot guess. If the bootloader fails to pass correct metadata, the kernel will misinterpret memory or fail to mount the root filesystem. After decompression and early setup, the kernel configures memory zones, initializes the interrupt descriptor table, sets up the scheduler, and probes drivers needed to access the root filesystem.

The initramfs is a compressed archive that the kernel unpacks into a temporary root. Its job is to provide minimal tools and drivers to locate the real root filesystem. The initramfs init process runs as PID 1 in early user space; it can load modules, assemble storage (for example, RAID or LUKS), and then switch to the final root using a pivot operation. Once the real root is ready, the kernel (or the initramfs init) performs a root switch and executes the real PID 1, which is typically systemd or an init system. From that point onward, the system behaves like the environment most users are familiar with: services start, loggers run, and the shell becomes available.

This chain is fragile because each step depends on the previous one. If the kernel cannot mount the root filesystem, it panics because it has no place to execute user space from. If PID 1 exits, the kernel also panics because there is no supervisor left. Debugging early boot therefore relies on kernel logs, boot parameters, and visibility into the initramfs environment. You can often diagnose issues by inspecting the kernel command line, verifying that required drivers are built-in or present in initramfs, and using emergency shell hooks in the initramfs.

A key insight is that boot is not a monolithic black box. Each stage is small, testable, and replaceable. You can run a tiny bootloader in QEMU, print directly to video memory, and halt the CPU to verify that the firmware-to-bootloader handoff works. You can alter kernel command line arguments to change root mounts and debug options. You can break the initramfs on purpose to practice recovery. This conceptual clarity is what allows systems engineers to debug “machine will not boot” with confidence instead of guesswork.

How this fit on projects Projects 1 and 10 rely on understanding early boot, PID 1 expectations, and how user space is launched.

Definitions and key terms

Firmware: Low-level code that initializes hardware and starts the boot chain.
Bootloader: Program that loads the kernel and passes metadata to it.
initramfs: Temporary root filesystem used for early user space.
PID 1: The first user space process; it must stay alive.
Kernel command line: Boot parameters passed to the kernel.

Mental model diagram

Power on
   |
   v
Firmware (BIOS/UEFI)
   |
   v
Bootloader
   |
   v
Kernel + initramfs
   |
   v
Early user space (initramfs init)
   |
   v
Real root mounted
   |
   v
PID 1 (init/systemd)

How it works

Firmware initializes hardware and finds a boot target.
Bootloader loads kernel image and initramfs, passes boot params.
Kernel initializes CPU, memory, interrupts, and essential drivers.
Kernel mounts temporary root and starts early user space.
Early user space mounts real root and hands off to PID 1.

Minimal concrete example

Boot log excerpt (conceptual):
[boot] firmware OK
[boot] bootloader loads kernel + initramfs
[boot] kernel cmdline: root=/dev/sda2 ro
[boot] initramfs: loading storage drivers
[boot] switch_root -> /sbin/init
[boot] PID 1 started

Common misconceptions

“The kernel is the first code that runs.” It is not; firmware and bootloader run first.
“initramfs is optional.” It is optional for some systems, mandatory for many modern ones.
“PID 1 is just another process.” It has special responsibilities and failure semantics.

Check-your-understanding questions

Why does the kernel panic if PID 1 exits?
What information does the bootloader provide to the kernel?
Why can a system boot without an initramfs in some cases but not others?

Check-your-understanding answers

The kernel has no user space supervisor left; it cannot proceed safely.
Memory layout, boot parameters, and location of command line or initramfs.
If all necessary drivers are built into the kernel and root is simple, initramfs may be unnecessary.

Real-world applications

Debugging boot failures in cloud images or embedded devices.
Building minimal appliances with a custom initramfs.
Understanding secure boot flows.

Where you’ll apply it

Project 1: Do-Nothing Bootloader and Kernel
Project 10: Container Runtime (PID 1 behavior)

References

Linux x86 boot protocol: https://docs.kernel.org/arch/x86/boot.html
UEFI specification overview: https://uefi.org/specs/UEFI/2.10/
“How Linux Works” by Brian Ward - Ch. 1-3

Key insights Boot is a chain of contracts, not a magic moment.

Summary If you can describe the handoff from firmware to bootloader to kernel to PID 1, you can debug the hardest “it does not boot” problems.

Homework/Exercises to practice the concept

Draw your machine’s boot chain from firmware to login prompt.
Identify which step owns the kernel command line and where it is stored.

Solutions to the homework/exercises

Firmware -> bootloader -> kernel + initramfs -> PID 1 -> services -> login.
Bootloader writes it into the protocol-defined field and points the kernel to it.

Chapter 2: System Call Interface and ABI

Fundamentals The system call interface is the only legal bridge from user space to kernel space. It is defined by the ABI: which registers hold arguments, how the syscall number is specified, and how return values and errors are communicated. Libraries like libc provide wrappers so most programs do not issue raw syscalls. This boundary exists to enforce privilege separation and provide a stable contract across kernel versions. Understanding syscalls lets you map user-level behavior to kernel actions, which is the basis of tracing, debugging, and performance diagnosis.

Deep Dive A system call is a controlled privilege transition. When user space needs a service that only the kernel can provide, it triggers an architecture-specific instruction that causes a switch to kernel mode. The CPU changes privilege level, switches stacks, and jumps to a well-defined entry point. The kernel then reads the syscall number and arguments according to the ABI. The ABI is not optional: it defines the binary contract between compiled user programs and the kernel. On Linux, this is documented by kernel and libc interfaces, and the details vary by architecture. The important takeaway is that the ABI makes syscalls a stable, testable interface even as kernel internals evolve.

Most programs do not call syscalls directly. The C library provides wrappers that validate arguments, set up registers, and handle error conventions. When a syscall fails, the kernel typically returns a negative error code, which libc converts into a -1 return value with errno set. This design allows error handling to be uniform across functions. It also allows the kernel to add new syscalls without forcing every user program to know the register-level details.

The syscall boundary is visible in tools like strace, which show each syscall and its result. This makes it possible to answer questions like “why does this program block” or “what file does it open.” It also reveals hidden costs: a simple user command can produce dozens of syscalls due to library initialization, dynamic linking, and filesystem checks. That is why tracing is such a powerful diagnostic tool. You are not guessing what the program does; you are observing the kernel-level truth.

The ABI also connects to binary formats and the loader. When you call exec to start a program, the kernel inspects the executable format (ELF on Linux). It maps segments into memory, sets up the initial stack with arguments and environment variables, and then begins execution at the entry point. The ABI specifies calling conventions, data layout, and how shared libraries are resolved. This is why a binary compiled for one architecture will not run on another: the ABI assumptions are different.

Syscalls are intentionally narrow. They do not expose kernel data structures directly, and they are designed to maintain a stable contract over time. When higher-level features are needed, they are built in user space by composing syscalls. For example, shells implement pipelines by combining fork, pipe, dup, and exec. Filesystems in user space (FUSE) are built by exposing a file operation API through a device interface. Containers are built by composing namespace and cgroup syscalls. This composability is the Unix design philosophy in action.

A common failure mode is confusing library calls with syscalls. Not every function that touches the OS is a syscall. Some functions are pure library code, and others combine multiple syscalls internally. When debugging, always trace what the kernel sees instead of assuming a direct mapping from a function name to a single syscall.

How this fit on projects Projects 2, 3, 4, 8, and 10 depend on understanding the syscall boundary and ABI conventions.

Definitions and key terms

Syscall: A privileged transition to the kernel to request a service.
ABI: Application Binary Interface; binary-level contract between user space and kernel.
libc: Standard C library providing syscall wrappers and utilities.
errno: Thread-local error indicator set by libc when a syscall fails.

Mental model diagram

User program
   |
   v
libc wrapper
   |
   v
syscall gate (CPU switches to kernel)
   |
   v
kernel service
   |
   v
return value + errno

How it works

User code calls a libc wrapper.
Wrapper loads syscall number and arguments.
CPU enters kernel mode and jumps to syscall handler.
Kernel validates, executes, and returns a result.
libc converts result into return value and errno.

Minimal concrete example

Syscall trace excerpt (conceptual):
openat("/etc/hostname") -> fd=3
read(fd=3, bytes=64) -> 12 bytes
close(fd=3) -> 0

Common misconceptions

“Every C function is a syscall.” Most are not.
“Syscalls are slow because of their names.” The overhead is the mode switch and validation.
“ABI changes break all programs.” The ABI is designed to be stable across kernel versions.

Check-your-understanding questions

Why do syscalls have to cross a privilege boundary?
How does libc report syscall errors to user code?
Why can tracing reveal performance bottlenecks?

Check-your-understanding answers

The boundary enforces protection and prevents user code from touching hardware directly.
It returns -1 and sets errno based on the kernel error code.
Tracing shows exact kernel operations and their frequency.

Real-world applications

Diagnosing file access failures and permission errors.
Understanding why a program blocks or consumes CPU.
Building sandboxes by limiting syscalls.

Where you’ll apply it

Project 2: Syscall Tracer
Project 3: Build Your Own Shell
Project 4: Filesystem Explorer
Project 8: FUSE Mirror Filesystem
Project 10: Container Runtime

References

syscall(2) man page: https://man7.org/linux/man-pages/man2/syscall.2.html
syscalls(2) overview: https://man7.org/linux/man-pages/man2/syscalls.2.html
“The Linux Programming Interface” by Michael Kerrisk - Ch. 3-6

Key insights Syscalls are the OS contract you can observe and trust.

Summary If you can trace syscalls, you can explain system behavior precisely.

Homework/Exercises to practice the concept

Use a syscall tracer on a simple command and count unique syscalls.
Identify which syscalls represent file system work and which represent process control.

Solutions to the homework/exercises

You should see a small set of file and process syscalls repeated.
File syscalls include open, read, write; process syscalls include fork and exec.

Chapter 3: Process Model, Scheduling, and Exec

Fundamentals A process is not just “a running program”; it is a kernel data structure that holds memory mappings, open files, credentials, and scheduling state. The kernel schedules runnable threads onto CPUs based on policy and priority. Process creation usually follows a fork-exec pattern: duplicate the process state, then replace the child with a new program image. Understanding process states and scheduling helps you explain why a process is running, sleeping, or blocked, and why it may appear unresponsive even when it is not dead.

Deep Dive The kernel represents each process (and thread) using internal structures that track identity, resources, and execution state. These structures include the process ID, parent/child relationships, open file descriptors, signal handlers, and memory mappings. When you call fork, the kernel creates a new task that initially shares many resources with the parent; modern kernels use copy-on-write so that memory is not duplicated until it is modified. This is why fork is relatively cheap and why it enables efficient process creation in shells and servers.

The exec family of syscalls replaces the current process image with a new program. The kernel loads the executable (usually ELF), maps its segments into memory, sets up the stack with arguments and environment variables, and jumps to the program entry point. At this moment, the process retains its PID and some attributes (like open file descriptors that are not marked close-on-exec) but its code, data, and stack are replaced. This separation is powerful: fork preserves process identity and relationships, while exec changes the program that runs within that identity.

Scheduling is the mechanism that decides which thread runs on which CPU. Linux uses a combination of policies. The default policy (SCHED_OTHER) is a time-sharing scheduler that aims to balance responsiveness and throughput. Real-time policies (SCHED_FIFO, SCHED_RR) provide deterministic scheduling at the cost of fairness. The scheduler tracks runnable tasks, applies priorities, and preempts running threads when higher priority work arrives. Process states explain why a task is not running: it may be runnable, sleeping in an interruptible wait, blocked in uninterruptible I/O, stopped by a signal, or a zombie waiting for its parent to reap it.

The process hierarchy matters. When a parent does not wait for a child, the child becomes a zombie after exit because the kernel must retain its exit status until the parent collects it. This is why shells and service managers must call wait. Orphaned children are reparented to PID 1, which reaps them. If PID 1 fails to reap, the system accumulates zombies. These are not active processes, but they still occupy kernel table entries. Understanding this behavior makes debugging “mysterious PIDs” much easier.

Scheduling and blocking are also closely tied to I/O and synchronization. A process waiting for disk I/O may appear “D” state (uninterruptible sleep) because the kernel cannot safely interrupt that wait. A process waiting for a user input event is usually in interruptible sleep and will respond to signals. Understanding these differences is critical when diagnosing unresponsive processes and when interpreting tools like top or ps.

Finally, the process model is the foundation for isolation. Namespaces and cgroups (discussed later) operate by changing the process view of system resources or limiting their usage. But they still rely on the same core process structures and scheduler decisions. That is why a solid process model is necessary before you can understand containers.

How this fit on projects Projects 3, 5, 9, and 10 rely directly on the fork-exec model, process states, and scheduling behavior.

Definitions and key terms

Task: A schedulable entity (process or thread) in the kernel.
Zombie: A terminated process waiting to be reaped.
Fork: Create a new process by copying the parent.
Exec: Replace the current process image with a new program.
Scheduling policy: Rules for CPU time allocation.

Mental model diagram

Parent process
   |
   | fork
   v
Child process (same PID tree, new task)
   |
   | exec
   v
New program image

How it works

Parent calls fork; kernel creates a new task with shared memory pages.
Child optionally modifies state, then calls exec.
Kernel loads ELF, maps segments, sets stack, and jumps to entry point.
Scheduler chooses runnable tasks based on policy and priority.

Minimal concrete example

Process state trace (conceptual):
PID 1042 R (running)
PID 1050 S (sleeping, waiting for input)
PID 1055 Z (zombie, waiting for parent)

Common misconceptions

“Fork copies all memory immediately.” It uses copy-on-write.
“A zombie uses CPU.” It does not; it only occupies a table entry.
“Exec creates a new PID.” The PID stays the same; the program changes.

Check-your-understanding questions

Why is fork cheap on modern kernels?
What causes a process to become a zombie?
Why do shells need to call wait?

Check-your-understanding answers

Copy-on-write defers memory duplication until a write occurs.
The process exits but the parent does not reap it.
To collect exit status and release kernel resources.

Real-world applications

Building shells, service managers, and job control.
Diagnosing unresponsive processes and CPU starvation.
Understanding how containers manage process trees.

Where you’ll apply it

Project 3: Build Your Own Shell
Project 5: Process Psychic
Project 9: Cgroup Resource Governor
Project 10: Container Runtime

References

sched(7) man page: https://man7.org/linux/man-pages/man7/sched.7.html
procfs process state fields: https://docs.kernel.org/filesystems/proc.html
“Operating Systems: Three Easy Pieces” - Virtualization and CPU scheduling chapters

Key insights Processes are kernel data structures first, running programs second.

Summary Understanding fork, exec, and scheduling turns “why is it stuck” into a solvable question.

Homework/Exercises to practice the concept

Observe process states in /proc/[pid]/stat for a running process.
Use ps to find zombie processes and explain their parent relationship.

Solutions to the homework/exercises

The state character indicates running, sleeping, or blocked.
Zombies show as Z; their parent must call wait to remove them.

Chapter 4: Virtual Memory and Page Cache

Fundamentals Virtual memory gives each process the illusion of a contiguous address space. The kernel uses page tables to map virtual addresses to physical frames, enabling protection and isolation. Page faults occur when a mapping is missing or invalid; the kernel resolves them by allocating memory or loading data from disk. The page cache sits between the VFS and storage, storing recently accessed file data in memory. This is why repeated file reads become fast and why I/O performance can appear unpredictable without understanding caching.

Deep Dive Virtual memory is the contract that separates a process’s view of memory from physical RAM. Each process sees a private address space that includes code, data, heap, stack, and mapped files. The kernel maintains page tables that translate virtual addresses to physical frames. When a process accesses a virtual address, the CPU checks the page tables. If the entry is missing or marked invalid, a page fault occurs and the kernel takes over.

Page faults are not always errors. A minor fault happens when the page is already in memory but the page table needs to be updated. A major fault happens when the data must be fetched from disk, which is much slower. This distinction matters when diagnosing performance: frequent major faults indicate I/O bottlenecks, while minor faults are usually acceptable. Tools that show faults can reveal whether memory pressure or slow storage is the root cause.

Memory mappings are created by both the kernel and user programs. When a program starts, its executable segments are mapped into memory. Shared libraries are mapped on demand. When you use memory-mapped files, the file contents are mapped into the process address space and the kernel treats file I/O as page faults. This is powerful because it allows the same data to be accessed via both file I/O and memory access semantics. It also means that the page cache becomes the central buffer between storage and memory access.

The page cache stores file-backed pages in memory. When you read a file, the kernel often reads entire pages into the cache and then satisfies your read from memory. Subsequent reads can be served from cache with no disk I/O. Writes are also buffered: data is written into cached pages and later flushed to disk. This improves performance but introduces complexity: a successful write does not always mean data is on disk. The kernel may delay the actual writeback, and a crash can lose buffered data unless an explicit sync occurs. Understanding this behavior explains why benchmarks can be misleading and why databases rely on fsync.

Memory pressure triggers eviction. The kernel maintains lists of active and inactive pages and chooses victims based on access patterns. File-backed pages can be dropped if the data can be reloaded from disk; anonymous pages must be swapped or reclaimed by killing processes. When memory is scarce, the kernel may swap to disk or invoke the OOM killer. This is where performance collapses if the system starts thrashing. Recognizing the symptoms of thrashing and understanding page cache effects allows you to interpret I/O latencies correctly.

Virtual memory also enforces protection. Each page has permissions (read, write, execute). Attempting to write to a read-only page or execute data will cause a fault, which the kernel converts into a signal. This is the basis of memory safety features like W^X and stack guards. The memory system is therefore both a performance layer and a security boundary.

How this fit on projects Projects 7 and 8 rely on virtual memory behavior, page faults, and page cache visibility.

Definitions and key terms

Page: Fixed-size unit of memory mapping.
Page fault: Trap when a page is missing or not permitted.
Page cache: Kernel cache of file-backed pages.
mmap: Map a file into a process address space.
OOM killer: Kernel mechanism to reclaim memory by killing processes.

Mental model diagram

Virtual Address -> Page Table -> Physical Frame
     |                |
     |                v
     |           Page cache (file-backed)
     v
Process view of memory

How it works

Process accesses a virtual address.
CPU checks page tables.
If missing, kernel handles fault.
Kernel loads or allocates a page.
Access resumes with mapping in place.

Minimal concrete example

Memory access trace (conceptual):
read file -> page fault -> page cached
read same file -> no fault, served from cache

Common misconceptions

“More RAM always means faster I/O.” It depends on cache behavior.
“A write is durable after write returns.” It may be buffered.
“A page fault is always bad.” Minor faults are normal.

Check-your-understanding questions

What is the difference between a major and minor page fault?
Why can a read become fast after the first access?
What causes the OOM killer to run?

Check-your-understanding answers

Major faults require disk I/O; minor faults only update page tables.
The data is now in the page cache.
The kernel cannot reclaim enough memory and must free it by killing a process.

Real-world applications

Explaining why database benchmarks change after a warm-up.
Diagnosing slow applications caused by swapping or cache misses.
Designing memory-mapped file workflows.

Where you’ll apply it

Project 7: Page Cache and mmap Lab
Project 8: FUSE Mirror Filesystem (page cache effects)

References

“Operating Systems: Three Easy Pieces” - Memory virtualization chapters
“The Linux Programming Interface” - Memory mapping chapter

Key insights Virtual memory is both performance infrastructure and a protection boundary.

Summary Once you understand page faults and page cache behavior, I/O latency becomes predictable.

Homework/Exercises to practice the concept

Measure file read times before and after cache warm-up.
Observe page fault counters for a simple program.

Solutions to the homework/exercises

The second read should be much faster due to cache.
Minor faults rise with memory mapping; major faults rise with disk access.

Chapter 5: VFS, Inodes, and Virtual Filesystems

Fundamentals The Virtual File System (VFS) provides a uniform interface for files regardless of the underlying filesystem. It translates pathnames into inodes and dispatches operations to the correct filesystem driver. A directory maps names to inode numbers; the inode stores metadata and pointers to file data. File descriptors are references to open file objects, not pathnames. Virtual filesystems like procfs expose kernel data structures as files, and user-space filesystems like FUSE plug into the same VFS interface.

Deep Dive When a program opens a file, the kernel does not immediately read file data. It first resolves the pathname. Path resolution walks directories, consulting dentries (directory entry caches) and inodes to locate the file. Each component in the path is looked up in the VFS cache or read from disk if missing. This is why long paths can cause multiple metadata reads. The VFS layer abstracts these operations so that the same open, read, and write syscalls work on ext4, xfs, procfs, and more.

Once a file is opened, the kernel creates an open file description that contains the current file offset and flags. A file descriptor is just an integer reference to this open file description. Two file descriptors can refer to the same open file description (for example, after a dup), sharing offsets and flags. This is why redirection works: the shell duplicates a file descriptor onto standard output, and the program writes to a file instead of the terminal without knowing the difference.

Inodes represent the identity of a file: ownership, permissions, size, timestamps, and pointers to file data. Directory entries map names to inode numbers. This separation explains why hard links can exist: multiple directory entries can point to the same inode. Deleting a file removes a directory entry, not the inode itself. The inode is removed only when the link count drops to zero and no process has the file open. This is why a file can be deleted but still consume disk space until the last handle closes.

The VFS layer defines a set of operations that filesystems must implement. These include lookup, read, write, and attribute queries. The kernel calls these operations through function pointers in the filesystem implementation. This design allows the same syscall to work across many filesystems, which is why you can mount a FUSE filesystem and still use ls, cat, and cp without modification. The VFS also handles permissions checks and caching rules, allowing filesystem developers to focus on storage-specific logic.

Virtual filesystems expose kernel data structures as file-like interfaces. Procfs is the classic example. It does not store data on disk; instead, it generates file contents on demand when read. This makes it a powerful debugging interface, but it also means the data is ephemeral and can change between reads. When you read /proc/[pid]/stat, you are seeing a snapshot of process state at that instant.

FUSE is the user-space counterpart to this design. It provides a device interface that allows user programs to implement filesystem operations in user space. The kernel forwards VFS operations to a user-space daemon, which responds with data or errors. This is slower than in-kernel filesystems but dramatically easier to implement and safer to debug. It also allows experimentation with new filesystem features without writing kernel code.

How this fit on projects Projects 4, 5, and 8 depend on VFS concepts, inode identity, and virtual filesystems.

Definitions and key terms

VFS: Virtual File System layer that abstracts filesystem operations.
Inode: File identity and metadata structure.
Dentry: Directory entry mapping names to inodes.
File descriptor: Integer reference to an open file description.
procfs: Virtual filesystem exposing kernel data structures.

Mental model diagram

Pathname -> Dentry cache -> Inode -> Filesystem ops
                         |
                         v
                    File data blocks

How it works

Resolve path components into dentries and inodes.
Create an open file description with offset and flags.
Read/write via filesystem operations defined by VFS.
Cache metadata and data for performance.

Minimal concrete example

Directory entry -> inode number
name: "report.txt" -> inode 192341
inode 192341 -> size, permissions, data blocks

Common misconceptions

“The filename is stored in the file.” It is stored in the directory entry.
“A file descriptor is a path.” It is a handle to an open file object.
“procfs files exist on disk.” They are generated by the kernel.

Check-your-understanding questions

Why can a deleted file still take disk space?
What does a hard link share with the original file?
Why can a process read /proc/self without knowing its PID?

Check-your-understanding answers

The inode remains as long as there are open file descriptions or link count > 0.
The inode; different names point to the same file identity.
The kernel resolves /proc/self to the calling process automatically.

Real-world applications

Debugging permission errors and unexpected file behavior.
Building user-space filesystems with custom behavior.
Understanding how tools like ls, stat, and find work.

Where you’ll apply it

Project 4: Filesystem Explorer
Project 5: Process Psychic
Project 8: FUSE Mirror Filesystem

References

VFS documentation: https://docs.kernel.org/filesystems/vfs.html
procfs documentation: https://docs.kernel.org/filesystems/proc.html
libfuse reference: https://github.com/libfuse/libfuse

Key insights VFS separates file identity, naming, and storage so that everything looks like a file.

Summary If you can explain inodes, dentries, and file descriptors, you can explain almost any file behavior.

Homework/Exercises to practice the concept

Create a hard link and observe how link counts change.
Compare /proc/self and /proc/<pid> output.

Solutions to the homework/exercises

Link count increases; both names point to the same inode.
They show the same process data, but one is resolved dynamically.

Chapter 6: Signals, IPC, and Terminals

Fundamentals Signals are asynchronous notifications delivered to processes by the kernel. IPC (inter-process communication) allows processes to exchange data or synchronize: pipes, FIFOs, shared memory, and message queues are common forms. Terminals are special device interfaces that convert keystrokes into bytes, sometimes applying line discipline and signal generation. Understanding how signals, IPC, and TTYs interact is essential for shells, terminal UIs, and any program that handles user input directly.

Deep Dive Signals are the kernel’s way of injecting control flow into a process. Each signal has a disposition: default action, ignore, or custom handler. Signals can be blocked, pending, or delivered. When a signal is delivered, the kernel arranges for the process to run a handler in user space. This is not a kernel callback in the classic sense; the kernel simply adjusts the process state so that when it resumes execution, it executes the handler. This explains why signal handlers must be reentrant and why only async-signal-safe operations are allowed.

Signals interact with process groups and terminals. When you press Ctrl+C in a terminal, the terminal driver does not send a literal character to the program by default. Instead, it interprets that key combination and sends SIGINT to the foreground process group. This is why shells can stop or terminate programs and why they must manage process groups carefully. If a shell does not handle SIGINT correctly, it will kill itself along with the child, which is not what the user expects.

IPC mechanisms provide different guarantees and tradeoffs. Pipes are unidirectional byte streams with kernel buffering, ideal for simple producer-consumer patterns. FIFOs extend pipes to unrelated processes through the filesystem namespace. Message queues provide discrete messages with priorities and boundaries. Shared memory is the fastest IPC because it avoids copying, but it requires explicit synchronization (such as semaphores or futexes) to avoid races. Sockets generalize IPC across the network but also work locally as a flexible communication primitive.

Terminals add another layer of complexity. The terminal driver maintains a line discipline, which by default implements canonical mode: input is buffered until a newline, echoing is enabled, and special control characters generate signals. Raw mode disables these transformations so that each keypress is delivered directly. This is how text editors and full-screen terminal UIs work. Raw mode can also disable signal generation, so programs must handle interrupts explicitly. Restoring terminal state on exit is essential to avoid leaving the user in a broken terminal.

The intersection of signals, IPC, and terminals is where many subtle bugs arise. A pipeline of processes involves pipes and process groups. A signal can interrupt a blocking read from a pipe, leading to partial reads or EINTR errors. A terminal resize sends SIGWINCH, which programs must handle to redraw properly. These interactions are not optional details; they are the behavior you must expect and handle if you build shells, TUI programs, or server daemons.

How this fit on projects Projects 3 and 6 rely on signal handling, pipes, and raw terminal behavior.

Definitions and key terms

Signal: Asynchronous notification delivered to a process.
IPC: Communication or synchronization between processes.
TTY: Terminal device interface with line discipline.
Canonical mode: Line-buffered terminal input.
Raw mode: Direct, byte-by-byte terminal input.

Mental model diagram

Keyboard -> TTY driver -> (signals + bytes) -> process group
                  |
                  v
            line discipline

How it works

User input arrives at TTY driver.
Line discipline may buffer or translate input.
Special keys trigger signals to the foreground group.
Processes read bytes or handle signals accordingly.

Minimal concrete example

Terminal behavior (conceptual):
Ctrl+C -> SIGINT to foreground process group
Ctrl+Z -> SIGTSTP (suspend)

Common misconceptions

“Ctrl+C sends a character to the program.” It sends a signal by default.
“Signals always interrupt syscalls.” Some are restartable, others are not.
“Raw mode is just no echo.” It also changes signal and line buffering behavior.

Check-your-understanding questions

Why must signal handlers be minimal?
What is the difference between a pipe and shared memory?
Why does a shell need to manage process groups?

Check-your-understanding answers

Handlers can run at any point and must avoid unsafe operations.
Pipes copy data through the kernel; shared memory does not.
To control which processes receive terminal-generated signals.

Real-world applications

Building interactive shells and TUI applications.
Designing IPC for multi-process systems.
Handling graceful shutdown in servers.

Where you’ll apply it

Project 3: Build Your Own Shell
Project 6: Raw Mode Terminal

References

signal(7) man page: https://man7.org/linux/man-pages/man7/signal.7.html
termios(3) man page: https://man7.org/linux/man-pages/man3/termios.3.html
“Advanced Programming in the UNIX Environment” - Signals and terminals chapters

Key insights Signals and TTYs are the control plane of Unix processes.

Summary If you understand signals, IPC, and terminal modes, you can build robust interactive programs.

Homework/Exercises to practice the concept

Identify which signals your shell sends when you press Ctrl+C or Ctrl+Z.
Draw a pipe diagram for a three-command pipeline.

Solutions to the homework/exercises

Ctrl+C sends SIGINT, Ctrl+Z sends SIGTSTP to the foreground group.
Each pipe connects stdout of one process to stdin of the next.

Chapter 7: Namespaces and Cgroups (Isolation and Resource Control)

Fundamentals Namespaces provide isolated views of system resources, such as PIDs, mounts, and hostnames. Cgroups provide hierarchical resource control, limiting CPU, memory, and I/O usage. Containers are built by combining namespaces with cgroups and a minimal filesystem. Understanding these mechanisms reveals that containers are just processes with constrained views and budgets, not virtual machines.

Deep Dive Namespaces wrap global kernel resources in per-process views. A PID namespace gives a process its own PID tree, where it can see itself as PID 1. A mount namespace provides a separate view of the filesystem mount table. A UTS namespace isolates hostname and domain name. Network namespaces provide separate network stacks. These namespaces are created by system calls that clone or unshare a process into new namespaces, and they can be joined by other processes using setns. The critical point is that namespaces do not duplicate the kernel; they partition the view of kernel data structures.

Cgroups manage resources. A cgroup is a node in a hierarchy that controls a set of processes. Controllers enforce limits, such as CPU time, memory usage, or I/O bandwidth. The cgroup v2 design provides a unified hierarchy and consistent controller semantics. When a process belongs to a cgroup, the kernel accounts for its resource usage and enforces the limits of the group. This is how a container can be restricted to a memory budget and how systemd can manage service resource usage.

Isolation and resource control are complementary. Namespaces prevent a process from seeing or interfering with unrelated system resources. Cgroups prevent it from consuming too much of a shared resource. Both rely on the core process model and scheduler. For example, CPU throttling in cgroups is implemented by scheduling decisions that delay runnable tasks. Memory limits are enforced by reclaiming pages or invoking the OOM killer within the cgroup. This means that isolation is not just about visibility; it is about enforcement.

The container runtime uses these primitives to assemble a container: it creates a new process, moves it into new namespaces, sets up a root filesystem, mounts /proc inside the namespace, and then applies cgroup limits. The process inside the container sees a new PID tree and a new filesystem root. But the kernel is the same; the isolation is a view, not a hardware separation. This is why containers start quickly and are lightweight compared to virtual machines.

Understanding cgroup delegation is important for security. Only privileged processes can create and manage cgroups by default, but delegation allows a parent to hand off a subtree to a less privileged process. The kernel enforces containment rules to prevent escape. This is why container managers have careful logic around cgroup ownership and why some operations require root privileges.

How this fit on projects Projects 9 and 10 rely on namespaces, cgroups, and their interaction with process management.

Definitions and key terms

Namespace: Isolation of a global kernel resource.
Cgroup: Control group for resource accounting and limits.
Container: A process with isolated namespaces and limited resources.
setns/unshare/clone: Syscalls to manipulate namespaces.

Mental model diagram

Host kernel
   |
   +-- Namespace A (PID, mount, UTS)
   |       |
   |       +-- Process tree (PID 1 inside)
   |
   +-- Namespace B (separate view)

Cgroup tree
  / (root)
    /services
      /web (cpu, memory limits)

How it works

Create a new process with namespace flags.
Set up mounts and root filesystem inside the namespace.
Move the process into a cgroup with limits.
Start the target program as PID 1 inside the namespace.

Minimal concrete example

Namespace view (conceptual):
Inside container: PID 1 -> /bin/sh
Outside container: PID 32450 -> /bin/sh

Common misconceptions

“Containers are lightweight VMs.” They share the host kernel.
“Namespaces alone are enough.” Resource limits require cgroups.
“cgroup limits are optional.” Without them, noisy neighbors can starve the host.

Check-your-understanding questions

What does a PID namespace change about process IDs?
Why does a container need its own /proc mount?
How does cgroup v2 enforce CPU limits?

Check-your-understanding answers

It provides a new PID namespace where processes see different IDs.
/proc must reflect the container’s PID view, not the host’s.
It throttles runnable tasks based on configured CPU bandwidth.

Real-world applications

Building container runtimes and sandboxes.
Limiting resource usage of services in production.
Providing isolation for untrusted workloads.

Where you’ll apply it

Project 9: Cgroup Resource Governor
Project 10: Container Runtime

References

namespaces(7) man page: https://man7.org/linux/man-pages/man7/namespaces.7.html
cgroups(7) man page: https://man7.org/linux/man-pages/man7/cgroups.7.html
cgroup v2 documentation: https://docs.kernel.org/admin-guide/cgroup-v2.html

Key insights Containers are processes with constrained views and budgets.

Summary Namespaces and cgroups explain containers without magic: isolation plus limits.

Homework/Exercises to practice the concept

List the namespace types available on your system.
Identify which cgroup controllers are enabled.

Solutions to the homework/exercises

Use ls /proc/self/ns and compare entries.
Inspect the cgroup filesystem for controller files at the root.

Glossary

ABI: Binary contract between compiled code and the kernel or CPU.
Cgroup: Kernel mechanism to group and limit resource usage.
Dentry: Directory entry mapping names to inode numbers.
ELF: Executable and Linkable Format for binaries.
Inode: File identity and metadata structure.
Namespace: Kernel feature that isolates a global resource view.
PID 1: The first user space process, responsible for system init.
Procfs: Virtual filesystem exposing kernel data structures.
Syscall: Controlled transition to the kernel for a service.
TTY: Terminal device interface with line discipline.

Why Linux and Unix Internals Matter

Modern motivation and real-world use cases: cloud servers, containers, and embedded devices depend on predictable kernel behavior.
Real-world statistics and impact:
- Linux is used by 59.8% of websites whose operating system is known (W3Techs, Jan 2, 2026).
- Unix-like systems overall account for 90.7% of known website OS usage (W3Techs, Jan 3, 2026).
Context and evolution: Unix introduced the process model and file abstraction; Linux scaled it across hardware and cloud.

ASCII comparison (old vs modern operations model):

Then (monolithic servers)        Now (containers and clouds)
+-----------------------+        +--------------------------+
| One big OS instance   |        | Many isolated processes  |
| Manual configuration  |        | Declarative orchestration|
| Few services per host |        | Thousands per host       |
+-----------------------+        +--------------------------+

Sources:

https://w3techs.com/technologies/comparison/os-Linux
https://w3techs.com/technologies/history_overview/operating_system

Concept Summary Table

Concept Cluster	What You Need to Internalize
Boot and Early Userspace	Boot is a chain of contracts from firmware to PID 1, not a single event.
System Call Interface and ABI	Syscalls are the only legal bridge and the ABI defines the binary contract.
Process Model, Scheduling, and Exec	Processes are kernel records; fork/exec and scheduling explain states.
Virtual Memory and Page Cache	Address spaces are contracts; page cache explains I/O behavior.
VFS, Inodes, and Virtual Filesystems	Names map to inodes; VFS unifies filesystems and procfs.
Signals, IPC, and Terminals	Signals are async control; IPC and TTYs drive interaction.
Namespaces and Cgroups	Isolation is a view plus limits, not a VM.

Project-to-Concept Map

Project	Concepts Applied
Project 1	Boot and Early Userspace
Project 2	System Call Interface and ABI; Process Model
Project 3	System Call Interface and ABI; Signals, IPC, and Terminals
Project 4	VFS, Inodes, and Virtual Filesystems
Project 5	Process Model, Scheduling, and Exec; VFS
Project 6	Signals, IPC, and Terminals
Project 7	Virtual Memory and Page Cache
Project 8	VFS, Inodes, and Virtual Filesystems; System Calls
Project 9	Namespaces and Cgroups; Process Model
Project 10	Namespaces and Cgroups; Process Model; Boot

Deep Dive Reading by Concept

Concept	Book and Chapter	Why This Matters
Boot and Early Userspace	“How Linux Works” by Brian Ward - Ch. 1-3	Clear, practical boot narrative.
System Call Interface and ABI	“The Linux Programming Interface” - Ch. 1-6	Syscall boundary and error model.
Process Model, Scheduling, and Exec	“Operating Systems: Three Easy Pieces” - CPU chapters	Scheduling and process model fundamentals.
Virtual Memory and Page Cache	“Operating Systems: Three Easy Pieces” - Memory chapters	Page faults and caching behavior.
VFS, Inodes, and Virtual Filesystems	“Linux Kernel Development” - VFS chapter	Kernel-level view of VFS design.
Signals, IPC, and Terminals	“Advanced Programming in the UNIX Environment” - Signals and terminals	Correct control-flow mental model.
Namespaces and Cgroups	“The Linux Programming Interface” - namespaces/cgroups sections	Practical API usage for isolation.

Quick Start

Day 1:

Read Theory Primer chapters 1 and 2.
Start Project 3 and get a simple prompt running.

Day 2:

Validate Project 3 against the Definition of Done.
Read Chapter 6 and practice signal handling scenarios.

Recommended Learning Paths

Path 1: The Systems Newcomer

Project 3 -> Project 4 -> Project 5 -> Project 6 -> Project 7 -> Project 8 -> Project 1 -> Project 9 -> Project 10

Path 2: The Linux Power User

Project 2 -> Project 3 -> Project 5 -> Project 7 -> Project 8 -> Project 9 -> Project 10

Path 3: The Container-Focused Engineer

Project 5 -> Project 9 -> Project 10 -> Project 8 -> Project 2

Success Metrics

You can trace a CLI command to its syscalls and explain each one.
You can explain why a process is sleeping or blocked using /proc output.
You can build a container-like process with isolated PID and mount namespaces.

Project Overview Table

#	Project	Core Topics	Difficulty
1	Do-Nothing Bootloader and Kernel	Boot chain, firmware, early init	Advanced
2	Syscall Tracer	Syscalls, ptrace, process state	Advanced
3	Build Your Own Shell	fork/exec, pipes, signals	Intermediate
4	Filesystem Explorer	VFS, inodes, directory traversal	Intermediate
5	Process Psychic	/proc, process states, CPU usage	Intermediate
6	Raw Mode Terminal	termios, TTY, signals	Advanced
7	Page Cache and mmap Lab	page faults, cache, memory	Intermediate
8	FUSE Mirror Filesystem	VFS callbacks, user space FS	Advanced
9	Cgroup Resource Governor	cgroup v2, resource limits	Advanced
10	Container Runtime	namespaces, pivot_root, PID 1	Expert

Project List

The following projects guide you from basic syscall tracing to container-level isolation.

Project 1: The Do-Nothing Bootloader and Kernel

File: P01-do-nothing-bootloader-kernel.md
Main Programming Language: Assembly and C
Alternative Programming Languages: Rust, Zig
Coolness Level: See REFERENCE.md (Level 4)
Business Potential: See REFERENCE.md (Level 1)
Difficulty: See REFERENCE.md (Level 3)
Knowledge Area: Boot Process / Low Level
Software or Tool: QEMU, assembler, linker
Main Book: “How Linux Works” by Brian Ward

What you will build: A bootable image that prints a message directly to screen memory and halts.

Why it teaches Linux internals: It reveals the exact handoff between firmware, bootloader, and kernel and makes the boot protocol tangible.

Core challenges you will face:

Boot sector constraints -> Boot protocol understanding
Real mode limitations -> CPU mode awareness
Binary layout -> Linker and memory placement

Real World Outcome

You will build an image os-image.bin that boots in QEMU and displays a message.

For CLI projects - exact output:

$ qemu-system-x86_64 -drive format=raw,file=os-image.bin
# A QEMU window opens with a black screen.
# Text appears:
HELLO FROM BARE METAL
# CPU halts; nothing else happens.

The Core Question You Are Answering

“What happens before the kernel has any services to rely on?”

This project shows that the boot chain is explicit, small, and testable.

Concepts You Must Understand First

Firmware and Bootloader Handoff
- What data does the bootloader pass to the kernel?
- Book Reference: “How Linux Works” - Ch. 1
Real Mode vs Protected Mode
- What limitations exist before protected mode is enabled?
- Book Reference: “Operating Systems: Three Easy Pieces” - Intro

Questions to Guide Your Design

How will you guarantee the boot signature is correct?
Where in memory will your message be written?

Thinking Exercise

Trace the First Instruction

Explain, step by step, how the CPU finds your first instruction after power-on.

Questions to answer:

What address does the firmware jump to?
Why does a wrong signature prevent boot?

The Interview Questions They Will Ask

“What is the difference between firmware and a bootloader?”
“Why does early boot code run in real mode?”
“What is the purpose of the boot signature?”
“Why is PID 1 special?”

Hints in Layers

Hint 1: The Image Format The first sector must include a boot signature and code the firmware can execute.

Hint 2: Video Memory Text mode can be written by placing character and color bytes into the VGA text buffer.

Hint 3: Minimal Flow Load -> write message -> halt in an infinite loop.

Hint 4: Verification If your image does not boot, check image size, signature, and start address.

Books That Will Help

Topic	Book	Chapter
Boot flow	“How Linux Works”	Ch. 1-2
CPU modes	“Operating Systems: Three Easy Pieces”	Intro

Common Pitfalls and Debugging

Problem 1: “Black screen, no text”

Why: Wrong load address or missing boot signature.
Fix: Verify last two bytes are 0x55 0xAA and code starts at correct offset.
Quick test: Use hexdump to confirm the signature at the end of the sector.

Definition of Done

Image boots in QEMU reliably
Message appears exactly as expected
CPU halts without reboot loop
Boot sector size and signature are correct

Project 2: The Syscall Tracer (Mini-strace)

File: P02-syscall-tracer.md
Main Programming Language: C
Alternative Programming Languages: Rust, Go
Coolness Level: See REFERENCE.md (Level 4)
Business Potential: See REFERENCE.md (Level 2)
Difficulty: See REFERENCE.md (Level 3)
Knowledge Area: Syscalls / Process Control
Software or Tool: ptrace
Main Book: “The Linux Programming Interface”

What you will build: A syscall tracer that attaches to a process and logs each syscall with its result.

Why it teaches Linux internals: It forces you to interact with the syscall boundary and process states directly.

Core challenges you will face:

Process attach/detach -> ptrace semantics
Syscall entry/exit -> register inspection
Signal handling -> avoid losing or misordering signals

Real World Outcome

You will run a program and see a live syscall log.

$ ./mytrace /bin/ls
[pid 22101] openat("/etc/ld.so.cache") -> fd=3
[pid 22101] openat("/lib/x86_64-linux-gnu/libc.so.6") -> fd=3
[pid 22101] getdents64(".") -> 7 entries
[pid 22101] write(1, "README.md\n") -> 10
[pid 22101] exit_group(0)

The Core Question You Are Answering

“What does the kernel actually see when a program runs?”

Concepts You Must Understand First

Syscall ABI
- How are syscall numbers and arguments passed?
- Book Reference: “The Linux Programming Interface” - Ch. 3-4
ptrace basics
- How does a tracer control a tracee?
- Book Reference: “The Linux Programming Interface” - process control chapters

Questions to Guide Your Design

How will you distinguish syscall entry from syscall exit?
How will you format arguments without full type info?

Thinking Exercise

State Machine for Tracing

Draw the tracer loop: attach -> wait -> inspect -> resume -> repeat.

Questions to answer:

What happens if the tracee exits between waits?
How do you avoid blocking forever?

The Interview Questions They Will Ask

“How does strace work under the hood?”
“What is ptrace used for besides tracing?”
“Why do tracers need to handle signals carefully?”
“How do you know a syscall failed?”

Hints in Layers

Hint 1: Start Simple Trace only syscall numbers and return values first.

Hint 2: Entry vs Exit Alternate between entry and exit stops to capture arguments and results.

Hint 3: Errors A negative return value usually maps to errno.

Hint 4: Debugging Compare your output with strace -f on the same command.

Books That Will Help

Topic	Book	Chapter
Syscalls	“The Linux Programming Interface”	Ch. 3-6
Process control	“Advanced Programming in the UNIX Environment”	Ch. 8

Common Pitfalls and Debugging

Problem 1: “Tracee never starts”

Why: You attached but did not resume the tracee.
Fix: Ensure the tracer always issues a resume after each stop.
Quick test: Compare your tracer control flow with strace -f behavior.

Definition of Done

Can trace a short command like ls
Logs syscall numbers and return values
Handles process exit without crashing
Output is deterministic and readable

Project 3: Build Your Own Shell

File: P03-build-your-own-shell.md
Main Programming Language: C
Alternative Programming Languages: Rust, Go
Coolness Level: See REFERENCE.md (Level 3)
Business Potential: See REFERENCE.md (Level 2)
Difficulty: See REFERENCE.md (Level 2)
Knowledge Area: Process Management / IPC
Software or Tool: POSIX APIs
Main Book: “The Linux Programming Interface”

What you will build: A minimal shell with built-ins, pipelines, and redirection.

Why it teaches Linux internals: It forces you to implement the fork-exec model and manage file descriptors.

Core challenges you will face:

Parsing input -> tokenization and quoting
Process control -> fork/exec/wait
Pipes and redirection -> FD manipulation

Real World Outcome

$ ./myshell
myshell> pwd
/home/user/projects
myshell> ls -l | grep .c
main.c
myshell> echo hello > out.txt
myshell> cat out.txt
hello
myshell> exit

The Core Question You Are Answering

“How does the OS run programs and connect them together?”

Concepts You Must Understand First

fork and exec
- Why are they separate steps?
- Book Reference: “The Linux Programming Interface” - Ch. 24-27
File descriptors
- How does redirection work?
- Book Reference: “The Linux Programming Interface” - Ch. 4-5

Questions to Guide Your Design

How will you handle built-ins like cd and exit?
How will you parse quoted strings safely?

Thinking Exercise

Pipeline Trace

Trace ls | grep txt using a process diagram.

Questions to answer:

Which process owns each pipe end?
When should unused pipe ends be closed?

The Interview Questions They Will Ask

“Why does cd have to be a built-in?”
“What happens if you do not wait for children?”
“How does a pipe connect two processes?”
“What is a zombie process?”

Hints in Layers

Hint 1: The Loop Read a line -> parse -> decide built-in or external -> execute.

Hint 2: Execution Flow Parent spawns child; child replaces itself with the target program.

Hint 3: Redirection Adjust standard FDs before starting the child program.

Hint 4: Debugging Use your Project 2 tracer to confirm expected syscalls.

Books That Will Help

Topic	Book	Chapter
Process control	“The Linux Programming Interface”	Ch. 24-27
Signals	“Advanced Programming in the UNIX Environment”	Ch. 10

Common Pitfalls and Debugging

Problem 1: “Shell exits on Ctrl+C”

Why: The shell and child are in the same process group.
Fix: Put children in their own group and forward signals.
Quick test: Run a sleep command and press Ctrl+C; shell should survive.

Definition of Done

Built-ins (cd, exit) work
External commands run correctly
Pipes and redirection work
No zombie processes remain

Project 4: Filesystem Explorer (ls -R clone)

File: P04-filesystem-explorer.md
Main Programming Language: C
Alternative Programming Languages: Rust, Go
Coolness Level: See REFERENCE.md (Level 2)
Business Potential: See REFERENCE.md (Level 1)
Difficulty: See REFERENCE.md (Level 2)
Knowledge Area: Filesystem / Inodes
Software or Tool: POSIX directory APIs
Main Book: “The Linux Programming Interface”

What you will build: A recursive directory lister that prints permissions, owners, and sizes.

Why it teaches Linux internals: It reveals how directory entries map names to inode metadata.

Core challenges you will face:

Directory traversal -> recursion and filtering
Metadata formatting -> permission bits and time
Symlink handling -> avoid loops

Real World Outcome

$ ./myls -R -l .
./
-rw-r--r-- 1 user staff  512 Jan 02 12:01 main.c
lrwxrwxrwx 1 user staff   10 Jan 02 12:02 latest -> main.c
./subdir
-rw-r--r-- 1 user staff  128 Jan 02 12:03 notes.txt

The Core Question You Are Answering

“Where is the filename actually stored?”

Concepts You Must Understand First

Inodes vs directory entries
- What is stored in each?
- Book Reference: “The Linux Programming Interface” - Ch. 15-18
Permission bits
- How do mode bits map to rwx strings?
- Book Reference: “The Linux Programming Interface” - Ch. 15

Questions to Guide Your Design

How will you detect and avoid symlink loops?
How will you map uid/gid to names?

Thinking Exercise

Metadata Walk

Pick a file and list which metadata fields are stored in its inode.

Questions to answer:

Which fields come from the inode?
Which fields come from the directory entry?

The Interview Questions They Will Ask

“What is the difference between a hard link and a soft link?”
“Why is stat separate from readdir?”
“What is the inode number used for?”
“Why can ls be slow on network filesystems?”

Hints in Layers

Hint 1: Start with current directory List entries, then add recursion.

Hint 2: Permissions Build the rwx string by checking mode bit flags.

Hint 3: Symlinks Do not follow symlinks by default in recursive mode.

Hint 4: Debugging Compare your output to ls -lR on the same directory.

Books That Will Help

Topic	Book	Chapter
File metadata	“The Linux Programming Interface”	Ch. 15
Directories	“The Linux Programming Interface”	Ch. 18

Common Pitfalls and Debugging

Problem 1: “Infinite recursion”

Why: Following symlinks to parent directories.
Fix: Skip symlinks or track visited inodes.
Quick test: Run on a directory containing a symlink to itself.

Definition of Done

Recursion works and avoids cycles
Permissions and ownership are correct
Output matches ls -lR for key fields
Handles empty directories gracefully

Project 5: Process Psychic (procfs Inspector)

File: P05-process-psychic.md
Main Programming Language: C or Python
Alternative Programming Languages: Go, Rust
Coolness Level: See REFERENCE.md (Level 3)
Business Potential: See REFERENCE.md (Level 3)
Difficulty: See REFERENCE.md (Level 2)
Knowledge Area: /proc, process states
Software or Tool: procfs
Main Book: “The Linux Programming Interface”

What you will build: A ps-like tool that reads /proc to show process state, memory, and CPU time.

Why it teaches Linux internals: It shows how the kernel exposes process state via virtual files.

Core challenges you will face:

Enumerating processes -> scanning /proc
Parsing state -> /proc/[pid]/stat
Handling races -> processes that exit mid-read

Real World Outcome

$ ./myps
PID   USER   STATE  CPU(ms)  RSS(KB)  CMD
1     root   S      12345    4567     /sbin/init
220   user   R      120      2048     ./myps
350   user   S      234      5120     bash

The Core Question You Are Answering

“How does top know what is running?”

Concepts You Must Understand First

procfs layout
- What is stored in /proc/[pid]/stat?
- Book Reference: “The Linux Programming Interface” - procfs sections
Process states
- What does R, S, D, Z mean?
- Book Reference: “Operating Systems: Three Easy Pieces” - CPU scheduling

Questions to Guide Your Design

How will you handle a PID directory disappearing mid-read?
How will you convert ticks to milliseconds?

Thinking Exercise

Manual ps

Read /proc/self/stat and map fields to human-readable values.

Questions to answer:

Which fields represent CPU time?
Which field is the process state?

The Interview Questions They Will Ask

“What is /proc and why is it special?”
“How do you calculate CPU usage from /proc?”
“What does state D mean?”
“Why can /proc reads race with process exit?”

Hints in Layers

Hint 1: Filter PIDs Only directories that are all digits represent processes.

Hint 2: Parsing /proc/[pid]/stat is space-separated but the comm field is in parentheses.

Hint 3: Timing Use the system clock tick size to convert jiffies to ms.

Hint 4: Debugging Compare a single PID output with ps -o for the same PID.

Books That Will Help

Topic	Book	Chapter
procfs	“The Linux Programming Interface”	procfs sections
Process model	“Operating Systems: Three Easy Pieces”	CPU chapters

Common Pitfalls and Debugging

Problem 1: “Parse errors on comm field”

Why: The command name can contain spaces inside parentheses.
Fix: Parse until the matching closing parenthesis first.
Quick test: Compare with processes that have spaces in their names.

Definition of Done

Lists at least PID, state, RSS, and command
Handles PID exit without crashing
Output matches ps for sample PIDs
Runs without root privileges

Project 6: Raw Mode Terminal (TUI Core)

File: P06-raw-mode-terminal.md
Main Programming Language: C
Alternative Programming Languages: Rust, Go
Coolness Level: See REFERENCE.md (Level 3)
Business Potential: See REFERENCE.md (Level 2)
Difficulty: See REFERENCE.md (Level 3)
Knowledge Area: TTY / Termios
Software or Tool: termios
Main Book: “The Linux Programming Interface”

What you will build: A raw-mode terminal tool that reads keystrokes byte-by-byte and draws a simple screen.

Why it teaches Linux internals: It forces you to understand terminal line discipline and signals.

Core challenges you will face:

Terminal modes -> canonical vs raw
Escape sequences -> cursor control
Cleanup -> restoring terminal state

Real World Outcome

$ ./rawmode
[screen clears]
RAW MODE ACTIVE - press 'q' to quit
KEY: a  (97)
KEY: Ctrl+C (3) - intercepted, not exiting
KEY: ArrowUp -> ESC [ A

The Core Question You Are Answering

“Why does Backspace and Ctrl+C work in a normal terminal?”

Concepts You Must Understand First

Canonical mode
- Why input is line-buffered by default.
- Book Reference: “The Linux Programming Interface” - terminals chapter
Signal generation
- How terminals generate SIGINT and SIGTSTP.
- Book Reference: “Advanced Programming in the UNIX Environment” - signals

Questions to Guide Your Design

How will you ensure the terminal restores on crash?
How will you decode escape sequences for arrows?

Thinking Exercise

The Stuck Terminal

Predict what happens if echo is disabled and not restored.

Questions to answer:

What commands restore the terminal?
Why is cleanup critical?

The Interview Questions They Will Ask

“What is canonical mode and why does it exist?”
“How does Ctrl+C become SIGINT?”
“What is a pseudo-terminal?”
“Why do TUI programs need to redraw on SIGWINCH?”

Hints in Layers

Hint 1: Save and Restore Store the original terminal settings and restore them on exit.

Hint 2: Disable Canonical Mode Turn off line buffering and echo for raw input.

Hint 3: Escape Sequences Arrow keys start with ESC [; parse multi-byte sequences.

Hint 4: Debugging If the terminal is broken, use stty sane to recover.

Books That Will Help

Topic	Book	Chapter
Terminals	“The Linux Programming Interface”	Ch. 62
Signals	“Advanced Programming in the UNIX Environment”	Ch. 10

Common Pitfalls and Debugging

Problem 1: “Terminal stays broken after exit”

Why: Raw mode not restored.
Fix: Ensure cleanup runs on normal exit and on signals.
Quick test: Use stty -a before and after.

Definition of Done

Raw input works for letters and control keys
Terminal state is restored on exit
Arrow keys are decoded correctly
Screen redraw works after resize

Project 7: Page Cache and mmap Lab

File: P07-page-cache-mmap-lab.md
Main Programming Language: C or Python
Alternative Programming Languages: Rust, Go
Coolness Level: See REFERENCE.md (Level 2)
Business Potential: See REFERENCE.md (Level 2)
Difficulty: See REFERENCE.md (Level 2)
Knowledge Area: Virtual Memory / Page Cache
Software or Tool: time, sync, /proc/meminfo
Main Book: “Operating Systems: Three Easy Pieces”

What you will build: A repeatable experiment that measures cached vs uncached file reads and page faults.

Why it teaches Linux internals: It reveals how the page cache and memory mappings change I/O behavior.

Core challenges you will face:

Cache warm-up -> repeated measurements
Page fault accounting -> interpreting counters
Reproducibility -> controlling system noise

Real World Outcome

$ ./cachelab sample.dat
cold read: 1.82s
warm read: 0.06s
minor faults: +1200
major faults: +45

The Core Question You Are Answering

“Why does the second read of a file feel instant?”

Concepts You Must Understand First

Page cache behavior
- What is cached and when?
- Book Reference: “Operating Systems: Three Easy Pieces” - memory chapters
Page faults
- What do major and minor faults mean?
- Book Reference: “The Linux Programming Interface” - memory mapping

Questions to Guide Your Design

How will you separate cold and warm reads?
How will you measure faults per run?

Thinking Exercise

Predict the Trend

Predict how read times change after repeated reads.

Questions to answer:

Why does the first read cost more?
What happens after a sync?

The Interview Questions They Will Ask

“What is the page cache?”
“What is a major page fault?”
“Why can benchmarks be misleading?”
“How does mmap change I/O behavior?”

Hints in Layers

Hint 1: Control the Cache Measure cold and warm reads separately.

Hint 2: Observe Counters Read /proc/meminfo and fault counters before and after.

Hint 3: Separate Effects Test normal read vs mmap-based access.

Hint 4: Debugging Repeat runs and compute averages to reduce noise.

Books That Will Help

Topic	Book	Chapter
Memory	“Operating Systems: Three Easy Pieces”	Memory chapters
mmap	“The Linux Programming Interface”	Memory mapping chapter

Common Pitfalls and Debugging

Problem 1: “Results vary wildly”

Why: Background I/O or cache activity.
Fix: Use a quiet system or repeat measurements and average.
Quick test: Run the test multiple times and compare variance.

Definition of Done

Cold and warm read times are measured
Fault counts are captured and explained
Results are reproducible across multiple runs
You can explain the difference in a paragraph

Project 8: The Mirror Filesystem (FUSE)

File: P08-mirror-filesystem-fuse.md
Main Programming Language: C or Python
Alternative Programming Languages: Rust, Go
Coolness Level: See REFERENCE.md (Level 4)
Business Potential: See REFERENCE.md (Level 4)
Difficulty: See REFERENCE.md (Level 3)
Knowledge Area: VFS / Filesystems
Software or Tool: libfuse
Main Book: “Linux Kernel Development”

What you will build: A FUSE filesystem that mirrors a directory while transforming file contents (for example, reversing text).

Why it teaches Linux internals: It exposes the VFS operations the kernel expects from any filesystem.

Core challenges you will face:

Filesystem callbacks -> open, read, write, getattr
Permissions -> pass-through metadata
Concurrency -> multi-threaded requests

Real World Outcome

$ ./mirrorfs real_dir/ mount_point/
$ echo "hello" > mount_point/test.txt
$ cat real_dir/test.txt
olleh

The Core Question You Are Answering

“How does the kernel support many filesystems with one API?”

Concepts You Must Understand First

VFS operations
- What does the kernel call for open/read/write?
- Book Reference: “Linux Kernel Development” - VFS chapter
User space filesystem boundary
- How does /dev/fuse mediate requests?
- Book Reference: libfuse docs

Questions to Guide Your Design

What metadata should be passed through unchanged?
How will you handle file permissions and ownership?

Thinking Exercise

Trace a cat

List the VFS operations triggered by cat file.

Questions to answer:

Which operation supplies file size?
Which operation returns file contents?

The Interview Questions They Will Ask

“Why is FUSE slower than in-kernel filesystems?”
“What is the role of the VFS layer?”
“What does getattr represent?”
“How do you handle concurrent writes?”

Hints in Layers

Hint 1: Start with passthrough Implement a mirror that forwards operations without transformation.

Hint 2: Add transformation Only modify the data path, not metadata.

Hint 3: Handle errors Return correct error codes for missing files.

Hint 4: Debugging Use strace on a cat command to see expected operations.

Books That Will Help

Topic	Book	Chapter
VFS	“Linux Kernel Development”	VFS chapter
FUSE	libfuse documentation	Reference

Common Pitfalls and Debugging

Problem 1: “Permissions behave strangely”

Why: Metadata not passed through correctly.
Fix: Mirror uid, gid, and mode values from the underlying file.
Quick test: Compare ls -l on the mount and real directory.

Definition of Done

Files can be read and written through the mount
Transformations apply only to file contents
Permissions and timestamps are preserved
Unmount works cleanly

Project 9: Cgroup Resource Governor

File: P09-cgroup-resource-governor.md
Main Programming Language: C or Python
Alternative Programming Languages: Go, Rust
Coolness Level: See REFERENCE.md (Level 4)
Business Potential: See REFERENCE.md (Level 4)
Difficulty: See REFERENCE.md (Level 3)
Knowledge Area: Resource Control / Cgroups
Software or Tool: cgroup v2
Main Book: “The Linux Programming Interface”

What you will build: A tool that launches a program inside a cgroup with CPU and memory limits and reports usage.

Why it teaches Linux internals: It shows how the kernel enforces resource budgets and exposes metrics.

Core challenges you will face:

Cgroup hierarchy -> creating and cleaning up cgroups
Controller settings -> CPU and memory limits
Metrics -> reading usage files

Real World Outcome

$ ./cg-run --cpu=20% --mem=200M ./stress
[info] cgroup created: /sys/fs/cgroup/lab
[info] cpu.max = 20000 100000
[info] memory.max = 209715200
[stats] cpu.usage_usec=512345
[stats] memory.current=154320128

The Core Question You Are Answering

“How does the kernel actually enforce resource limits?”

Concepts You Must Understand First

cgroup v2 hierarchy
- How are processes attached to a cgroup?
- Book Reference: “The Linux Programming Interface” - resource limits
Scheduler interaction
- How does CPU throttling work?
- Book Reference: “Operating Systems: Three Easy Pieces” - scheduling

Questions to Guide Your Design

How will you create and remove cgroup directories safely?
How will you map friendly limits to cgroup values?

Thinking Exercise

Budget Math

Translate a CPU percentage into a cgroup cpu.max pair.

Questions to answer:

What period should you choose?
How does burst behavior appear?

The Interview Questions They Will Ask

“What is the difference between cgroup v1 and v2?”
“How does CPU throttling affect latency?”
“What happens when memory.max is exceeded?”
“Why are cgroups critical for containers?”

Hints in Layers

Hint 1: Use cgroup v2 Verify the cgroup2 filesystem is mounted.

Hint 2: Attach a Process Write the PID to cgroup.procs.

Hint 3: Set Limits Update cpu.max and memory.max before starting workload.

Hint 4: Debugging Read cpu.stat and memory.current to confirm enforcement.

Books That Will Help

Topic	Book	Chapter
Resource limits	“The Linux Programming Interface”	Resource control sections
Scheduling	“Operating Systems: Three Easy Pieces”	CPU scheduling chapters

Common Pitfalls and Debugging

Problem 1: “Limits not enforced”

Why: Controllers not enabled or cgroup v2 not mounted.
Fix: Verify the unified hierarchy and enable controllers.
Quick test: Read controller lists in the cgroup root.

Definition of Done

Can place a process into a cgroup
CPU and memory limits are enforced
Metrics are reported accurately
cgroup cleanup is safe and repeatable

Project 10: The Poor Man’s Docker (Container Runtime)

File: P10-container-runtime.md
Main Programming Language: Go or C
Alternative Programming Languages: Rust, Python
Coolness Level: See REFERENCE.md (Level 5)
Business Potential: See REFERENCE.md (Level 4)
Difficulty: See REFERENCE.md (Level 4)
Knowledge Area: Namespaces / Cgroups / Mounts
Software or Tool: Linux namespaces
Main Book: “The Linux Programming Interface”

What you will build: A minimal container runtime that launches a command in isolated PID and mount namespaces with resource limits.

Why it teaches Linux internals: It ties together process creation, namespaces, cgroups, and filesystem mounts.

Core challenges you will face:

Namespace setup -> PID, mount, UTS
Root filesystem -> chroot or pivot_root
Procfs -> remount /proc inside container

Real World Outcome

$ sudo ./mycontainer run /bin/sh
container# hostname
sandbox
container# ps
PID  USER  CMD
1    root  /bin/sh
2    root  ps
container# exit

The Core Question You Are Answering

“What is a container in kernel terms?”

Concepts You Must Understand First

Namespaces
- Which resources do they isolate?
- Book Reference: “The Linux Programming Interface” - namespaces sections
Cgroups
- How are limits applied to a process tree?
- Book Reference: “The Linux Programming Interface” - resource control

Questions to Guide Your Design

How will you create a PID namespace where the child is PID 1?
How will you populate a minimal root filesystem?

Thinking Exercise

The /proc Trap

Explain what happens if you do not mount /proc inside the container.

Questions to answer:

What does ps show?
Why is this misleading?

The Interview Questions They Will Ask

“How do namespaces differ from VMs?”
“Why is PID 1 special inside a container?”
“What is the role of cgroups in containers?”
“What is pivot_root used for?”
“How does Docker isolate processes?”

Hints in Layers

Hint 1: Start with unshare Experiment with a shell created in new namespaces.

Hint 2: Minimal Root Use a small directory tree with a shell and libraries.

Hint 3: Mount /proc Inside the namespace, mount procfs to get accurate process views.

Hint 4: Debugging Compare ps output inside and outside the container.

Books That Will Help

Topic	Book	Chapter
Namespaces	“The Linux Programming Interface”	namespaces sections
Containers	“Container Security” by Liz Rice	Ch. 2-3

Common Pitfalls and Debugging

Problem 1: “ps shows host processes”

Why: /proc is not remounted inside the namespace.
Fix: Mount procfs inside the container namespace.
Quick test: Check that PID 1 is the container init inside /proc.

Definition of Done

Container has isolated PID and mount namespaces
/proc shows only container processes
Resource limits apply via cgroups
Container exits cleanly without host impact

Project Comparison Table

Project	Difficulty	Time	Depth of Understanding	Fun Factor
1. Bootloader	Advanced	Weekend	Boot chain	High
2. Syscall Tracer	Advanced	1 week	Syscalls and tracing	High
3. Shell	Intermediate	1 week	Process control	High
4. Filesystem Explorer	Intermediate	Weekend	VFS and inodes	Medium
5. Process Psychic	Intermediate	Weekend	/proc and states	Medium
6. Raw Terminal	Advanced	1 week	TTY and signals	High
7. Page Cache Lab	Intermediate	Weekend	Memory and cache	Medium
8. FUSE Mirror	Advanced	1-2 weeks	VFS callbacks	High
9. Cgroup Governor	Advanced	1 week	Resource limits	High
10. Container Runtime	Expert	2 weeks	Isolation stack	Very High

Recommendation

If you are new to Linux internals: Start with Project 3 for a practical view of fork/exec and signals. If you are a backend engineer: Start with Project 5 to learn how /proc exposes system truth. If you want containers: Focus on Projects 9 and 10 after reading the Namespaces chapter.

Final Overall Project

Final Overall Project: The System Supervisor

The Goal: Combine Project 3 (shell), Project 5 (process inspector), and Project 10 (container runtime) into a minimal init-like supervisor.

Boot into a minimal system image.
Start a containerized shell as the main service.
Monitor the service using procfs data and restart on failure.

Success Criteria: The supervisor starts, launches the service, reports status, and restarts it after a crash.

From Learning to Production

Your Project	Production Equivalent	Gap to Fill
Syscall Tracer	strace	Robust argument decoding and formatting
Shell	bash/zsh	Job control, scripting, globbing
Process Psychic	top/ps	Performance optimizations and UI
FUSE Mirror	sshfs/encfs	Security, caching, consistency
Cgroup Governor	systemd	Policy management, delegation
Container Runtime	runc	Full OCI runtime spec support

Summary

This learning path covers Linux and Unix internals through 10 hands-on projects.

#	Project Name	Main Language	Difficulty	Time Estimate
1	Bootloader	Assembly/C	Advanced	Weekend
2	Syscall Tracer	C	Advanced	1 week
3	Shell	C	Intermediate	1 week
4	Filesystem Explorer	C	Intermediate	Weekend
5	Process Psychic	C/Python	Intermediate	Weekend
6	Raw Terminal	C	Advanced	1 week
7	Page Cache Lab	C/Python	Intermediate	Weekend
8	FUSE Mirror	C/Python	Advanced	1-2 weeks
9	Cgroup Governor	C/Python	Advanced	1 week
10	Container Runtime	Go/C	Expert	2 weeks

Expected Outcomes

You can trace syscalls and map them to kernel behavior.
You can explain file identity via inodes and VFS.
You can build a minimal container from namespaces and cgroups.

Additional Resources and References

Standards and Specifications

POSIX Base Specifications (Issue 7, 2018): https://pubs.opengroup.org/onlinepubs/9699919799/
UEFI Specification: https://uefi.org/specs/UEFI/2.10/
ELF Object File Format (gABI): https://gabi.xinuos.com/elf/

Industry Analysis

W3Techs Linux usage statistics: https://w3techs.com/technologies/comparison/os-Linux
W3Techs OS historical trends: https://w3techs.com/technologies/history_overview/operating_system

Books

“The Linux Programming Interface” by Michael Kerrisk - System call and process fundamentals
“Operating Systems: Three Easy Pieces” by Remzi and Andrea Arpaci-Dusseau - Process and memory models
“Advanced Programming in the UNIX Environment” by Stevens and Rago - Signals, terminals, and IPC
“How Linux Works” by Brian Ward - Boot and user space overview