LEARN LINUX UNIX INTERNALS DEEP DIVE
Learn Linux and Unix Internals: From Boot to User Space
Goal: Deeply understand the Linux and Unix operating systems by traversing the entire stack—from the first instruction the CPU executes at boot, through the kernel’s management of processes and memory, down to the filesystem structures on disk and the shell that interprets your commands.
Why Linux & Unix Internals Matter
Linux runs the world. From the Android phone in your pocket to the servers powering the internet, and likely the embedded device in your car, the Unix philosophy and Linux kernel are ubiquitous.
Understanding internals transforms you from a “user” of APIs to a “master” of the system. You stop guessing why a process crashed or why I/O is slow. You know exactly which data structure in the kernel is bottlenecked, or which signal wasn’t handled.
You will move from:
- “I run
lsto list files.” - To: “I know
lscallsgetdents64(), readsstruct linux_dirent64entries, and usesstat()to retrieve inode metadata from the filesystem driver.”
Core Concept Analysis
1. The Rings of Power (Kernel vs. User Space)
The CPU enforces a strict boundary between “Kernel Mode” (Ring 0) and “User Mode” (Ring 3).
+-----------------------------------------+
| USER SPACE |
| (Applications: Shell, Browser, ls) |
| |
| Restricted Access: No hardware I/O |
+--------------------+--------------------+
| System Calls (INT 0x80 / syscall)
v
+--------------------+--------------------+
| KERNEL SPACE |
| (Process Mgmt, VFS, Network Stack) |
| |
| Full Access: Hardware, Memory, CPU |
+-----------------------------------------+
| Drivers
v
+-----------------------------------------+
| HARDWARE |
| (CPU, RAM, Disk, Network Card, GPU) |
+-----------------------------------------+
2. The Process Life Cycle
A process isn’t just a running program. It’s a kernel data structure (task_struct in Linux).
fork() exec() exit()
[Parent] ----> [Child] ------------> [New Program] ----> [Zombie]
| ^ |
| (Copy of Parent) | (Replaces Memory)|
| | |
+-----------------------+ |
v
wait() by Parent
(Zombie Reaped)
3. The Filesystem Abstraction (VFS)
“Everything is a file” is the Unix mantra. The Virtual Filesystem (VFS) provides a unified interface.
User App: read(fd, buf, len)
|
System Call: sys_read()
|
VFS Layer: (Common Interface)
+-----------+-----------+-----------+
| | | |
[ext4] [xfs] [proc] [devtmpfs]
| | | |
Disk Disk RAM Device
Inodes Inodes Structs Drivers
4. The Boot Sequence
How does a computer wake up?
- BIOS/UEFI: Initializes hardware, finds bootloader.
- Bootloader (GRUB): Loads Kernel image (
vmlinuz) and Init RAM Disk (initrd) into memory. - Kernel Start: Probes hardware, mounts root filesystem.
- Init Process: The first process (PID 1) starts (
systemdorsysvinit). - User Space: Login prompts, graphical interfaces.
Concept Summary Table
| Concept Cluster | What You Need to Internalize |
|---|---|
| System Calls | The only way user space talks to the kernel. A function call that triggers a CPU interrupt/trap to switch rings. |
| Processes (PID) | An instance of a program. Has memory, file descriptors, and a state. Created via fork(). |
| File Descriptors | Integers representing open files/sockets. Indices into a per-process table pointing to kernel file structs. |
| Inodes | The identity of a file. Contains metadata (permissions, size, blocks) but not the name. |
| Signals | Software interrupts. The kernel notifies a process of an event (Ctrl+C -> SIGINT, Segfault -> SIGSEGV). |
| Permissions | rwx (Read/Write/Execute) for User/Group/Others. Checked by the kernel at open/exec time. |
Deep Dive Reading by Concept
Foundation & Architecture
| Concept | Book & Chapter |
|---|---|
| Architecture | “The Linux Programming Interface” by Michael Kerrisk — Ch. 1-3 (Fundamentals) |
| Boot Process | “How Linux Works” by Brian Ward — Ch. 1 (The Big Picture) |
Processes & Signals
| Concept | Book & Chapter |
|---|---|
| Process API | “The Linux Programming Interface” by Michael Kerrisk — Ch. 24-28 (Creation, Termination, Execution) |
| Signals | “Advanced Programming in the UNIX Environment” (APUE) — Ch. 10 (Signals) |
Filesystems & I/O
| Concept | Book & Chapter |
|---|---|
| VFS & I/O | “Linux Kernel Development” by Robert Love — Ch. 13 (The Virtual Filesystem) |
| Directories | “C Programming: A Modern Approach” (or generic) — How dirent.h works |
Essential Reading Order
- Foundation:
- “How Linux Works” Ch. 1-3 (Overview)
- “The Linux Programming Interface” Ch. 4 (File I/O)
- Process Mastery:
- “The Linux Programming Interface” Ch. 24 (Process Creation)
- “APUE” Ch. 8 (Process Control)
Project List
Project 1: The “Do-Nothing” Bootloader & Kernel
- File: LEARN_LINUX_UNIX_INTERNALS_DEEP_DIVE.md
- Main Programming Language: Assembly (x86) & C
- Alternative Programming Languages: Rust
- Coolness Level: Level 4: Hardcore Tech Flex
- Business Potential: 1. The “Resume Gold”
- Difficulty: Level 3: Advanced
- Knowledge Area: Boot Process / Low Level
- Software or Tool: QEMU, NASM/GAS, GCC
- Main Book: “Operating Systems: Three Easy Pieces” (Virtualization Part)
What you’ll build: A minimal “kernel” that boots from a disk image (in QEMU), prints “Hello from Bare Metal!” to the VGA video memory, and halts. No Linux, no libraries, just you and the hardware.
Why it teaches boot process: You assume Linux “just starts.” This project forces you to be the one starting it. You’ll see the magic number 0xAA55 that marks a bootable disk, and you’ll understand what “Ring 0” actually feels like.
Core challenges you’ll face:
- The 512-byte limit: The BIOS only loads the first sector. You have tiny space.
- VGA Memory Mapping: Writing to
0xB8000to make text appear on screen. - Linking: Understanding how code is arranged in the binary file.
Key Concepts:
- Boot Sector: MBR format and the 0x7C00 load address.
- Real Mode vs Protected Mode: 16-bit vs 32-bit addressing.
- Bare Metal IO: Writing directly to memory addresses.
Difficulty: Advanced (Conceptually) / Intermediate (Lines of code) Time estimate: Weekend Prerequisites: Basic Assembly knowledge (registers, mov instructions).
Real World Outcome
You will create a file os-image.bin. When you run it in an emulator, you will see your message on a raw black screen.
Example Output:
$ nasm -f bin boot.asm -o boot.bin
$ qemu-system-x86_64 boot.bin
# A QEMU window opens.
# Inside, white text on black background:
# "Hello from Bare Metal!"
The Core Question You’re Answering
“What happens before main()?”
Most programmers live inside an OS. This project answers: “How does the computer get from power-on to executing code?”
Concepts You Must Understand First
Stop and research these before coding:
- BIOS vs UEFI (We will use BIOS/Legacy mode for simplicity)
- What address does BIOS jump to? (Answer: 0x7C00)
- VGA Text Mode
- What is the memory address for the screen buffer?
- How is a character + color stored? (2 bytes)
- Reference: OSDev Wiki (“Printing to Screen”)
Questions to Guide Your Design
- Boot Signature: What are the last two bytes of a boot sector?
- Infinite Loop: Once your code finishes, what should the CPU do? (Don’t let it execute garbage memory).
Thinking Exercise
Trace the Boot
Imagine you are the CPU.
- Power on. Reset vector jumps to BIOS.
- BIOS checks disks. Finds one with
0x55AAat end of sector 0. - BIOS copies that 512 bytes to
0x7C00. - BIOS sets
IP(Instruction Pointer) to0x7C00. - YOUR CODE STARTS HERE.
The Interview Questions They’ll Ask
- “What is the difference between a bootloader and a kernel?”
- “What is ‘memory mapped I/O’?”
- “Why do operating systems switch from Real Mode to Protected Mode?”
Hints in Layers
Hint 1: The Magic Number
Your file must be exactly 512 bytes. The last two bytes must be 0x55 and 0xAA. Pad the rest with zeros.
Hint 2: Writing to Screen
In 16-bit Real Mode, you can use BIOS interrupts (int 0x10) OR write directly to video memory. Writing directly is “cooler”. Segment 0xB800.
Hint 3: Assembly Loop To write a string, you need a loop. Load a character, move it to video memory, increment pointer, repeat until null terminator.
Books That Will Help
| Topic | Book | Chapter |
|---|---|---|
| Bootloading | “Operating Systems: From 0 to 1” (Free online book) | Ch. 2-3 |
| Assembly | “Programming from the Ground Up” by Jonathan Bartlett | Ch. 1-3 |
Project 2: Build Your Own Shell (BYOS)
- File: LEARN_LINUX_UNIX_INTERNALS_DEEP_DIVE.md
- Main Programming Language: C
- Alternative Programming Languages: Rust, Go
- Coolness Level: Level 3: Genuinely Clever
- Business Potential: 1. The “Resume Gold”
- Difficulty: Level 2: Intermediate
- Knowledge Area: Process Management / Syscalls
- Software or Tool: Linux Terminal, GCC
- Main Book: “The Linux Programming Interface” by Michael Kerrisk
What you’ll build: A functional shell (like bash or zsh, but simpler). It will show a prompt, accept commands (ls -la), handle built-ins (cd, exit), and support input/output redirection (>) and pipes (|).
Why it teaches processes: You will implement the core lifecycle: fork() a new process, exec() a command, and wait() for it to finish. You’ll learn why cd must be a shell built-in and not a program.
Core challenges you’ll face:
-
Parsing: Breaking “ls -la grep foo” into tokens. - Fork/Exec pattern: The standard Unix way to run programs.
- File Descriptors: Making
>work by manipulatingstdoutbeforeexec. - Pipes: Connecting the
stdoutof one process to thestdinof another.
Key Concepts:
- Process Creation:
fork(),execvp(),waitpid(). - File Descriptors:
dup2(),open(),close(). - Signals: Handling Ctrl+C (
SIGINT) without killing the shell itself.
Difficulty: Intermediate Time estimate: 1 week Prerequisites: C basics, Pointers.
Real World Outcome
You will have a program myshell.
Example Output:
$ ./myshell
myshell> ls
file1.txt file2.c myshell
myshell> pwd
/home/user/projects/myshell
myshell> ls -l > output.txt
myshell> cat output.txt
(lists files)
myshell> exit
The Core Question You’re Answering
“How does the OS run programs?”
And also: “What actually happens when I type a command?” It’s not magic; it’s specific syscalls.
Concepts You Must Understand First
Stop and research these before coding:
- Fork vs Exec
fork()creates a copy.exec()replaces the copy with new code. Why do we need both?- Book Reference: “The Linux Programming Interface” Ch. 24 & 27.
- File Descriptors (FDs)
- FD 0 is Stdin, 1 is Stdout, 2 is Stderr.
- What happens if I close FD 1 and open a file? (The file becomes FD 1).
Questions to Guide Your Design
- Built-ins: Why doesn’t
exec("cd")work? (Hint: A child process cannot change the parent’s directory). - Zombies: What happens if you don’t call
wait()? - Parsing: How do you split a string by spaces in C? (
strtokor manual parsing).
Thinking Exercise
Trace a Pipe: ls | grep c
- Parent (Shell) creates a pipe (array of 2 ints).
- Parent forks Child A (for
ls). - Parent forks Child B (for
grep). - Child A: Closes Stdout. Duplicates Pipe-Write-End to Stdout. Execs
ls. - Child B: Closes Stdin. Duplicates Pipe-Read-End to Stdin. Execs
grep. - Parent closes pipe ends and waits.
The Interview Questions They’ll Ask
- “Write a simplified implementation of
system().” - “What is a zombie process?”
- “Why does
cdhave to be a built-in?”
Hints in Layers
Hint 1: The Loop
while (1) { print prompt; read line; parse; execute; }
Hint 2: Executing
Split the line into char **args. Call fork().
If pid == 0 (child), call execvp(args[0], args).
If pid > 0 (parent), call waitpid(pid).
Hint 3: Redirection
For ls > out.txt:
In the child process, before calling exec:
fd = open("out.txt", ...)
dup2(fd, STDOUT_FILENO)
close(fd)
Then exec prints to stdout, which is now the file.
Books That Will Help
| Topic | Book | Chapter | |——-|——|———| | Process Control | “The Linux Programming Interface” | Ch. 24-27 | | Signals | “The Linux Programming Interface” | Ch. 20-22 | —
Project 3: The Filesystem Explorer (ls -R) Clone
- File: LEARN_LINUX_UNIX_INTERNALS_DEEP_DIVE.md
- Main Programming Language: C
- Alternative Programming Languages: Rust
- Coolness Level: Level 2: Practical but Forgettable
- Business Potential: 1. The “Resume Gold”
- Difficulty: Level 2: Intermediate
- Knowledge Area: Filesystem / Inodes
- Software or Tool: GCC,
manpages - Main Book: “The Linux Programming Interface” by Michael Kerrisk
What you’ll build: A robust clone of the ls command that supports recursion (-R) and detailed listing (-l). You will read directories, query file metadata (inodes), and handle permissions formatting.
Why it teaches filesystems: You’ll learn that a directory is just a special file containing a list of names and inode numbers. You’ll understand the difference between a filename (in a directory) and the file metadata (in the inode).
Core challenges you’ll face:
- Directory Traversal: Using
opendir,readdir,closedir. - Stat System Call: Converting
statstruct data into human-readable strings (e.g.,rwxr-xr-x). - Time Formatting: Handling Unix timestamps.
- Recursion: Safely walking directory trees without infinite loops (symlinks).
Key Concepts:
- Inodes:
struct statand what it contains. - Directory Entries:
struct dirent. - Bitmasks: Decoding permission bits (
st_mode).
Difficulty: Intermediate Time estimate: Weekend Prerequisites: C structs, recursion.
Real World Outcome
You will have a tool myls.
Example Output:
$ ./myls -l
drwxr-xr-x 2 user group 4096 Dec 22 10:00 .
drwxr-xr-x 10 user group 4096 Dec 21 14:00 ..
-rw-r--r-- 1 user group 512 Dec 22 10:01 main.c
The Core Question You’re Answering
“Where is the file name stored?”
Spoiler: NOT in the file itself. It’s stored in the directory. This project proves it.
Concepts You Must Understand First
Stop and research these before coding:
- Inodes vs Directory Entries
- An Inode holds the file’s metadata (size, permissions).
- A Directory Entry maps a “Name” to an “Inode Number”.
- Stat Struct
- Look up
man 2 stat. Understandst_mode,st_uid,st_size.
- Look up
Questions to Guide Your Design
- Hidden Files: How does
lsknow to hide files starting with.? (It’s manual filtering in user space!). - User Names:
statgives youuid(integer). How do you get “root” or “douglas”? (Hint:/etc/passwdorgetpwuid).
Thinking Exercise
Struct dirent vs Struct stat
readdir gives you a name and a type. That’s it. To get the size, you MUST call stat on that name.
Trace:
- Open directory
.. - Read entry
main.c. - Call
stat("main.c", &mystat). - Read
mystat.st_size.
The Interview Questions They’ll Ask
- “What is a hard link vs a soft link?” (Hard link = same inode, new directory entry).
- “Why does
lstake a long time on a network mount?” (Stat calls are expensive).
Hints in Layers
Hint 1: Decoding Permissions
st_mode is a bitfield.
(mode & S_IRUSR) checks if User has Read permission.
Construct the rwxr-xr-x string character by character.
Hint 2: Recursion
If S_ISDIR(mode) is true, and the name isn’t . or .., construct the new path (current/child) and recurse.
Books That Will Help
| Topic | Book | Chapter |
|---|---|---|
| File I/O | “The Linux Programming Interface” | Ch. 15 (File Attributes) |
| Directories | “The Linux Programming Interface” | Ch. 18 (Directories) |
Project 4: The Process Psychic (Process Inspector)
- File: LEARN_LINUX_UNIX_INTERNALS_DEEP_DIVE.md
- Main Programming Language: C or Python
- Alternative Programming Languages: Go
- Coolness Level: Level 3: Genuinely Clever
- Business Potential: 3. The “Service & Support” Model (Monitoring Agent)
- Difficulty: Level 2: Intermediate
- Knowledge Area: Linux
/procfilesystem - Software or Tool: Linux
- Main Book: “Linux Kernel Development” by Robert Love
What you’ll build: A tool similar to ps or top that lists running processes, their state (Running, Sleeping, Zombie), memory usage, and command line arguments. You will do this by parsing the /proc filesystem directly.
Why it teaches Linux Internals: Linux exposes kernel statistics to user space via the /proc virtual filesystem. By parsing it, you see exactly what the kernel tracks for each process.
Core challenges you’ll face:
- Scanning
/proc: Iterating over numeric directories. - Parsing
statusfiles: Reading/proc/[pid]/stator/proc/[pid]/status. - Calculating CPU usage: Reading
/proc/stat(system wide) and process time to calculate percentages.
Key Concepts:
- Virtual Filesystems:
/procexists only in RAM. - Process States:
R(Running),S(Sleeping),Z(Zombie). - UID/GID: Mapping numeric IDs to names.
Difficulty: Beginner/Intermediate Time estimate: Weekend Prerequisites: File I/O, String parsing.
Real World Outcome
You will have a CLI tool myps.
Example Output:
$ ./myps
PID USER STATE CMD
1 root S /sbin/init
1042 douglas R ./myps
1043 douglas S bash
The Core Question You’re Answering
“How does
topknow what’s running?”
It doesn’t use magic system calls. It reads files. Everything in Unix is a file.
Concepts You Must Understand First
Stop and research these before coding:
- The
/procdirectory- Run
ls /procon your machine. - Run
cat /proc/self/status. Look at the output.
- Run
- Ticks vs Seconds
- Kernel counts time in “jiffies” or ticks. You need
sysconf(_SC_CLK_TCK)to convert to seconds.
- Kernel counts time in “jiffies” or ticks. You need
Questions to Guide Your Design
- Filtering: How do you distinguish process directories (numbers) from other info (like
cpuinfo) in/proc? - Race Conditions: What happens if a process dies while you are reading its
/procfile? (HandleENOENT).
Thinking Exercise
Manual ps
cd /proc- Find a numbered folder, e.g.,
1234. cat 1234/cmdline-> See the command (nullseparated).cat 1234/stat-> See the raw state characters.
The Interview Questions They’ll Ask
- “What is the
/procfilesystem?” - “How do you find a process’s memory usage without
top?”
Hints in Layers
Hint 1: Iteration
Use opendir("/proc"). Check if entry->d_name is all digits (isdigit).
Hint 2: Reading CMDLINE
/proc/[pid]/cmdline arguments are separated by \0 (null bytes), not spaces. Replace \0 with spaces for display.
Books That Will Help
| Topic | Book | Chapter |
|---|---|---|
| /proc | “The Linux Programming Interface” | Ch. 10 (Process Credentials) & Ch. 12 |
Project 5: The Raw Mode Terminal (Text Editor Base)
- File: LEARN_LINUX_UNIX_INTERNALS_DEEP_DIVE.md
- Main Programming Language: C
- Alternative Programming Languages: Rust
- Coolness Level: Level 3: Genuinely Clever
- Business Potential: 2. Micro-SaaS (Custom CLI Tools)
- Difficulty: Level 3: Advanced
- Knowledge Area: TTY / Termios
- Software or Tool:
termios.h - Main Book: “The Linux Programming Interface”
What you’ll build: A program that enters “Raw Mode”. It disables the terminal’s default behavior (echoing characters, line buffering, handling Ctrl+C). You will read input byte-by-byte and render output manually, moving the cursor using VT100 escape sequences. This is the foundation of vim or nano.
Why it teaches TTY: The terminal is a complex legacy beast. Normally, the terminal driver “cooks” input (processes Backspace, Enter, signals). To build advanced TUIs, you must talk directly to the TTY driver.
Core challenges you’ll face:
- Termios Struct: Modifying flags
ECHO,ICANON,ISIG. - Escape Sequences: Sending
\x1b[2Jto clear screen. - Restoring State: If your program crashes in Raw Mode, your terminal is broken. You must implement
atexithandlers.
Key Concepts:
- Cooked vs Raw Mode: Canonical vs Non-canonical input.
- Control Characters: What Ctrl+C (3) and Enter (10/13) actually send.
- VT100/ANSI Codes: Controlling the cursor.
Difficulty: Intermediate Time estimate: 1 week Prerequisites: C basics, Bitwise operations.
Real World Outcome
A program that lets you type, and draws characters, but Ctrl+C prints “I caught Ctrl+C!” instead of exiting, and Backspace requires manual handling.
Example Output:
$ ./raw_term
(Screen clears)
[Type 'q' to quit]
You pressed: 'a' (97)
You pressed: 'Ctrl+C' (3) - Not exiting!
...
The Core Question You’re Answering
“Why does Backspace work?”
In Raw Mode, Backspace is just byte 127. It deletes nothing. You have to move the cursor back, print a space, and move back again.
Concepts You Must Understand First
Stop and research these before coding:
- Canonical Mode (Line buffering).
- Why input doesn’t appear until you hit Enter.
termiosflagsECHO: Prints what you type.ICANON: Enables line buffering.ISIG: Handles signals (Ctrl+C/Z).
Questions to Guide Your Design
- Safety: How do you ensure the terminal returns to normal mode when the user quits?
- Mapping: Does pressing
Entersend\n(10) or\r(13)? (It’s messy).
Thinking Exercise
The “Stuck” Terminal
Run stty -echo in your terminal. Type. Nothing appears.
Run stty echo blindly to fix it.
This is what you are manipulating programmatically.
The Interview Questions They’ll Ask
- “What is the difference between
stdoutand a TTY?” - “How do text editors detect window resize events?” (SIGWINCH).
Hints in Layers
Hint 1: Disabling Echo
struct termios raw; tcgetattr(STDIN_FILENO, &raw); raw.c_lflag &= ~(ECHO); ...
Hint 2: Read Byte-by-Byte
Turn off ICANON. read(STDIN_FILENO, &c, 1) returns immediately after a keypress.
Books That Will Help
| Topic | Book | Chapter |
|---|---|---|
| Terminals | “The Linux Programming Interface” | Ch. 62 (Terminals) |
| TTY Driver | “Build Your Own Text Editor” (Online Guide by Jeremy Ruten) | All |
Project 6: The “Mirror” Filesystem (FUSE)
- File: LEARN_LINUX_UNIX_INTERNALS_DEEP_DIVE.md
- Main Programming Language: C or Python
- Alternative Programming Languages: Rust
- Coolness Level: Level 4: Hardcore Tech Flex
- Business Potential: 4. Open Core Infrastructure
- Difficulty: Level 3: Advanced
- Knowledge Area: VFS / Filesystems
- Software or Tool:
libfuse - Main Book: “Linux Kernel Development” (Filesystems chapter)
What you’ll build: A user-space filesystem that mounts a folder. When you write to it, it reverses the text (or encrypts it) before saving to the underlying disk. This uses FUSE (Filesystem in Userspace) to hook into kernel VFS calls.
Why it teaches VFS: You will implement the callbacks: getattr, read, write, readdir. You will see exactly what the kernel asks for when you run cat file.txt.
Core challenges you’ll face:
- Callback Implementation: Mapping
read()requests to underlying file operations. - Permissions: Handling
chmod/chowncorrectly. - Latency: User-space context switching overhead.
Key Concepts:
- VFS Interface: The common API for all filesystems.
- Mount Points: How the kernel attaches filesystems.
- User-Kernel Bridge: How
/dev/fuseworks.
Difficulty: Advanced Time estimate: 1-2 weeks Prerequisites: C, Pointers, Project 3 (ls clone).
Real World Outcome
You will mount a directory mnt/.
Example Output:
$ ./mirrorfs root_dir/ mount_point/
$ echo "Hello" > mount_point/test.txt
$ cat root_dir/test.txt
olleH
The Core Question You’re Answering
“How does the kernel support ext4, fat32, and ntfs simultaneously?”
It uses a standard interface (VFS). FUSE lets you be that interface.
Concepts You Must Understand First
Stop and research these before coding:
- FUSE Architecture
- Kernel module ->
/dev/fuse-> libfuse (User space) -> Your Code.
- Kernel module ->
- Error Codes
- Returning
-ENOENTwhen file is missing.
- Returning
Questions to Guide Your Design
- Stat: When
lsruns, it callsgetattr. What do you return for a file that doesn’t exist yet? - Thread Safety: FUSE is multithreaded. Do you need locks?
Thinking Exercise
Trace a cat
catcallsopen().- Kernel VFS sees mount point.
- Kernel sends
OPENop to FUSE. - Your code receives
open. - Your code returns Success.
- Kernel sends
READ.
The Interview Questions They’ll Ask
- “Why is FUSE slower than a kernel driver?” (Context switches).
- “What is the VFS?”
Hints in Layers
Hint 1: Hello World FUSE
Start with the “Hello World” example in libfuse documentation. It creates a virtual file that doesn’t exist on disk.
Hint 2: Passthrough
First, make a “passthrough” filesystem that just forwards every call to the underlying directory. open -> open, read -> read.
Books That Will Help
| Topic | Book | Chapter |
|---|---|---|
| VFS | “Linux Kernel Development” | Ch. 13 |
| FUSE | “FUSE Documentation” | Official Wiki |
Project 7: The “Poor Man’s Docker” (Container Runtime)
- File: LEARN_LINUX_UNIX_INTERNALS_DEEP_DIVE.md
- Main Programming Language: Go or C
- Alternative Programming Languages: Rust, Python
- Coolness Level: Level 5: Pure Magic
- Business Potential: 4. Open Core Infrastructure
- Difficulty: Level 4: Expert
- Knowledge Area: Namespaces / Cgroups
- Software or Tool: Linux Namespaces
- Main Book: “The Linux Programming Interface”
What you’ll build: A program that runs a command in an isolated environment. It will have its own Process ID tree (PID 1), its own mount table, and its own hostname. It’s a mini-Docker.
Why it teaches isolation: Docker isn’t magic; it uses Linux Namespaces (CLONE_NEWPID, CLONE_NEWNS, CLONE_NEWUTS) and cgroups. You will call these directly.
Core challenges you’ll face:
- Namespaces: Using
unshare()orclone()with flags. - Root Filesystem: Setting up
chrootorpivot_root(the “jail”). - ProcFS: Mounting a fresh
/procsopsinside the container only shows container processes.
Key Concepts:
- PID Namespace: Process isolation.
- Mount Namespace: Filesystem isolation.
- Chroot/Pivot_root: Root directory isolation.
Difficulty: Expert Time estimate: 2 weeks Prerequisites: Project 2 (Shell), Root access.
Real World Outcome
You run a shell inside your container. ps shows only two processes.
Example Output:
$ sudo ./mycontainer run /bin/bash
container# ps aux
PID USER COMMAND
1 root /bin/bash
2 root ps aux
container# hostname
container-host
container# exit
The Core Question You’re Answering
“What IS a container?”
It is NOT a virtual machine. It is a process with a restricted view of the kernel’s data structures.
Concepts You Must Understand First
Stop and research these before coding:
- Namespaces
- Read
man 7 namespaces. PID, UTS, MNT, NET.
- Read
chrootvspivot_root- Why
chrootis not enough for security.
- Why
Questions to Guide Your Design
- PID 1: Who handles signals inside the container? (Your starter process).
- Networking: By default, you have no network. Do you want to share the host’s or create a veth pair? (Share for simplicity first).
Thinking Exercise
The ps Lie
If you chroot but don’t use PID namespaces, ps will show all host processes (if /proc is mounted).
If you use PID namespaces but don’t remount /proc, ps will show nothing or crash.
You need BOTH.
The Interview Questions They’ll Ask
- “What is the difference between a VM and a Container?” (Kernel sharing).
- “What are cgroups used for?” (Resource limiting).
- “How does Docker hide processes?” (PID Namespaces).
Hints in Layers
Hint 1: unshare Tool
Play with the command line tool unshare first. sudo unshare --fork --pid --mount-proc /bin/bash.
Hint 2: Go Syscalls
In Go: cmd.SysProcAttr = &syscall.SysProcAttr{Cloneflags: syscall.CLONE_NEWUTS | syscall.CLONE_NEWPID ...}.
Hint 3: Mounting Proc
Inside the container setup: mount("proc", "/proc", "proc", 0, "").
Books That Will Help
| Topic | Book | Chapter |
|---|---|---|
| Namespaces | “The Linux Programming Interface” | (Newer editions cover namespaces) |
| Containers | “Container Security” by Liz Rice | Ch. 2-3 |
Project Comparison Table
| Project | Difficulty | Time | Depth of Understanding | Fun Factor |
|---|---|---|---|---|
| 1. Boot Kernel | ⭐⭐⭐⭐ | Weekend | Ring 0, BIOS, ASM | ⭐⭐⭐⭐⭐ |
| 2. Build Shell | ⭐⭐ | 1 week | Fork/Exec, Signals | ⭐⭐⭐⭐ |
| 3. ls Clone | ⭐⭐ | Weekend | Inodes, Dirents | ⭐⭐ |
| 4. Process Psychic | ⭐⭐ | Weekend | /proc, Parsing | ⭐⭐⭐ |
| 5. Raw Terminal | ⭐⭐⭐ | 1 week | TTY, Termios | ⭐⭐⭐⭐ |
| 6. FUSE Filesystem | ⭐⭐⭐⭐ | 1-2 weeks | VFS, Kernel Hooks | ⭐⭐⭐⭐⭐ |
| 7. Container Runtime | ⭐⭐⭐⭐⭐ | 2 weeks | Namespaces, Isolation | ⭐⭐⭐⭐⭐ |
Recommendation
Where to Start
- For the Absolute Best Foundation: Start with Project 2 (Build Shell). It covers the most important daily concepts (
fork,exec,pipes,signals). It bridges the gap between being a user and a programmer. - For the Low-Level Curious: Start with Project 4 (Process Psychic). It’s easier than the Shell but shows you the “magic” behind the curtain (
/proc).
The Progression Path
Shell -> ls Clone -> Process Psychic -> Container Runtime.
This path takes you from managing processes, to inspecting files, to inspecting processes, to isolating all of them.
Final Overall Project: The “System-Z” (Mini-OS Supervisor)
What you’ll build: A minimal Init System (PID 1) that combines your shell, your container runtime, and your process monitor.
The Goal: You boot your Linux kernel (using init=/path/to/your/program). Your program starts. It:
- Mounts
/proc,/sys. - Reads a config file (like
systemdunits). - Starts services (like a web server or your shell) in isolated containers (using your Project 7 code).
- Monitors them (restarting if they crash).
- Provides a command socket to query status (using your Project 4 code).
Why this is the ultimate test: You are the system. If your code crashes, the kernel panics. You handle the orphans, the signals, the mounts, and the logs. You effectively replace the entire user-space OS layer.
Summary
This learning path covers Linux and Unix Internals through 7 hands-on projects.
| # | Project Name | Main Language | Difficulty | Time Estimate |
|---|---|---|---|---|
| 1 | Boot Sector Kernel | Assembly/C | Advanced | Weekend |
| 2 | Build Your Own Shell | C | Intermediate | 1 week |
| 3 | ls -R Clone | C | Intermediate | Weekend |
| 4 | Process Psychic (ps) | C | Intermediate | Weekend |
| 5 | Raw Mode Terminal | C | Advanced | 1 week |
| 6 | Mirror Filesystem (FUSE) | C | Advanced | 1-2 weeks |
| 7 | Container Runtime | Go/C | Expert | 2 weeks |
Expected Outcomes
After completing these projects, you will:
- Understand Boot Process by writing one.
- Master Syscalls by calling them directly.
- Demystify Containers by building one from scratch.
- Visualize Filesystems by parsing inodes manually.
- Debug Processes by reading their raw kernel state.
You will no longer view Linux as a “black box” but as a collection of data structures and well-defined interfaces that you can inspect, manipulate, and reimplement.