Project 1: Syscall Tracer (strace clone)
A tool that intercepts and logs every system call a program makes, showing the syscall name, arguments, and return values in real-time.
Quick Reference
| Attribute | Value |
|---|---|
| Primary Language | See main guide |
| Alternative Languages | N/A |
| Difficulty | Level 3: Advanced |
| Time Estimate | 2-3 weeks |
| Knowledge Area | OS Internals / Debugging |
| Tooling | ptrace |
| Prerequisites | C programming, basic understanding of processes, comfort reading man pages |
What You Will Build
A tool that intercepts and logs every system call a program makes, showing the syscall name, arguments, and return values in real-time.
Why It Matters
This project builds core skills that appear repeatedly in real-world systems and tooling.
Core Challenges
- Decoding syscall numbers to names (maps to syscall table structure)
- Reading arguments from registers and memory (maps to calling conventions, kernel/user boundary)
- Handling multi-threaded programs (maps to process lifecycle, threading model)
- Following child processes through fork/exec (maps to process lifecycle)
- Dealing with restartable syscalls after signals (maps to interrupt handling)
Key Concepts
- ptrace mechanism: “The Linux Programming Interface” Ch. 26 - Michael Kerrisk
- x86-64 calling conventions: “Computer Systems: A Programmer’s Perspective” Ch. 3 - Bryant & O’Hallaron
- Syscall tables: Linux kernel source
arch/x86/entry/syscalls/syscall_64.tbl - Process states: “Operating Systems: Three Easy Pieces” Ch. 4 - Arpaci-Dusseau
Real-World Outcome
$ ./mystrace ls -la /tmp
execve("/usr/bin/ls", ["ls", "-la", "/tmp"], 0x7ffd4c2e3b80 /* 67 vars */) = 0
brk(NULL) = 0x555555756000
arch_prctl(0x3001 /* ARCH_??? */, 0x7fffffffea00) = -1 EINVAL (Invalid argument)
mmap(NULL, 8192, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7ffff7fb7000
access("/etc/ld.so.preload", R_OK) = -1 ENOENT (No such file or directory)
openat(AT_FDCWD, "/etc/ld.so.cache", O_RDONLY|O_CLOEXEC) = 3
newfstatat(3, "", {st_mode=S_IFREG|0644, st_size=107619, ...}, AT_EMPTY_PATH) = 0
mmap(NULL, 107619, PROT_READ, MAP_PRIVATE, 3, 0) = 0x7ffff7f9c000
close(3) = 0
openat(AT_FDCWD, "/lib/x86_64-linux-gnu/libselinux.so.1", O_RDONLY|O_CLOEXEC) = 3
read(3, "\177ELF\2\1\1\0\0\0\0\0\0\0\0\0"..., 832) = 832
newfstatat(3, "", {st_mode=S_IFREG|0644, st_size=163200, ...}, AT_EMPTY_PATH) = 0
mmap(NULL, 174600, PROT_READ, MAP_PRIVATE|MAP_DENYWRITE, 3, 0) = 0x7ffff7f71000
...
write(1, "total 48\n", 9) = 9
write(1, "drwxrwxrwt 13 root root 4096 "..., 75) = 75
write(1, "drwxr-xr-x 19 root root 4096 "..., 72) = 72
...
+++ exited with 0 +++
Implementation Guide
- Reproduce the simplest happy-path scenario.
- Build the smallest working version of the core feature.
- Add input validation and error handling.
- Add instrumentation/logging to confirm behavior.
- Refactor into clean modules with tests.
Milestones
- Milestone 1: Minimal working program that runs end-to-end.
- Milestone 2: Correct outputs for typical inputs.
- Milestone 3: Robust handling of edge cases.
- Milestone 4: Clean structure and documented usage.
Validation Checklist
- Output matches the real-world outcome example
- Handles invalid inputs safely
- Provides clear errors and exit codes
- Repeatable results across runs
References
- Main guide:
TRACK_A_OS_KERNEL_PROJECTS.md - “The Linux Programming Interface” by Michael Kerrisk