Project 1: Syscall Tracer (strace clone)

A tool that intercepts and logs every system call a program makes, showing the syscall name, arguments, and return values in real-time.

Quick Reference

Attribute Value
Primary Language See main guide
Alternative Languages N/A
Difficulty Level 3: Advanced
Time Estimate 2-3 weeks
Knowledge Area OS Internals / Debugging
Tooling ptrace
Prerequisites C programming, basic understanding of processes, comfort reading man pages

What You Will Build

A tool that intercepts and logs every system call a program makes, showing the syscall name, arguments, and return values in real-time.

Why It Matters

This project builds core skills that appear repeatedly in real-world systems and tooling.

Core Challenges

  • Decoding syscall numbers to names (maps to syscall table structure)
  • Reading arguments from registers and memory (maps to calling conventions, kernel/user boundary)
  • Handling multi-threaded programs (maps to process lifecycle, threading model)
  • Following child processes through fork/exec (maps to process lifecycle)
  • Dealing with restartable syscalls after signals (maps to interrupt handling)

Key Concepts

  • ptrace mechanism: “The Linux Programming Interface” Ch. 26 - Michael Kerrisk
  • x86-64 calling conventions: “Computer Systems: A Programmer’s Perspective” Ch. 3 - Bryant & O’Hallaron
  • Syscall tables: Linux kernel source arch/x86/entry/syscalls/syscall_64.tbl
  • Process states: “Operating Systems: Three Easy Pieces” Ch. 4 - Arpaci-Dusseau

Real-World Outcome

$ ./mystrace ls -la /tmp
execve("/usr/bin/ls", ["ls", "-la", "/tmp"], 0x7ffd4c2e3b80 /* 67 vars */) = 0
brk(NULL)                               = 0x555555756000
arch_prctl(0x3001 /* ARCH_??? */, 0x7fffffffea00) = -1 EINVAL (Invalid argument)
mmap(NULL, 8192, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7ffff7fb7000
access("/etc/ld.so.preload", R_OK)      = -1 ENOENT (No such file or directory)
openat(AT_FDCWD, "/etc/ld.so.cache", O_RDONLY|O_CLOEXEC) = 3
newfstatat(3, "", {st_mode=S_IFREG|0644, st_size=107619, ...}, AT_EMPTY_PATH) = 0
mmap(NULL, 107619, PROT_READ, MAP_PRIVATE, 3, 0) = 0x7ffff7f9c000
close(3)                                = 0
openat(AT_FDCWD, "/lib/x86_64-linux-gnu/libselinux.so.1", O_RDONLY|O_CLOEXEC) = 3
read(3, "\177ELF\2\1\1\0\0\0\0\0\0\0\0\0"..., 832) = 832
newfstatat(3, "", {st_mode=S_IFREG|0644, st_size=163200, ...}, AT_EMPTY_PATH) = 0
mmap(NULL, 174600, PROT_READ, MAP_PRIVATE|MAP_DENYWRITE, 3, 0) = 0x7ffff7f71000
...
write(1, "total 48\n", 9)               = 9
write(1, "drwxrwxrwt 13 root root  4096 "..., 75) = 75
write(1, "drwxr-xr-x 19 root root  4096 "..., 72) = 72
...
+++ exited with 0 +++

Implementation Guide

  1. Reproduce the simplest happy-path scenario.
  2. Build the smallest working version of the core feature.
  3. Add input validation and error handling.
  4. Add instrumentation/logging to confirm behavior.
  5. Refactor into clean modules with tests.

Milestones

  • Milestone 1: Minimal working program that runs end-to-end.
  • Milestone 2: Correct outputs for typical inputs.
  • Milestone 3: Robust handling of edge cases.
  • Milestone 4: Clean structure and documented usage.

Validation Checklist

  • Output matches the real-world outcome example
  • Handles invalid inputs safely
  • Provides clear errors and exit codes
  • Repeatable results across runs

References

  • Main guide: TRACK_A_OS_KERNEL_PROJECTS.md
  • “The Linux Programming Interface” by Michael Kerrisk