Project 8: System Call Interface
Implement raw syscalls in assembly and build a tiny syscall tracer.
Quick Reference
| Attribute | Value |
|---|---|
| Difficulty | Intermediate |
| Time Estimate | 10-14 hours |
| Main Programming Language | C + x86-64 assembly |
| Alternative Programming Languages | Rust (inline asm) |
| Coolness Level | High |
| Business Potential | Medium (debug tooling) |
| Prerequisites | ABI knowledge, registers, ptrace basics |
| Key Topics | syscall ABI, user/kernel boundary, tracing |
1. Learning Objectives
By completing this project, you will:
- Invoke syscalls directly via the x86-64 syscall instruction.
- Map syscall numbers to names and decode arguments.
- Use ptrace to intercept syscall entry and exit.
- Explain why the kernel validates user pointers.
2. All Theory Needed (Per-Concept Breakdown)
Syscall ABI and Tracing
Fundamentals
System calls are the controlled entry points from user space into the kernel. On x86-64 Linux, arguments are passed in specific registers and the syscall instruction triggers a transition to kernel mode. The kernel validates arguments, performs the requested operation, and returns a result in RAX. A tracer can use ptrace to stop a process at each syscall entry and exit, inspect registers, and print a human-readable log. This is the foundation of tools like strace.
Deep Dive into the concept
The syscall ABI is a contract between user space and the kernel. For x86-64, the syscall number goes in RAX, and the first six arguments go in RDI, RSI, RDX, R10, R8, and R9. The syscall instruction switches to kernel mode, updates the instruction pointer to the kernel entry point, and uses the kernel stack for the current thread. The kernel saves register state, validates pointers, performs the syscall, and then returns to user space with sysret or iret.
Direct syscalls bypass libc. This is useful for understanding the minimal OS API and for debugging. However, direct syscalls do not set errno automatically; you must interpret negative return values yourself. The kernel uses negative error codes, and libc translates them into errno and returns -1. Your raw syscall layer should mimic this behavior if you want a libc-like API.
Tracing requires understanding ptrace’s stop semantics. With PTRACE_SYSCALL, the traced process stops twice per syscall: once before entering the kernel and once after returning. You must alternate between these stops to capture both arguments and return values. A tracer must also handle signals and process creation if it wants to trace forked children. For this project, tracing a single process is enough, but you should still handle SIGTRAP and SIGSTOP correctly.
Decoding syscalls requires a mapping from syscall numbers to names and argument formats. You can build a small table for common syscalls (openat, read, write, execve, mmap). When printing arguments, you can show raw values for pointers and sizes. For strings, you can use ptrace(PTRACE_PEEKDATA) to read memory from the traced process, but that is optional. Focus on correctness and determinism.
This project exposes the OS boundary: a system call is not a function call. It changes privilege level, uses a different stack, and must defensively validate user inputs. That is why syscalls are the security boundary for the OS.
How this fit on projects
This concept supports Section 3.2 and Section 3.7 and is used later in Project 15 (kernel module interactions) and Project 16 (network syscalls).
Definitions & key terms
- Syscall: request from user space to kernel.
- ABI: application binary interface (register and calling conventions).
- ptrace: debugging API to control another process.
- errno: thread-local error code set by libc.
Mental model diagram (ASCII)
User code -> syscall instruction -> kernel entry -> kernel work -> return
How it works (step-by-step)
- Load syscall number and args into registers.
- Execute
syscall. - Kernel validates and executes.
- Return value in RAX.
- Tracer stops at entry/exit and logs.
Minimal concrete example
mov rax, 1 ; write
mov rdi, 1 ; stdout
mov rsi, msg
mov rdx, len
syscall
Common misconceptions
- “syscall is just a function”: it changes privilege level.
- “errno is a syscall”: errno is set by libc.
- “ptrace sees only entry”: it stops at entry and exit.
Check-your-understanding questions
- Which registers hold syscall arguments on x86-64?
- Why must the kernel validate user pointers?
- Why does ptrace stop twice per syscall?
Check-your-understanding answers
- RDI, RSI, RDX, R10, R8, R9.
- To avoid kernel memory corruption or leakage.
- One stop for entry (args) and one for exit (return).
Real-world applications
- Debugging tools like strace.
- Security monitoring of syscalls.
Where you’ll apply it
- This project: Section 3.2, Section 3.7, Section 5.10 Phase 2.
- Also used in: Project 15, Project 16.
References
- “TLPI” Ch. 3, 26
- x86-64 SysV ABI documentation
Key insights
The syscall ABI is the real OS API, and tracing reveals its exact behavior.
Summary
By invoking syscalls directly and tracing them, you demystify the user/kernel boundary.
Homework/Exercises to practice the concept
- Add a raw syscall for
getpidandgettid. - Decode
openatflags into names. - Trace
execveand print argv length.
Solutions to the homework/exercises
- Use syscall numbers 39 and 186 on x86-64.
- Map O_RDONLY/O_WRONLY/O_CREAT bits.
- Read argv pointers via ptrace PEEKDATA.
3. Project Specification
3.1 What You Will Build
A small library that performs raw syscalls and a tracer that logs syscalls for a target process.
3.2 Functional Requirements
- Implement raw syscalls for at least two functions.
- Build a syscall table for common syscalls.
- Trace a target process and log entry/exit.
- Print return values and errors.
3.3 Non-Functional Requirements
- Performance: trace simple programs without huge overhead.
- Reliability: consistent output for fixed commands.
- Usability:
./syscall_lab rawand./syscall_lab trace /bin/ls.
3.4 Example Usage / Output
$ ./syscall_lab raw
[raw] write(1, "Hello\n", 6) -> 6
3.5 Data Formats / Schemas / Protocols
- syscall table: number, name, arg count.
3.6 Edge Cases
- Syscall returns -EINTR and must be retried.
- Tracee exits quickly (handle exit events).
3.7 Real World Outcome
3.7.1 How to Run (Copy/Paste)
./syscall_lab raw
./syscall_lab trace /bin/echo --seed 42
3.7.2 Golden Path Demo (Deterministic)
- Use
--seed 42for any randomized formatting.
3.7.3 If CLI: exact terminal transcript
$ ./syscall_lab trace /bin/echo hello
execve("/bin/echo", ["echo","hello"], [/* env */]) = 0
write(1, "hello\n", 6) = 6
exit_group(0) = ?
Failure demo (deterministic):
$ ./syscall_lab trace /nope
error: execve failed (ENOENT)
Exit codes:
0success2invalid args3trace error
4. Solution Architecture
4.1 High-Level Design
Raw syscall lib -> tracer -> syscall table -> formatter
4.2 Key Components
| Component | Responsibility | Key Decisions | |———–|—————-|—————| | Raw syscall | inline asm wrapper | x86-64 ABI | | Tracer | ptrace loop | entry/exit state | | Formatter | pretty output | minimal parsing |
4.3 Data Structures (No Full Code)
struct syscall_info {
int nr;
const char *name;
int argc;
};
4.4 Algorithm Overview
Key Algorithm: ptrace loop
- fork and ptrace child.
- wait for entry stop.
- read registers, print syscall.
- resume and wait for exit stop.
- print return value.
Complexity Analysis:
- Time: O(n) syscalls
- Space: O(1)
5. Implementation Guide
5.1 Development Environment Setup
sudo apt-get install build-essential
5.2 Project Structure
project-root/
|-- syscall_lab.c
|-- syscalls.h
`-- Makefile
5.3 The Core Question You’re Answering
“What actually happens at the user/kernel boundary, and how can I observe it?”
5.4 Concepts You Must Understand First
- x86-64 syscall ABI.
- ptrace stop states.
- errno conventions.
5.5 Questions to Guide Your Design
- How will you map numbers to names?
- How will you handle multi-threaded targets?
- How will you read string arguments safely?
5.6 Thinking Exercise
Predict the first 5 syscalls made by /bin/ls.
5.7 The Interview Questions They’ll Ask
- Why does the kernel validate user pointers?
- What registers carry syscall args?
5.8 Hints in Layers
Hint 1: Start with raw getpid syscall.
Hint 2: Add ptrace loop for entry/exit.
Hint 3: Decode a handful of syscalls.
5.9 Books That Will Help
| Topic | Book | Chapter | |——-|——|———| | Syscalls | TLPI | 3 | | ptrace | TLPI | 26 |
5.10 Implementation Phases
Phase 1: Raw syscalls (2-3 hours)
Goals: execute write/getpid.
Phase 2: Tracer loop (3-5 hours)
Goals: entry/exit logging.
Phase 3: Formatting (3-4 hours)
Goals: readable output and errors.
5.11 Key Implementation Decisions
| Decision | Options | Recommendation | Rationale | |———-|———|—————-|———–| | Arg decoding | raw hex vs strings | raw first | simplicity | | Trace scope | single process | single process | avoid fork handling |
6. Testing Strategy
6.1 Test Categories
| Category | Purpose | Examples | |———-|———|———-| | Unit | syscall wrapper | getpid == libc getpid | | Integration | trace /bin/echo | compare with strace | | Error | invalid exec | ENOENT behavior |
6.2 Critical Test Cases
- Raw syscall returns negative errno.
- Tracee exits immediately.
- Tracee writes to stdout.
6.3 Test Data
cmd: /bin/echo hello
7. Common Pitfalls & Debugging
7.1 Frequent Mistakes
| Pitfall | Symptom | Solution | |——–|———|———-| | Wrong syscall numbers | -ENOSYS | verify table | | Missing entry/exit toggle | duplicate logs | track state | | ptrace permission error | EPERM | run as same user |
7.2 Debugging Strategies
- Compare output with
strace -o ref. - Print raw register values on each stop.
7.3 Performance Traps
- Reading large strings with ptrace is slow.
8. Extensions & Challenges
8.1 Beginner Extensions
- Add decoding for read/write/open.
8.2 Intermediate Extensions
- Trace forked children (PTRACE_O_TRACEFORK).
8.3 Advanced Extensions
- Implement seccomp filter demo using your syscall table.
9. Real-World Connections
9.1 Industry Applications
- Observability and security tooling.
9.2 Related Open Source Projects
- strace: full syscall tracer.
9.3 Interview Relevance
- Syscall ABI questions.
10. Resources
10.1 Essential Reading
- TLPI Ch. 3, 26
10.2 Video Resources
- Kernel entry path lectures
10.3 Tools & Documentation
man syscall,man ptrace
10.4 Related Projects in This Series
11. Self-Assessment Checklist
11.1 Understanding
- I can explain syscall argument registers.
- I can explain ptrace stops.
11.2 Implementation
- Raw syscalls and tracing work.
11.3 Growth
- I can explain syscalls in an interview.
12. Submission / Completion Criteria
Minimum Viable Completion:
- Raw syscall and basic trace output.
Full Completion:
- Entry/exit logging with error decoding.
Excellence (Going Above & Beyond):
- Multi-process tracing and string decoding.