Project 8: System Call Interface

Implement raw syscalls in assembly and build a tiny syscall tracer.

Quick Reference

Attribute Value
Difficulty Intermediate
Time Estimate 10-14 hours
Main Programming Language C + x86-64 assembly
Alternative Programming Languages Rust (inline asm)
Coolness Level High
Business Potential Medium (debug tooling)
Prerequisites ABI knowledge, registers, ptrace basics
Key Topics syscall ABI, user/kernel boundary, tracing

1. Learning Objectives

By completing this project, you will:

  1. Invoke syscalls directly via the x86-64 syscall instruction.
  2. Map syscall numbers to names and decode arguments.
  3. Use ptrace to intercept syscall entry and exit.
  4. Explain why the kernel validates user pointers.

2. All Theory Needed (Per-Concept Breakdown)

Syscall ABI and Tracing

Fundamentals

System calls are the controlled entry points from user space into the kernel. On x86-64 Linux, arguments are passed in specific registers and the syscall instruction triggers a transition to kernel mode. The kernel validates arguments, performs the requested operation, and returns a result in RAX. A tracer can use ptrace to stop a process at each syscall entry and exit, inspect registers, and print a human-readable log. This is the foundation of tools like strace.

Deep Dive into the concept

The syscall ABI is a contract between user space and the kernel. For x86-64, the syscall number goes in RAX, and the first six arguments go in RDI, RSI, RDX, R10, R8, and R9. The syscall instruction switches to kernel mode, updates the instruction pointer to the kernel entry point, and uses the kernel stack for the current thread. The kernel saves register state, validates pointers, performs the syscall, and then returns to user space with sysret or iret.

Direct syscalls bypass libc. This is useful for understanding the minimal OS API and for debugging. However, direct syscalls do not set errno automatically; you must interpret negative return values yourself. The kernel uses negative error codes, and libc translates them into errno and returns -1. Your raw syscall layer should mimic this behavior if you want a libc-like API.

Tracing requires understanding ptrace’s stop semantics. With PTRACE_SYSCALL, the traced process stops twice per syscall: once before entering the kernel and once after returning. You must alternate between these stops to capture both arguments and return values. A tracer must also handle signals and process creation if it wants to trace forked children. For this project, tracing a single process is enough, but you should still handle SIGTRAP and SIGSTOP correctly.

Decoding syscalls requires a mapping from syscall numbers to names and argument formats. You can build a small table for common syscalls (openat, read, write, execve, mmap). When printing arguments, you can show raw values for pointers and sizes. For strings, you can use ptrace(PTRACE_PEEKDATA) to read memory from the traced process, but that is optional. Focus on correctness and determinism.

This project exposes the OS boundary: a system call is not a function call. It changes privilege level, uses a different stack, and must defensively validate user inputs. That is why syscalls are the security boundary for the OS.

How this fit on projects

This concept supports Section 3.2 and Section 3.7 and is used later in Project 15 (kernel module interactions) and Project 16 (network syscalls).

Definitions & key terms

  • Syscall: request from user space to kernel.
  • ABI: application binary interface (register and calling conventions).
  • ptrace: debugging API to control another process.
  • errno: thread-local error code set by libc.

Mental model diagram (ASCII)

User code -> syscall instruction -> kernel entry -> kernel work -> return

How it works (step-by-step)

  1. Load syscall number and args into registers.
  2. Execute syscall.
  3. Kernel validates and executes.
  4. Return value in RAX.
  5. Tracer stops at entry/exit and logs.

Minimal concrete example

mov rax, 1      ; write
mov rdi, 1      ; stdout
mov rsi, msg
mov rdx, len
syscall

Common misconceptions

  • “syscall is just a function”: it changes privilege level.
  • “errno is a syscall”: errno is set by libc.
  • “ptrace sees only entry”: it stops at entry and exit.

Check-your-understanding questions

  1. Which registers hold syscall arguments on x86-64?
  2. Why must the kernel validate user pointers?
  3. Why does ptrace stop twice per syscall?

Check-your-understanding answers

  1. RDI, RSI, RDX, R10, R8, R9.
  2. To avoid kernel memory corruption or leakage.
  3. One stop for entry (args) and one for exit (return).

Real-world applications

  • Debugging tools like strace.
  • Security monitoring of syscalls.

Where you’ll apply it

  • This project: Section 3.2, Section 3.7, Section 5.10 Phase 2.
  • Also used in: Project 15, Project 16.

References

  • “TLPI” Ch. 3, 26
  • x86-64 SysV ABI documentation

Key insights

The syscall ABI is the real OS API, and tracing reveals its exact behavior.

Summary

By invoking syscalls directly and tracing them, you demystify the user/kernel boundary.

Homework/Exercises to practice the concept

  1. Add a raw syscall for getpid and gettid.
  2. Decode openat flags into names.
  3. Trace execve and print argv length.

Solutions to the homework/exercises

  1. Use syscall numbers 39 and 186 on x86-64.
  2. Map O_RDONLY/O_WRONLY/O_CREAT bits.
  3. Read argv pointers via ptrace PEEKDATA.

3. Project Specification

3.1 What You Will Build

A small library that performs raw syscalls and a tracer that logs syscalls for a target process.

3.2 Functional Requirements

  1. Implement raw syscalls for at least two functions.
  2. Build a syscall table for common syscalls.
  3. Trace a target process and log entry/exit.
  4. Print return values and errors.

3.3 Non-Functional Requirements

  • Performance: trace simple programs without huge overhead.
  • Reliability: consistent output for fixed commands.
  • Usability: ./syscall_lab raw and ./syscall_lab trace /bin/ls.

3.4 Example Usage / Output

$ ./syscall_lab raw
[raw] write(1, "Hello\n", 6) -> 6

3.5 Data Formats / Schemas / Protocols

  • syscall table: number, name, arg count.

3.6 Edge Cases

  • Syscall returns -EINTR and must be retried.
  • Tracee exits quickly (handle exit events).

3.7 Real World Outcome

3.7.1 How to Run (Copy/Paste)

./syscall_lab raw
./syscall_lab trace /bin/echo --seed 42

3.7.2 Golden Path Demo (Deterministic)

  • Use --seed 42 for any randomized formatting.

3.7.3 If CLI: exact terminal transcript

$ ./syscall_lab trace /bin/echo hello
execve("/bin/echo", ["echo","hello"], [/* env */]) = 0
write(1, "hello\n", 6) = 6
exit_group(0) = ?

Failure demo (deterministic):

$ ./syscall_lab trace /nope
error: execve failed (ENOENT)

Exit codes:

  • 0 success
  • 2 invalid args
  • 3 trace error

4. Solution Architecture

4.1 High-Level Design

Raw syscall lib -> tracer -> syscall table -> formatter

4.2 Key Components

| Component | Responsibility | Key Decisions | |———–|—————-|—————| | Raw syscall | inline asm wrapper | x86-64 ABI | | Tracer | ptrace loop | entry/exit state | | Formatter | pretty output | minimal parsing |

4.3 Data Structures (No Full Code)

struct syscall_info {
    int nr;
    const char *name;
    int argc;
};

4.4 Algorithm Overview

Key Algorithm: ptrace loop

  1. fork and ptrace child.
  2. wait for entry stop.
  3. read registers, print syscall.
  4. resume and wait for exit stop.
  5. print return value.

Complexity Analysis:

  • Time: O(n) syscalls
  • Space: O(1)

5. Implementation Guide

5.1 Development Environment Setup

sudo apt-get install build-essential

5.2 Project Structure

project-root/
|-- syscall_lab.c
|-- syscalls.h
`-- Makefile

5.3 The Core Question You’re Answering

“What actually happens at the user/kernel boundary, and how can I observe it?”

5.4 Concepts You Must Understand First

  1. x86-64 syscall ABI.
  2. ptrace stop states.
  3. errno conventions.

5.5 Questions to Guide Your Design

  1. How will you map numbers to names?
  2. How will you handle multi-threaded targets?
  3. How will you read string arguments safely?

5.6 Thinking Exercise

Predict the first 5 syscalls made by /bin/ls.

5.7 The Interview Questions They’ll Ask

  1. Why does the kernel validate user pointers?
  2. What registers carry syscall args?

5.8 Hints in Layers

Hint 1: Start with raw getpid syscall.

Hint 2: Add ptrace loop for entry/exit.

Hint 3: Decode a handful of syscalls.

5.9 Books That Will Help

| Topic | Book | Chapter | |——-|——|———| | Syscalls | TLPI | 3 | | ptrace | TLPI | 26 |

5.10 Implementation Phases

Phase 1: Raw syscalls (2-3 hours)

Goals: execute write/getpid.

Phase 2: Tracer loop (3-5 hours)

Goals: entry/exit logging.

Phase 3: Formatting (3-4 hours)

Goals: readable output and errors.

5.11 Key Implementation Decisions

| Decision | Options | Recommendation | Rationale | |———-|———|—————-|———–| | Arg decoding | raw hex vs strings | raw first | simplicity | | Trace scope | single process | single process | avoid fork handling |


6. Testing Strategy

6.1 Test Categories

| Category | Purpose | Examples | |———-|———|———-| | Unit | syscall wrapper | getpid == libc getpid | | Integration | trace /bin/echo | compare with strace | | Error | invalid exec | ENOENT behavior |

6.2 Critical Test Cases

  1. Raw syscall returns negative errno.
  2. Tracee exits immediately.
  3. Tracee writes to stdout.

6.3 Test Data

cmd: /bin/echo hello

7. Common Pitfalls & Debugging

7.1 Frequent Mistakes

| Pitfall | Symptom | Solution | |——–|———|———-| | Wrong syscall numbers | -ENOSYS | verify table | | Missing entry/exit toggle | duplicate logs | track state | | ptrace permission error | EPERM | run as same user |

7.2 Debugging Strategies

  • Compare output with strace -o ref.
  • Print raw register values on each stop.

7.3 Performance Traps

  • Reading large strings with ptrace is slow.

8. Extensions & Challenges

8.1 Beginner Extensions

  • Add decoding for read/write/open.

8.2 Intermediate Extensions

  • Trace forked children (PTRACE_O_TRACEFORK).

8.3 Advanced Extensions

  • Implement seccomp filter demo using your syscall table.

9. Real-World Connections

9.1 Industry Applications

  • Observability and security tooling.
  • strace: full syscall tracer.

9.3 Interview Relevance

  • Syscall ABI questions.

10. Resources

10.1 Essential Reading

  • TLPI Ch. 3, 26

10.2 Video Resources

  • Kernel entry path lectures

10.3 Tools & Documentation

  • man syscall, man ptrace

11. Self-Assessment Checklist

11.1 Understanding

  • I can explain syscall argument registers.
  • I can explain ptrace stops.

11.2 Implementation

  • Raw syscalls and tracing work.

11.3 Growth

  • I can explain syscalls in an interview.

12. Submission / Completion Criteria

Minimum Viable Completion:

  • Raw syscall and basic trace output.

Full Completion:

  • Entry/exit logging with error decoding.

Excellence (Going Above & Beyond):

  • Multi-process tracing and string decoding.