Project 9: Dynamic Analysis with strace/ltrace

Expanded deep-dive guide for Project 9 from the Binary Analysis sprint.

Quick Reference

Attribute Value
Difficulty Level 1: Beginner
Time Estimate 3-5 days
Main Programming Language Command line tools
Alternative Programming Languages Python for automation
Coolness Level Level 2: Practical but Forgettable
Business Potential 1. The “Resume Gold”
Knowledge Area Dynamic Analysis / System Calls
Software or Tool strace, ltrace, Linux
Main Book “The Linux Programming Interface” by Michael Kerrisk

1. Learning Objectives

  1. Build a working implementation with reproducible outputs.
  2. Justify key design choices with binary-analysis principles.
  3. Produce an evidence-backed report of findings and limitations.
  4. Document hardening or next-step improvements.

2. All Theory Needed (Per-Concept Breakdown)

This project depends on concepts from the main sprint primer: loader semantics, control/data-flow recovery, runtime observation, and mitigation-aware vulnerability reasoning. Before implementation, restate the project’s core assumptions in your own words and define how you will validate them.

3. Project Specification

3.1 What You Will Build

Analyze unknown binaries using only system call and library call tracing, without disassembly.

3.2 Functional Requirements

  1. Accept the target binary/input and validate format assumptions.
  2. Produce analyzable outputs (console report and/or artifacts).
  3. Handle malformed inputs safely with explicit errors.

3.3 Non-Functional Requirements

  • Reproducibility: same input should produce equivalent findings.
  • Safety: unknown samples run only in isolated lab contexts.
  • Clarity: separate facts, hypotheses, and inferred conclusions.

3.4 Expanded Project Brief

  • File: P09-dynamic-analysis-with-strace-ltrace.md

  • Main Programming Language: Command line tools
  • Alternative Programming Languages: Python for automation
  • Coolness Level: Level 2: Practical but Forgettable
  • Business Potential: 1. The “Resume Gold”
  • Difficulty: Level 1: Beginner
  • Knowledge Area: Dynamic Analysis / System Calls
  • Software or Tool: strace, ltrace, Linux
  • Main Book: “The Linux Programming Interface” by Michael Kerrisk

What you’ll build: Analyze unknown binaries using only system call and library call tracing, without disassembly.

Why it teaches binary analysis: Sometimes you don’t need disassembly. Seeing what files a program opens and what APIs it calls reveals a lot.

Core challenges you’ll face:

  • Understanding syscall output → maps to knowing what each syscall does
  • Filtering noise → maps to focusing on interesting calls
  • Following child processes → maps to fork/exec tracing
  • Interpreting library calls → maps to understanding libc functions

Resources for key challenges:

Key Concepts:

  • System Calls: “The Linux Programming Interface” Ch. 3
  • Library Calls: ltrace man page
  • Process Tracing: strace man page

Difficulty: Beginner Time estimate: 3-5 days Prerequisites: Basic Linux command line

Real World Outcome

Deliverables:

  • Analysis output or tooling scripts
  • Report with control/data flow notes

Validation checklist:

  • Parses sample binaries correctly
  • Findings are reproducible in debugger
  • No unsafe execution outside lab ```bash $ strace -f ./suspicious_binary 2>&1 | head -50 execve(“./suspicious_binary”, …) = 0 openat(AT_FDCWD, “/etc/passwd”, O_RDONLY) = 3 # Reading password file! read(3, “root:x:0:0:…”, 4096) = 2847 close(3) socket(AF_INET, SOCK_STREAM, 0) = 4 # Opening socket! connect(4, {sa_family=AF_INET, sin_port=htons(1337), sin_addr=inet_addr(“10.0.0.1”)}, 16) = 0 # Connecting to C2! write(4, “root:x:0:0:…”, 2847) = 2847 # Exfiltrating data!

$ ltrace ./crackme __libc_start_main(…) puts(“Enter password: “) fgets(“test\n”, 100, stdin) strlen(“test\n”) = 5 strcmp(“test”, “s3cr3t_p4ss”) = -1 # Password revealed! puts(“Wrong!”)


#### Hints in Layers
Useful strace options:
```bash
strace -f          # Follow child processes
strace -e open     # Only trace open() calls
strace -e file     # All file-related calls
strace -e network  # All network-related calls
strace -s 1000     # Show 1000 chars of strings
strace -o log.txt  # Output to file
strace -p PID      # Attach to running process

Useful ltrace options:

ltrace -e strcmp   # Only trace strcmp
ltrace -e '*'      # All library calls
ltrace -C          # Demangle C++ names
ltrace -n 2        # Show 2 levels of nesting

Analysis workflow:

  1. Run with strace to see syscalls
  2. Run with ltrace to see library calls
  3. Look for interesting patterns:
    • File operations (what does it read/write?)
    • Network operations (where does it connect?)
    • String comparisons (password checks?)

Learning milestones:

  1. Trace basic program → Understand output format
  2. Find password checks → strcmp/memcmp in ltrace
  3. Trace network activity → socket/connect/send
  4. Analyze malware behavior → Without disassembly

The Core Question You Are Answering

“Can we understand what a program does by watching it interact with the operating system, without ever looking at its source code or disassembly?”

This project explores the power of behavioral analysis through system call and library call tracing. You’ll learn that sometimes the most revealing information about a program comes not from what it is, but from what it does—every file it touches, every network connection it makes, every string it compares.

Concepts You Must Understand First

  1. System Calls (syscalls)
    • The boundary between user space and kernel space—how programs request services from the OS
    • Every file operation, network connection, or process creation goes through syscalls
    • Understanding syscalls reveals a program’s interactions with the outside world

    Guiding Questions:

    • Why can’t user-space programs directly access hardware or files?
    • What’s the difference between a library call like fopen() and a syscall like open()?
    • How does the kernel validate syscall arguments to prevent malicious programs from harming the system?

    Book References:

    • “The Linux Programming Interface” by Michael Kerrisk - Chapter 3: System Programming Concepts
    • “Computer Systems: A Programmer’s Perspective” (CS:APP) - Chapter 8.4: Process Control (syscall mechanics)
    • “Low-Level Programming” by Igor Zhirkov - Chapter 2.5: System Calls
  2. Process Memory Layout
    • How programs are loaded into memory (text, data, stack, heap segments)
    • Understanding memory addresses in strace output (e.g., mmap() calls)
    • Why programs request memory from the OS via brk() or mmap()

    Guiding Questions:

    • What does it mean when strace shows brk(0x5555555a2000) = 0x5555555a2000?
    • Why do programs use mmap() instead of just allocating with malloc()?
    • How can you tell from syscall traces whether a program is leaking memory?

    Book References:

    • “Computer Systems: A Programmer’s Perspective” - Chapter 9: Virtual Memory
    • “The Linux Programming Interface” - Chapter 6: Processes (memory layout)
    • “Practical Binary Analysis” by Dennis Andriesse - Chapter 5.2: Loading and Dynamic Linking
  3. Library Calls vs. System Calls
    • Library calls (ltrace) are user-space wrappers around syscalls
    • One fread() might generate multiple read() syscalls due to buffering
    • Understanding the libc abstraction layer

    Guiding Questions:

    • Why does printf("hello") not immediately call write() syscall?
    • How does libc’s buffering affect what you see in strace vs. ltrace?
    • When would you use ltrace instead of strace (and vice versa)?

    Book References:

    • “The Linux Programming Interface” - Chapter 13: File I/O Buffering
    • “Computer Systems: A Programmer’s Perspective” - Chapter 10: System-Level I/O
  4. File Descriptors and File Operations
    • Understanding fd numbers: 0=stdin, 1=stdout, 2=stderr, 3+=open files
    • How openat(), read(), write(), close() work together
    • Interpreting flags like O_RDONLY, O_WRONLY, O_CREAT

    Guiding Questions:

    • What does openat(AT_FDCWD, "/etc/passwd", O_RDONLY) = 3 tell you?
    • How can you track which fd corresponds to which file in a long trace?
    • What’s suspicious about a program opening /dev/urandom or /etc/shadow?

    Book References:

    • “The Linux Programming Interface” - Chapter 4: File I/O: The Universal I/O Model
    • “The Linux Programming Interface” - Chapter 18: Directories and Links
  5. Process Lifecycle (fork/exec/wait)
    • How processes create children with fork(), replace themselves with execve()
    • Following child processes with strace -f
    • Understanding return values: fork() returns twice (parent gets child PID, child gets 0)

    Guiding Questions:

    • Why does fork() return different values in parent and child?
    • What happens to file descriptors when a process calls execve()?
    • How would you trace a shell script that spawns multiple child processes?

    Book References:

    • “The Linux Programming Interface” - Chapter 24: Process Creation
    • “The Linux Programming Interface” - Chapter 27: Program Execution
    • “Computer Systems: A Programmer’s Perspective” - Chapter 8.4: Process Control
  6. Network Socket API
    • Understanding socket(), connect(), bind(), listen(), accept(), send(), recv()
    • Reading sockaddr structures to extract IP addresses and ports
    • Identifying client vs. server behavior from syscall patterns

    Guiding Questions:

    • What syscall sequence indicates a program is acting as a server?
    • How do you extract the destination IP and port from a connect() call?
    • What’s the difference between AF_INET (IPv4) and AF_INET6 (IPv6)?

    Book References:

    • “The Linux Programming Interface” - Chapter 56-61: Sockets and Network Programming
    • “Computer Systems: A Programmer’s Perspective” - Chapter 11: Network Programming
  7. Signal Handling
    • How programs respond to events (Ctrl+C sends SIGINT, segfault triggers SIGSEGV)
    • Seeing rt_sigaction() and rt_sigprocmask() in traces
    • Understanding signal delivery and handler installation

    Guiding Questions:

    • What does it mean when a program installs a handler for SIGSEGV?
    • Why might malware install signal handlers to detect debugging?
    • How can you tell if a program is ignoring SIGTERM?

    Book References:

    • “The Linux Programming Interface” - Chapter 20-22: Signals
    • “Computer Systems: A Programmer’s Perspective” - Chapter 8.5: Signals
  8. Dynamic Linking and Shared Libraries
    • How programs load .so files at runtime
    • Understanding LD_PRELOAD and library injection
    • Seeing dlopen(), dlsym() for runtime loading

    Guiding Questions:

    • What’s happening when you see multiple openat() calls to .so files?
    • How could an attacker use LD_PRELOAD maliciously?
    • Why do some programs use dlopen() instead of linking at compile time?

    Book References:

    • “Computer Systems: A Programmer’s Perspective” - Chapter 7: Linking
    • “Practical Binary Analysis” - Chapter 5: Loading and Dynamic Linking
    • “The Linux Programming Interface” - Chapter 41-42: Shared Libraries

Questions to Guide Your Design

  1. How can you automatically filter out “boring” syscalls (like mmap() for library loading) to focus on interesting behavior?
    • Consider writing a Python script that parses strace output and highlights file/network operations
    • What heuristics distinguish initialization syscalls from runtime behavior?
  2. How would you detect anti-debugging or anti-tracing techniques in a program?
    • Programs can check if they’re being traced using ptrace(PTRACE_TRACEME)
    • What syscall patterns indicate a program is checking for analysis tools?
  3. How can you reconstruct a program’s command-line parsing logic from ltrace output alone?
    • Watch for strcmp(), strncmp(), getopt() calls
    • Can you build a decision tree of program behavior based on arguments?
  4. What’s the difference between tracing a statically-linked binary vs. a dynamically-linked binary?
    • Static binaries make syscalls directly; dynamic binaries go through libc
    • How does this affect what you see in strace vs. ltrace?
  5. How would you trace a multi-threaded program with strace?
    • Use strace -f to follow threads created by clone()
    • How do you distinguish thread creation from process creation in the output?
  6. Can you identify a program’s cryptographic operations from syscall traces?
    • Look for reads from /dev/urandom (entropy source)
    • Large writes to network sockets might indicate encrypted communication
  7. How would you use strace to diagnose why a program is slow or hanging?
    • Look for blocking syscalls: read() on network sockets, wait() on child processes
    • Use strace -T to show time spent in each syscall
  8. How can you determine if a binary is packed or obfuscated by examining its syscalls?
    • Self-modifying code might use mprotect() to change memory permissions
    • Packed binaries often unpack themselves in memory before executing

Thinking Exercise

Exercise 1: Manual Syscall Trace Analysis

Before running any tools, examine this strace output from an unknown binary:

execve("./mystery", ["./mystery"], 0x7ffc...) = 0
openat(AT_FDCWD, "/home/user/.ssh/id_rsa", O_RDONLY) = 3
read(3, "-----BEGIN RSA PRIVATE KEY-----
"..., 4096) = 1679
close(3) = 0
socket(AF_INET, SOCK_STREAM, IPPROTO_TCP) = 3
connect(3, {sa_family=AF_INET, sin_port=htons(443),
        sin_addr=inet_addr("203.0.113.45")}, 16) = 0
write(3, "-----BEGIN RSA PRIVATE KEY-----
"..., 1679) = 1679
close(3) = 0
unlink("/home/user/.ssh/id_rsa") = 0

Questions to answer:

  1. What is this program doing? (Be specific about each step)
  2. What type of malware behavior does this exhibit?
  3. What Indicators of Compromise (IOCs) can you extract?
  4. How would you write a YARA rule to detect similar behavior?
  5. What syscall would you set a breakpoint on if debugging this?

Exercise 2: ltrace Password Extraction

Given this ltrace output from a crackme:

__libc_start_main(...)
puts("Enter password: ")
fgets("my_guess
", 100, 0x7f...)
strlen("my_guess
") = 9
strcmp("my_guess", "sup3r_s3cr3t") = -1
puts("Wrong password!")

Tasks:

  1. Extract the correct password (even though we guessed wrong)
  2. Explain why ltrace is more useful than strace for this crackme
  3. What would strace show instead? (Describe the syscalls)
  4. How could the developer prevent this ltrace attack?

Exercise 3: Network Protocol Reconstruction

Analyze this strace excerpt and reconstruct the network protocol:

socket(AF_INET, SOCK_STREAM, 0) = 3
connect(3, {sin_addr=inet_addr("10.0.0.5"), sin_port=htons(9999)}, 16) = 0
write(3, "HELLO
", 6) = 6
read(3, "OK
", 4096) = 3
write(3, "GET /data
", 10) = 10
read(3, "DATA:12345
", 4096) = 11
write(3, "BYE
", 4) = 4
close(3) = 0

Questions:

  1. Is this a text-based or binary protocol?
  2. What’s the message flow? (Draw a sequence diagram)
  3. How would you fuzz this protocol?
  4. What’s missing from this trace that would help with analysis?

The Interview Questions They’ll Ask

  1. “You’re analyzing a suspicious binary. It produces no output, but you suspect it’s exfiltrating data. How would you use strace to confirm this?”
    • Expected Answer: Use strace -e network to trace network syscalls. Look for socket(), connect(), send(), or write() to network fds. Check destination IPs. Use strace -s 1000 to see full data buffers. Alternatively, combine with Wireshark for full packet capture.
  2. “Explain the difference between strace and ltrace. When would you use each?”
    • Expected Answer: strace traces system calls (kernel boundary), ltrace traces library calls (user-space functions). Use strace for file/network I/O, process management. Use ltrace for string operations (strcmp), crypto functions (MD5), library-level logic. Sometimes you need both: strace shows what happens, ltrace shows how the program logic works.
  3. “A program is reading from /dev/urandom. What does this tell you, and what should you investigate next?”
    • Expected Answer: It’s generating random numbers, likely for cryptography or nonce generation. Check how much entropy it reads. Look for subsequent crypto operations (OpenSSL functions in ltrace, or network writes that might be encrypted data). Could be legitimate (TLS) or malicious (ransomware generating encryption keys).
  4. “How does strace work under the hood? What syscall does strace itself use?”
    • Expected Answer: strace uses ptrace() syscall to attach to a process and intercept its syscalls. When the traced process makes a syscall, the kernel stops it and notifies strace. This is the same mechanism debuggers use. This is why anti-debugging malware often checks for ptrace() or looks for parent processes named “strace”.
  5. “You see hundreds of mmap() and mprotect() calls in a trace. What might this indicate?”
    • Expected Answer: Could be normal (loading shared libraries, allocating memory). Or could indicate packing/obfuscation—malware unpacking itself, self-modifying code, or JIT compilation. Check if mprotect() is changing memory to executable (PROT_EXEC). Packed malware often mmap()s space, writes unpacked code, then mprotect()s it to RWX.
  6. “How would you trace a program that uses fork() to create multiple child processes?”
    • Expected Answer: Use strace -f (follow forks). Output can be confusing with interleaved processes. Use -ff -o trace.log to write each process to a separate file (trace.log.PID). Then analyze each child’s behavior independently. Watch for clone() (threads) vs. fork() (processes).
  7. “A program calls unlink() on its own executable. What’s likely happening?”
    • Expected Answer: It’s deleting itself, common in malware to hide tracks. On Linux, an open file can be deleted—it stays on disk until the last fd is closed. The program continues running from memory. This is an anti-forensics technique. You’d need to dump the process memory to recover the binary.
  8. “You trace a crackme and see strcmp("my_input", "secretpass") = -1. Is this always the password?”
    • Expected Answer: Usually yes, but not always! Some crackmes use tricks: comparing hashes instead of plaintext, doing multiple checks (must pass all), or using timing attacks. Also, smart crackmes might use memcmp() (binary compare) instead of strcmp() to avoid ltrace. Or they might implement custom comparison in assembly to avoid library calls entirely.
  9. “How can a program detect that it’s being traced by strace, and how would you bypass this detection?”
    • Expected Answer: Programs can call ptrace(PTRACE_TRACEME) which fails if already traced (strace uses ptrace). They can check /proc/self/status for “TracerPid”. They can use timing attacks (strace is slow). Bypasses: Use kernel modules that hook syscalls without ptrace. Use emulation (QEMU user-mode). Patch the binary to remove checks. Use LD_PRELOAD to fake ptrace return values.
  10. “You need to analyze a binary but it’s statically linked. How does this affect your strace/ltrace strategy?”
    • Expected Answer: ltrace is useless—no library calls to intercept. strace still works (syscalls are unavoidable). You’ll see raw syscalls instead of nice library wrappers. For string operations, you’ll need to disassemble or use dynamic instrumentation (Frida, DynamoRIO) to hook internal functions.

Books That Will Help

Topic Book Chapter/Section Why It Matters
System Call Fundamentals “The Linux Programming Interface” by Michael Kerrisk Ch. 3: System Programming Concepts Complete reference for every syscall you’ll see in traces
System Call Mechanics “Computer Systems: A Programmer’s Perspective” by Bryant & O’Hallaron Ch. 8.1: Exceptions; Ch. 8.4: Process Control Understand how syscalls transition from user to kernel mode
File I/O Operations “The Linux Programming Interface” by Michael Kerrisk Ch. 4-5: File I/O Decode all file-related syscalls (open, read, write, ioctl)
Process Management “The Linux Programming Interface” by Michael Kerrisk Ch. 24-27: Process Creation, Monitoring, Execution Understand fork(), exec(), wait() patterns in traces
Network Programming “The Linux Programming Interface” by Michael Kerrisk Ch. 56-61: Sockets Interpret socket(), connect(), bind(), listen(), accept()
Network Internals “Computer Systems: A Programmer’s Perspective” Ch. 11: Network Programming Client-server architecture, protocol design
Signals “The Linux Programming Interface” by Michael Kerrisk Ch. 20-22: Signals Understand signal handlers in malware
Dynamic Linking “Computer Systems: A Programmer’s Perspective” Ch. 7: Linking Why you see library loads in strace
Binary Loading “Practical Binary Analysis” by Dennis Andriesse Ch. 5: Loading and Dynamic Linking How programs load and what syscalls this generates
Low-Level System Calls “Low-Level Programming” by Igor Zhirkov Ch. 2: Assembly Language Direct syscall invocation via syscall instruction
Ptrace Internals “The Linux Programming Interface” by Michael Kerrisk Ch. 53: Process Credentials (includes ptrace) How strace itself works
Anti-Debugging Techniques “Practical Malware Analysis” by Sikorski & Honig Ch. 15: Anti-Disassembly and Anti-Debugging Detect and bypass tracing countermeasures
Behavioral Analysis Methodology “Practical Malware Analysis” by Sikorski & Honig Ch. 3: Basic Dynamic Analysis Professional workflow for using dynamic analysis tools
Assembly & Syscalls “Hacking: The Art of Exploitation” by Jon Erickson Ch. 0x200: Programming (syscalls section) Raw syscall invocation in assembly

Common Pitfalls and Debugging

Problem 1: “Your interpretation does not match runtime behavior”

  • Why: Static analysis can hide runtime-resolved addresses, lazy binding, and input-dependent branches.
  • Fix: Reproduce the path with debugger or tracer, then compare static assumptions against live register/memory state.
  • Quick test: Run the same sample through both your static workflow and a debugger transcript, and confirm control-flow decisions align.

Problem 2: “Tool output is inconsistent across machines”

  • Why: ASLR, tool version drift, and different binary build flags (PIE, RELRO, symbols stripped) change observed addresses and metadata.
  • Fix: Pin tool versions, capture checksec/metadata, and document environment assumptions in your report.
  • Quick test: Re-run analysis in a container or VM with pinned tools and compare hashes of generated outputs.

Problem 3: “Analysis accidentally executes unsafe code”

  • Why: Dynamic workflows run binaries in host context without sufficient isolation.
  • Fix: Use disposable snapshots, no-network execution, and non-privileged users for all unknown samples.
  • Quick test: Validate isolation controls first (network disabled, snapshot active, unprivileged user), then execute sample.

Definition of Done

  • Core functionality works on reference inputs
  • Edge cases are tested and documented
  • Results are reproducible (same binary, same tools, same report output)
  • Analysis notes clearly separate observations, assumptions, and conclusions
  • Lab safety controls were applied for any dynamic execution

4. Solution Architecture

Input Artifact -> Parse/Decode -> Analysis Engine -> Validation Layer -> Report

Design each stage so intermediate artifacts are inspectable (JSON/text/notes), which makes debugging and peer review much easier.

5. Implementation Phases

Phase 1: Foundation

  • Define input assumptions and format checks.
  • Produce a minimal golden output on one known sample.

Phase 2: Core Functionality

  • Implement full analysis pass for normal cases.
  • Add validation against an external ground-truth tool.

Phase 3: Hard Cases and Reporting

  • Add malformed/edge-case handling.
  • Finalize report template and reproducibility notes.

6. Testing Strategy

  • Unit-level checks for parser/decoder helpers.
  • Integration checks against known binaries/challenges.
  • Regression tests for previously failing cases.

7. Extensions & Challenges

  • Add automation for batch analysis and comparative reports.
  • Add confidence scoring for each major finding.
  • Add export formats suitable for CI/security pipelines.

8. Production Reflection

Map your project output to a production analogue: what reliability, observability, and security controls would be required to run this continuously in an engineering organization?