Project 9: Dynamic Analysis with strace/ltrace
Expanded deep-dive guide for Project 9 from the Binary Analysis sprint.
Quick Reference
| Attribute | Value |
|---|---|
| Difficulty | Level 1: Beginner |
| Time Estimate | 3-5 days |
| Main Programming Language | Command line tools |
| Alternative Programming Languages | Python for automation |
| Coolness Level | Level 2: Practical but Forgettable |
| Business Potential | 1. The “Resume Gold” |
| Knowledge Area | Dynamic Analysis / System Calls |
| Software or Tool | strace, ltrace, Linux |
| Main Book | “The Linux Programming Interface” by Michael Kerrisk |
1. Learning Objectives
- Build a working implementation with reproducible outputs.
- Justify key design choices with binary-analysis principles.
- Produce an evidence-backed report of findings and limitations.
- Document hardening or next-step improvements.
2. All Theory Needed (Per-Concept Breakdown)
This project depends on concepts from the main sprint primer: loader semantics, control/data-flow recovery, runtime observation, and mitigation-aware vulnerability reasoning. Before implementation, restate the project’s core assumptions in your own words and define how you will validate them.
3. Project Specification
3.1 What You Will Build
Analyze unknown binaries using only system call and library call tracing, without disassembly.
3.2 Functional Requirements
- Accept the target binary/input and validate format assumptions.
- Produce analyzable outputs (console report and/or artifacts).
- Handle malformed inputs safely with explicit errors.
3.3 Non-Functional Requirements
- Reproducibility: same input should produce equivalent findings.
- Safety: unknown samples run only in isolated lab contexts.
- Clarity: separate facts, hypotheses, and inferred conclusions.
3.4 Expanded Project Brief
-
File: P09-dynamic-analysis-with-strace-ltrace.md
- Main Programming Language: Command line tools
- Alternative Programming Languages: Python for automation
- Coolness Level: Level 2: Practical but Forgettable
- Business Potential: 1. The “Resume Gold”
- Difficulty: Level 1: Beginner
- Knowledge Area: Dynamic Analysis / System Calls
- Software or Tool: strace, ltrace, Linux
- Main Book: “The Linux Programming Interface” by Michael Kerrisk
What you’ll build: Analyze unknown binaries using only system call and library call tracing, without disassembly.
Why it teaches binary analysis: Sometimes you don’t need disassembly. Seeing what files a program opens and what APIs it calls reveals a lot.
Core challenges you’ll face:
- Understanding syscall output → maps to knowing what each syscall does
- Filtering noise → maps to focusing on interesting calls
- Following child processes → maps to fork/exec tracing
- Interpreting library calls → maps to understanding libc functions
Resources for key challenges:
- Packt - Using ltrace and strace
- Red Hat - ltrace Guide
- “The Linux Programming Interface” - Syscall reference
Key Concepts:
- System Calls: “The Linux Programming Interface” Ch. 3
- Library Calls: ltrace man page
- Process Tracing: strace man page
Difficulty: Beginner Time estimate: 3-5 days Prerequisites: Basic Linux command line
Real World Outcome
Deliverables:
- Analysis output or tooling scripts
- Report with control/data flow notes
Validation checklist:
- Parses sample binaries correctly
- Findings are reproducible in debugger
- No unsafe execution outside lab ```bash $ strace -f ./suspicious_binary 2>&1 | head -50 execve(“./suspicious_binary”, …) = 0 openat(AT_FDCWD, “/etc/passwd”, O_RDONLY) = 3 # Reading password file! read(3, “root:x:0:0:…”, 4096) = 2847 close(3) socket(AF_INET, SOCK_STREAM, 0) = 4 # Opening socket! connect(4, {sa_family=AF_INET, sin_port=htons(1337), sin_addr=inet_addr(“10.0.0.1”)}, 16) = 0 # Connecting to C2! write(4, “root:x:0:0:…”, 2847) = 2847 # Exfiltrating data!
$ ltrace ./crackme __libc_start_main(…) puts(“Enter password: “) fgets(“test\n”, 100, stdin) strlen(“test\n”) = 5 strcmp(“test”, “s3cr3t_p4ss”) = -1 # Password revealed! puts(“Wrong!”)
#### Hints in Layers
Useful strace options:
```bash
strace -f # Follow child processes
strace -e open # Only trace open() calls
strace -e file # All file-related calls
strace -e network # All network-related calls
strace -s 1000 # Show 1000 chars of strings
strace -o log.txt # Output to file
strace -p PID # Attach to running process
Useful ltrace options:
ltrace -e strcmp # Only trace strcmp
ltrace -e '*' # All library calls
ltrace -C # Demangle C++ names
ltrace -n 2 # Show 2 levels of nesting
Analysis workflow:
- Run with strace to see syscalls
- Run with ltrace to see library calls
- Look for interesting patterns:
- File operations (what does it read/write?)
- Network operations (where does it connect?)
- String comparisons (password checks?)
Learning milestones:
- Trace basic program → Understand output format
- Find password checks → strcmp/memcmp in ltrace
- Trace network activity → socket/connect/send
- Analyze malware behavior → Without disassembly
The Core Question You Are Answering
“Can we understand what a program does by watching it interact with the operating system, without ever looking at its source code or disassembly?”
This project explores the power of behavioral analysis through system call and library call tracing. You’ll learn that sometimes the most revealing information about a program comes not from what it is, but from what it does—every file it touches, every network connection it makes, every string it compares.
Concepts You Must Understand First
- System Calls (syscalls)
- The boundary between user space and kernel space—how programs request services from the OS
- Every file operation, network connection, or process creation goes through syscalls
- Understanding syscalls reveals a program’s interactions with the outside world
Guiding Questions:
- Why can’t user-space programs directly access hardware or files?
- What’s the difference between a library call like
fopen()and a syscall likeopen()? - How does the kernel validate syscall arguments to prevent malicious programs from harming the system?
Book References:
- “The Linux Programming Interface” by Michael Kerrisk - Chapter 3: System Programming Concepts
- “Computer Systems: A Programmer’s Perspective” (CS:APP) - Chapter 8.4: Process Control (syscall mechanics)
- “Low-Level Programming” by Igor Zhirkov - Chapter 2.5: System Calls
- Process Memory Layout
- How programs are loaded into memory (text, data, stack, heap segments)
- Understanding memory addresses in strace output (e.g.,
mmap()calls) - Why programs request memory from the OS via
brk()ormmap()
Guiding Questions:
- What does it mean when strace shows
brk(0x5555555a2000) = 0x5555555a2000? - Why do programs use
mmap()instead of just allocating withmalloc()? - How can you tell from syscall traces whether a program is leaking memory?
Book References:
- “Computer Systems: A Programmer’s Perspective” - Chapter 9: Virtual Memory
- “The Linux Programming Interface” - Chapter 6: Processes (memory layout)
- “Practical Binary Analysis” by Dennis Andriesse - Chapter 5.2: Loading and Dynamic Linking
- Library Calls vs. System Calls
- Library calls (ltrace) are user-space wrappers around syscalls
- One
fread()might generate multipleread()syscalls due to buffering - Understanding the libc abstraction layer
Guiding Questions:
- Why does
printf("hello")not immediately callwrite()syscall? - How does libc’s buffering affect what you see in strace vs. ltrace?
- When would you use ltrace instead of strace (and vice versa)?
Book References:
- “The Linux Programming Interface” - Chapter 13: File I/O Buffering
- “Computer Systems: A Programmer’s Perspective” - Chapter 10: System-Level I/O
- File Descriptors and File Operations
- Understanding fd numbers: 0=stdin, 1=stdout, 2=stderr, 3+=open files
- How
openat(),read(),write(),close()work together - Interpreting flags like
O_RDONLY,O_WRONLY,O_CREAT
Guiding Questions:
- What does
openat(AT_FDCWD, "/etc/passwd", O_RDONLY) = 3tell you? - How can you track which fd corresponds to which file in a long trace?
- What’s suspicious about a program opening
/dev/urandomor/etc/shadow?
Book References:
- “The Linux Programming Interface” - Chapter 4: File I/O: The Universal I/O Model
- “The Linux Programming Interface” - Chapter 18: Directories and Links
- Process Lifecycle (fork/exec/wait)
- How processes create children with
fork(), replace themselves withexecve() - Following child processes with
strace -f - Understanding return values:
fork()returns twice (parent gets child PID, child gets 0)
Guiding Questions:
- Why does
fork()return different values in parent and child? - What happens to file descriptors when a process calls
execve()? - How would you trace a shell script that spawns multiple child processes?
Book References:
- “The Linux Programming Interface” - Chapter 24: Process Creation
- “The Linux Programming Interface” - Chapter 27: Program Execution
- “Computer Systems: A Programmer’s Perspective” - Chapter 8.4: Process Control
- How processes create children with
- Network Socket API
- Understanding
socket(),connect(),bind(),listen(),accept(),send(),recv() - Reading sockaddr structures to extract IP addresses and ports
- Identifying client vs. server behavior from syscall patterns
Guiding Questions:
- What syscall sequence indicates a program is acting as a server?
- How do you extract the destination IP and port from a
connect()call? - What’s the difference between
AF_INET(IPv4) andAF_INET6(IPv6)?
Book References:
- “The Linux Programming Interface” - Chapter 56-61: Sockets and Network Programming
- “Computer Systems: A Programmer’s Perspective” - Chapter 11: Network Programming
- Understanding
- Signal Handling
- How programs respond to events (Ctrl+C sends SIGINT, segfault triggers SIGSEGV)
- Seeing
rt_sigaction()andrt_sigprocmask()in traces - Understanding signal delivery and handler installation
Guiding Questions:
- What does it mean when a program installs a handler for SIGSEGV?
- Why might malware install signal handlers to detect debugging?
- How can you tell if a program is ignoring SIGTERM?
Book References:
- “The Linux Programming Interface” - Chapter 20-22: Signals
- “Computer Systems: A Programmer’s Perspective” - Chapter 8.5: Signals
- Dynamic Linking and Shared Libraries
- How programs load
.sofiles at runtime - Understanding
LD_PRELOADand library injection - Seeing
dlopen(),dlsym()for runtime loading
Guiding Questions:
- What’s happening when you see multiple
openat()calls to.sofiles? - How could an attacker use
LD_PRELOADmaliciously? - Why do some programs use
dlopen()instead of linking at compile time?
Book References:
- “Computer Systems: A Programmer’s Perspective” - Chapter 7: Linking
- “Practical Binary Analysis” - Chapter 5: Loading and Dynamic Linking
- “The Linux Programming Interface” - Chapter 41-42: Shared Libraries
- How programs load
Questions to Guide Your Design
- How can you automatically filter out “boring” syscalls (like
mmap()for library loading) to focus on interesting behavior?- Consider writing a Python script that parses strace output and highlights file/network operations
- What heuristics distinguish initialization syscalls from runtime behavior?
- How would you detect anti-debugging or anti-tracing techniques in a program?
- Programs can check if they’re being traced using
ptrace(PTRACE_TRACEME) - What syscall patterns indicate a program is checking for analysis tools?
- Programs can check if they’re being traced using
- How can you reconstruct a program’s command-line parsing logic from ltrace output alone?
- Watch for
strcmp(),strncmp(),getopt()calls - Can you build a decision tree of program behavior based on arguments?
- Watch for
- What’s the difference between tracing a statically-linked binary vs. a dynamically-linked binary?
- Static binaries make syscalls directly; dynamic binaries go through libc
- How does this affect what you see in strace vs. ltrace?
- How would you trace a multi-threaded program with strace?
- Use
strace -fto follow threads created byclone() - How do you distinguish thread creation from process creation in the output?
- Use
- Can you identify a program’s cryptographic operations from syscall traces?
- Look for reads from
/dev/urandom(entropy source) - Large writes to network sockets might indicate encrypted communication
- Look for reads from
- How would you use strace to diagnose why a program is slow or hanging?
- Look for blocking syscalls:
read()on network sockets,wait()on child processes - Use
strace -Tto show time spent in each syscall
- Look for blocking syscalls:
- How can you determine if a binary is packed or obfuscated by examining its syscalls?
- Self-modifying code might use
mprotect()to change memory permissions - Packed binaries often unpack themselves in memory before executing
- Self-modifying code might use
Thinking Exercise
Exercise 1: Manual Syscall Trace Analysis
Before running any tools, examine this strace output from an unknown binary:
execve("./mystery", ["./mystery"], 0x7ffc...) = 0
openat(AT_FDCWD, "/home/user/.ssh/id_rsa", O_RDONLY) = 3
read(3, "-----BEGIN RSA PRIVATE KEY-----
"..., 4096) = 1679
close(3) = 0
socket(AF_INET, SOCK_STREAM, IPPROTO_TCP) = 3
connect(3, {sa_family=AF_INET, sin_port=htons(443),
sin_addr=inet_addr("203.0.113.45")}, 16) = 0
write(3, "-----BEGIN RSA PRIVATE KEY-----
"..., 1679) = 1679
close(3) = 0
unlink("/home/user/.ssh/id_rsa") = 0
Questions to answer:
- What is this program doing? (Be specific about each step)
- What type of malware behavior does this exhibit?
- What Indicators of Compromise (IOCs) can you extract?
- How would you write a YARA rule to detect similar behavior?
- What syscall would you set a breakpoint on if debugging this?
Exercise 2: ltrace Password Extraction
Given this ltrace output from a crackme:
__libc_start_main(...)
puts("Enter password: ")
fgets("my_guess
", 100, 0x7f...)
strlen("my_guess
") = 9
strcmp("my_guess", "sup3r_s3cr3t") = -1
puts("Wrong password!")
Tasks:
- Extract the correct password (even though we guessed wrong)
- Explain why ltrace is more useful than strace for this crackme
- What would strace show instead? (Describe the syscalls)
- How could the developer prevent this ltrace attack?
Exercise 3: Network Protocol Reconstruction
Analyze this strace excerpt and reconstruct the network protocol:
socket(AF_INET, SOCK_STREAM, 0) = 3
connect(3, {sin_addr=inet_addr("10.0.0.5"), sin_port=htons(9999)}, 16) = 0
write(3, "HELLO
", 6) = 6
read(3, "OK
", 4096) = 3
write(3, "GET /data
", 10) = 10
read(3, "DATA:12345
", 4096) = 11
write(3, "BYE
", 4) = 4
close(3) = 0
Questions:
- Is this a text-based or binary protocol?
- What’s the message flow? (Draw a sequence diagram)
- How would you fuzz this protocol?
- What’s missing from this trace that would help with analysis?
The Interview Questions They’ll Ask
- “You’re analyzing a suspicious binary. It produces no output, but you suspect it’s exfiltrating data. How would you use strace to confirm this?”
- Expected Answer: Use
strace -e networkto trace network syscalls. Look forsocket(),connect(),send(), orwrite()to network fds. Check destination IPs. Usestrace -s 1000to see full data buffers. Alternatively, combine with Wireshark for full packet capture.
- Expected Answer: Use
- “Explain the difference between strace and ltrace. When would you use each?”
- Expected Answer: strace traces system calls (kernel boundary), ltrace traces library calls (user-space functions). Use strace for file/network I/O, process management. Use ltrace for string operations (strcmp), crypto functions (MD5), library-level logic. Sometimes you need both: strace shows what happens, ltrace shows how the program logic works.
- “A program is reading from
/dev/urandom. What does this tell you, and what should you investigate next?”- Expected Answer: It’s generating random numbers, likely for cryptography or nonce generation. Check how much entropy it reads. Look for subsequent crypto operations (OpenSSL functions in ltrace, or network writes that might be encrypted data). Could be legitimate (TLS) or malicious (ransomware generating encryption keys).
- “How does strace work under the hood? What syscall does strace itself use?”
- Expected Answer: strace uses
ptrace()syscall to attach to a process and intercept its syscalls. When the traced process makes a syscall, the kernel stops it and notifies strace. This is the same mechanism debuggers use. This is why anti-debugging malware often checks forptrace()or looks for parent processes named “strace”.
- Expected Answer: strace uses
- “You see hundreds of
mmap()andmprotect()calls in a trace. What might this indicate?”- Expected Answer: Could be normal (loading shared libraries, allocating memory). Or could indicate packing/obfuscation—malware unpacking itself, self-modifying code, or JIT compilation. Check if
mprotect()is changing memory to executable (PROT_EXEC). Packed malware oftenmmap()s space, writes unpacked code, thenmprotect()s it to RWX.
- Expected Answer: Could be normal (loading shared libraries, allocating memory). Or could indicate packing/obfuscation—malware unpacking itself, self-modifying code, or JIT compilation. Check if
- “How would you trace a program that uses fork() to create multiple child processes?”
- Expected Answer: Use
strace -f(follow forks). Output can be confusing with interleaved processes. Use-ff -o trace.logto write each process to a separate file (trace.log.PID). Then analyze each child’s behavior independently. Watch forclone()(threads) vs.fork()(processes).
- Expected Answer: Use
- “A program calls
unlink()on its own executable. What’s likely happening?”- Expected Answer: It’s deleting itself, common in malware to hide tracks. On Linux, an open file can be deleted—it stays on disk until the last fd is closed. The program continues running from memory. This is an anti-forensics technique. You’d need to dump the process memory to recover the binary.
- “You trace a crackme and see
strcmp("my_input", "secretpass") = -1. Is this always the password?”- Expected Answer: Usually yes, but not always! Some crackmes use tricks: comparing hashes instead of plaintext, doing multiple checks (must pass all), or using timing attacks. Also, smart crackmes might use
memcmp()(binary compare) instead ofstrcmp()to avoid ltrace. Or they might implement custom comparison in assembly to avoid library calls entirely.
- Expected Answer: Usually yes, but not always! Some crackmes use tricks: comparing hashes instead of plaintext, doing multiple checks (must pass all), or using timing attacks. Also, smart crackmes might use
- “How can a program detect that it’s being traced by strace, and how would you bypass this detection?”
- Expected Answer: Programs can call
ptrace(PTRACE_TRACEME)which fails if already traced (strace uses ptrace). They can check/proc/self/statusfor “TracerPid”. They can use timing attacks (strace is slow). Bypasses: Use kernel modules that hook syscalls without ptrace. Use emulation (QEMU user-mode). Patch the binary to remove checks. Use LD_PRELOAD to fake ptrace return values.
- Expected Answer: Programs can call
- “You need to analyze a binary but it’s statically linked. How does this affect your strace/ltrace strategy?”
- Expected Answer: ltrace is useless—no library calls to intercept. strace still works (syscalls are unavoidable). You’ll see raw syscalls instead of nice library wrappers. For string operations, you’ll need to disassemble or use dynamic instrumentation (Frida, DynamoRIO) to hook internal functions.
Books That Will Help
| Topic | Book | Chapter/Section | Why It Matters |
|---|---|---|---|
| System Call Fundamentals | “The Linux Programming Interface” by Michael Kerrisk | Ch. 3: System Programming Concepts | Complete reference for every syscall you’ll see in traces |
| System Call Mechanics | “Computer Systems: A Programmer’s Perspective” by Bryant & O’Hallaron | Ch. 8.1: Exceptions; Ch. 8.4: Process Control | Understand how syscalls transition from user to kernel mode |
| File I/O Operations | “The Linux Programming Interface” by Michael Kerrisk | Ch. 4-5: File I/O | Decode all file-related syscalls (open, read, write, ioctl) |
| Process Management | “The Linux Programming Interface” by Michael Kerrisk | Ch. 24-27: Process Creation, Monitoring, Execution | Understand fork(), exec(), wait() patterns in traces |
| Network Programming | “The Linux Programming Interface” by Michael Kerrisk | Ch. 56-61: Sockets | Interpret socket(), connect(), bind(), listen(), accept() |
| Network Internals | “Computer Systems: A Programmer’s Perspective” | Ch. 11: Network Programming | Client-server architecture, protocol design |
| Signals | “The Linux Programming Interface” by Michael Kerrisk | Ch. 20-22: Signals | Understand signal handlers in malware |
| Dynamic Linking | “Computer Systems: A Programmer’s Perspective” | Ch. 7: Linking | Why you see library loads in strace |
| Binary Loading | “Practical Binary Analysis” by Dennis Andriesse | Ch. 5: Loading and Dynamic Linking | How programs load and what syscalls this generates |
| Low-Level System Calls | “Low-Level Programming” by Igor Zhirkov | Ch. 2: Assembly Language | Direct syscall invocation via syscall instruction |
| Ptrace Internals | “The Linux Programming Interface” by Michael Kerrisk | Ch. 53: Process Credentials (includes ptrace) | How strace itself works |
| Anti-Debugging Techniques | “Practical Malware Analysis” by Sikorski & Honig | Ch. 15: Anti-Disassembly and Anti-Debugging | Detect and bypass tracing countermeasures |
| Behavioral Analysis Methodology | “Practical Malware Analysis” by Sikorski & Honig | Ch. 3: Basic Dynamic Analysis | Professional workflow for using dynamic analysis tools |
| Assembly & Syscalls | “Hacking: The Art of Exploitation” by Jon Erickson | Ch. 0x200: Programming (syscalls section) | Raw syscall invocation in assembly |
Common Pitfalls and Debugging
Problem 1: “Your interpretation does not match runtime behavior”
- Why: Static analysis can hide runtime-resolved addresses, lazy binding, and input-dependent branches.
- Fix: Reproduce the path with debugger or tracer, then compare static assumptions against live register/memory state.
- Quick test: Run the same sample through both your static workflow and a debugger transcript, and confirm control-flow decisions align.
Problem 2: “Tool output is inconsistent across machines”
- Why: ASLR, tool version drift, and different binary build flags (PIE, RELRO, symbols stripped) change observed addresses and metadata.
- Fix: Pin tool versions, capture
checksec/metadata, and document environment assumptions in your report. - Quick test: Re-run analysis in a container or VM with pinned tools and compare hashes of generated outputs.
Problem 3: “Analysis accidentally executes unsafe code”
- Why: Dynamic workflows run binaries in host context without sufficient isolation.
- Fix: Use disposable snapshots, no-network execution, and non-privileged users for all unknown samples.
- Quick test: Validate isolation controls first (network disabled, snapshot active, unprivileged user), then execute sample.
Definition of Done
- Core functionality works on reference inputs
- Edge cases are tested and documented
- Results are reproducible (same binary, same tools, same report output)
- Analysis notes clearly separate observations, assumptions, and conclusions
- Lab safety controls were applied for any dynamic execution
4. Solution Architecture
Input Artifact -> Parse/Decode -> Analysis Engine -> Validation Layer -> Report
Design each stage so intermediate artifacts are inspectable (JSON/text/notes), which makes debugging and peer review much easier.
5. Implementation Phases
Phase 1: Foundation
- Define input assumptions and format checks.
- Produce a minimal golden output on one known sample.
Phase 2: Core Functionality
- Implement full analysis pass for normal cases.
- Add validation against an external ground-truth tool.
Phase 3: Hard Cases and Reporting
- Add malformed/edge-case handling.
- Finalize report template and reproducibility notes.
6. Testing Strategy
- Unit-level checks for parser/decoder helpers.
- Integration checks against known binaries/challenges.
- Regression tests for previously failing cases.
7. Extensions & Challenges
- Add automation for batch analysis and comparative reports.
- Add confidence scoring for each major finding.
- Add export formats suitable for CI/security pipelines.
8. Production Reflection
Map your project output to a production analogue: what reliability, observability, and security controls would be required to run this continuously in an engineering organization?