Project 3: LD_PRELOAD Function Interceptor
Build a shared library that intercepts libc calls via LD_PRELOAD and logs behavior.
Quick Reference
| Attribute | Value |
|---|---|
| Difficulty | Advanced |
| Time Estimate | Weekend-1 week |
| Language | C |
| Prerequisites | Function pointers, dlopen/dlsym basics |
| Key Topics | Symbol interposition, RTLD_NEXT, thread safety |
1. Learning Objectives
By completing this project, you will:
- Explain symbol interposition and the loader search order.
- Implement function interceptors with
LD_PRELOAD. - Call the original function using
dlsym(RTLD_NEXT, ...). - Avoid recursion and maintain thread safety.
2. Theoretical Foundation
2.1 Core Concepts
- Symbol interposition: The loader resolves symbols to the first matching definition in the search order, including preloaded libraries.
- RTLD_NEXT: Lets your interceptor call the next symbol in the chain (the real function).
- Reentrancy pitfalls: Interceptors can accidentally call the same function again.
2.2 Why This Matters
This is how many tracing tools and debuggers work without kernel involvement. It gives you visibility into runtime behavior with minimal effort.
2.3 Historical Context / Background
LD_PRELOAD was introduced to override symbols dynamically and has been used for debugging, testing, and sometimes exploitation.
2.4 Common Misconceptions
- “Interception is universal”: Static binaries and some setuid binaries ignore
LD_PRELOAD. - “printf is safe”: Many libc functions internally call others you intercept, causing recursion.
3. Project Specification
3.1 What You Will Build
A shared library that intercepts malloc, free, and open, logs usage, and reports totals on exit.
3.2 Functional Requirements
- Implement wrappers for at least two libc functions.
- Use
dlsym(RTLD_NEXT, ...)to call the original. - Prevent recursion with a guard or minimal syscalls.
- Print summary statistics at program exit.
3.3 Non-Functional Requirements
- Performance: Minimal overhead per call.
- Reliability: Should not crash target programs.
- Usability: Simple
LD_PRELOAD=...usage.
3.4 Example Usage / Output
$ LD_PRELOAD=./libintercept.so /bin/ls
[intercept] open("/etc/ld.so.cache") = 3
[intercept] malloc(1024) = 0x7f8a...
[summary] allocs=120 bytes=98304
3.5 Real World Outcome
You can trace real programs without recompilation:
$ LD_PRELOAD=./libintercept.so /usr/bin/curl https://example.com
[intercept] connect(fd=5, addr=93.184.216.34:443)
[intercept] malloc(4096) = 0x7f8a...
[summary] allocs=847 bytes=2304000
4. Solution Architecture
4.1 High-Level Design
┌──────────────┐ ┌─────────────────┐ ┌──────────────┐
│ target app │────▶│ libintercept.so │────▶│ libc real fn │
└──────────────┘ └─────────────────┘ └──────────────┘
4.2 Key Components
| Component | Responsibility | Key Decisions |
|---|---|---|
| Interceptor funcs | Log and forward | Use RTLD_NEXT |
| Guard | Prevent recursion | Thread-local flag |
| Reporter | Print totals at exit | atexit hook |
4.3 Data Structures
typedef struct {
size_t alloc_count;
size_t alloc_bytes;
} alloc_stats_t;
4.4 Algorithm Overview
Key Algorithm: Interpose and forward
- Resolve original symbol with
dlsym(RTLD_NEXT, ...). - Log parameters.
- Call original function.
- Update stats and return.
Complexity Analysis:
- Time: O(1) per intercepted call.
- Space: O(1) global state.
5. Implementation Guide
5.1 Development Environment Setup
gcc --version
man ld.so
5.2 Project Structure
project-root/
├── intercept.c
└── Makefile
5.3 The Core Question You’re Answering
“How does the dynamic loader choose which function implementation to call?”
5.4 Concepts You Must Understand First
Stop and research these before coding:
- Symbol resolution order
- Preload libraries vs default paths
- Function pointers
- Matching signatures exactly
- Reentrancy hazards
- Logging functions calling intercepted functions
5.5 Questions to Guide Your Design
- How will you avoid recursive calls while logging?
- Which functions are safe to call inside interceptors?
- How will you store state across threads?
5.6 Thinking Exercise
Design an interceptor for connect() that blocks connections to a specific IP.
5.7 The Interview Questions They’ll Ask
- What does
LD_PRELOADdo? - Why is
RTLD_NEXTnecessary? - What breaks when you intercept
mallocwithout care?
5.8 Hints in Layers
Hint 1: Minimal logging
- Use
write(2, ...)instead ofprintf.
Hint 2: Guard
- Use
__threadguard variable.
Hint 3: Resolve once
- Cache the original function pointer.
5.9 Books That Will Help
| Topic | Book | Chapter |
|---|---|---|
| Interposition | TLPI | Ch. 42 |
| Function pointers | “C Programming: A Modern Approach” | Ch. 17 |
| Loader behavior | Drepper PDF | symbol lookup |
5.10 Implementation Phases
Phase 1: Foundation (1-2 days)
Goals:
- Intercept a simple function like
puts.
Tasks:
- Create shared library with
-shared -fPIC. - Override
putsand forward to original.
Checkpoint: LD_PRELOAD logs each call.
Phase 2: Core Functionality (2-3 days)
Goals:
- Intercept
mallocand track stats.
Tasks:
- Add guard to avoid recursion.
- Track total allocations.
Checkpoint: Summary printed at exit.
Phase 3: Polish & Edge Cases (2-3 days)
Goals:
- Thread safety and stability.
Tasks:
- Add thread-local guard.
- Avoid unsafe logging.
Checkpoint: Works on multi-threaded apps.
5.11 Key Implementation Decisions
| Decision | Options | Recommendation | Rationale |
|---|---|---|---|
| Logging method | printf vs write |
write |
Avoid recursion |
| Guard type | global vs thread-local | thread-local | Multi-thread safety |
6. Testing Strategy
6.1 Test Categories
| Category | Purpose | Examples |
|---|---|---|
| Interception | Confirm override | LD_PRELOAD on /bin/ls |
| Forwarding | Ensure real fn called | Output still correct |
| Stability | Avoid recursion | No crashes on curl |
6.2 Critical Test Cases
- Intercepted function logs and returns correct value.
- No infinite recursion when intercepting
malloc. - Works on real programs without crash.
6.3 Test Data
/bin/ls, /usr/bin/curl
7. Common Pitfalls & Debugging
7.1 Frequent Mistakes
| Pitfall | Symptom | Solution |
|---|---|---|
| Wrong signature | Crash or garbage | Match exact prototype |
| Using printf | Recursion | Use write |
| Missing RTLD_NEXT | Calls itself | Use RTLD_NEXT |
7.2 Debugging Strategies
- Use
LD_DEBUG=libs,bindingsto see loader decisions. - Test with a simple program before big apps.
7.3 Performance Traps
Intercepting hot functions adds overhead; keep logging minimal.
8. Extensions & Challenges
8.1 Beginner Extensions
- Intercept
openand log file paths.
8.2 Intermediate Extensions
- Build a per-thread allocation tracker.
8.3 Advanced Extensions
- Implement a policy engine to block disallowed syscalls.
9. Real-World Connections
9.1 Industry Applications
- Profilers: Track allocations and I/O patterns.
- Security tooling: Enforce runtime policy without kernel hooks.
9.2 Related Open Source Projects
- libeatmydata: Interposes fsync for performance.
- jemalloc: Advanced allocator with interposition hooks.
9.3 Interview Relevance
- Shows deep understanding of symbol resolution and runtime behavior.
10. Resources
10.1 Essential Reading
- TLPI - Shared libraries advanced features.
- Drepper - Symbol lookup details.
10.2 Video Resources
- Search: “LD_PRELOAD tutorial”.
10.3 Tools & Documentation
- ld.so:
man ld.so - dlopen/dlsym:
man dlopen
10.4 Related Projects in This Series
11. Self-Assessment Checklist
11.1 Understanding
- I can explain loader symbol order.
- I can describe RTLD_NEXT behavior.
11.2 Implementation
- Interceptors log and forward correctly.
- No recursion or crashes on real programs.
11.3 Growth
- I can design a custom tracing tool for my stack.
12. Submission / Completion Criteria
Minimum Viable Completion:
- Intercept one libc function and log it.
Full Completion:
- Track allocation stats and print a summary.
Excellence (Going Above & Beyond):
- Add policy-based blocking or per-thread reporting.
This guide was generated from SHARED_LIBRARIES_LEARNING_PROJECTS.md. For the complete learning path, see the parent directory README.