Project 2: Library Dependency Visualizer
Build a tool that parses binaries and renders their shared library dependency graph.
Quick Reference
| Attribute | Value |
|---|---|
| Difficulty | Advanced |
| Time Estimate | 1-2 weeks |
| Language | C |
| Prerequisites | Binary file I/O, basic ELF/Mach-O concepts |
| Key Topics | DT_NEEDED, search paths, recursion, graph output |
1. Learning Objectives
By completing this project, you will:
- Parse ELF (Linux) or Mach-O (macOS) headers.
- Extract dynamic dependencies (
DT_NEEDEDor LC_LOAD_DYLIB). - Resolve library paths using real search rules.
- Render a dependency graph as SVG or ASCII.
2. Theoretical Foundation
2.1 Core Concepts
- Dynamic section entries:
DT_NEEDEDrecords required shared libraries. - Search path resolution:
rpath,runpath, and environment variables determine actual locations. - Dependency graphs: Libraries depend on other libraries transitively.
2.2 Why This Matters
Understanding dependencies explains many real-world loader failures and ABI mismatches. It also reveals why packaging issues happen.
2.3 Historical Context / Background
ldd is a thin wrapper around the dynamic loader. This project exposes the underlying data and logic that ldd hides.
2.4 Common Misconceptions
- “ldd is magic”: It follows defined rules and metadata.
- “Dependencies are flat”: They are recursive graphs.
3. Project Specification
3.1 What You Will Build
A CLI tool that reads a binary, finds all dependencies, and outputs a dependency graph as SVG (Graphviz) or ASCII.
3.2 Functional Requirements
- Parse a target binary and extract its direct dependencies.
- Resolve each dependency to a full path.
- Recursively parse dependencies without infinite loops.
- Emit output in a chosen format.
3.3 Non-Functional Requirements
- Performance: Avoid re-parsing libraries (use a visited set).
- Reliability: Handle missing libraries gracefully.
- Usability: Provide a clean CLI interface.
3.4 Example Usage / Output
$ ./libviz /usr/bin/python3 --output deps.svg
[ok] wrote deps.svg
3.5 Real World Outcome
You open the SVG and see exactly why a binary depends on libc.so, libpthread.so, and libdl.so:
$ ./libviz /usr/bin/python3 --output deps.svg
[scan] /usr/bin/python3
[found] libpython3.11.so
[found] libc.so.6
[found] libdl.so.2
[ok] wrote deps.svg
4. Solution Architecture
4.1 High-Level Design
┌───────────────┐ ┌────────────────┐ ┌──────────────┐
│ input binary │────▶│ dependency scan│────▶│ graph output │
└───────────────┘ └────────────────┘ └──────────────┘
4.2 Key Components
| Component | Responsibility | Key Decisions |
|---|---|---|
| Binary parser | Read headers and dynamic section | ELF vs Mach-O path |
| Resolver | Apply search rules | Respect rpath/runpath |
| Graph builder | Track edges and nodes | Avoid duplicates |
4.3 Data Structures
typedef struct {
char *name;
char *path;
} lib_node_t;
typedef struct {
int from;
int to;
} lib_edge_t;
4.4 Algorithm Overview
Key Algorithm: Dependency discovery
- Parse binary, extract dependency names.
- Resolve each name to a path.
- Recurse into each dependency if not visited.
- Emit graph edges.
Complexity Analysis:
- Time: O(N + E) for nodes and edges.
- Space: O(N) for visited set.
5. Implementation Guide
5.1 Development Environment Setup
gcc --version
readelf --version # useful for validation
5.2 Project Structure
project-root/
├── src/
│ ├── main.c
│ ├── elf_parser.c
│ ├── resolver.c
│ └── graph.c
└── Makefile
5.3 The Core Question You’re Answering
“What exact chain of shared libraries does this program depend on, and why?”
5.4 Concepts You Must Understand First
Stop and research these before coding:
- ELF dynamic section
DT_NEEDED,DT_RPATH,DT_RUNPATH
- Search path algorithm
LD_LIBRARY_PATHand default paths
- Graph traversal
- Avoid cycles with a visited set
5.5 Questions to Guide Your Design
- Which metadata fields are required to resolve paths?
- How will you handle duplicate dependencies?
- What should the tool do when a library is missing?
5.6 Thinking Exercise
How would the graph change if you set LD_LIBRARY_PATH to include a custom directory?
5.7 The Interview Questions They’ll Ask
- What is
DT_NEEDEDand where is it stored? - What is the difference between
rpathandrunpath? - How do you avoid infinite loops in dependency graphs?
5.8 Hints in Layers
Hint 1: Start small
- Parse a single ELF header and list
DT_NEEDEDentries.
Hint 2: Use a visited set
- Hash the resolved library path.
Hint 3: Graph output
- Emit DOT format for Graphviz.
5.9 Books That Will Help
| Topic | Book | Chapter |
|---|---|---|
| Linking mechanics | “Linkers and Loaders” | Ch. 10 |
| ELF basics | CSAPP | Ch. 7 |
| Binary formats | “Practical Binary Analysis” | Ch. 2-3 |
5.10 Implementation Phases
Phase 1: Foundation (3-4 days)
Goals:
- Parse a binary and list dependencies.
Tasks:
- Read ELF headers.
- Extract
DT_NEEDEDnames.
Checkpoint: Print direct dependencies for one binary.
Phase 2: Core Functionality (4-5 days)
Goals:
- Resolve paths and recurse.
Tasks:
- Implement search path resolution.
- Build recursive traversal.
Checkpoint: Full dependency list without duplicates.
Phase 3: Polish & Edge Cases (3-4 days)
Goals:
- Graph output and error handling.
Tasks:
- Emit DOT/SVG.
- Handle missing libraries.
Checkpoint: Graph renders correctly.
5.11 Key Implementation Decisions
| Decision | Options | Recommendation | Rationale |
|---|---|---|---|
| Output format | DOT/SVG/ASCII | DOT + Graphviz | Easy visualization |
| Parsing approach | Manual vs libelf | Manual first | Learning value |
6. Testing Strategy
6.1 Test Categories
| Category | Purpose | Examples |
|---|---|---|
| Parsing | Confirm DT_NEEDED | readelf -d comparison |
| Resolution | Validate paths | Compare with ldd |
| Graph | Verify edges | DOT has correct node links |
6.2 Critical Test Cases
- A simple binary with known dependencies.
- A binary with recursive dependencies.
- A missing library path scenario.
6.3 Test Data
/usr/bin/ls
/usr/bin/python3
7. Common Pitfalls & Debugging
7.1 Frequent Mistakes
| Pitfall | Symptom | Solution |
|---|---|---|
| Incorrect offsets | Garbage names | Validate with readelf |
| Missing runpath | Wrong path | Implement runpath rules |
| Infinite recursion | Hang | Track visited nodes |
7.2 Debugging Strategies
- Compare output against
ldd. - Dump parsed offsets while developing.
7.3 Performance Traps
Repeated parsing of the same library without caching.
8. Extensions & Challenges
8.1 Beginner Extensions
- Add ASCII graph output for terminal use.
8.2 Intermediate Extensions
- Support both ELF and Mach-O.
8.3 Advanced Extensions
- Highlight version mismatches and missing symbols.
9. Real-World Connections
9.1 Industry Applications
- Packaging: Track dependency closure for distribution.
- Debugging: Diagnose “library not found” errors.
9.2 Related Open Source Projects
- lddtree: Similar tool for dependency trees.
- patchelf: Modifies RPATH and interpreter.
9.3 Interview Relevance
- Demonstrates knowledge of ELF metadata and loader behavior.
10. Resources
10.1 Essential Reading
- “Linkers and Loaders” - Dynamic linking chapter.
- TLPI - Shared library fundamentals.
10.2 Video Resources
- Search: “ELF DT_NEEDED tutorial”.
10.3 Tools & Documentation
- readelf: Inspect dynamic section.
- objdump: Disassembly and headers.
10.4 Related Projects in This Series
11. Self-Assessment Checklist
11.1 Understanding
- I can explain how the loader finds libraries.
- I can read DT_NEEDED entries.
11.2 Implementation
- My tool matches
lddoutput for basic binaries. - My graph renders without duplicates.
11.3 Growth
- I can troubleshoot dependency issues faster.
12. Submission / Completion Criteria
Minimum Viable Completion:
- Extract direct dependencies for a binary.
Full Completion:
- Recursive graph with correct search paths.
Excellence (Going Above & Beyond):
- Support multiple binary formats and version annotations.
This guide was generated from SHARED_LIBRARIES_LEARNING_PROJECTS.md. For the complete learning path, see the parent directory README.