Project 8: CPU Stack Profiler (profile Clone)
A sampling profiler that captures stack traces at regular intervals to show where CPU time is being spent. Output in a format suitable for flame graph generation.
Quick Reference
| Attribute | Value |
|---|---|
| Primary Language | C (libbpf) |
| Alternative Languages | Go (cilium/ebpf), Rust (aya) |
| Difficulty | Level 3: Advanced |
| Time Estimate | 2 weeks |
| Knowledge Area | Performance Profiling / CPU Analysis |
| Tooling | libbpf, perf_event |
| Prerequisites | Projects 1-7 completed |
What You Will Build
A sampling profiler that captures stack traces at regular intervals to show where CPU time is being spent. Output in a format suitable for flame graph generation.
Why It Matters
This project combines BPF with perf events for CPU sampling, teaches you about stack unwinding, and produces data for visualization. This is the foundation of production profilers.
Core Challenges
- CPU sampling with perf events → maps to PERF_TYPE_SOFTWARE, PERF_COUNT_SW_CPU_CLOCK
- Capturing kernel and user stacks → maps to bpf_get_stack(), stack traces
- Symbol resolution → maps to /proc/kallsyms, DWARF, frame pointers
- Flame graph generation → maps to folded stacks format
Key Concepts
- CPU Profiling: “BPF Performance Tools” Chapter 6 - Brendan Gregg
- Flame Graphs: Brendan Gregg Flame Graphs
- Stack Walking: “BPF Performance Tools” Chapter 2.7 - Brendan Gregg
- perf_event Integration: “Learning eBPF” Chapter 7 - Liz Rice
Real-World Outcome
$ sudo ./stackprof -p 1234 -d 10 # Profile PID 1234 for 10 seconds
Profiling PID 1234 for 10 seconds at 99 Hz...
Collected 990 samples
# Output folded stacks for flame graph
$ sudo ./stackprof -p 1234 -d 10 -f > stacks.folded
$ flamegraph.pl stacks.folded > profile.svg
Top functions by sample count:
pthread_mutex_lock 234 (23.6%)
__GI___libc_read 198 (20.0%)
__memcpy_avx_unaligned 156 (15.8%)
process_request 123 (12.4%)
parse_json 89 (9.0%)
Implementation Guide
- Reproduce the simplest happy-path scenario.
- Build the smallest working version of the core feature.
- Add input validation and error handling.
- Add instrumentation/logging to confirm behavior.
- Refactor into clean modules with tests.
Milestones
- Milestone 1: Minimal working program that runs end-to-end.
- Milestone 2: Correct outputs for typical inputs.
- Milestone 3: Robust handling of edge cases.
- Milestone 4: Clean structure and documented usage.
Validation Checklist
- Output matches the real-world outcome example
- Handles invalid inputs safely
- Provides clear errors and exit codes
- Repeatable results across runs
References
- Main guide:
LEARN_BPF_EBPF_LINUX.md - “BPF Performance Tools” by Brendan Gregg