Learn Continuous Profiling: From Zero to Observability Master
Goal: Deeply understand the mechanics of continuous profiling—how to observe code execution in production with near-zero overhead. You will learn to build agents that bypass traditional instrumentation, using eBPF and runtime hooks to walk the stack, resolve symbols from binary metadata, and generate actionable insights from the “living” code of high-performance systems.
Why Continuous Profiling Matters
For decades, profiling was something you did in your IDE or on a staging server when things felt slow. Continuous Profiling (CP) changes the game: it runs 24/7/365 in production.
- The “Invisible” Tax: Traditional tracers can add 10-50% overhead. CP aim for <1%, making it safe for the most sensitive production environments.
- Solving the “Heisenbug”: Many performance regressions only happen under specific production loads. Without CP, you’re guessing. With it, you have the exact stack trace of the bottleneck.
- Cost Optimization: Large-scale companies like Google and Netflix use CP to find “micro-optimizations” that save millions in compute costs.
- Beyond Metrics: Metrics tell you that something is slow; CP tells you exactly which line of code is responsible.
Core Concept Analysis
1. The Profiling Lifecycle
Profiling isn’t just one step; it’s a pipeline of data transformation.
[ Running Process ] -> [ Collection Agent ] -> [ Storage/Database ] -> [ Visualization ]
| | | |
Instruction Pointer Stack Walking Aggregation Flame Graphs
& Memory Allocations Symbolication Compression Iceberg Charts
2. Sampling vs. Tracing
- Tracing: Hooks every function entry/exit. High overhead, high detail.
- Sampling: Wakes up periodically (e.g., 99 times a second), checks what the CPU is doing, and goes back to sleep. Low overhead, statistically accurate.
3. Stack Walking: The Hard Part
How does the profiler know the “path” to the current line of code? It must “walk” the stack.
High Memory
+-------------------+
| main() | [Frame 0]
+-------------------+
| handle_req() | [Frame 1]
+-------------------+
| parse_json() | [Frame 2] <-- Current Instruction Pointer (RIP)
+-------------------+
Low Memory
Methods of Walking:
- Frame Pointers (FP): Uses the
RBPregister. Fast, but often optimized away by compilers (-fomit-frame-pointer). - DWARF: Uses debug information sections in the binary. Very accurate but heavy and hard to do in-kernel.
- ORC (Oops Rewind Capability): A Linux-specific compromise for kernel stack walking.
4. Symbolication: Turning Hex into Names
The kernel sees memory addresses like 0x4012ab. You need to know that this corresponds to src/parser.c:142.
Address: 0x4012ab
|
V
[ Look up in ELF Symbol Table / DWARF ]
|
V
Function: json_parse_internal
File: parser.c
Line: 142
The eBPF Advantage
eBPF allows us to run “sandboxed” code inside the Linux kernel. This is the secret sauce for modern profilers (like Parca or Pyroscope).
User Space Kernel Space
+-----------+ +--------------------------+
| Profiler | <------- | eBPF Program |
| Collector | (Maps) | (Hooked to Perf Events) |
+-----------+ +--------------------------+
^ |
| v
+---------------------- [ CPU / Hardware ]
The eBPF program is triggered by hardware timers or kernel events, records the stack into a shared “Map,” and the user-space agent periodically reads and clears that map.
Concept Summary Table
| Concept Cluster | What You Need to Internalize |
|---|---|
| Sampling Theory | Statistical accuracy requires high enough frequency but low enough to avoid “observer effect.” |
| Stack Unwinding | Moving from a raw Instruction Pointer to a full chain of calls (Frame Pointers vs DWARF). |
| Symbolication | Translating virtual addresses to human-readable strings using ELF metadata and debug symbols. |
| eBPF Maps | The high-performance bridge between kernel-level collection and user-level aggregation. |
| Aggregation | How to group millions of samples into a single “Flame Graph” using hash trees or pprof formats. |
Deep Dive Reading by Concept
This section maps each concept from above to specific book chapters for deeper understanding. Read these before or alongside the projects to build strong mental models.
Foundations & Performance Theory
| Concept | Book & Chapter |
|---|---|
| Profiling Methodology | Systems Performance by Brendan Gregg — Ch. 2: “Methodology” |
| CPU Performance Analysis | Systems Performance by Brendan Gregg — Ch. 6: “CPUs” |
| Memory Performance Analysis | Systems Performance by Brendan Gregg — Ch. 7: “Memory” |
eBPF & Kernel Tracing
| Concept | Book & Chapter |
|---|---|
| eBPF Architecture | Learning eBPF by Liz Rice — Ch. 2: “eBPF Programs and Maps” |
| Performance Sampling | BPF Performance Tools by Brendan Gregg — Ch. 4: “Working with BPF” |
| Stack Tracing with BPF | BPF Performance Tools by Brendan Gregg — Ch. 5: “CPU” (Section: Profile) |
Binary Internals & Symbolication
| Concept | Book & Chapter |
|---|---|
| ELF File Format | How Linux Works by Brian Ward — Ch. 11: “How the Kernel Manages Memory” (Binary sections) |
| Debug Symbols & DWARF | The Linux Programming Interface by Michael Kerrisk — Ch. 41: “Fundamentals of Shared Libraries” |
| Instruction Pointers | Write Great Code, Vol 1 by Randall Hyde — Ch. 11: “CPU Architecture” |
Essential Reading Order
For maximum comprehension, read in this order:
Project List
Projects are ordered from fundamental understanding to advanced eBPF implementations.
Project 1: The “Poor Man’s” Profiler (Sampling with ptrace)
- File: CONTINUOUS_PROFILING_DEEP_DIVE.md
- Main Programming Language: C
- Alternative Programming Languages: Rust, Go
- Coolness Level: Level 3: Genuinely Clever
- Business Potential: 1. The “Resume Gold”
- Difficulty: Level 2: Intermediate
- Knowledge Area: Linux Process Management / Signals
- Software or Tool:
ptrace,waitpid - Main Book: “The Linux Programming Interface” by Michael Kerrisk
What you’ll build: A tool that attaches to a running process, interrupts it periodically using ptrace(PTRACE_INTERRUPT), reads the Instruction Pointer (RIP), and records which functions are being executed.
Why it teaches continuous profiling: This project forces you to grapple with the “Observer Effect.” You’ll see how stopping a process to inspect it adds latency, and why high-frequency sampling requires a more performant approach than ptrace.
Core challenges you’ll face:
- Attaching to a PID → maps to understanding process permissions and namespaces
- Reading CPU Registers → maps to the
user_regs_structand hardware state - Signal handling → maps to how to resume a process without breaking it
- Rate limiting → maps to calculating overhead (e.g., if sampling takes 10ms, how many Hz can you support?)
Key Concepts:
- ptrace syscall: TLPI Ch. 11 & 12 - Michael Kerrisk
- Instruction Pointer (RIP): “Computer Systems: A Programmer’s Perspective” Ch. 3 - Bryant & O’Hallaron
Difficulty: Intermediate Time estimate: Weekend Prerequisites: Basic C, knowledge of Linux process IDs.
Real World Outcome
You will have a CLI tool that takes a PID and outputs a frequency list of memory addresses that were “active” during the sampling period.
Example Output:
$ sudo ./pm-profiler -p 1234 -f 99
Attaching to process 1234...
Sampling at 99Hz...
[Samples Collected: 1000]
Address | Hits | Percent
-----------|------|---------
0x4012ab | 450 | 45.0%
0x4012c4 | 200 | 20.0%
0x7ff120 | 100 | 10.0%
The Core Question You’re Answering
“How do I look inside a running program without modifying its source code?”
Before you write any code, sit with this question. Most debugging happens with source-code access. Continuous profiling must work on “black boxes.”
Concepts You Must Understand First
Stop and research these before coding:
- The
ptraceSyscall- How do you “seize” a process versus “attach” to it?
- What happens to the process state when it is stopped?
- Book Reference: “The Linux Programming Interface” Ch. 11
- CPU Registers (x86_64)
- What is the difference between RIP, RBP, and RSP?
- How are registers mapped to the
user_regs_struct?
Questions to Guide Your Design
Before implementing, think through these:
- Timing
- How will you ensure exactly 99 samples per second? (
nanosleep?timerfd?)
- How will you ensure exactly 99 samples per second? (
- Safety
- What happens if the process dies while you are attached?
- How do you ensure you call
PTRACE_DETACHon exit?
Project 2: The eBPF Stack Collector (Kernel-Level Observation)
- File: CONTINUOUS_PROFILING_DEEP_DIVE.md
- Main Programming Language: C (eBPF) / Go or Rust (Loader)
- Alternative Programming Languages: C++, Python (BCC)
- Coolness Level: Level 4: Hardcore Tech Flex
- Business Potential: 4. The “Open Core” Infrastructure
- Difficulty: Level 3: Advanced
- Knowledge Area: eBPF / Kernel Internals
- Software or Tool:
libbpf,bpftool - Main Book: “Learning eBPF” by Liz Rice
What you’ll build: An eBPF program that attaches to a PERF_COUNT_SW_CPU_CLOCK event. Every time the timer fires, the kernel code will use bpf_get_stackid to capture the entire call stack and store it in a BPF Hash Map.
Why it teaches continuous profiling: This is how modern production profilers work. By moving collection into the kernel, you eliminate the context-switch overhead of ptrace.
Core challenges you’ll face:
- The BPF Verifier → maps to writing code the kernel can prove is safe
- BPF Maps → maps to efficiently passing data from kernel to user space
- Stack IDs → maps to how the kernel deduplicates identical stack traces
- Perf Events → maps to hooking hardware/software timers
Key Concepts:
- BPF Programs and Maps: “Learning eBPF” Ch. 2 - Liz Rice
- Stack Tracing with BPF: “BPF Performance Tools” Ch. 5 - Brendan Gregg
Difficulty: Advanced Time estimate: 1-2 weeks Prerequisites: Understanding of Project 1, basic Go or Rust for the “loader” program.
Real World Outcome
A tool that runs in the background and aggregates “unique” stack traces it sees across the whole system (or a specific PID).
Example Output:
$ sudo ./bpf-prof --pid 5678
Collecting samples... (Ctrl+C to stop)
[Stack ID 42] Hits: 550
0x7ff001
0x7ff022
0x401005
[Stack ID 89] Hits: 12
0x7ff001
0x7ff099
Thinking Exercise
The Deduplication Puzzle
Before coding, imagine you sample a process 10,000 times. 9,000 of those samples show the exact same stack: main -> loop -> work.
- If you send 10,000 stack traces to user space, how much bandwidth is wasted?
- How could you use a BPF Map to “count” occurrences in-kernel before the user space tool even looks at it?
Project 3: The Symbolicator (Converting Addresses to Names)
- File: CONTINUOUS_PROFILING_DEEP_DIVE.md
- Main Programming Language: Rust
- Alternative Programming Languages: C++, Go
- Coolness Level: Level 3: Genuinely Clever
- Business Potential: 2. The “Micro-SaaS / Pro Tool”
- Difficulty: Level 3: Advanced
- Knowledge Area: Binary Analysis / ELF / DWARF
- Software or Tool:
gimli,objectcrate, orlibelf - Main Book: “How Linux Works” by Brian Ward
What you’ll build: A tool that takes a list of memory addresses and a path to a binary file, parses the ELF symbol tables (.symtab, .dynsym) and DWARF debug sections, and returns the function names and line numbers.
Why it teaches continuous profiling: Profiling data is useless as raw hex addresses. This project teaches you how programs are laid out on disk and in memory, and how “mapping” works (/proc/PID/maps).
Core challenges you’ll face:
- Parsing ELF Headers → maps to finding where the symbol table lives
- Handling ASLR → maps to calculating the “load bias” (offset) of a running process
- DWARF State Machine → maps to decoding the complex compressed line number program
- Address Translation → maps to mapping a virtual address back to a file offset
Key Concepts:
- ELF Layout: “How Linux Works” Ch. 11
- DWARF Specification: DWARF Standard (Introduction to the Debugging Format)
Difficulty: Advanced Time estimate: 1-2 weeks Prerequisites: Knowledge of Projects 1 & 2.
Real World Outcome
You’ll have a library/tool where you can input (Address, BinaryPath) and get (Function, File, Line).
Example Usage:
$ ./symbolicator --binary ./my-app --addr 0x4012ab
Result:
Function: handle_request
File: src/server.c
Line: 42
Project 4: The Flame Graph Generator (Data Visualization)
- File: CONTINUOUS_PROFILING_DEEP_DIVE.md
- Main Programming Language: JavaScript/TypeScript (D3.js or Canvas)
- Alternative Programming Languages: Python (Matplotlib), Go
- Coolness Level: Level 5: Pure Magic (Super Cool)
- Business Potential: 2. The “Micro-SaaS / Pro Tool”
- Difficulty: Level 2: Intermediate
- Knowledge Area: Data Visualization / Tree Algorithms
- Software or Tool: D3.js, SVG
- Main Book: “Systems Performance” by Brendan Gregg
What you’ll build: A web-based visualizer that takes the output from your eBPF collector (Project 2) and symbolicator (Project 3) and renders a “Flame Graph.”
Why it teaches continuous profiling: You’ll learn that profiling isn’t just about collecting data; it’s about making it understandable. You’ll implement the algorithm that aggregates hierarchical stack traces into a visual representation of time spent.
Core challenges you’ll face:
- Converting Stacks to Trees → maps to hierarchical data aggregation
- Layout Algorithm → maps to calculating the width of blocks based on frequency
- Interactivity → maps to zooming into specific sub-trees of the profile
- Search & Filter → maps to highlighting functions that match a regex
Key Concepts:
- Flame Graphs: “Systems Performance” Ch. 2.5
- Brendan Gregg’s original
flamegraph.pllogic
Real World Outcome
A browser-based tool where you can upload a profile file and see a beautiful, interactive Flame Graph of your process.
Example Visualization Logic:
- Width: Represents the number of samples (Total CPU time).
- Y-Axis: Represents the stack depth.
- Color: Usually randomized within a hue to distinguish adjacent blocks.
Hints in Layers
Hint 1: The Input Format
Start by converting your data into a “folded” format: main;foo;bar 42 (meaning the stack main->foo->bar was seen 42 times).
Hint 2: The Tree Structure Build a prefix tree (Trie). Each node represents a function. The “weight” of the node is the sum of all samples that passed through it.
Hint 3: Calculating Rectangles
When rendering, the root takes 100% width. Each child’s width is (child_samples / parent_samples) * parent_width.
Project 5: The “Off-CPU” Profiler (Where did the time go?)
- File: CONTINUOUS_PROFILING_DEEP_DIVE.md
- Main Programming Language: C (eBPF)
- Alternative Programming Languages: Rust (aya)
- Coolness Level: Level 4: Hardcore Tech Flex
- Business Potential: 3. The “Service & Support” Model
- Difficulty: Level 4: Expert
- Knowledge Area: Kernel Scheduling / Context Switching
- Software or Tool:
tp_btf/sched_switch - Main Book: “BPF Performance Tools” by Brendan Gregg
What you’ll build: A profiler that captures stacks not when the CPU is busy, but when a thread stops running (e.g., waiting for a lock, disk I/O, or network).
Why it teaches continuous profiling: Standard profilers only tell you what’s slow while running. Off-CPU profiling tells you why your application is “hanging” or why latency is high despite low CPU usage.
Core challenges you’ll face:
- Tracking State Transitions → maps to storing the “start wait” time in a BPF Map
- Scheduler Hooks → maps to the
sched_switchtracepoint - Delta Calculation → maps to subtracting timestamps in-kernel
- Filtering Noise → maps to ignoring “idle” threads
The Core Question You’re Answering
“If my CPU usage is only 5%, why is my request taking 2 seconds?”
This is the ultimate observability question. It moves you from “CPU profiling” to “Latency profiling.”
Concepts You Must Understand First
- Context Switches
- What is the difference between voluntary and involuntary switches?
- How does the kernel represent a thread’s state (Running vs Sleeping)?
- eBPF Timestamping
- Using
bpf_ktime_get_ns()for high-precision timing.
- Using
Project 6: The Heap Allocator Tracer (Memory Profiling)
- File: CONTINUOUS_PROFILING_DEEP_DIVE.md
- Main Programming Language: C (uprobes) / Rust
- Alternative Programming Languages: Go
- Coolness Level: Level 4: Hardcore Tech Flex
- Business Potential: 2. The “Micro-SaaS / Pro Tool”
- Difficulty: Level 3: Advanced
- Knowledge Area: Memory Management / Shared Libraries
- Software or Tool:
uprobes,malloc/freehooks - Main Book: “BPF Performance Tools” by Brendan Gregg
What you’ll build: A tool that uses eBPF uprobes to hook into libc.so’s malloc and free functions. You will record every allocation, its size, and the stack trace that triggered it.
Why it teaches continuous profiling: This introduces “User-space Probes” (uprobes). You’ll understand how to profile high-level library calls without kernel-level code changes.
Core challenges you’ll face:
- uprobes Performance → maps to the overhead of switching to kernel for every malloc
- Tracking “Live” Allocations → maps to matching
freecalls with their originalmallocin a Map - Handling Fragmentation → maps to understanding how allocators work under the hood
Project 7: The JIT Symbolicator (Profiling High-Level Languages)
- File: CONTINUOUS_PROFILING_DEEP_DIVE.md
- Main Programming Language: Python or Node.js (for the target) / C++ or Rust (for the profiler)
- Alternative Programming Languages: Java (using async-profiler concepts)
- Coolness Level: Level 5: Pure Magic (Super Cool)
- Business Potential: 5. The “Industry Disruptor”
- Difficulty: Level 4: Expert
- Knowledge Area: JIT Compilation / Runtime Internals
- Software or Tool:
/tmp/perf-PID.map, V8/JVM - Main Book: “Systems Performance” by Brendan Gregg
What you’ll build: A tool that can symbolicate stack traces from a Just-In-Time (JIT) compiled language like Node.js or Java. Since these functions aren’t in the ELF binary, you must parse the “perf map” files generated by the runtime.
Why it teaches continuous profiling: Most production code is JITed. You’ll learn that binary symbolication is only half the battle; the other half is cooperating with runtimes to find where they put their dynamic machine code.
Core challenges you’ll face:
- Dynamic Code Generation → maps to addresses that change during execution
- Perf-Map Format → maps to reading
/tmp/perf-<pid>.mapfiles - Instruction Pointer Mapping → maps to associating a random memory address with a JS/Java method name
Real World Outcome
You will be able to profile a Node.js application from the outside (using your eBPF tool) and see function names from the Javascript code, not just node::v8::... internal C++ names.
Project 8: Container-Aware Profiler (Namespaces & Cgroups)
- File: CONTINUOUS_PROFILING_DEEP_DIVE.md
- Main Programming Language: Go
- Alternative Programming Languages: Rust
- Coolness Level: Level 3: Genuinely Clever
- Business Potential: 4. The “Open Core” Infrastructure
- Difficulty: Level 3: Advanced
- Knowledge Area: Docker / Kubernetes / Linux Namespaces
- Software or Tool:
containerd,mountnamespaces - Main Book: “How Linux Works” by Brian Ward
What you’ll build: A profiler that runs on the host but correctly identifies which Kubernetes Pod or Docker Container a sampled address belongs to.
Why it teaches continuous profiling: In the cloud, “PIDs” are messy. You’ll learn how to cross the “Container Boundary” to find the right binary for symbolication by looking into the container’s mount namespace.
Core challenges you’ll face:
- Mount Namespaces → maps to finding the binary file at
/proc/PID/root/... - Container IDs → maps to mapping a kernel task to a container runtime ID
- Shared Libraries in Containers → maps to handling different versions of
libcacross containers
Project 9: The pprof Exporter (Interoperability)
- File: CONTINUOUS_PROFILING_DEEP_DIVE.md
- Main Programming Language: Go
- Alternative Programming Languages: Python, Rust
- Coolness Level: Level 2: Practical but Forgettable
- Business Potential: 2. The “Micro-SaaS / Pro Tool”
- Difficulty: Level 2: Intermediate
- Knowledge Area: Protocol Buffers / Data Serialization
- Software or Tool:
pprof(Google’s profiling tool), Protobuf - Main Book: “Observability Engineering” by Charity Majors
What you’ll build: A converter that takes your custom binary profile format and exports it as a Gzipped Protobuf file compatible with Google’s pprof tool.
Why it teaches continuous profiling: You’ll learn the industry standard for profile data exchange. Implementing this allows you to use established tools like go tool pprof or “Google Cloud Profiler” to view your own data.
Project 10: The Multi-Process Aggregator (Fleet-wide View)
- File: CONTINUOUS_PROFILING_DEEP_DIVE.md
- Main Programming Language: Rust or Go
- Alternative Programming Languages: C++
- Coolness Level: Level 4: Hardcore Tech Flex
- Business Potential: 4. The “Open Core” Infrastructure
- Difficulty: Level 4: Expert
- Knowledge Area: Distributed Systems / High-Throughput Ingestion
- Software or Tool: gRPC, ClickHouse or Prometheus-style storage
- Main Book: “Designing Data-Intensive Applications” by Martin Kleppmann
What you’ll build: A central server that accepts profile streams from multiple agents (Project 2) and aggregates them by “Service Name” and “Version,” allowing you to see the aggregate CPU usage of a whole microservice fleet.
Why it teaches continuous profiling: This is the “Continuous” part of Continuous Profiling. You’ll deal with high-volume data ingestion, storage strategies for profiles, and how to query millions of stacks efficiently.
Project Comparison Table
| Project | Difficulty | Time | Depth of Understanding | Fun Factor |
|---|---|---|---|---|
| 1. Poor Man’s Profiler | Level 2 | Weekend | 🟢 Basic Process Control | 😐 Educational |
| 2. eBPF Stack Collector | Level 3 | 1-2 Weeks | 🔵 Kernel/eBPF Internals | 😎 Hardcore |
| 3. The Symbolicator | Level 3 | 1-2 Weeks | 🟣 Binary/ELF Analysis | 🧐 Intellectual |
| 4. Flame Graph Gen | Level 2 | 1 Week | 🟡 Data Visualization | 🎨 Creative |
| 5. Off-CPU Profiler | Level 4 | 2 Weeks | 🔴 Scheduler/Latency | 🤯 Mind-bending |
| 6. Memory Tracer | Level 3 | 1 Week | 🔵 Heap Management | 🔍 Insightful |
| 7. JIT Symbolicator | Level 4 | 2 Weeks | 🔴 Runtime/JIT Internals | 🧙♂️ Magical |
| 8. Container Profiler | Level 3 | 1 Week | 🔵 K8s/Namespaces | 🏗️ Structural |
| 9. pprof Exporter | Level 2 | Weekend | 🟢 Interop/Standards | 🛠️ Useful |
| 10. Fleet Aggregator | Level 4 | 1 Month+ | 🔴 Distributed Systems | 🚀 Enterprise |
Recommendation
Where to start?
Start with Project 1 (Poor Man’s Profiler). It provides the “Aha!” moment where you realize that a program is just a series of instructions you can pause and inspect. Once you see the limitations of ptrace, move immediately to Project 2 (eBPF) to see how the industry solves it.
Final Overall Project: “The Observability Forge”
The Goal: Combine all previous projects into a single, production-ready continuous profiling platform.
What you’ll build:
- The Agent: A low-overhead eBPF agent (Project 2) that auto-discovers containers (Project 8).
- The Symbolicator: A sidecar service that pulls debug symbols from a central “Symbol Server” or S3 bucket (Project 3).
- The Ingestor: A high-performance collector (Project 10) that stores profiles in a columnar format.
- The UI: An interactive dashboard showing Flame Graphs (Project 4) with the ability to “diff” two time ranges to find regressions.
Why this makes you a Master: You aren’t just building a tool; you’re building an infrastructure. You’ll have to solve the “Profile-Symbol-Mismatch” problem, handle agent crashes, and ensure that your own profiler doesn’t become the bottleneck.
Summary
This learning path covers Continuous Profiling through 10 hands-on projects. Here’s the complete list:
| # | Project Name | Main Language | Difficulty | Time Estimate |
|---|---|---|---|---|
| 1 | Poor Man’s Profiler | C | Level 2 | Weekend |
| 2 | eBPF Stack Collector | C (eBPF) / Go | Level 3 | 1-2 Weeks |
| 3 | The Symbolicator | Rust | Level 3 | 1-2 Weeks |
| 4 | Flame Graph Generator | TypeScript | Level 2 | 1 Week |
| 5 | Off-CPU Profiler | C (eBPF) | Level 4 | 2 Weeks |
| 6 | Memory Allocation Tracer | C (eBPF) | Level 3 | 1 Week |
| 7 | JIT Symbolicator | Rust / C++ | Level 4 | 2 Weeks |
| 8 | Container-Aware Profiler | Go | Level 3 | 1 Week |
| 9 | pprof Exporter | Go | Level 2 | Weekend |
| 10 | Fleet Aggregator | Rust | Level 4 | 1 Month+ |
Recommended Learning Path
For beginners: Start with projects #1, #4, and #9. Focus on the data format and basic collection. For intermediate: Jump to #2, #3, and #6. Master eBPF and binary analysis. For advanced: Focus on #5, #7, #8, and #10. Build for scale and complex runtimes.
Expected Outcomes
After completing these projects, you will:
- Understand exactly how eBPF programs interact with kernel events.
- Be able to parse ELF/DWARF/JIT metadata from scratch.
- Know how to walk the stack without frame pointers.
- Implement high-performance data visualization for hierarchical data.
- Design distributed systems capable of handling millions of performance samples per second.
You’ll have built 10 working projects that demonstrate deep understanding of Continuous Profiling from first principles.