Project 14: Performance Profiler

A profiling tool that instruments Go programs, collects CPU and memory samples, and generates flame graphs—helping you understand where programs spend time and allocate memory.

Quick Reference

Attribute Value
Primary Language Go
Alternative Languages Rust, C++
Difficulty Level 4: Expert
Time Estimate 2-3 weeks
Knowledge Area Profiling, Runtime Internals, Visualization
Tooling go tool pprof (as reference)
Prerequisites Completed Projects 1-9. Deep understanding of Go runtime. Familiarity with profiling concepts.

What You Will Build

A profiling tool that instruments Go programs, collects CPU and memory samples, and generates flame graphs—helping you understand where programs spend time and allocate memory.

Why It Matters

This project builds core skills that appear repeatedly in real-world systems and tooling.

Core Challenges

  • Collecting profiling data → maps to runtime package, signal handling
  • Parsing pprof format → maps to protocol buffers, binary data
  • Stack trace analysis → maps to call graphs, aggregation
  • Flame graph generation → maps to SVG generation, visualization

Key Concepts

  • Go runtime: runtime package documentation
  • pprof format: Protocol buffer definition
  • Flame graphs: Brendan Gregg’s work
  • Performance analysis: Dave Cheney’s blog posts

Real-World Outcome

$ ./profiler record --cpu --duration 30s ./myapp
Recording CPU profile for 30s...
Profile saved to profile.pb.gz

$ ./profiler analyze profile.pb.gz
Top 10 by CPU time:
  45.2%  encoding/json.Marshal
  22.1%  net/http.(*conn).serve
  12.3%  runtime.mallocgc
   8.7%  myapp.processRequest
   4.2%  database/sql.(*DB).Query
   ...

Hot paths:
  main.handleRequest
  └── myapp.processRequest (8.7%)
      └── encoding/json.Marshal (45.2%)
          └── encoding/json.(*encodeState).marshal
              └── encoding/json.(*encodeState).reflectValue

$ ./profiler flamegraph profile.pb.gz > flame.svg
Flame graph written to flame.svg

# Open flame.svg in browser - interactive!

$ ./profiler record --mem --allocs ./myapp
Recording memory profile...

$ ./profiler analyze --top 5 mem.pb.gz
Top 5 allocations:
  1.2 GB  []byte allocations in encoding/json
  800 MB  string allocations in net/http
  256 MB  map[string]interface{} in myapp
  128 MB  *sql.Rows in database/sql
   64 MB  goroutine stacks

$ ./profiler diff profile1.pb.gz profile2.pb.gz
Diff (profile2 vs profile1):
  +15.2%  myapp.newFeature
   -8.3%  encoding/json.Marshal (optimization worked!)
   +2.1%  runtime.mallocgc

Implementation Guide

  1. Reproduce the simplest happy-path scenario.
  2. Build the smallest working version of the core feature.
  3. Add input validation and error handling.
  4. Add instrumentation/logging to confirm behavior.
  5. Refactor into clean modules with tests.

Milestones

  • Milestone 1: Minimal working program that runs end-to-end.
  • Milestone 2: Correct outputs for typical inputs.
  • Milestone 3: Robust handling of edge cases.
  • Milestone 4: Clean structure and documented usage.

Validation Checklist

  • Output matches the real-world outcome example
  • Handles invalid inputs safely
  • Provides clear errors and exit codes
  • Repeatable results across runs

References

  • Main guide: LEARN_GO_DEEP_DIVE.md
  • “High Performance Go” (online resources)