Project 15: eBPF-based Observability Agent

A complete observability agent that uses eBPF to collect metrics (CPU, memory, network, disk), traces (function calls, syscalls), and events—exposing them via Prometheus metrics and structured logs.

Quick Reference

Attribute Value
Primary Language Go (cilium/ebpf)
Alternative Languages Rust (aya), C (libbpf)
Difficulty Level 4: Expert
Time Estimate 4-6 weeks
Knowledge Area Observability / Distributed Systems
Tooling cilium/ebpf, Prometheus, Grafana
Prerequisites Projects 1-10 completed, familiarity with Go

What You Will Build

A complete observability agent that uses eBPF to collect metrics (CPU, memory, network, disk), traces (function calls, syscalls), and events—exposing them via Prometheus metrics and structured logs.

Why It Matters

This ties together everything: multiple BPF program types, various maps, userspace integration, and production concerns like performance and reliability.

Core Challenges

  • Multiple data sources → maps to combining tracepoints, kprobes, XDP
  • Metric aggregation → maps to in-kernel vs userspace aggregation
  • Configuration management → maps to dynamic program loading
  • Production reliability → maps to error handling, resource limits

Key Concepts

Real-World Outcome

$ sudo ./ebpf-agent --config agent.yaml
eBPF Observability Agent v1.0.0
Loading BPF programs...
  ✓ syscall_counter (tracepoint)
  ✓ tcp_tracker (kprobe)
  ✓ file_monitor (kprobe)
  ✓ net_stats (XDP)

Metrics server: http://localhost:9090/metrics
Log output: /var/log/ebpf-agent/

# Prometheus metrics
$ curl localhost:9090/metrics
# HELP ebpf_syscalls_total Total system calls by type
# TYPE ebpf_syscalls_total counter
ebpf_syscalls_total{syscall="read"} 1234567
ebpf_syscalls_total{syscall="write"} 987654
ebpf_syscalls_total{syscall="openat"} 123456

# HELP ebpf_tcp_connections Active TCP connections
# TYPE ebpf_tcp_connections gauge
ebpf_tcp_connections{direction="outgoing"} 234
ebpf_tcp_connections{direction="incoming"} 567

# HELP ebpf_network_bytes_total Network bytes by direction
# TYPE ebpf_network_bytes_total counter
ebpf_network_bytes_total{direction="rx"} 12345678901
ebpf_network_bytes_total{direction="tx"} 9876543210

Implementation Guide

  1. Reproduce the simplest happy-path scenario.
  2. Build the smallest working version of the core feature.
  3. Add input validation and error handling.
  4. Add instrumentation/logging to confirm behavior.
  5. Refactor into clean modules with tests.

Milestones

  • Milestone 1: Minimal working program that runs end-to-end.
  • Milestone 2: Correct outputs for typical inputs.
  • Milestone 3: Robust handling of edge cases.
  • Milestone 4: Clean structure and documented usage.

Validation Checklist

  • Output matches the real-world outcome example
  • Handles invalid inputs safely
  • Provides clear errors and exit codes
  • Repeatable results across runs

References

  • Main guide: LEARN_BPF_EBPF_LINUX.md
  • “Learning eBPF” by Liz Rice