Project 4: Page Fault Analyzer

Build a tool that captures page faults, classifies them (major/minor/COW), and maps fault addresses to memory regions.

Quick Reference

Attribute	Value
Difficulty	Level 4: Expert
Time Estimate	2-3 weeks
Main Programming Language	C (Alternatives: Rust, Python)
Alternative Programming Languages	Rust, Python
Coolness Level	Level 4: Hardcore
Business Potential	Level 3: Performance tooling
Prerequisites	C, virtual memory concepts, `/proc` familiarity
Key Topics	Page faults, perf events, address mapping

1. Learning Objectives

By completing this project, you will:

Distinguish minor, major, and COW page faults.
Capture page-fault events with perf/tracepoints.
Map fault addresses to VMAs via /proc/<pid>/maps.
Correlate faults with file-backed vs anonymous regions.
Produce deterministic summaries and histograms.
Explain how faults impact performance.

2. All Theory Needed (Per-Concept Breakdown)

2.1 Page Faults and Demand Paging

Fundamentals

A page fault occurs when the CPU references a virtual page not currently present. The kernel resolves the fault by mapping the page (minor fault) or by reading it from disk (major fault). Copy-on-write faults happen when a shared page is written.

Deep Dive into the concept

Page tables translate virtual addresses to physical frames. If a present bit is clear, the CPU raises a fault. The kernel examines the VMA and decides whether the access is valid. For anonymous memory, it may allocate a new page. For file-backed memory, it may read from disk, causing major faults. COW faults occur when multiple processes share a read-only page after fork; the first write triggers a copy. Understanding this path lets you interpret which faults are truly expensive.

How this fits on projects

Fault classification powers Section 3.2 and the output in Section 3.7.

Definitions & key terms

minor fault -> resolved without disk I/O
major fault -> requires disk I/O
COW -> copy-on-write fault
present bit -> page table flag indicating page is resident

Mental model diagram (ASCII)

VA -> page table -> present? no -> page fault -> kernel -> map page

How it works (step-by-step)

CPU accesses a virtual address.
Page table lookup finds page not present.
CPU triggers page fault exception.
Kernel resolves (allocate or read from disk).
Execution resumes.

Minimal concrete example

mmap file -> first read -> minor/major fault -> page loaded

Common misconceptions

Misconception: all page faults are bad. Correction: minor faults are normal and cheap.

Check-your-understanding questions

What distinguishes a major from a minor fault?
When do COW faults happen?

Check-your-understanding answers

Major faults require disk I/O.
When a shared read-only page is written.

Real-world applications

Performance analysis of databases and caches
Understanding cold-start latency

Where you’ll apply it

This project: Section 3.2 Functional Requirements, Section 3.7 Output.
Also used in: P02-memory-allocator-malloc-free-from-scratch.

References

OSTEP, VM chapters
perf_event_open(2) docs

Key insights

Not all faults are equal; classification matters.

Summary

Page faults are how virtual memory becomes physical reality.

Homework/Exercises to practice the concept

Use mmap and touch a file to observe faults.
Compare faults after drop_caches.

Solutions to the homework/exercises

First touch triggers faults; subsequent touches do not.
After cache drop, major faults increase.

2.2 perf_event_open and Tracepoint Sampling

Fundamentals

perf_event_open can subscribe to kernel tracepoints like page-faults and provide a ring buffer of events. You can use it in counting mode or sampling mode.

Deep Dive into the concept

perf events are configured with a perf_event_attr struct. In sampling mode, the kernel writes event records into a ring buffer. You read the buffer with mmap and parse records. Tracepoint events include metadata such as address and fault type. Properly handling buffer overruns and event sizes is critical for correctness.

How this fits on projects

This is the event capture engine in Section 4.2 and Section 5.10 Phase 1.

Definitions & key terms

perf event -> kernel performance counter or tracepoint
ring buffer -> circular buffer used for events
sample -> event record containing metadata

Mental model diagram (ASCII)

kernel tracepoint -> perf ring buffer -> user parser

How it works (step-by-step)

Configure perf_event_attr for the page-fault tracepoint.
mmap the ring buffer.
Poll and read event records.
Decode address and fault type.

Minimal concrete example

int fd = perf_event_open(&attr, pid, -1, -1, 0);

Common misconceptions

Misconception: perf buffers are ordered across CPUs. Correction: they’re per-CPU; ordering is best-effort.

Check-your-understanding questions

What happens if the ring buffer overflows?
Why use sampling instead of counting?

Check-your-understanding answers

You lose events unless you handle overwrite.
Sampling gives per-fault metadata (addresses).

Real-world applications

Profiling (perf, bcc, eBPF tools)

Where you’ll apply it

This project: Section 4.4 Algorithm Overview, Section 5.10 Phase 2.
Also used in: P07-interrupt-latency-profiler.

References

perf_event_open(2) man page
Kernel perf documentation

Key insights

perf is the bridge from kernel events to user insight.

Summary

Without perf, fault analysis is guesswork.

Homework/Exercises to practice the concept

Count page faults with perf stat.
Write a program that reads raw perf events.

Solutions to the homework/exercises

perf stat -e page-faults ./prog.
Use perf_event_open with a tiny ring buffer and parse records.

2.3 Mapping Fault Addresses to VMAs

Fundamentals

A fault address only becomes meaningful when mapped to a region: stack, heap, or file-backed segment. /proc/<pid>/maps lists these regions with permissions and file names.

Deep Dive into the concept

Each VMA includes start/end, permissions, offset, device, inode, and pathname. By scanning /proc/<pid>/maps, you can determine which region contains a fault address and whether it’s anonymous or file-backed. Combining this with ELF symbolization (optional) lets you see which binary segment caused the fault. You must refresh maps periodically because VMAs change as the process allocates or maps files.

How this fits on projects

This is how you turn fault addresses into human-readable output in Section 3.7.

Definitions & key terms

VMA -> virtual memory area entry
anonymous mapping -> no backing file
file-backed mapping -> region mapped from a file

Mental model diagram (ASCII)

fault addr 0x7f... -> /proc/pid/maps -> libc.so.6 [text]

How it works (step-by-step)

Read /proc/<pid>/maps into a list of regions.
For each fault address, find matching region.
Label as stack/heap/file/anon.
Optionally symbolize within ELF.

Minimal concrete example

7f2c7b100000-7f2c7b200000 r-xp ... /lib/x86_64-linux-gnu/libc.so.6

Common misconceptions

Misconception: /proc/<pid>/maps is static. Correction: it changes as mappings change.

Check-your-understanding questions

How do you detect a stack fault?
What does an empty pathname mean?

Check-your-understanding answers

Region labeled [stack] in maps.
Anonymous mapping.

Real-world applications

Memory profiling and leak analysis

Where you’ll apply it

This project: Section 3.4 Example Output, Section 5.10 Phase 2.
Also used in: P02-memory-allocator-malloc-free-from-scratch.

References

proc(5) man page

Key insights

A fault address is meaningless without its VMA context.

Summary

Mapping faults to VMAs makes the data actionable.

Homework/Exercises to practice the concept

Trigger a stack growth fault and observe maps.
Map a file and identify its region.

Solutions to the homework/exercises

Recursively allocate and watch [stack] grow.
mmap a file and find its pathname in maps.

3. Project Specification

3.1 What You Will Build

A CLI tool pagefault-analyzer that attaches to a PID and streams page-fault events with classification and region labeling. It also provides summary stats and histograms.

3.2 Functional Requirements

Attach to target PID and capture page-fault events.
Classify faults into minor/major/COW.
Map fault addresses to VMAs and label regions.
Output live stream and summary report.
Provide deterministic output with --fixed-ts.

3.3 Non-Functional Requirements

Performance: handle 10k faults/sec without dropping.
Reliability: graceful exit on target termination.
Usability: readable output and clear error messages.

3.4 Example Usage / Output

$ sudo ./pagefault-analyzer --fixed-ts -p 4321
[000000.001] MINOR 0x7f2c7b100000 libc.so.6 .text
[000000.012] MAJOR 0x400000 demo.bin .text 12.4ms
Summary: major=12 minor=542 cow=3

3.5 Data Formats / Schemas / Protocols

Text stream format: [ts] <type> <addr> <region> <latency>

3.6 Edge Cases

Target exits while tracing.
Missing permissions (perf_event_paranoid).
Fault address outside current maps snapshot.

3.7 Real World Outcome

3.7.1 How to Run (Copy/Paste)

sudo ./pagefault-analyzer --fixed-ts -p 4321

3.7.2 Golden Path Demo (Deterministic)

Use --fixed-ts and a fixed workload to produce stable output.

3.7.3 CLI Transcript (Success + Failure)

$ sudo ./pagefault-analyzer --fixed-ts -p 4321
[000000.001] MINOR 0x7f... libc.so.6 .text
Summary: major=1 minor=12 cow=0

$ sudo ./pagefault-analyzer -p 1
error: perf_event_paranoid too strict (see /proc/sys/kernel/perf_event_paranoid)
exit code: 2

3.7.4 Exit Codes

0 success
2 permission/config error

4. Solution Architecture

4.1 High-Level Design

perf event reader -> classifier -> VMA mapper -> reporter

4.2 Key Components

4.3 Data Structures (No Full Code)

struct vma {
    uint64_t start, end;
    char path[128];
};

4.4 Algorithm Overview

Setup perf event.
Read events and classify.
Map address to VMA.
Emit output and update summary.

5. Implementation Guide

5.1 Development Environment Setup

sudo apt install linux-tools-common

5.2 Project Structure

pagefault-analyzer/
|-- src/
|   |-- main.c
|   |-- perf_reader.c
|   |-- maps.c
|   `-- report.c
`-- Makefile

5.3 The Core Question You’re Answering

“Where do page faults happen, and how do they map to a program’s memory layout?”

5.4 Concepts You Must Understand First

Page faults and demand paging.
perf_event_open and tracepoints.
/proc/<pid>/maps parsing.

5.5 Questions to Guide Your Design

What sampling frequency is safe?
How often should you refresh VMAs?
How will you handle missing mappings?

5.6 Thinking Exercise

Why might a program have many minor faults but no major faults?

5.7 The Interview Questions They’ll Ask

Explain major vs minor faults.
What is copy-on-write and why is it used?

5.8 Hints in Layers

Hint 1: start in counting mode to verify events.
Hint 2: switch to sampling and parse ring buffer.
Hint 3: add VMA labeling.

5.9 Books That Will Help

5.10 Implementation Phases

Phase 1: collect fault counts. Phase 2: add address sampling + maps. Phase 3: add summary + histograms.

6. Testing Strategy

6.1 Test Categories

6.2 Critical Test Cases

Faults captured for file-backed reads.
Address mapping works for stack/heap.

7. Common Pitfalls & Debugging

7.1 Frequent Mistakes

8. Extensions & Challenges

Add symbolization with libbfd or addr2line.
Export CSV for offline analysis.

9. Real-World Connections

Cold-start latency analysis for services

10. Resources

proc(5), perf_event_open(2)

11. Self-Assessment Checklist

I can classify page faults correctly.

12. Submission / Completion Criteria

Minimum: capture and classify faults. Full: map to VMAs and produce summary. Excellence: histogram + symbolization.

13. Determinism Notes

Use --fixed-ts and a fixed workload.