Project 1: Locality Benchmarker

A program that sums a large 2D array of integers in two ways: once iterating row-by-row, and once iterating column-by-column. You will then measure and display the massive performance difference.

Quick Reference

Attribute Value
Primary Language C
Alternative Languages C++, Rust
Difficulty Level 1: Beginner
Time Estimate Weekend
Knowledge Area CPU Caches / Performance Measurement
Tooling GCC/Clang
Prerequisites Basic C programming (loops, arrays, functions)

What You Will Build

A program that sums a large 2D array of integers in two ways: once iterating row-by-row, and once iterating column-by-column. You will then measure and display the massive performance difference.

Why It Matters

This project builds core skills that appear repeatedly in real-world systems and tooling.

Core Challenges

  • Setting up a large 2D array → maps to dynamic memory allocation
  • Writing a high-resolution timer → maps to measuring performance accurately
  • Implementing the two loop orders → maps to understanding row-major vs. column-major access
  • Preventing compiler optimizations → maps to ensuring the compiler doesn’t optimize away your benchmark

Key Concepts

  • Spatial Locality: “Computer Systems: A Programmer’s Perspective” Ch. 6 - Bryant & O’Hallaron
  • Memory Hierarchy: “Computer Systems: A Programmer’s Perspective” Ch. 6 - Bryant & O’Hallaron
  • Row-Major Layout: C programming tutorials on 2D arrays

Real-World Outcome

$ ./locality_benchmark
Initializing 10000x10000 matrix...
Row-major access (good spatial locality):
  Sum: 122713344, Time: 0.08 seconds
Column-major access (bad spatial locality):
  Sum: 122713344, Time: 1.12 seconds

Performance Ratio (bad/good): 14.00x

Implementation Guide

  1. Reproduce the simplest happy-path scenario.
  2. Build the smallest working version of the core feature.
  3. Add input validation and error handling.
  4. Add instrumentation/logging to confirm behavior.
  5. Refactor into clean modules with tests.

Milestones

  • Milestone 1: Minimal working program that runs end-to-end.
  • Milestone 2: Correct outputs for typical inputs.
  • Milestone 3: Robust handling of edge cases.
  • Milestone 4: Clean structure and documented usage.

Validation Checklist

  • Output matches the real-world outcome example
  • Handles invalid inputs safely
  • Provides clear errors and exit codes
  • Repeatable results across runs

References

  • Main guide: LEARN_C_PERFORMANCE_DEEP_DIVE.md
  • “Computer Systems: A Programmer’s Perspective” by Randal E. Bryant & David R. O’Hallaron