Project 8: Multi-dimensional Array Navigator

Build a visualization tool showing how multi-dimensional arrays are laid out in memory, demonstrating row-major ordering and the relationship between arrays, pointers, and indexing.


Quick Reference

Attribute Value
Language C
Difficulty Intermediate (Level 3)
Time 1 Week
Chapters Expert C Programming Ch. 9-10
Coolness 4/5 - Genuinely Enlightening
Portfolio Value Interview Ready

Learning Objectives

By completing this project, you will:

  1. Understand row-major memory layout: Visualize how arr[i][j] is stored contiguously in memory
  2. Master the array indexing formula: Calculate addresses manually: base + icolssizeof(T) + j*sizeof(T)
  3. Distinguish array of pointers from pointer to array: Know why int ** is different from int (*)[N]
  4. Pass 2D arrays to functions correctly: Understand parameter decay and dimension requirements
  5. Perform pointer arithmetic in multi-dimensional contexts: Navigate arrays using pointer manipulation
  6. Debug memory layout issues: Use tools to verify your mental model matches reality

The Core Question You’re Answering

“How does a 2D array exist in 1D memory, and what does that mean for how I access and pass it?”

C stores all arrays in contiguous, one-dimensional memory. A 2D array is a convenient abstraction that maps to this linear storage. Understanding this mapping is essential for:

  • Efficient cache-friendly access patterns
  • Correct function parameter types
  • Low-level memory manipulation
  • Understanding why certain operations are undefined behavior

Concepts You Must Understand First

Concept Why It Matters Where to Learn
Basic pointer arithmetic Foundation for array access Expert C Ch. 4
sizeof operator behavior Size calculations differ for arrays vs pointers Expert C Ch. 4
Array decay to pointer Why arr[i] works like *(arr+i) Expert C Ch. 4
Memory addresses Understanding hex addresses CS:APP Ch. 3
Type system for pointers int* vs int** vs int(*)[N] Expert C Ch. 10

Key Concepts Deep Dive

  1. Row-Major vs Column-Major Ordering
    • How are elements of a 2D array stored in memory?
    • What is the difference between row-major (C) and column-major (Fortran)?
    • Why does access pattern affect cache performance?
    • Expert C Programming Ch. 9
  2. The Array Indexing Formula
    • How does arr[i][j] translate to a memory address?
    • What is the general formula for N-dimensional arrays?
    • How does pointer arithmetic implement this formula?
    • Expert C Programming Ch. 9-10
  3. Array of Pointers vs Pointer to Array
    • What does int *arr[N] declare?
    • What does int (*arr)[N] declare?
    • Why can’t you pass int arr[M][N] as int **?
    • Expert C Programming Ch. 10
  4. Function Parameter Decay
    • What type does int arr[M][N] become in a function parameter?
    • Why must you specify all dimensions except the first?
    • How do VLAs (C99) change this?
    • C Standard, Expert C Ch. 9
  5. Cache-Friendly Access Patterns
    • Why is row-major traversal faster than column-major in C?
    • What is spatial locality?
    • How do you optimize nested loops for cache?
    • CS:APP Ch. 6

Theoretical Foundation

Row-Major Memory Layout

┌─────────────────────────────────────────────────────────────────────────────┐
│                    ROW-MAJOR MEMORY LAYOUT                                   │
├─────────────────────────────────────────────────────────────────────────────┤
│                                                                             │
│  CONCEPTUAL VIEW (2D):                                                      │
│                                                                             │
│  int arr[3][4] = {                                                          │
│      { 0,  1,  2,  3},    // Row 0                                          │
│      { 4,  5,  6,  7},    // Row 1                                          │
│      { 8,  9, 10, 11}     // Row 2                                          │
│  };                                                                         │
│                                                                             │
│        Column:  0    1    2    3                                            │
│               ┌────┬────┬────┬────┐                                         │
│       Row 0:  │  0 │  1 │  2 │  3 │                                         │
│               ├────┼────┼────┼────┤                                         │
│       Row 1:  │  4 │  5 │  6 │  7 │                                         │
│               ├────┼────┼────┼────┤                                         │
│       Row 2:  │  8 │  9 │ 10 │ 11 │                                         │
│               └────┴────┴────┴────┘                                         │
│                                                                             │
│  ACTUAL MEMORY (1D, contiguous):                                            │
│                                                                             │
│  Address:  0x100  0x104  0x108  0x10C  0x110  0x114  0x118  0x11C  ...      │
│           ┌──────┬──────┬──────┬──────┬──────┬──────┬──────┬──────┬───────┐ │
│  Memory:  │   0  │   1  │   2  │   3  │   4  │   5  │   6  │   7  │  ...  │ │
│           └──────┴──────┴──────┴──────┴──────┴──────┴──────┴──────┴───────┘ │
│            \_____________  ____________/\____________  ____________/        │
│                         \/                          \/                      │
│                       Row 0                       Row 1                     │
│                                                                             │
│  KEY INSIGHT: Each complete row is stored before the next row begins.      │
│               This is "row-major" order.                                    │
│                                                                             │
└─────────────────────────────────────────────────────────────────────────────┘

Row-Major Memory Layout

The Array Indexing Formula

┌─────────────────────────────────────────────────────────────────────────────┐
│                    ARRAY INDEXING FORMULA                                    │
├─────────────────────────────────────────────────────────────────────────────┤
│                                                                             │
│  For a 2D array: Type arr[ROWS][COLS]                                       │
│                                                                             │
│  Address of arr[i][j] = base + (i * COLS + j) * sizeof(Type)                │
│                                                                             │
│  BREAKDOWN:                                                                 │
│  ──────────                                                                 │
│  base           = starting address of arr                                   │
│  i * COLS       = skip i complete rows                                      │
│  + j            = position within the row                                   │
│  * sizeof(Type) = scale by element size                                     │
│                                                                             │
│  EXAMPLE:                                                                   │
│  ─────────                                                                  │
│  int arr[3][4];  // 3 rows, 4 columns, sizeof(int) = 4                      │
│  Base address: 0x1000                                                       │
│                                                                             │
│  &arr[0][0] = 0x1000 + (0 * 4 + 0) * 4 = 0x1000                             │
│  &arr[0][3] = 0x1000 + (0 * 4 + 3) * 4 = 0x100C                             │
│  &arr[1][0] = 0x1000 + (1 * 4 + 0) * 4 = 0x1010                             │
│  &arr[1][2] = 0x1000 + (1 * 4 + 2) * 4 = 0x1018                             │
│  &arr[2][3] = 0x1000 + (2 * 4 + 3) * 4 = 0x102C                             │
│                                                                             │
│  POINTER ARITHMETIC EQUIVALENT:                                             │
│  ───────────────────────────────                                            │
│  arr[i][j]  ≡  *(*(arr + i) + j)                                            │
│  arr[i][j]  ≡  *((int*)arr + i * COLS + j)                                  │
│                                                                             │
└─────────────────────────────────────────────────────────────────────────────┘

Array of Pointers vs Pointer to Array

┌─────────────────────────────────────────────────────────────────────────────┐
│                    ARRAY OF POINTERS VS POINTER TO ARRAY                     │
├─────────────────────────────────────────────────────────────────────────────┤
│                                                                             │
│  DECLARATION 1: int *arr[3]  (Array of 3 pointers to int)                   │
│  ─────────────────────────────────────────────────────────                  │
│                                                                             │
│  Memory layout:                                                             │
│                                                                             │
│    arr (on stack):                                                          │
│    ┌─────────────────┐                                                      │
│    │ arr[0] ─────────┼───────────▶ [some ints somewhere]                    │
│    ├─────────────────┤                                                      │
│    │ arr[1] ─────────┼───────────▶ [some other ints]                        │
│    ├─────────────────┤                                                      │
│    │ arr[2] ─────────┼───────────▶ [more ints elsewhere]                    │
│    └─────────────────┘                                                      │
│                                                                             │
│    - Each row can be different length (jagged array)                        │
│    - Rows are NOT contiguous in memory                                      │
│    - Two indirections to access element: arr[i][j]                          │
│    - sizeof(arr) = 3 * sizeof(int*) = 24 bytes (on 64-bit)                  │
│                                                                             │
│  DECLARATION 2: int (*arr)[4]  (Pointer to array of 4 ints)                 │
│  ─────────────────────────────────────────────────────────                  │
│                                                                             │
│  Memory layout:                                                             │
│                                                                             │
│    arr (pointer):     Points to contiguous memory:                          │
│    ┌───────────┐      ┌────┬────┬────┬────┬────┬────┬────┬────┬────┬────┐   │
│    │     ──────┼─────▶│  0 │  1 │  2 │  3 │  4 │  5 │  6 │  7 │... │    │   │
│    └───────────┘      └────┴────┴────┴────┴────┴────┴────┴────┴────┴────┘   │
│                        \___ row 0 ___/\___ row 1 ___/\_ row 2 _/            │
│                                                                             │
│    - All rows same length (4 ints)                                          │
│    - Rows ARE contiguous in memory                                          │
│    - One indirection + offset to access                                     │
│    - sizeof(arr) = sizeof(int*) = 8 bytes (on 64-bit)                       │
│    - arr + 1 moves by sizeof(int[4]) = 16 bytes!                            │
│                                                                             │
│  DECLARATION 3: int arr[3][4]  (True 2D array)                              │
│  ─────────────────────────────────────────────────                          │
│                                                                             │
│    - 12 ints contiguous in memory                                           │
│    - sizeof(arr) = 3 * 4 * sizeof(int) = 48 bytes                           │
│    - Decays to int (*)[4] when passed to function                           │
│                                                                             │
└─────────────────────────────────────────────────────────────────────────────┘

Array of Pointers vs Pointer to Array

Passing 2D Arrays to Functions

┌─────────────────────────────────────────────────────────────────────────────┐
│                    PASSING 2D ARRAYS TO FUNCTIONS                            │
├─────────────────────────────────────────────────────────────────────────────┤
│                                                                             │
│  RULE: The first dimension "decays" to a pointer, but all other             │
│        dimensions MUST be specified.                                        │
│                                                                             │
│  For: int arr[3][4];                                                        │
│                                                                             │
│  CORRECT WAYS TO PASS:                                                      │
│  ─────────────────────                                                      │
│  void func1(int arr[3][4]);      // All dimensions (first ignored)          │
│  void func2(int arr[][4]);       // First dimension as []                   │
│  void func3(int (*arr)[4]);      // Pointer to array of 4 ints              │
│                                                                             │
│  All three are EQUIVALENT! They all become: int (*arr)[4]                   │
│                                                                             │
│  INCORRECT:                                                                 │
│  ──────────                                                                 │
│  void bad1(int **arr);           // Different type entirely!                │
│  void bad2(int arr[3][]);        // Must specify last dimension             │
│  void bad3(int arr[][]);         // Must specify last dimension             │
│                                                                             │
│  WHY int** DOESN'T WORK:                                                    │
│  ───────────────────────                                                    │
│                                                                             │
│  int arr[3][4];                                                             │
│  int **pp = (int**)arr;  // Wrong! Compiles with warning, but...            │
│                                                                             │
│  pp[1][2]:  Treats memory at pp+1 as an address to dereference              │
│             But that memory contains the VALUE 4, not an address!           │
│             → Dereferences 0x00000004 → CRASH                               │
│                                                                             │
│  arr[1][2]: Calculates 1*4 + 2 = 6, scales by sizeof(int)                   │
│             → Correct offset into contiguous memory                         │
│                                                                             │
│  C99 VARIABLE LENGTH ARRAYS (VLAs):                                         │
│  ──────────────────────────────────                                         │
│  void process(int rows, int cols, int arr[rows][cols]) {                    │
│      // Dimensions known at runtime                                         │
│      arr[1][2] = 42;  // Works!                                             │
│  }                                                                          │
│                                                                             │
│  // Call with any dimensions:                                               │
│  int matrix[5][10];                                                         │
│  process(5, 10, matrix);                                                    │
│                                                                             │
└─────────────────────────────────────────────────────────────────────────────┘

Cache Efficiency and Access Patterns

┌─────────────────────────────────────────────────────────────────────────────┐
│                    CACHE-FRIENDLY ACCESS PATTERNS                            │
├─────────────────────────────────────────────────────────────────────────────┤
│                                                                             │
│  int arr[1000][1000];                                                       │
│                                                                             │
│  FAST (Row-Major Traversal):            SLOW (Column-Major Traversal):      │
│  ────────────────────────────           ──────────────────────────────      │
│  for (i = 0; i < 1000; i++)             for (j = 0; j < 1000; j++)          │
│      for (j = 0; j < 1000; j++)             for (i = 0; i < 1000; i++)      │
│          sum += arr[i][j];                      sum += arr[i][j];           │
│                                                                             │
│  Memory access pattern:                 Memory access pattern:              │
│                                                                             │
│  [0,0][0,1][0,2]...[0,999]              [0,0][1,0][2,0]...[999,0]           │
│  [1,0][1,1][1,2]...[1,999]              [0,1][1,1][2,1]...[999,1]           │
│  ...                                    ...                                 │
│                                                                             │
│  ┌──────────────────────────┐           ┌──────────────────────────┐        │
│  │ Sequential in memory     │           │ Jumping 4000 bytes each  │        │
│  │ Cache line reuse: HIGH   │           │ Cache line reuse: LOW    │        │
│  │ Cache misses: FEW        │           │ Cache misses: MANY       │        │
│  └──────────────────────────┘           └──────────────────────────┘        │
│                                                                             │
│  SPEEDUP: Row-major can be 10-100x faster for large arrays!                 │
│                                                                             │
│  WHY?                                                                       │
│  ─────                                                                      │
│  - CPU cache loads 64 bytes at a time (cache line)                          │
│  - Sequential access uses all loaded data                                   │
│  - Strided access wastes loaded data, causes cache misses                   │
│                                                                             │
└─────────────────────────────────────────────────────────────────────────────┘

Project Specification

What You Will Build

An interactive visualization tool that displays:

  1. Memory layout of multi-dimensional arrays
  2. Address calculations for any element access
  3. Differences between various pointer/array declarations
  4. Cache access patterns for different traversal orders

Functional Requirements

  1. Memory Layout Visualization (--layout):
    • Show how any declared array is laid out in memory
    • Display addresses for each element
    • Highlight row boundaries
  2. Address Calculator (--address):
    • Calculate address of arr[i][j] given base and dimensions
    • Show step-by-step formula application
    • Support 2D, 3D, and higher dimensions
  3. Type Comparator (--types):
    • Compare int arr[M][N], int *arr[M], int (*arr)[N]
    • Show sizeof differences
    • Demonstrate pointer arithmetic differences
  4. Function Parameter Analyzer (--params):
    • Show what happens when arrays are passed to functions
    • Demonstrate decay behavior
    • Show VLA parameter syntax
  5. Cache Simulator (--cache):
    • Simulate cache behavior for different access patterns
    • Show cache hit/miss rates
    • Visualize which cache lines are loaded

Non-Functional Requirements

  • Clear, educational output with ASCII visualizations
  • Works with user-specified dimensions
  • Interactive mode for exploration
  • Generates C code examples demonstrating concepts

Real World Outcome

When complete, you will have a tool that provides deep insight into array memory layout:

$ ./array_navigator --layout "int arr[3][4]" --base 0x1000

================================================================================
                    MULTI-DIMENSIONAL ARRAY MEMORY LAYOUT
================================================================================

Declaration: int arr[3][4]
Base Address: 0x1000
Element Size: 4 bytes (sizeof(int))
Total Size: 48 bytes (3 * 4 * 4)

CONCEPTUAL VIEW (2D Grid):
═══════════════════════════

         Column 0    Column 1    Column 2    Column 3
        ┌───────────┬───────────┬───────────┬───────────┐
Row 0   │  arr[0][0]│  arr[0][1]│  arr[0][2]│  arr[0][3]│
        │  0x1000   │  0x1004   │  0x1008   │  0x100C   │
        ├───────────┼───────────┼───────────┼───────────┤
Row 1   │  arr[1][0]│  arr[1][1]│  arr[1][2]│  arr[1][3]│
        │  0x1010   │  0x1014   │  0x1018   │  0x101C   │
        ├───────────┼───────────┼───────────┼───────────┤
Row 2   │  arr[2][0]│  arr[2][1]│  arr[2][2]│  arr[2][3]│
        │  0x1020   │  0x1024   │  0x1028   │  0x102C   │
        └───────────┴───────────┴───────────┴───────────┘

LINEAR MEMORY VIEW:
═══════════════════

Address │ Offset │ Index    │ Value Slot
────────┼────────┼──────────┼───────────
0x1000  │ 0      │ [0][0]   │ █████████
0x1004  │ 4      │ [0][1]   │ █████████
0x1008  │ 8      │ [0][2]   │ █████████
0x100C  │ 12     │ [0][3]   │ █████████
        │        │          │ ─── end row 0 ───
0x1010  │ 16     │ [1][0]   │ █████████
0x1014  │ 20     │ [1][1]   │ █████████
0x1018  │ 24     │ [1][2]   │ █████████
0x101C  │ 28     │ [1][3]   │ █████████
        │        │          │ ─── end row 1 ───
0x1020  │ 32     │ [2][0]   │ █████████
0x1024  │ 36     │ [2][1]   │ █████████
0x1028  │ 40     │ [2][2]   │ █████████
0x102C  │ 44     │ [2][3]   │ █████████
        │        │          │ ─── end row 2 ───

ADDRESSING FORMULA:
═══════════════════

  &arr[i][j] = base + (i * COLS + j) * sizeof(element)
  &arr[i][j] = 0x1000 + (i * 4 + j) * 4

Examples:
  &arr[0][0] = 0x1000 + (0*4 + 0)*4 = 0x1000
  &arr[1][2] = 0x1000 + (1*4 + 2)*4 = 0x1000 + 24 = 0x1018
  &arr[2][3] = 0x1000 + (2*4 + 3)*4 = 0x1000 + 44 = 0x102C

================================================================================

$ ./array_navigator --types

================================================================================
                    COMPARING ARRAY/POINTER TYPES
================================================================================

TYPE 1: int arr[3][4]  (True 2D Array)
──────────────────────────────────────
- sizeof(arr) = 48 bytes  (3 * 4 * sizeof(int))
- sizeof(arr[0]) = 16 bytes  (4 * sizeof(int))
- arr decays to: int (*)[4]  (pointer to array of 4 ints)
- &arr has type: int (*)[3][4]  (pointer to whole array)
- arr + 1 advances by: 16 bytes (one row)

Memory:
┌────────────────────────────────────────────────────┐
│ [0,0][0,1][0,2][0,3][1,0][1,1][1,2][1,3][2,0]...   │
│  contiguous, 48 bytes total                        │
└────────────────────────────────────────────────────┘

TYPE 2: int *arr[3]  (Array of 3 Pointers)
──────────────────────────────────────────
- sizeof(arr) = 24 bytes  (3 * sizeof(int*))
- sizeof(arr[0]) = 8 bytes  (sizeof(int*))
- arr decays to: int **
- arr + 1 advances by: 8 bytes (one pointer)

Memory:
┌──────────────────────┐
│ ptr[0] ──────────────┼──▶ [dynamically allocated ints]
│ ptr[1] ──────────────┼──▶ [more ints, different location]
│ ptr[2] ──────────────┼──▶ [more ints, yet another place]
└──────────────────────┘
Rows are NOT contiguous! Different sizes possible!

TYPE 3: int (*arr)[4]  (Pointer to Array of 4 Ints)
────────────────────────────────────────────────────
- sizeof(arr) = 8 bytes  (sizeof pointer)
- sizeof(*arr) = 16 bytes  (sizeof(int[4]))
- arr + 1 advances by: 16 bytes (size of int[4])
- Can point to: int matrix[N][4] for any N

Memory (when pointing to int matrix[3][4]):
┌────────┐      ┌────────────────────────────────────────────┐
│   ─────┼──────▶[0,0][0,1][0,2][0,3][1,0][1,1][1,2][1,3]... │
└────────┘      └────────────────────────────────────────────┘
                 ^ arr points here, arr+1 points 16 bytes later

KEY DIFFERENCES:
════════════════

                    int arr[3][4]  │  int *arr[3]  │  int (*arr)[4]
────────────────────────────────────┼───────────────┼───────────────
sizeof(arr)         48 bytes       │  24 bytes     │  8 bytes
Memory layout       Contiguous     │  Scattered    │  Depends on target
Rows same size?     Yes (fixed)    │  No (jagged)  │  Yes (fixed)
Can reassign arr?   No (not lvalue)│  No           │  Yes (is pointer)
Passing to func     Decays         │  Decays       │  Stays same

$ ./array_navigator --cache "int arr[1000][1000]" --pattern row-major

================================================================================
                    CACHE ACCESS SIMULATION
================================================================================

Array: int arr[1000][1000]
Total Size: 4,000,000 bytes (3.8 MB)
Cache Line Size: 64 bytes (16 integers per line)
L1 Cache Size: 32 KB (512 cache lines)

ACCESS PATTERN: Row-Major (arr[i][j] with i outer loop)

Simulation:
  Accessing row 0, column 0-999:
    Load cache line @ 0x0000 → covers arr[0][0] to arr[0][15]
    ✓ Hit arr[0][0], ✓ Hit arr[0][1], ... ✓ Hit arr[0][15]
    Load cache line @ 0x0040 → covers arr[0][16] to arr[0][31]
    ✓ Hit arr[0][16], ✓ Hit arr[0][17], ...

RESULTS:
  Total accesses: 1,000,000
  Cache loads:    62,500 (one per 16 elements)
  Cache hits:     937,500 (15/16 = 93.75% hit rate)

COMPARISON WITH COLUMN-MAJOR:

Row-Major:     ████████████████████████████████████████████████░░░░  93.75%
Column-Major:  ░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░   0.00%

Column-major would load a new cache line for EVERY access
(because arr[1][0] is 4000 bytes away from arr[0][0])!

PERFORMANCE IMPACT: Row-major is ~16x faster for this array size!

Questions to Guide Your Design

  1. How will you parse array declarations like “int arr[3][4][5]” from user input?

  2. How will you handle the display for very large arrays where printing every element is impractical?

  3. How will you demonstrate the difference between static arrays and dynamically allocated “2D” arrays?

  4. How will you show pointer arithmetic visually (arr+1 vs &arr[0]+1 vs &arr+1)?

  5. For the cache simulation, what simplifying assumptions will you make?

  6. How will you generate compilable example code that users can run to verify your explanations?


Thinking Exercise

Before coding, work through these exercises:

Exercise 1: Address Calculation

Given: int matrix[4][5] at base address 0x2000

Calculate by hand:

  • &matrix[0][0] = ?
  • &matrix[1][3] = ?
  • &matrix[3][4] = ?
  • What happens if you access matrix[4][0]? (out of bounds)

Exercise 2: Pointer Type Analysis

For each expression, determine the type:

int arr[3][4];

typeof(arr)      = ?    // The array itself
typeof(arr[0])   = ?    // First row
typeof(arr[0][0])= ?    // First element
typeof(&arr)     = ?    // Address of whole array
typeof(&arr[0])  = ?    // Address of first row
typeof(arr + 1)  = ?    // Pointer to second row

Exercise 3: Function Parameter Decay

Which of these function signatures are equivalent?

void f1(int arr[10][20]);
void f2(int arr[][20]);
void f3(int (*arr)[20]);
void f4(int **arr);
void f5(int *arr[20]);

Exercise 4: Why This Crashes

int arr[3][4] = {{1,2,3,4}, {5,6,7,8}, {9,10,11,12}};
int **pp = (int**)arr;
printf("%d\n", pp[1][2]);  // CRASH! Why?

Trace what the code actually does step by step.


Hints in Layers

Hint 1: Parsing Array Declarations

Use a simple regex or manual parsing for declarations:

// Parse "int arr[3][4][5]"
typedef struct {
    char type[32];      // "int"
    char name[64];      // "arr"
    int dims[8];        // {3, 4, 5}
    int ndims;          // 3
} ArrayDecl;

// Parsing pseudocode:
// 1. Extract type (before first space)
// 2. Extract name (between space and first '[')
// 3. Extract dimensions (between each '[' and ']')
Hint 2: Calculating Addresses Generically

For N-dimensional array, the address formula generalizes:

// arr[i₀][i₁][i₂]...[iₙ₋₁] for dims[n]
size_t offset = 0;
size_t stride = elem_size;
for (int d = ndims - 1; d >= 0; d--) {
    offset += indices[d] * stride;
    stride *= dims[d];
}
address = base + offset;
Hint 3: Visualizing Large Arrays

For large arrays, show:

  • First few elements of first row
  • “…” indicator
  • Last few elements of last row
  • Summary statistics
if (rows > 5) {
    show_rows(0, 2);      // First 2 rows
    printf("... (%d rows omitted) ...\n", rows - 4);
    show_rows(rows-2, 2); // Last 2 rows
}
Hint 4: Cache Simulation

Simple cache simulation:

typedef struct {
    size_t line_size;     // e.g., 64 bytes
    size_t num_lines;     // e.g., 512
    size_t *tags;         // Which memory block is in each line
    int *valid;           // Is line valid?
} Cache;

int access(Cache *c, size_t address) {
    size_t line_addr = address / c->line_size;
    size_t index = line_addr % c->num_lines;  // Direct-mapped

    if (c->valid[index] && c->tags[index] == line_addr) {
        return 1;  // Hit
    }
    c->tags[index] = line_addr;
    c->valid[index] = 1;
    return 0;  // Miss
}
Hint 5: Generating Example Code

Generate runnable C code that verifies concepts:

void generate_verification_code(ArrayDecl *decl, FILE *out) {
    fprintf(out, "#include <stdio.h>\n\n");
    fprintf(out, "int main(void) {\n");
    fprintf(out, "    %s %s", decl->type, decl->name);
    for (int i = 0; i < decl->ndims; i++) {
        fprintf(out, "[%d]", decl->dims[i]);
    }
    fprintf(out, ";\n\n");
    fprintf(out, "    printf(\"Base address: %%p\\n\", (void*)%s);\n", decl->name);
    // ... more verification code
    fprintf(out, "}\n");
}

Solution Architecture

High-Level Design

┌─────────────────────────────────────────────────────────────────────────────┐
│                         ARRAY NAVIGATOR                                      │
├─────────────────────────────────────────────────────────────────────────────┤
│                                                                             │
│  ┌──────────────┐   ┌──────────────┐   ┌──────────────┐   ┌───────────────┐│
│  │  Declaration │   │    Layout    │   │    Type      │   │    Cache      ││
│  │    Parser    │──▶│  Visualizer  │   │  Comparator  │   │  Simulator    ││
│  └──────────────┘   └──────────────┘   └──────────────┘   └───────────────┘│
│         │                  │                  │                   │        │
│         ▼                  ▼                  ▼                   ▼        │
│  ┌─────────────────────────────────────────────────────────────────────────┐│
│  │                        Report Generator                                 ││
│  │            (ASCII tables, diagrams, code examples)                      ││
│  └─────────────────────────────────────────────────────────────────────────┘│
│                                                                             │
└─────────────────────────────────────────────────────────────────────────────┘

Data Structures

typedef struct {
    char element_type[32];  // "int", "double", etc.
    size_t element_size;    // sizeof(element_type)
    int dimensions[8];      // e.g., {3, 4, 5} for [3][4][5]
    int num_dimensions;     // e.g., 3
    char name[64];          // e.g., "matrix"
} ArrayDeclaration;

typedef struct {
    size_t base_address;
    int indices[8];         // e.g., {1, 2} for [1][2]
    size_t calculated_address;
    char formula[256];      // Human-readable formula
} AddressCalculation;

typedef struct {
    size_t line_size;
    size_t total_size;
    size_t accesses;
    size_t hits;
    size_t misses;
} CacheStats;

typedef struct {
    char signature[256];
    char decayed_type[128];
    char explanation[512];
} FunctionParameter;

Implementation Guide

Phase 1: Declaration Parser (Day 1)

Tasks:

  1. Parse “int arr[M][N]” style declarations
  2. Support various types (int, char, double, struct types)
  3. Calculate total size and per-dimension strides

Phase 2: Layout Visualizer (Days 2-3)

Tasks:

  1. Generate 2D grid view for 2D arrays
  2. Generate linear memory view
  3. Show address formula and examples
  4. Handle 3D and higher dimensions

Phase 3: Type Comparator (Days 4-5)

Tasks:

  1. Compare arr[M][N] vs arr[M] vs (arr)[N]
  2. Show sizeof differences
  3. Demonstrate pointer arithmetic differences
  4. Generate code examples

Phase 4: Cache Simulator (Days 6-7)

Tasks:

  1. Implement simple direct-mapped cache model
  2. Simulate row-major vs column-major access
  3. Display hit/miss statistics
  4. Show visual comparison

Testing Strategy

Test Cases

  1. 2D Array Layout
    • int arr[3][4] - standard case
    • char arr[2][8] - different element size
    • double arr[10][10] - larger array
  2. Address Calculations
    • Verify calculated addresses match actual addresses
    • Test boundary elements (first, last, corners)
  3. Type Comparisons
    • Verify sizeof calculations
    • Verify pointer arithmetic steps
  4. Cache Simulation
    • Known patterns with predictable hit rates
    • Compare with theoretical analysis

Verification Approach

Generate C code that users can compile and run to verify:

// Generated verification code
int arr[3][4];
printf("Calculated &arr[1][2]: 0x%lx\n", CALCULATED);
printf("Actual &arr[1][2]: %p\n", (void*)&arr[1][2]);
// Should match!

Common Pitfalls & Debugging

Pitfall Symptom Solution
Forgetting element size Addresses wrong by factor of 4 Multiply by sizeof(type)
Confusing row/column major Wrong elements accessed Verify formula matches C’s layout
Wrong pointer arithmetic Unexpected address Remember arr+1 moves by sizeof(*arr)
Type confusion Compilation errors Draw out the types carefully
Off-by-one in formulas Wrong addresses at boundaries Test with known small arrays

Extensions & Challenges

Beginner Extensions

  • Add support for typedef’d types
  • Colorize output for different sections
  • Export visualizations as HTML

Intermediate Extensions

  • Support for struct arrays
  • Visualization of padding in structs
  • Interactive mode with REPL

Advanced Extensions

  • Full cache hierarchy simulation (L1/L2/L3)
  • SIMD access pattern analysis
  • Integration with compiler output (compare with gcc -S)

Self-Assessment Checklist

Understanding

  • I can calculate array element addresses by hand
  • I know why int** is not the same as int[][N]
  • I understand row-major layout and its cache implications
  • I can write correct function signatures for 2D arrays
  • I understand VLA parameter syntax

Implementation

  • Layout visualization works for 2D and 3D arrays
  • Address calculations are verified correct
  • Type comparison shows meaningful differences
  • Cache simulation produces reasonable results

The Interview Questions They’ll Ask

  1. “How are 2D arrays stored in memory in C?”
    • Row-major order: complete rows stored consecutively
    • arr[i][j] = base + (i * cols + j) * sizeof(element)
    • All elements contiguous in memory
  2. “Why can’t you pass int arr[][] to a function?”
    • Compiler needs column count for address calculation
    • arr[i][j] requires knowing stride
    • Must be: int arr[][COLS] or int (*arr)[COLS]
  3. “What’s the difference between int arr[N] and int (arr)[N]?”
    • First: array of N pointers to int
    • Second: pointer to array of N ints
    • Different sizeof, different memory layout
  4. “Why is row-major traversal faster than column-major?”
    • Cache lines load contiguous memory
    • Row-major accesses sequential addresses
    • Column-major jumps by row_size each access

Resources

Essential Reading

  • Expert C Programming Ch. 9-10: Arrays and pointers in depth
  • CS:APP Ch. 6.2-6.3: Memory hierarchy and cache optimization
  • K&R Ch. 5.7-5.9: Multi-dimensional arrays

Books That Will Help

Topic Book Chapter
Array/pointer relationship Expert C Programming Ch. 4, 9, 10
Memory layout Write Great Code Vol 1 Ch. 4
Cache optimization CS:APP Ch. 6
C99 VLAs C Programming: A Modern Approach Ch. 8

Submission / Completion Criteria

Minimum Viable:

  • Layout visualization for 2D arrays
  • Address calculation with formula display
  • Basic type comparison

Full Completion:

  • Support for 2D, 3D arrays
  • Type comparator with all three major types
  • Cache simulation with statistics
  • Generated verification code

Excellence:

  • Interactive REPL mode
  • Full cache hierarchy simulation
  • HTML export with diagrams
  • Support for struct arrays and padding

This project is part of the Expert C Programming Mastery series. For the complete learning path, see the project index.