Project 8: Multi-dimensional Array Navigator
Build a visualization tool showing how multi-dimensional arrays are laid out in memory, demonstrating row-major ordering and the relationship between arrays, pointers, and indexing.
Quick Reference
| Attribute | Value |
|---|---|
| Language | C |
| Difficulty | Intermediate (Level 3) |
| Time | 1 Week |
| Chapters | Expert C Programming Ch. 9-10 |
| Coolness | 4/5 - Genuinely Enlightening |
| Portfolio Value | Interview Ready |
Learning Objectives
By completing this project, you will:
- Understand row-major memory layout: Visualize how arr[i][j] is stored contiguously in memory
- Master the array indexing formula: Calculate addresses manually: base + icolssizeof(T) + j*sizeof(T)
- Distinguish array of pointers from pointer to array: Know why
int **is different fromint (*)[N] - Pass 2D arrays to functions correctly: Understand parameter decay and dimension requirements
- Perform pointer arithmetic in multi-dimensional contexts: Navigate arrays using pointer manipulation
- Debug memory layout issues: Use tools to verify your mental model matches reality
The Core Question You’re Answering
“How does a 2D array exist in 1D memory, and what does that mean for how I access and pass it?”
C stores all arrays in contiguous, one-dimensional memory. A 2D array is a convenient abstraction that maps to this linear storage. Understanding this mapping is essential for:
- Efficient cache-friendly access patterns
- Correct function parameter types
- Low-level memory manipulation
- Understanding why certain operations are undefined behavior
Concepts You Must Understand First
| Concept | Why It Matters | Where to Learn |
|---|---|---|
| Basic pointer arithmetic | Foundation for array access | Expert C Ch. 4 |
| sizeof operator behavior | Size calculations differ for arrays vs pointers | Expert C Ch. 4 |
| Array decay to pointer | Why arr[i] works like *(arr+i) | Expert C Ch. 4 |
| Memory addresses | Understanding hex addresses | CS:APP Ch. 3 |
| Type system for pointers | int* vs int** vs int(*)[N] | Expert C Ch. 10 |
Key Concepts Deep Dive
- Row-Major vs Column-Major Ordering
- How are elements of a 2D array stored in memory?
- What is the difference between row-major (C) and column-major (Fortran)?
- Why does access pattern affect cache performance?
- Expert C Programming Ch. 9
- The Array Indexing Formula
- How does arr[i][j] translate to a memory address?
- What is the general formula for N-dimensional arrays?
- How does pointer arithmetic implement this formula?
- Expert C Programming Ch. 9-10
- Array of Pointers vs Pointer to Array
- What does
int *arr[N]declare? - What does
int (*arr)[N]declare? - Why can’t you pass
int arr[M][N]asint **? - Expert C Programming Ch. 10
- What does
- Function Parameter Decay
- What type does
int arr[M][N]become in a function parameter? - Why must you specify all dimensions except the first?
- How do VLAs (C99) change this?
- C Standard, Expert C Ch. 9
- What type does
- Cache-Friendly Access Patterns
- Why is row-major traversal faster than column-major in C?
- What is spatial locality?
- How do you optimize nested loops for cache?
- CS:APP Ch. 6
Theoretical Foundation
Row-Major Memory Layout
┌─────────────────────────────────────────────────────────────────────────────┐
│ ROW-MAJOR MEMORY LAYOUT │
├─────────────────────────────────────────────────────────────────────────────┤
│ │
│ CONCEPTUAL VIEW (2D): │
│ │
│ int arr[3][4] = { │
│ { 0, 1, 2, 3}, // Row 0 │
│ { 4, 5, 6, 7}, // Row 1 │
│ { 8, 9, 10, 11} // Row 2 │
│ }; │
│ │
│ Column: 0 1 2 3 │
│ ┌────┬────┬────┬────┐ │
│ Row 0: │ 0 │ 1 │ 2 │ 3 │ │
│ ├────┼────┼────┼────┤ │
│ Row 1: │ 4 │ 5 │ 6 │ 7 │ │
│ ├────┼────┼────┼────┤ │
│ Row 2: │ 8 │ 9 │ 10 │ 11 │ │
│ └────┴────┴────┴────┘ │
│ │
│ ACTUAL MEMORY (1D, contiguous): │
│ │
│ Address: 0x100 0x104 0x108 0x10C 0x110 0x114 0x118 0x11C ... │
│ ┌──────┬──────┬──────┬──────┬──────┬──────┬──────┬──────┬───────┐ │
│ Memory: │ 0 │ 1 │ 2 │ 3 │ 4 │ 5 │ 6 │ 7 │ ... │ │
│ └──────┴──────┴──────┴──────┴──────┴──────┴──────┴──────┴───────┘ │
│ \_____________ ____________/\____________ ____________/ │
│ \/ \/ │
│ Row 0 Row 1 │
│ │
│ KEY INSIGHT: Each complete row is stored before the next row begins. │
│ This is "row-major" order. │
│ │
└─────────────────────────────────────────────────────────────────────────────┘

The Array Indexing Formula
┌─────────────────────────────────────────────────────────────────────────────┐
│ ARRAY INDEXING FORMULA │
├─────────────────────────────────────────────────────────────────────────────┤
│ │
│ For a 2D array: Type arr[ROWS][COLS] │
│ │
│ Address of arr[i][j] = base + (i * COLS + j) * sizeof(Type) │
│ │
│ BREAKDOWN: │
│ ────────── │
│ base = starting address of arr │
│ i * COLS = skip i complete rows │
│ + j = position within the row │
│ * sizeof(Type) = scale by element size │
│ │
│ EXAMPLE: │
│ ───────── │
│ int arr[3][4]; // 3 rows, 4 columns, sizeof(int) = 4 │
│ Base address: 0x1000 │
│ │
│ &arr[0][0] = 0x1000 + (0 * 4 + 0) * 4 = 0x1000 │
│ &arr[0][3] = 0x1000 + (0 * 4 + 3) * 4 = 0x100C │
│ &arr[1][0] = 0x1000 + (1 * 4 + 0) * 4 = 0x1010 │
│ &arr[1][2] = 0x1000 + (1 * 4 + 2) * 4 = 0x1018 │
│ &arr[2][3] = 0x1000 + (2 * 4 + 3) * 4 = 0x102C │
│ │
│ POINTER ARITHMETIC EQUIVALENT: │
│ ─────────────────────────────── │
│ arr[i][j] ≡ *(*(arr + i) + j) │
│ arr[i][j] ≡ *((int*)arr + i * COLS + j) │
│ │
└─────────────────────────────────────────────────────────────────────────────┘
Array of Pointers vs Pointer to Array
┌─────────────────────────────────────────────────────────────────────────────┐
│ ARRAY OF POINTERS VS POINTER TO ARRAY │
├─────────────────────────────────────────────────────────────────────────────┤
│ │
│ DECLARATION 1: int *arr[3] (Array of 3 pointers to int) │
│ ───────────────────────────────────────────────────────── │
│ │
│ Memory layout: │
│ │
│ arr (on stack): │
│ ┌─────────────────┐ │
│ │ arr[0] ─────────┼───────────▶ [some ints somewhere] │
│ ├─────────────────┤ │
│ │ arr[1] ─────────┼───────────▶ [some other ints] │
│ ├─────────────────┤ │
│ │ arr[2] ─────────┼───────────▶ [more ints elsewhere] │
│ └─────────────────┘ │
│ │
│ - Each row can be different length (jagged array) │
│ - Rows are NOT contiguous in memory │
│ - Two indirections to access element: arr[i][j] │
│ - sizeof(arr) = 3 * sizeof(int*) = 24 bytes (on 64-bit) │
│ │
│ DECLARATION 2: int (*arr)[4] (Pointer to array of 4 ints) │
│ ───────────────────────────────────────────────────────── │
│ │
│ Memory layout: │
│ │
│ arr (pointer): Points to contiguous memory: │
│ ┌───────────┐ ┌────┬────┬────┬────┬────┬────┬────┬────┬────┬────┐ │
│ │ ──────┼─────▶│ 0 │ 1 │ 2 │ 3 │ 4 │ 5 │ 6 │ 7 │... │ │ │
│ └───────────┘ └────┴────┴────┴────┴────┴────┴────┴────┴────┴────┘ │
│ \___ row 0 ___/\___ row 1 ___/\_ row 2 _/ │
│ │
│ - All rows same length (4 ints) │
│ - Rows ARE contiguous in memory │
│ - One indirection + offset to access │
│ - sizeof(arr) = sizeof(int*) = 8 bytes (on 64-bit) │
│ - arr + 1 moves by sizeof(int[4]) = 16 bytes! │
│ │
│ DECLARATION 3: int arr[3][4] (True 2D array) │
│ ───────────────────────────────────────────────── │
│ │
│ - 12 ints contiguous in memory │
│ - sizeof(arr) = 3 * 4 * sizeof(int) = 48 bytes │
│ - Decays to int (*)[4] when passed to function │
│ │
└─────────────────────────────────────────────────────────────────────────────┘

Passing 2D Arrays to Functions
┌─────────────────────────────────────────────────────────────────────────────┐
│ PASSING 2D ARRAYS TO FUNCTIONS │
├─────────────────────────────────────────────────────────────────────────────┤
│ │
│ RULE: The first dimension "decays" to a pointer, but all other │
│ dimensions MUST be specified. │
│ │
│ For: int arr[3][4]; │
│ │
│ CORRECT WAYS TO PASS: │
│ ───────────────────── │
│ void func1(int arr[3][4]); // All dimensions (first ignored) │
│ void func2(int arr[][4]); // First dimension as [] │
│ void func3(int (*arr)[4]); // Pointer to array of 4 ints │
│ │
│ All three are EQUIVALENT! They all become: int (*arr)[4] │
│ │
│ INCORRECT: │
│ ────────── │
│ void bad1(int **arr); // Different type entirely! │
│ void bad2(int arr[3][]); // Must specify last dimension │
│ void bad3(int arr[][]); // Must specify last dimension │
│ │
│ WHY int** DOESN'T WORK: │
│ ─────────────────────── │
│ │
│ int arr[3][4]; │
│ int **pp = (int**)arr; // Wrong! Compiles with warning, but... │
│ │
│ pp[1][2]: Treats memory at pp+1 as an address to dereference │
│ But that memory contains the VALUE 4, not an address! │
│ → Dereferences 0x00000004 → CRASH │
│ │
│ arr[1][2]: Calculates 1*4 + 2 = 6, scales by sizeof(int) │
│ → Correct offset into contiguous memory │
│ │
│ C99 VARIABLE LENGTH ARRAYS (VLAs): │
│ ────────────────────────────────── │
│ void process(int rows, int cols, int arr[rows][cols]) { │
│ // Dimensions known at runtime │
│ arr[1][2] = 42; // Works! │
│ } │
│ │
│ // Call with any dimensions: │
│ int matrix[5][10]; │
│ process(5, 10, matrix); │
│ │
└─────────────────────────────────────────────────────────────────────────────┘
Cache Efficiency and Access Patterns
┌─────────────────────────────────────────────────────────────────────────────┐
│ CACHE-FRIENDLY ACCESS PATTERNS │
├─────────────────────────────────────────────────────────────────────────────┤
│ │
│ int arr[1000][1000]; │
│ │
│ FAST (Row-Major Traversal): SLOW (Column-Major Traversal): │
│ ──────────────────────────── ────────────────────────────── │
│ for (i = 0; i < 1000; i++) for (j = 0; j < 1000; j++) │
│ for (j = 0; j < 1000; j++) for (i = 0; i < 1000; i++) │
│ sum += arr[i][j]; sum += arr[i][j]; │
│ │
│ Memory access pattern: Memory access pattern: │
│ │
│ [0,0][0,1][0,2]...[0,999] [0,0][1,0][2,0]...[999,0] │
│ [1,0][1,1][1,2]...[1,999] [0,1][1,1][2,1]...[999,1] │
│ ... ... │
│ │
│ ┌──────────────────────────┐ ┌──────────────────────────┐ │
│ │ Sequential in memory │ │ Jumping 4000 bytes each │ │
│ │ Cache line reuse: HIGH │ │ Cache line reuse: LOW │ │
│ │ Cache misses: FEW │ │ Cache misses: MANY │ │
│ └──────────────────────────┘ └──────────────────────────┘ │
│ │
│ SPEEDUP: Row-major can be 10-100x faster for large arrays! │
│ │
│ WHY? │
│ ───── │
│ - CPU cache loads 64 bytes at a time (cache line) │
│ - Sequential access uses all loaded data │
│ - Strided access wastes loaded data, causes cache misses │
│ │
└─────────────────────────────────────────────────────────────────────────────┘
Project Specification
What You Will Build
An interactive visualization tool that displays:
- Memory layout of multi-dimensional arrays
- Address calculations for any element access
- Differences between various pointer/array declarations
- Cache access patterns for different traversal orders
Functional Requirements
- Memory Layout Visualization (
--layout):- Show how any declared array is laid out in memory
- Display addresses for each element
- Highlight row boundaries
- Address Calculator (
--address):- Calculate address of arr[i][j] given base and dimensions
- Show step-by-step formula application
- Support 2D, 3D, and higher dimensions
- Type Comparator (
--types):- Compare
int arr[M][N],int *arr[M],int (*arr)[N] - Show sizeof differences
- Demonstrate pointer arithmetic differences
- Compare
- Function Parameter Analyzer (
--params):- Show what happens when arrays are passed to functions
- Demonstrate decay behavior
- Show VLA parameter syntax
- Cache Simulator (
--cache):- Simulate cache behavior for different access patterns
- Show cache hit/miss rates
- Visualize which cache lines are loaded
Non-Functional Requirements
- Clear, educational output with ASCII visualizations
- Works with user-specified dimensions
- Interactive mode for exploration
- Generates C code examples demonstrating concepts
Real World Outcome
When complete, you will have a tool that provides deep insight into array memory layout:
$ ./array_navigator --layout "int arr[3][4]" --base 0x1000
================================================================================
MULTI-DIMENSIONAL ARRAY MEMORY LAYOUT
================================================================================
Declaration: int arr[3][4]
Base Address: 0x1000
Element Size: 4 bytes (sizeof(int))
Total Size: 48 bytes (3 * 4 * 4)
CONCEPTUAL VIEW (2D Grid):
═══════════════════════════
Column 0 Column 1 Column 2 Column 3
┌───────────┬───────────┬───────────┬───────────┐
Row 0 │ arr[0][0]│ arr[0][1]│ arr[0][2]│ arr[0][3]│
│ 0x1000 │ 0x1004 │ 0x1008 │ 0x100C │
├───────────┼───────────┼───────────┼───────────┤
Row 1 │ arr[1][0]│ arr[1][1]│ arr[1][2]│ arr[1][3]│
│ 0x1010 │ 0x1014 │ 0x1018 │ 0x101C │
├───────────┼───────────┼───────────┼───────────┤
Row 2 │ arr[2][0]│ arr[2][1]│ arr[2][2]│ arr[2][3]│
│ 0x1020 │ 0x1024 │ 0x1028 │ 0x102C │
└───────────┴───────────┴───────────┴───────────┘
LINEAR MEMORY VIEW:
═══════════════════
Address │ Offset │ Index │ Value Slot
────────┼────────┼──────────┼───────────
0x1000 │ 0 │ [0][0] │ █████████
0x1004 │ 4 │ [0][1] │ █████████
0x1008 │ 8 │ [0][2] │ █████████
0x100C │ 12 │ [0][3] │ █████████
│ │ │ ─── end row 0 ───
0x1010 │ 16 │ [1][0] │ █████████
0x1014 │ 20 │ [1][1] │ █████████
0x1018 │ 24 │ [1][2] │ █████████
0x101C │ 28 │ [1][3] │ █████████
│ │ │ ─── end row 1 ───
0x1020 │ 32 │ [2][0] │ █████████
0x1024 │ 36 │ [2][1] │ █████████
0x1028 │ 40 │ [2][2] │ █████████
0x102C │ 44 │ [2][3] │ █████████
│ │ │ ─── end row 2 ───
ADDRESSING FORMULA:
═══════════════════
&arr[i][j] = base + (i * COLS + j) * sizeof(element)
&arr[i][j] = 0x1000 + (i * 4 + j) * 4
Examples:
&arr[0][0] = 0x1000 + (0*4 + 0)*4 = 0x1000
&arr[1][2] = 0x1000 + (1*4 + 2)*4 = 0x1000 + 24 = 0x1018
&arr[2][3] = 0x1000 + (2*4 + 3)*4 = 0x1000 + 44 = 0x102C
================================================================================
$ ./array_navigator --types
================================================================================
COMPARING ARRAY/POINTER TYPES
================================================================================
TYPE 1: int arr[3][4] (True 2D Array)
──────────────────────────────────────
- sizeof(arr) = 48 bytes (3 * 4 * sizeof(int))
- sizeof(arr[0]) = 16 bytes (4 * sizeof(int))
- arr decays to: int (*)[4] (pointer to array of 4 ints)
- &arr has type: int (*)[3][4] (pointer to whole array)
- arr + 1 advances by: 16 bytes (one row)
Memory:
┌────────────────────────────────────────────────────┐
│ [0,0][0,1][0,2][0,3][1,0][1,1][1,2][1,3][2,0]... │
│ contiguous, 48 bytes total │
└────────────────────────────────────────────────────┘
TYPE 2: int *arr[3] (Array of 3 Pointers)
──────────────────────────────────────────
- sizeof(arr) = 24 bytes (3 * sizeof(int*))
- sizeof(arr[0]) = 8 bytes (sizeof(int*))
- arr decays to: int **
- arr + 1 advances by: 8 bytes (one pointer)
Memory:
┌──────────────────────┐
│ ptr[0] ──────────────┼──▶ [dynamically allocated ints]
│ ptr[1] ──────────────┼──▶ [more ints, different location]
│ ptr[2] ──────────────┼──▶ [more ints, yet another place]
└──────────────────────┘
Rows are NOT contiguous! Different sizes possible!
TYPE 3: int (*arr)[4] (Pointer to Array of 4 Ints)
────────────────────────────────────────────────────
- sizeof(arr) = 8 bytes (sizeof pointer)
- sizeof(*arr) = 16 bytes (sizeof(int[4]))
- arr + 1 advances by: 16 bytes (size of int[4])
- Can point to: int matrix[N][4] for any N
Memory (when pointing to int matrix[3][4]):
┌────────┐ ┌────────────────────────────────────────────┐
│ ─────┼──────▶[0,0][0,1][0,2][0,3][1,0][1,1][1,2][1,3]... │
└────────┘ └────────────────────────────────────────────┘
^ arr points here, arr+1 points 16 bytes later
KEY DIFFERENCES:
════════════════
int arr[3][4] │ int *arr[3] │ int (*arr)[4]
────────────────────────────────────┼───────────────┼───────────────
sizeof(arr) 48 bytes │ 24 bytes │ 8 bytes
Memory layout Contiguous │ Scattered │ Depends on target
Rows same size? Yes (fixed) │ No (jagged) │ Yes (fixed)
Can reassign arr? No (not lvalue)│ No │ Yes (is pointer)
Passing to func Decays │ Decays │ Stays same
$ ./array_navigator --cache "int arr[1000][1000]" --pattern row-major
================================================================================
CACHE ACCESS SIMULATION
================================================================================
Array: int arr[1000][1000]
Total Size: 4,000,000 bytes (3.8 MB)
Cache Line Size: 64 bytes (16 integers per line)
L1 Cache Size: 32 KB (512 cache lines)
ACCESS PATTERN: Row-Major (arr[i][j] with i outer loop)
Simulation:
Accessing row 0, column 0-999:
Load cache line @ 0x0000 → covers arr[0][0] to arr[0][15]
✓ Hit arr[0][0], ✓ Hit arr[0][1], ... ✓ Hit arr[0][15]
Load cache line @ 0x0040 → covers arr[0][16] to arr[0][31]
✓ Hit arr[0][16], ✓ Hit arr[0][17], ...
RESULTS:
Total accesses: 1,000,000
Cache loads: 62,500 (one per 16 elements)
Cache hits: 937,500 (15/16 = 93.75% hit rate)
COMPARISON WITH COLUMN-MAJOR:
Row-Major: ████████████████████████████████████████████████░░░░ 93.75%
Column-Major: ░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░ 0.00%
Column-major would load a new cache line for EVERY access
(because arr[1][0] is 4000 bytes away from arr[0][0])!
PERFORMANCE IMPACT: Row-major is ~16x faster for this array size!
Questions to Guide Your Design
-
How will you parse array declarations like “int arr[3][4][5]” from user input?
-
How will you handle the display for very large arrays where printing every element is impractical?
-
How will you demonstrate the difference between static arrays and dynamically allocated “2D” arrays?
-
How will you show pointer arithmetic visually (arr+1 vs &arr[0]+1 vs &arr+1)?
-
For the cache simulation, what simplifying assumptions will you make?
-
How will you generate compilable example code that users can run to verify your explanations?
Thinking Exercise
Before coding, work through these exercises:
Exercise 1: Address Calculation
Given: int matrix[4][5] at base address 0x2000
Calculate by hand:
- &matrix[0][0] = ?
- &matrix[1][3] = ?
- &matrix[3][4] = ?
- What happens if you access matrix[4][0]? (out of bounds)
Exercise 2: Pointer Type Analysis
For each expression, determine the type:
int arr[3][4];
typeof(arr) = ? // The array itself
typeof(arr[0]) = ? // First row
typeof(arr[0][0])= ? // First element
typeof(&arr) = ? // Address of whole array
typeof(&arr[0]) = ? // Address of first row
typeof(arr + 1) = ? // Pointer to second row
Exercise 3: Function Parameter Decay
Which of these function signatures are equivalent?
void f1(int arr[10][20]);
void f2(int arr[][20]);
void f3(int (*arr)[20]);
void f4(int **arr);
void f5(int *arr[20]);
Exercise 4: Why This Crashes
int arr[3][4] = {{1,2,3,4}, {5,6,7,8}, {9,10,11,12}};
int **pp = (int**)arr;
printf("%d\n", pp[1][2]); // CRASH! Why?
Trace what the code actually does step by step.
Hints in Layers
Hint 1: Parsing Array Declarations
Use a simple regex or manual parsing for declarations:
// Parse "int arr[3][4][5]"
typedef struct {
char type[32]; // "int"
char name[64]; // "arr"
int dims[8]; // {3, 4, 5}
int ndims; // 3
} ArrayDecl;
// Parsing pseudocode:
// 1. Extract type (before first space)
// 2. Extract name (between space and first '[')
// 3. Extract dimensions (between each '[' and ']')
Hint 2: Calculating Addresses Generically
For N-dimensional array, the address formula generalizes:
// arr[i₀][i₁][i₂]...[iₙ₋₁] for dims[n]
size_t offset = 0;
size_t stride = elem_size;
for (int d = ndims - 1; d >= 0; d--) {
offset += indices[d] * stride;
stride *= dims[d];
}
address = base + offset;
Hint 3: Visualizing Large Arrays
For large arrays, show:
- First few elements of first row
- “…” indicator
- Last few elements of last row
- Summary statistics
if (rows > 5) {
show_rows(0, 2); // First 2 rows
printf("... (%d rows omitted) ...\n", rows - 4);
show_rows(rows-2, 2); // Last 2 rows
}
Hint 4: Cache Simulation
Simple cache simulation:
typedef struct {
size_t line_size; // e.g., 64 bytes
size_t num_lines; // e.g., 512
size_t *tags; // Which memory block is in each line
int *valid; // Is line valid?
} Cache;
int access(Cache *c, size_t address) {
size_t line_addr = address / c->line_size;
size_t index = line_addr % c->num_lines; // Direct-mapped
if (c->valid[index] && c->tags[index] == line_addr) {
return 1; // Hit
}
c->tags[index] = line_addr;
c->valid[index] = 1;
return 0; // Miss
}
Hint 5: Generating Example Code
Generate runnable C code that verifies concepts:
void generate_verification_code(ArrayDecl *decl, FILE *out) {
fprintf(out, "#include <stdio.h>\n\n");
fprintf(out, "int main(void) {\n");
fprintf(out, " %s %s", decl->type, decl->name);
for (int i = 0; i < decl->ndims; i++) {
fprintf(out, "[%d]", decl->dims[i]);
}
fprintf(out, ";\n\n");
fprintf(out, " printf(\"Base address: %%p\\n\", (void*)%s);\n", decl->name);
// ... more verification code
fprintf(out, "}\n");
}
Solution Architecture
High-Level Design
┌─────────────────────────────────────────────────────────────────────────────┐
│ ARRAY NAVIGATOR │
├─────────────────────────────────────────────────────────────────────────────┤
│ │
│ ┌──────────────┐ ┌──────────────┐ ┌──────────────┐ ┌───────────────┐│
│ │ Declaration │ │ Layout │ │ Type │ │ Cache ││
│ │ Parser │──▶│ Visualizer │ │ Comparator │ │ Simulator ││
│ └──────────────┘ └──────────────┘ └──────────────┘ └───────────────┘│
│ │ │ │ │ │
│ ▼ ▼ ▼ ▼ │
│ ┌─────────────────────────────────────────────────────────────────────────┐│
│ │ Report Generator ││
│ │ (ASCII tables, diagrams, code examples) ││
│ └─────────────────────────────────────────────────────────────────────────┘│
│ │
└─────────────────────────────────────────────────────────────────────────────┘
Data Structures
typedef struct {
char element_type[32]; // "int", "double", etc.
size_t element_size; // sizeof(element_type)
int dimensions[8]; // e.g., {3, 4, 5} for [3][4][5]
int num_dimensions; // e.g., 3
char name[64]; // e.g., "matrix"
} ArrayDeclaration;
typedef struct {
size_t base_address;
int indices[8]; // e.g., {1, 2} for [1][2]
size_t calculated_address;
char formula[256]; // Human-readable formula
} AddressCalculation;
typedef struct {
size_t line_size;
size_t total_size;
size_t accesses;
size_t hits;
size_t misses;
} CacheStats;
typedef struct {
char signature[256];
char decayed_type[128];
char explanation[512];
} FunctionParameter;
Implementation Guide
Phase 1: Declaration Parser (Day 1)
Tasks:
- Parse “int arr[M][N]” style declarations
- Support various types (int, char, double, struct types)
- Calculate total size and per-dimension strides
Phase 2: Layout Visualizer (Days 2-3)
Tasks:
- Generate 2D grid view for 2D arrays
- Generate linear memory view
- Show address formula and examples
- Handle 3D and higher dimensions
Phase 3: Type Comparator (Days 4-5)
Tasks:
- Compare arr[M][N] vs arr[M] vs (arr)[N]
- Show sizeof differences
- Demonstrate pointer arithmetic differences
- Generate code examples
Phase 4: Cache Simulator (Days 6-7)
Tasks:
- Implement simple direct-mapped cache model
- Simulate row-major vs column-major access
- Display hit/miss statistics
- Show visual comparison
Testing Strategy
Test Cases
- 2D Array Layout
- int arr[3][4] - standard case
- char arr[2][8] - different element size
- double arr[10][10] - larger array
- Address Calculations
- Verify calculated addresses match actual addresses
- Test boundary elements (first, last, corners)
- Type Comparisons
- Verify sizeof calculations
- Verify pointer arithmetic steps
- Cache Simulation
- Known patterns with predictable hit rates
- Compare with theoretical analysis
Verification Approach
Generate C code that users can compile and run to verify:
// Generated verification code
int arr[3][4];
printf("Calculated &arr[1][2]: 0x%lx\n", CALCULATED);
printf("Actual &arr[1][2]: %p\n", (void*)&arr[1][2]);
// Should match!
Common Pitfalls & Debugging
| Pitfall | Symptom | Solution |
|---|---|---|
| Forgetting element size | Addresses wrong by factor of 4 | Multiply by sizeof(type) |
| Confusing row/column major | Wrong elements accessed | Verify formula matches C’s layout |
| Wrong pointer arithmetic | Unexpected address | Remember arr+1 moves by sizeof(*arr) |
| Type confusion | Compilation errors | Draw out the types carefully |
| Off-by-one in formulas | Wrong addresses at boundaries | Test with known small arrays |
Extensions & Challenges
Beginner Extensions
- Add support for typedef’d types
- Colorize output for different sections
- Export visualizations as HTML
Intermediate Extensions
- Support for struct arrays
- Visualization of padding in structs
- Interactive mode with REPL
Advanced Extensions
- Full cache hierarchy simulation (L1/L2/L3)
- SIMD access pattern analysis
- Integration with compiler output (compare with gcc -S)
Self-Assessment Checklist
Understanding
- I can calculate array element addresses by hand
- I know why int** is not the same as int[][N]
- I understand row-major layout and its cache implications
- I can write correct function signatures for 2D arrays
- I understand VLA parameter syntax
Implementation
- Layout visualization works for 2D and 3D arrays
- Address calculations are verified correct
- Type comparison shows meaningful differences
- Cache simulation produces reasonable results
The Interview Questions They’ll Ask
- “How are 2D arrays stored in memory in C?”
- Row-major order: complete rows stored consecutively
- arr[i][j] = base + (i * cols + j) * sizeof(element)
- All elements contiguous in memory
- “Why can’t you pass int arr[][] to a function?”
- Compiler needs column count for address calculation
- arr[i][j] requires knowing stride
- Must be: int arr[][COLS] or int (*arr)[COLS]
- “What’s the difference between int arr[N] and int (arr)[N]?”
- First: array of N pointers to int
- Second: pointer to array of N ints
- Different sizeof, different memory layout
- “Why is row-major traversal faster than column-major?”
- Cache lines load contiguous memory
- Row-major accesses sequential addresses
- Column-major jumps by row_size each access
Resources
Essential Reading
- Expert C Programming Ch. 9-10: Arrays and pointers in depth
- CS:APP Ch. 6.2-6.3: Memory hierarchy and cache optimization
- K&R Ch. 5.7-5.9: Multi-dimensional arrays
Books That Will Help
| Topic | Book | Chapter |
|---|---|---|
| Array/pointer relationship | Expert C Programming | Ch. 4, 9, 10 |
| Memory layout | Write Great Code Vol 1 | Ch. 4 |
| Cache optimization | CS:APP | Ch. 6 |
| C99 VLAs | C Programming: A Modern Approach | Ch. 8 |
Submission / Completion Criteria
Minimum Viable:
- Layout visualization for 2D arrays
- Address calculation with formula display
- Basic type comparison
Full Completion:
- Support for 2D, 3D arrays
- Type comparator with all three major types
- Cache simulation with statistics
- Generated verification code
Excellence:
- Interactive REPL mode
- Full cache hierarchy simulation
- HTML export with diagrams
- Support for struct arrays and padding
This project is part of the Expert C Programming Mastery series. For the complete learning path, see the project index.