Project 9: Pointer Arithmetic Visualizer

Build an interactive tool that visualizes how pointer arithmetic works, demonstrating sizeof scaling, type-aware increments, and the relationship between array indexing and pointer operations.


Quick Reference

Attribute Value
Language C
Difficulty Intermediate (Level 3)
Time 1 Week
Chapters Expert C Programming Ch. 4
Coolness 4/5 - Eye-Opening
Portfolio Value Interview Ready

Learning Objectives

By completing this project, you will:

  1. Understand sizeof scaling: Know why int *p; p+1 advances by 4 bytes, not 1
  2. Master type-aware pointer arithmetic: Predict address changes for any pointer type
  3. Navigate void pointer arithmetic: Understand why it’s non-standard and how compilers handle it
  4. Calculate pointer differences correctly: Know why p2 - p1 gives element count, not bytes
  5. Identify array bounds violations: Recognize undefined behavior from out-of-bounds pointer arithmetic
  6. Connect indexing to pointer math: Prove that arr[i] equals *(arr + i)

The Core Question You’re Answering

“When I add 1 to a pointer, what actually happens at the byte level, and why?”

Pointer arithmetic is one of C’s most powerful features and one of its most misunderstood. The key insight is that pointer arithmetic is type-aware: adding 1 to a pointer advances it by sizeof(*pointer) bytes, not by 1 byte. This project makes this behavior visceral and intuitive through visualization.


Concepts You Must Understand First

Concept Why It Matters Where to Learn
Basic pointer declaration int *p declares a pointer to int K&R Ch. 5
sizeof operator Returns size in bytes Expert C Ch. 4
Memory addresses Addresses are byte-granular CS:APP Ch. 3
Array decay Arrays decay to pointers in expressions Expert C Ch. 4
Types and their sizes int=4, char=1, double=8 on most systems C Standard

Key Concepts Deep Dive

  1. The Scaling Rule
    • When you compute ptr + n, the actual address is ptr + n * sizeof(*ptr)
    • This makes array indexing work: arr[i] becomes *(arr + i)
    • Scaling happens automatically based on pointer type
    • Expert C Programming Ch. 4
  2. Pointer Subtraction
    • ptr2 - ptr1 gives the number of ELEMENTS between them, not bytes
    • Result has type ptrdiff_t (signed integer)
    • Pointers must point to the same array (or undefined behavior)
    • C Standard section 6.5.6
  3. Void Pointer Arithmetic
    • Standard C says void * arithmetic is undefined
    • GCC/Clang treat it as char * (1-byte increments) as an extension
    • Common source of portability bugs
    • GCC documentation, C Standard
  4. Array Indexing Equivalence
    • arr[i] is defined as *(arr + i)
    • Therefore arr[i] equals i[arr] (both valid!)
    • This is why array bounds aren’t checked
    • K&R, Expert C Ch. 4
  5. Undefined Behavior in Pointer Arithmetic
    • Pointer arithmetic past array bounds is UB
    • Only exception: one past the end is allowed (but not dereferenced)
    • Comparing pointers from different arrays is UB
    • C Standard section 6.5.6

Theoretical Foundation

The Scaling Rule

┌─────────────────────────────────────────────────────────────────────────────┐
│                    POINTER ARITHMETIC SCALING                                │
├─────────────────────────────────────────────────────────────────────────────┤
│                                                                             │
│  THE FUNDAMENTAL RULE:                                                      │
│  ─────────────────────                                                      │
│  For pointer p of type T*:                                                  │
│                                                                             │
│      p + n  =  (address of p) + n * sizeof(T)                               │
│                                                                             │
│  EXAMPLE WITH int* (sizeof(int) = 4):                                       │
│  ─────────────────────────────────────                                      │
│                                                                             │
│  int arr[5] = {10, 20, 30, 40, 50};                                         │
│  int *p = arr;  // p = 0x1000                                               │
│                                                                             │
│  Address:  0x1000    0x1004    0x1008    0x100C    0x1010                   │
│           ┌────────┬────────┬────────┬────────┬────────┐                    │
│  Memory:  │   10   │   20   │   30   │   40   │   50   │                    │
│           └────────┴────────┴────────┴────────┴────────┘                    │
│             ▲         ▲         ▲         ▲         ▲                       │
│             │         │         │         │         │                       │
│             p        p+1       p+2       p+3       p+4                      │
│         (0x1000)  (0x1004)  (0x1008)  (0x100C)  (0x1010)                    │
│                                                                             │
│  Notice: p+1 is 0x1004, not 0x1001!                                         │
│          The +1 means "one int forward" = 4 bytes                           │
│                                                                             │
│  DIFFERENT POINTER TYPES, DIFFERENT SCALING:                                │
│  ─────────────────────────────────────────────                              │
│                                                                             │
│  char *cp = 0x1000;    cp + 1 = 0x1001  (scale by 1)                        │
│  short *sp = 0x1000;   sp + 1 = 0x1002  (scale by 2)                        │
│  int *ip = 0x1000;     ip + 1 = 0x1004  (scale by 4)                        │
│  double *dp = 0x1000;  dp + 1 = 0x1008  (scale by 8)                        │
│  void *vp = 0x1000;    vp + 1 = 0x1001  (GCC extension! Not standard!)      │
│                                                                             │
└─────────────────────────────────────────────────────────────────────────────┘

Pointer Arithmetic Scaling

Pointer Subtraction

┌─────────────────────────────────────────────────────────────────────────────┐
│                    POINTER SUBTRACTION                                       │
├─────────────────────────────────────────────────────────────────────────────┤
│                                                                             │
│  RULE: Subtracting two pointers gives ELEMENT COUNT, not bytes              │
│                                                                             │
│  int arr[5];                                                                │
│  int *p1 = &arr[1];  // 0x1004                                              │
│  int *p2 = &arr[4];  // 0x1010                                              │
│                                                                             │
│  Address:  0x1000    0x1004    0x1008    0x100C    0x1010                   │
│           ┌────────┬────────┬────────┬────────┬────────┐                    │
│           │ arr[0] │ arr[1] │ arr[2] │ arr[3] │ arr[4] │                    │
│           └────────┴────────┴────────┴────────┴────────┘                    │
│                      ▲                           ▲                          │
│                      │                           │                          │
│                     p1                          p2                          │
│                                                                             │
│  p2 - p1 = ?                                                                │
│                                                                             │
│  Bytes: 0x1010 - 0x1004 = 12 bytes                                          │
│  But result is: 3 (three int-sized elements)                                │
│                                                                             │
│  Formula: (p2 - p1) = (addr2 - addr1) / sizeof(*p)                          │
│           = 12 / 4 = 3                                                      │
│                                                                             │
│  TYPE OF RESULT: ptrdiff_t (signed integer, platform-specific size)         │
│                                                                             │
│  DANGER: Subtracting pointers to different arrays is UNDEFINED BEHAVIOR!    │
│                                                                             │
│  int arr1[5], arr2[5];                                                      │
│  int *p = arr1, *q = arr2;                                                  │
│  ptrdiff_t diff = q - p;  // UB! Compilers may produce garbage              │
│                                                                             │
└─────────────────────────────────────────────────────────────────────────────┘

Array Indexing Equivalence

┌─────────────────────────────────────────────────────────────────────────────┐
│                    ARRAY INDEXING AS POINTER ARITHMETIC                      │
├─────────────────────────────────────────────────────────────────────────────┤
│                                                                             │
│  THE DEFINITION (from C Standard):                                          │
│  ─────────────────────────────────                                          │
│                                                                             │
│      arr[i]  is DEFINED as  *(arr + i)                                      │
│                                                                             │
│  This means:                                                                │
│  1. arr decays to pointer to first element                                  │
│  2. Add i (scaled by sizeof element)                                        │
│  3. Dereference the result                                                  │
│                                                                             │
│  CONSEQUENCE:                                                               │
│  ─────────────                                                              │
│  Since addition is commutative:                                             │
│                                                                             │
│      arr[i] = *(arr + i) = *(i + arr) = i[arr]                              │
│                                                                             │
│  Yes, i[arr] is valid C! (Never write this in real code.)                   │
│                                                                             │
│  PROOF BY EXAMPLE:                                                          │
│  ─────────────────                                                          │
│                                                                             │
│  int arr[3] = {100, 200, 300};                                              │
│                                                                             │
│  printf("%d\n", arr[1]);      // prints 200                                 │
│  printf("%d\n", *(arr + 1));  // prints 200                                 │
│  printf("%d\n", 1[arr]);      // prints 200                                 │
│  printf("%d\n", *(1 + arr));  // prints 200                                 │
│                                                                             │
│  All four are IDENTICAL to the compiler!                                    │
│                                                                             │
│  THE ASSEMBLY:                                                              │
│  ─────────────                                                              │
│                                                                             │
│  // For arr[i] where arr is at %rbx and i is in %eax:                       │
│  movslq %eax, %rax           // sign-extend i to 64-bit                     │
│  movl (%rbx,%rax,4), %eax    // load from base + i*4                        │
│                                                                             │
│  The compiler generates the SAME code for arr[i] and *(arr+i)               │
│                                                                             │
└─────────────────────────────────────────────────────────────────────────────┘

Void Pointer Arithmetic (Non-Standard)

┌─────────────────────────────────────────────────────────────────────────────┐
│                    VOID POINTER ARITHMETIC                                   │
├─────────────────────────────────────────────────────────────────────────────┤
│                                                                             │
│  THE PROBLEM:                                                               │
│  ─────────────                                                              │
│  void *vp;                                                                  │
│  vp + 1 = ?                                                                 │
│                                                                             │
│  sizeof(void) is UNDEFINED in standard C!                                   │
│  So what does vp + 1 scale by?                                              │
│                                                                             │
│  STANDARD C SAYS:                                                           │
│  ─────────────────                                                          │
│  Pointer arithmetic on void* is a constraint violation.                     │
│  The code should not compile (or at least warn).                            │
│                                                                             │
│  WHAT GCC/CLANG DO (Extension):                                             │
│  ───────────────────────────────                                            │
│  They treat sizeof(void) as 1, so void* arithmetic works like char*.        │
│                                                                             │
│  void *vp = (void*)0x1000;                                                  │
│  vp + 1;  // GCC: 0x1001 (scales by 1)                                      │
│           // Other compilers: error or warning                              │
│                                                                             │
│  COMMON USE CASE (and the portable fix):                                    │
│  ─────────────────────────────────────────                                  │
│                                                                             │
│  void *ptr = malloc(100);                                                   │
│                                                                             │
│  // Non-portable (GCC extension):                                           │
│  void *next = ptr + 10;  // Skip 10 bytes                                   │
│                                                                             │
│  // Portable:                                                               │
│  void *next = (char*)ptr + 10;  // Explicitly cast to char*                 │
│                                                                             │
│  DETECTION:                                                                 │
│  ──────────                                                                 │
│  gcc -Wpointer-arith    // Warns about void* arithmetic                     │
│  gcc -pedantic          // Errors on non-standard extensions                │
│                                                                             │
└─────────────────────────────────────────────────────────────────────────────┘

Out-of-Bounds Pointer Arithmetic

┌─────────────────────────────────────────────────────────────────────────────┐
│                    POINTER ARITHMETIC BOUNDS                                 │
├─────────────────────────────────────────────────────────────────────────────┤
│                                                                             │
│  VALID POINTER POSITIONS:                                                   │
│  ─────────────────────────                                                  │
│                                                                             │
│  int arr[5];                                                                │
│                                                                             │
│  Valid pointers:                                                            │
│   ┌─────────────────────────────────────────────────────────────┐           │
│   │ &arr[0] │ &arr[1] │ &arr[2] │ &arr[3] │ &arr[4] │ &arr[5]? │           │
│   └─────────────────────────────────────────────────────────────┘           │
│       ▲         ▲         ▲         ▲         ▲         ▲                   │
│       │         │         │         │         │         │                   │
│     Valid     Valid     Valid     Valid     Valid   "One past end"          │
│                                                   (valid to FORM,           │
│                                                   NOT to dereference)       │
│                                                                             │
│  RULE: You may form a pointer to "one past the last element"                │
│        but dereferencing it is UNDEFINED BEHAVIOR.                          │
│                                                                             │
│  UNDEFINED BEHAVIOR:                                                        │
│  ───────────────────                                                        │
│                                                                             │
│  int *p = arr + 6;    // UB: more than one past end                         │
│  int *q = arr - 1;    // UB: before the array                               │
│  *(&arr[5]);          // UB: dereferencing one-past-end                     │
│  arr[5] = 42;         // UB: dereferencing one-past-end                     │
│                                                                             │
│  WHY ONE-PAST-END IS ALLOWED:                                               │
│  ─────────────────────────────                                              │
│  It enables idiomatic C patterns like:                                      │
│                                                                             │
│  int *end = arr + 5;  // One past last element                              │
│  for (int *p = arr; p != end; p++) {                                        │
│      // Process *p                                                          │
│  }                                                                          │
│                                                                             │
│  The comparison p != end is valid; we never dereference 'end'.              │
│                                                                             │
│  WHAT COMPILERS MAY DO WITH UB:                                             │
│  ───────────────────────────────                                            │
│  - Produce expected result (by luck)                                        │
│  - Crash                                                                    │
│  - Format your hard drive (theoretically)                                   │
│  - Optimize away the entire code path                                       │
│  - Modern compilers actively exploit UB for optimization!                   │
│                                                                             │
└─────────────────────────────────────────────────────────────────────────────┘

Pointer Increment vs Address-of Increment

┌─────────────────────────────────────────────────────────────────────────────┐
│                    COMMON POINTER EXPRESSIONS                                │
├─────────────────────────────────────────────────────────────────────────────┤
│                                                                             │
│  Given: int arr[5]; int *p = arr;                                           │
│                                                                             │
│  EXPRESSION              MEANING                    RESULT TYPE             │
│  ──────────────────────────────────────────────────────────────────────     │
│  p                       Address of arr[0]          int*                    │
│  *p                      Value at arr[0]            int                     │
│  p + 1                   Address of arr[1]          int*                    │
│  *(p + 1)                Value at arr[1]            int                     │
│  p++                     Advance p, return old      int*                    │
│  ++p                     Advance p, return new      int*                    │
│  *p++                    Get *p, then advance p     int (value)             │
│  (*p)++                  Increment value at *p      int (old value)         │
│  *++p                    Advance p, then get *p     int (value)             │
│  ++*p                    Increment value at *p      int (new value)         │
│                                                                             │
│  THE TRICKY ONES:                                                           │
│  ─────────────────                                                          │
│                                                                             │
│  *p++ is parsed as *(p++), NOT (*p)++                                       │
│                                                                             │
│  Example:                                                                   │
│  int arr[3] = {10, 20, 30};                                                 │
│  int *p = arr;                                                              │
│                                                                             │
│  int a = *p++;   // a = 10, p now points to arr[1]                          │
│  int b = *p++;   // b = 20, p now points to arr[2]                          │
│  int c = *p++;   // c = 30, p now points past arr (one-past-end)            │
│                                                                             │
│  This idiom is common in string copying:                                    │
│  while (*dst++ = *src++);  // Copy until null terminator                    │
│                                                                             │
│  PRECEDENCE TABLE (relevant operators):                                     │
│  ───────────────────────────────────────                                    │
│  1. () []                 (highest)                                         │
│  2. ++ -- (postfix)                                                         │
│  3. ++ -- * & (prefix/unary)                                                │
│  4. + - (binary)                                                            │
│                                                                             │
│  So: *p++ = *(p++) because postfix ++ binds tighter than *                  │
│                                                                             │
└─────────────────────────────────────────────────────────────────────────────┘

Project Specification

What You Will Build

An interactive visualization tool that:

  1. Shows pointer values and memory at each step
  2. Traces pointer arithmetic operations
  3. Demonstrates the scaling rule
  4. Catches and explains undefined behavior
  5. Compares different pointer types side-by-side

Functional Requirements

  1. Basic Visualization (--trace):
    • Accept C-like expressions
    • Show before/after memory state
    • Display address calculations
  2. Type Comparison (--compare-types):
    • Show same operation on different pointer types
    • Display scaling factors
    • Highlight size differences
  3. Expression Analyzer (--analyze):
    • Parse expressions like *p++, *(p+i)
    • Show step-by-step evaluation
    • Explain precedence
  4. Bounds Checker (--check-bounds):
    • Detect out-of-bounds arithmetic
    • Warn about undefined behavior
    • Show valid pointer range
  5. Interactive Mode (--interactive):
    • REPL for exploring pointer operations
    • Maintain state between commands
    • Show memory visualization

Non-Functional Requirements

  • Educational output with clear explanations
  • Generates verifiable C code examples
  • Handles all standard pointer types
  • Warns about non-portable constructs (void* arithmetic)

Real World Outcome

When complete, you’ll have a tool like this:

$ ./ptr_viz --trace "int arr[5] = {10,20,30,40,50}; int *p = arr; p = p + 2;"

================================================================================
                    POINTER ARITHMETIC TRACE
================================================================================

INITIAL STATE:
──────────────

Declaration: int arr[5] = {10, 20, 30, 40, 50}
             int *p = arr

Memory Layout:
  Address    Value     Variable
  ────────   ─────     ────────
  0x1000     10        arr[0]
  0x1004     20        arr[1]
  0x1008     30        arr[2]
  0x100C     40        arr[3]
  0x1010     50        arr[4]

Pointer State:
  p = 0x1000 (points to arr[0], value = 10)
      └───────────▲

OPERATION: p = p + 2
─────────────────────

Step-by-step calculation:
  1. Current p = 0x1000
  2. Adding 2 to int* pointer
  3. Scale factor: sizeof(int) = 4 bytes
  4. Byte offset: 2 * 4 = 8 bytes
  5. New address: 0x1000 + 8 = 0x1008

AFTER OPERATION:
────────────────

Memory Layout (unchanged):
  Address    Value     Variable
  ────────   ─────     ────────
  0x1000     10        arr[0]
  0x1004     20        arr[1]
  0x1008     30        arr[2]  ◄── p now points here
  0x100C     40        arr[3]
  0x1010     50        arr[4]

Pointer State:
  p = 0x1008 (points to arr[2], value = 30)
              └───────────────────▲

VERIFICATION:
─────────────
  *p = 30 ✓
  p - arr = 2 (element offset)================================================================================

$ ./ptr_viz --compare-types "ptr + 3" --base 0x1000

================================================================================
                    TYPE COMPARISON: ptr + 3
================================================================================

Starting address for all pointers: 0x1000

Type           sizeof    Scale    Address after +3
────────────   ──────    ─────    ────────────────
char *         1 byte    1        0x1000 + 3*1 = 0x1003
short *        2 bytes   2        0x1000 + 3*2 = 0x1006
int *          4 bytes   4        0x1000 + 3*4 = 0x100C
long *         8 bytes   8        0x1000 + 3*8 = 0x1018
float *        4 bytes   4        0x1000 + 3*4 = 0x100C
double *       8 bytes   8        0x1000 + 3*8 = 0x1018
void *         1 byte*   1*       0x1000 + 3*1 = 0x1003  ⚠️  GCC extension!

* void* arithmetic is non-standard; GCC treats as char*

VISUAL:
───────

        0x1000    0x1003    0x1006         0x100C              0x1018
           │         │         │              │                    │
char*      ●─────────●                                            │
           │         │                        │                    │
short*     ●─────────┼─────────●              │                    │
           │         │         │              │                    │
int*       ●─────────┼─────────┼──────────────●                    │
           │         │         │              │                    │
long*      ●─────────┼─────────┼──────────────┼────────────────────●
           │         │         │              │                    │
           Base    +1 char   +1 short      +1 int              +1 long

================================================================================

$ ./ptr_viz --analyze "*p++"

================================================================================
                    EXPRESSION ANALYSIS: *p++
================================================================================

Given: int *p pointing to some integer

PARSING:
────────

Expression: *p++

Operator precedence (high to low):
  1. ++ (postfix) - binds tightest
  2. * (dereference)

Therefore parsed as: *(p++)

STEP-BY-STEP EVALUATION:
────────────────────────

Assume: p = 0x1000, *p = 42

Step 1: Evaluate p++
        - Returns OLD value of p (0x1000)
        - Side effect: p becomes 0x1004 (advances by sizeof(int))

Step 2: Dereference the returned value
        - *(0x1000) = 42

RESULT:
───────
  Expression value: 42 (the value at the ORIGINAL location)
  Side effect: p is now 0x1004

COMMON MISTAKES:
────────────────
  ✗ Thinking *p is incremented (that would be (*p)++)
  ✗ Thinking the NEW location is dereferenced (that would be *++p)

COMPARISON TABLE:
─────────────────
  Expression    Value Returned    p after    Memory after
  ──────────    ──────────────    ───────    ────────────
  *p++          42 (old *p)       0x1004     unchanged
  (*p)++        42 (old *p)       0x1000     *p is now 43
  *++p          value at 0x1004   0x1004     unchanged
  ++*p          43 (new *p)       0x1000     *p is now 43

================================================================================

$ ./ptr_viz --check-bounds "int arr[5]; int *p = arr + 7;"

================================================================================
                    BOUNDS CHECK WARNING
================================================================================

Declaration: int arr[5]
             int *p = arr + 7

⚠️  UNDEFINED BEHAVIOR DETECTED!

ANALYSIS:
─────────

Array bounds:
  arr[0] at offset 0  (valid)
  arr[1] at offset 1  (valid)
  arr[2] at offset 2  (valid)
  arr[3] at offset 3  (valid)
  arr[4] at offset 4  (valid)
  arr[5] at offset 5  (ONE-PAST-END: valid to form, not to dereference)

Pointer calculation: arr + 7
  This is offset 7, which is 2 past the allowed limit.

         arr    arr+1  arr+2  arr+3  arr+4  arr+5  arr+6  arr+7
          │      │      │      │      │      │      │      │
  ────────●──────●──────●──────●──────●──────○──────✗──────✗────────
          │◄───── valid array ───────▶│     │      │      │
                                      │     │      │      │
                                   one-past │    OUT OF BOUNDS!
                                   (OK for  │
                                   pointer, │
                                   not *ptr)│

CONSEQUENCES:
─────────────
  - The C standard says this is undefined behavior
  - The compiler may:
    • Produce "expected" result (by luck)
    • Crash when dereferenced
    • Optimize away code that depends on this
    • Do literally anything

RECOMMENDATION:
───────────────
  Always keep pointers within array bounds (or at one-past-end).
  Use array length constants or sizeof to stay safe:

    #define ARRLEN(arr) (sizeof(arr)/sizeof(arr[0]))
    for (int *p = arr; p < arr + ARRLEN(arr); p++) { ... }

================================================================================

Questions to Guide Your Design

  1. How will you simulate memory without actually allocating at arbitrary addresses?

  2. How will you parse C-like expressions to understand the operations?

  3. How will you handle the difference between compile-time and runtime knowledge?

  4. For the interactive mode, how will you track variable state?

  5. How will you detect undefined behavior while still showing what would happen?

  6. How will you generate verifiable C code from visualizations?


Thinking Exercise

Before coding, work through these exercises on paper:

Exercise 1: Address Calculation

Given:

double arr[4];  // at address 0x2000, sizeof(double) = 8
double *p = arr + 1;
double *q = &arr[3];

Calculate:

  • Address of p: ?
  • Address of q: ?
  • Value of (q - p): ?
  • Value of (char)q - (char)p: ?

Exercise 2: Expression Evaluation

Given:

int arr[4] = {10, 20, 30, 40};
int *p = arr + 1;  // p points to arr[1]

Evaluate each expression (give value AND side effects):

  • *p
  • *p++
  • *(p++)
  • (*p)++
  • *++p
  • ++*p
  • *(p + 1)
  • *(++p + 1)

Exercise 3: Void Pointer Puzzle

int arr[3] = {0x11223344, 0x55667788, 0x99AABBCC};
void *vp = arr;
char *cp = (char*)arr;

// On little-endian, what are:
*(int*)vp           = ?
*(int*)(vp + 4)     = ?  // GCC extension
*(char*)vp          = ?
*(cp + 1)           = ?

Exercise 4: Bounds Analysis

int arr[10];
int *p = arr;

// Which of these are valid? Mark V=valid, U=undefined:
p + 0    = ?
p + 9    = ?
p + 10   = ?
p + 11   = ?
p - 1    = ?
&arr[10] = ?
arr[10]  = ?

Hints in Layers

Hint 1: Simulating Memory

Create a simple memory model:

typedef struct {
    size_t address;     // Simulated address
    size_t size;        // Size of allocation
    char *data;         // Actual storage
    char *name;         // Variable name
} MemoryRegion;

typedef struct {
    MemoryRegion regions[100];
    int count;
    size_t next_address;  // Next "address" to assign
} SimulatedMemory;

// Allocate a new region at a simulated address
size_t sim_alloc(SimulatedMemory *mem, const char *name, size_t size) {
    size_t addr = mem->next_address;
    mem->regions[mem->count].address = addr;
    mem->regions[mem->count].size = size;
    mem->regions[mem->count].data = calloc(1, size);
    mem->regions[mem->count].name = strdup(name);
    mem->count++;
    mem->next_address = (addr + size + 15) & ~15;  // Align
    return addr;
}
Hint 2: Parsing Expressions

For simple expression parsing, use recursive descent:

// Simplified: handle ptr, ptr+n, ptr-n, *ptr, ptr++
typedef enum {
    TOK_IDENT, TOK_NUMBER, TOK_PLUS, TOK_MINUS,
    TOK_STAR, TOK_PLUSPLUS, TOK_LPAREN, TOK_RPAREN
} TokenType;

typedef struct {
    enum { EXPR_IDENT, EXPR_NUM, EXPR_ADD, EXPR_DEREF, EXPR_POSTINC } type;
    union {
        char *ident;
        long number;
        struct { struct Expr *left, *right; } binary;
        struct Expr *unary;
    };
} Expr;
Hint 3: Type-Aware Calculations
typedef struct {
    char *base_type;   // "int", "char", "double", etc.
    int pointer_level; // 0 = value, 1 = pointer, 2 = pointer-to-pointer
    size_t base_size;  // sizeof(base_type)
} PointerType;

size_t get_scale_factor(PointerType *type) {
    if (type->pointer_level == 0) return 0;  // Can't do arithmetic on value
    if (type->pointer_level > 1) return sizeof(void*);  // Pointer to pointer
    return type->base_size;  // Regular pointer
}

size_t add_to_pointer(size_t addr, int n, PointerType *type) {
    return addr + n * get_scale_factor(type);
}
Hint 4: ASCII Visualization
void draw_memory_bar(size_t start, size_t end, size_t highlight) {
    printf("  ");
    for (size_t addr = start; addr < end; addr += 4) {
        if (addr == highlight) {
            printf("▼───────");
        } else {
            printf("────────");
        }
    }
    printf("\n");

    printf("  ");
    for (size_t addr = start; addr < end; addr += 4) {
        printf("│%6zx │", addr);
    }
    printf("\n");
}

Solution Architecture

Data Structures

// Represent a pointer variable
typedef struct {
    char name[32];
    char type[32];          // "int*", "char*", etc.
    size_t address;         // Current value (address it points to)
    size_t element_size;    // sizeof pointed-to type
} Pointer;

// Represent an array in memory
typedef struct {
    char name[32];
    char element_type[32];
    size_t base_address;
    size_t element_size;
    size_t count;
    char *data;             // Actual values
} Array;

// Visualization state
typedef struct {
    Array arrays[10];
    int array_count;
    Pointer pointers[10];
    int pointer_count;
} VizState;

Key Functions

// Calculate new address after pointer arithmetic
size_t ptr_add(Pointer *p, int offset);

// Check if pointer is in valid bounds
bool check_bounds(VizState *state, Pointer *p);

// Parse and evaluate expression
int eval_expression(VizState *state, const char *expr);

// Generate visualization output
void visualize_state(VizState *state, FILE *out);

// Generate verifiable C code
void generate_c_code(VizState *state, const char *expr, FILE *out);

Testing Strategy

Test Cases

  1. Basic Arithmetic
    • p + 1, p - 1 for various types
    • Verify address calculation matches formula
  2. Complex Expressions
    • *p++, *(p+1), ++*p
    • Verify both value and side effects
  3. Bounds Checking
    • Exactly at bounds (valid)
    • One past (valid for pointer)
    • Beyond (undefined)
  4. Type Variations
    • All standard types (char through long double)
    • void* (with warning)
    • Struct pointers

Verification

Generate C code that can be compiled and run:

// Generated verification
#include <stdio.h>
int main(void) {
    int arr[5] = {10, 20, 30, 40, 50};
    int *p = arr;

    printf("Before: p = %p, *p = %d\n", (void*)p, *p);
    p = p + 2;
    printf("After p+2: p = %p, *p = %d\n", (void*)p, *p);

    // Expected from visualization:
    // p should advance by 8 bytes (2 * sizeof(int))
    // *p should now be 30
}

Common Pitfalls & Debugging

Pitfall Symptom Solution
Forgetting scaling Addresses off by factor Multiply by sizeof
Wrong precedence Unexpected expression results Use parentheses to clarify
Void pointer math Compile errors (strict) or wrong scale (GCC) Cast to char* explicitly
Off-by-one in bounds False positives on one-past-end Remember one-past is valid
Pointer subtraction Getting bytes instead of elements Result is already scaled

Extensions & Challenges

Beginner Extensions

  • Colorize output for different variable types
  • Add support for struct pointers
  • Generate GDB commands to verify

Intermediate Extensions

  • Support for pointer-to-pointer
  • Handle function pointers
  • Interactive debugger-like stepping

Advanced Extensions

  • Parse actual C source files
  • Integration with valgrind output
  • Detect undefined behavior statically

Self-Assessment Checklist

Understanding

  • I can calculate pointer addresses by hand
  • I know the scaling factor for every basic type
  • I understand why void* arithmetic is non-standard
  • I can trace expressions like *p++ step by step
  • I know what pointer operations are undefined behavior

Implementation

  • Basic visualization works for int pointers
  • All standard types supported
  • Expression parser handles common cases
  • Bounds checking detects violations
  • Generated C code verifies results

The Interview Questions They’ll Ask

  1. “What happens when you add 1 to an int pointer?”
    • Address increases by sizeof(int), typically 4 bytes
    • Scaling is automatic based on pointer type
    • This makes arr[i] work as *(arr+i)
  2. “What’s the difference between p++ and (p)++?”
    • *p++: dereferences p, then increments pointer
    • (*p)++: increments the value pointed to, pointer unchanged
    • Postfix ++ has higher precedence than unary *
  3. “What’s undefined behavior with pointers?”
    • Dereferencing null or uninitialized pointers
    • Pointer arithmetic past array bounds
    • Comparing pointers to different arrays
    • Dereferencing one-past-end pointer
  4. “Why is void* arithmetic a problem?”
    • Standard C doesn’t define sizeof(void)
    • GCC treats it as 1 (like char*)
    • Code using it isn’t portable
    • Solution: cast to char* explicitly

Resources

Essential Reading

  • Expert C Programming Ch. 4: Pointers and arrays
  • K&R Ch. 5: Pointers and Arrays
  • C Standard 6.5.6: Additive operators (pointer arithmetic)

Books That Will Help

Topic Book Chapter
Pointer fundamentals K&R Ch. 5
Deep pointer semantics Expert C Programming Ch. 4
Memory and addresses CS:APP Ch. 3
Undefined behavior Effective C Ch. 8

Submission / Completion Criteria

Minimum Viable:

  • Basic trace visualization for int pointers
  • Address calculation display
  • Type comparison for basic types

Full Completion:

  • All standard types supported
  • Expression analyzer for complex expressions
  • Bounds checking with warnings
  • Interactive mode

Excellence:

  • Parse real C expressions
  • Integration with actual memory (via /proc)
  • Web interface for visualization
  • Detect UB statically

This project is part of the Expert C Programming Mastery series. For the complete learning path, see the project index.