Project 1: Memory Inspector Tool

The Core Question: “What IS memory? Where do my variables actually live, and how can I see them?”

Project Overview

Attribute	Value
Difficulty	Intermediate
Time Estimate	Weekend (8-16 hours)
Language	C
Prerequisites	Basic C syntax, compiling with gcc/clang
Main Book	“Computer Systems: A Programmer’s Perspective” by Bryant & O’Hallaron

Learning Objectives

By completing this project, you will:

Understand memory as numbered bytes - Not abstract “variables” but actual addresses in memory
Visualize stack vs heap - See how local variables and malloc’d memory occupy different regions
Master pointer semantics - Know exactly what &x, *p, and p + 1 mean at the hardware level
Use debuggers effectively - Verify your understanding with lldb/gdb
Develop memory intuition - Instinctively think of memory as a big array of bytes

Theoretical Foundation

What Memory Actually Is

At the hardware level, your computer’s RAM is a giant array of bytes. Each byte has:

An address: A number from 0 to (RAM_SIZE - 1)
A value: An 8-bit number (0-255)

When you write int x = 42; in C, you’re saying:

“Reserve 4 consecutive bytes somewhere, and store the binary representation of 42 in them.”

Memory Address    Contents (hex)    Contents (decimal)
0x7ffeefbff4ac    2A                42    <- x lives here (4 bytes: 2A 00 00 00)
0x7ffeefbff4ad    00                0
0x7ffeefbff4ae    00                0
0x7ffeefbff4af    00                0

The Address-of Operator (&)

The & operator returns the address where a variable is stored:

int x = 42;
int *p = &x;  // p contains the ADDRESS of x

// If x is at 0x7ffeefbff4ac:
// - x contains 42 (the VALUE)
// - &x returns 0x7ffeefbff4ac (the ADDRESS)
// - p contains 0x7ffeefbff4ac (same as &x)
// - *p returns 42 (the value AT that address)

The Process Memory Layout

When your program runs, the operating system creates a virtual address space:

High addresses (0xFFFFFFFF...)
┌────────────────────────────┐
│         Kernel Space       │  ← OS code (you can't touch this)
├────────────────────────────┤
│                            │
│          Stack             │  ← Local variables, return addresses
│            ↓               │    GROWS DOWNWARD
│                            │
│         (empty)            │
│                            │
│            ↑               │
│          Heap              │  ← malloc'd memory
│                            │    GROWS UPWARD
├────────────────────────────┤
│          BSS               │  ← Uninitialized globals
├────────────────────────────┤
│          Data              │  ← Initialized globals
├────────────────────────────┤
│          Text              │  ← Your compiled code (read-only)
└────────────────────────────┘
Low addresses (0x00000000...)

Stack vs Heap: The Key Distinction

Aspect	Stack	Heap
Allocation	Automatic (when function called)	Manual (`malloc()`)
Deallocation	Automatic (when function returns)	Manual (`free()`)
Speed	Very fast (just move stack pointer)	Slower (allocator overhead)
Size	Limited (~8MB default on Linux)	Limited by RAM
Growth	Downward (toward lower addresses)	Upward (toward higher addresses)
Typical addresses	High (0x7fff…)	Lower (0x6000…)

Why Stack Grows Downward

When you call a function, a new “stack frame” is pushed:

void bar() {
    int z = 30;    // Lives at lower address than y
}

void foo() {
    int y = 20;    // Lives at lower address than x
    bar();
}

int main() {
    int x = 10;    // Lives at high address
    foo();
}

Stack during bar():
┌─────────────────────┐ High addresses
│   main's frame      │
│   int x = 10        │  ← 0x7ffeefbff4bc
├─────────────────────┤
│   foo's frame       │
│   int y = 20        │  ← 0x7ffeefbff49c
├─────────────────────┤
│   bar's frame       │
│   int z = 30        │  ← 0x7ffeefbff47c
└─────────────────────┘ Low addresses (stack grows down)

Pointer Arithmetic

C pointers are “type-aware”—arithmetic moves by the size of the pointed-to type:

int arr[3] = {10, 20, 30};
int *p = arr;

// If p points to address 0x1000:
// p + 0  →  0x1000  →  arr[0] = 10
// p + 1  →  0x1004  →  arr[1] = 20  (moved by 4 bytes = sizeof(int))
// p + 2  →  0x1008  →  arr[2] = 30

This is why char *p and int *p behave differently:

char *p: p + 1 moves by 1 byte
int *p: p + 1 moves by 4 bytes
double *p: p + 1 moves by 8 bytes

Endianness: How Multi-Byte Values Are Stored

On x86/x64 (little-endian), the least significant byte comes first:

int x = 0x12345678;

Memory layout (little-endian):
Address   Value
0x1000    0x78    ← Least significant byte first
0x1001    0x56
0x1002    0x34
0x1003    0x12    ← Most significant byte last

Project Specification

What You’re Building

A command-line tool that visualizes the memory layout of a C program, showing:

Stack variables and their addresses
Heap allocations and their addresses
How addresses change during function calls
Raw byte contents of variables
(Optional) Memory corruption demonstrations

Core Features

Feature 1: Stack Variable Visualization

$ ./memory_inspector --stack
[STACK VARIABLES]
Variable 'x' (int):
  Address: 0x7ffeefbff4ac
  Value: 42
  Size: 4 bytes
  Raw bytes: 2a 00 00 00

Feature 2: Heap Allocation Visualization

$ ./memory_inspector --heap
[HEAP ALLOCATIONS]
Pointer 'p' points to:
  Address: 0x600000004000
  Value: 100
  Size: 4 bytes
  Location: HEAP

Feature 3: Stack Frame Inspection

$ ./memory_inspector --frames
[STACK FRAME VISUALIZATION]
Calling sequence: main() → foo() → bar()

In bar(): z at 0x7ffeefbff47c = 30
In foo(): y at 0x7ffeefbff49c = 20
In main(): x at 0x7ffeefbff4bc = 10

Notice: Addresses DECREASE as we go deeper!

Feature 4: Raw Byte Dump

$ ./memory_inspector --bytes
Integer: 0x12345678
Byte-by-byte (little-endian):
  Byte 0: 0x78 (least significant)
  Byte 1: 0x56
  Byte 2: 0x34
  Byte 3: 0x12 (most significant)

Solution Architecture

Module Design

memory_inspector/
├── main.c              # Entry point, argument parsing
├── stack_demo.c        # Stack visualization functions
├── heap_demo.c         # Heap allocation demonstrations
├── frame_demo.c        # Stack frame hierarchy
├── bytes_demo.c        # Raw byte inspection
├── utils.c             # Printing utilities
├── utils.h             # Shared declarations
└── Makefile

Key Data Structures

// For tracking memory regions
typedef enum {
    REGION_STACK,
    REGION_HEAP,
    REGION_BSS,
    REGION_DATA,
    REGION_TEXT,
    REGION_UNKNOWN
} MemoryRegion;

// For describing a variable's memory location
typedef struct {
    const char *name;
    void *address;
    size_t size;
    const char *type_name;
    MemoryRegion region;
} MemoryInfo;

Core Functions to Implement

// Determine which memory region an address belongs to
MemoryRegion classify_address(void *addr);

// Print variable information
void inspect_variable(const char *name, void *addr, size_t size, const char *type);

// Dump raw bytes of a variable
void dump_bytes(void *addr, size_t size);

// Demonstrate stack frame hierarchy
void demonstrate_stack_frames(void);

// Show heap allocation behavior
void demonstrate_heap(void);

Implementation Guide

Phase 1: Basic Address Printing (2-3 hours)

Goal: Print the address of a single variable.

Start with the simplest possible program:

#include <stdio.h>

int main(void) {
    int x = 42;
    printf("x is at address %p, value = %d\n", (void*)&x, x);
    return 0;
}

Checkpoint Questions:

What format does %p print in? (Hexadecimal)
Why cast to (void*)? (Portability—%p expects void pointer)
Run it 3 times. Does the address change? (Yes, due to ASLR)

Extension: Add more variables and observe their relative positions:

int a = 1;
int b = 2;
int c = 3;
printf("a: %p, b: %p, c: %p\n", (void*)&a, (void*)&b, (void*)&c);
// Observe: addresses decrease (stack grows down)

Phase 2: Stack vs Heap Comparison (2-3 hours)

Goal: Show the difference between stack and heap addresses.

void compare_stack_heap(void) {
    int stack_var = 100;
    int *heap_ptr = malloc(sizeof(int));
    *heap_ptr = 200;

    printf("Stack variable at: %p\n", (void*)&stack_var);
    printf("Heap allocation at: %p\n", (void*)heap_ptr);

    // Notice: stack addresses are much higher
    // Stack: 0x7fff... (high addresses)
    // Heap:  0x6000... (lower addresses)

    free(heap_ptr);
}

Key Insight: You can often tell whether memory is stack or heap by looking at the address prefix:

Stack addresses typically start with 0x7ff... on 64-bit Linux/macOS
Heap addresses typically start with 0x6... or lower

Phase 3: Function Call Stack Demonstration (2-3 hours)

Goal: Visualize how function calls create stack frames.

void bar(void) {
    int z = 30;
    printf("  In bar(): z at %p = %d\n", (void*)&z, z);
}

void foo(void) {
    int y = 20;
    printf("  In foo(): y at %p = %d\n", (void*)&y, y);
    bar();
    printf("  Back in foo()\n");
}

int main(void) {
    int x = 10;
    printf("In main(): x at %p = %d\n", (void*)&x, x);
    foo();
    printf("Back in main()\n");
    return 0;
}

Expected Output:

In main(): x at 0x7ffeefbff4bc = 10
  In foo(): y at 0x7ffeefbff49c = 20
    In bar(): z at 0x7ffeefbff47c = 30
  Back in foo()
Back in main()

Calculate the frame size: 0x7ffeefbff4bc - 0x7ffeefbff49c = 32 bytes between main and foo.

Phase 4: Raw Byte Inspection (2-3 hours)

Goal: See exactly how multi-byte values are stored.

void dump_bytes(void *ptr, size_t size) {
    unsigned char *bytes = (unsigned char *)ptr;
    for (size_t i = 0; i < size; i++) {
        printf("  Byte %zu at %p: 0x%02x\n", i, (void*)(bytes + i), bytes[i]);
    }
}

int main(void) {
    int x = 0x12345678;
    printf("Integer 0x%08x at %p:\n", x, (void*)&x);
    dump_bytes(&x, sizeof(x));
    return 0;
}

Expected Output (on little-endian system):

Integer 0x12345678 at 0x7ffeefbff4ac:
  Byte 0 at 0x7ffeefbff4ac: 0x78
  Byte 1 at 0x7ffeefbff4ad: 0x56
  Byte 2 at 0x7ffeefbff4ae: 0x34
  Byte 3 at 0x7ffeefbff4af: 0x12

Phase 5: Struct Padding Demonstration (2-3 hours)

Goal: See how compilers add padding for alignment.

struct Padded {
    char a;     // 1 byte
    int b;      // 4 bytes
    char c;     // 1 byte
};

int main(void) {
    struct Padded p = {'A', 100, 'B'};

    printf("sizeof(struct Padded) = %zu\n", sizeof(struct Padded));
    printf("Expected without padding: %zu\n", sizeof(char) + sizeof(int) + sizeof(char));

    printf("\nField addresses:\n");
    printf("  a at offset %zu: %p\n", offsetof(struct Padded, a), (void*)&p.a);
    printf("  b at offset %zu: %p\n", offsetof(struct Padded, b), (void*)&p.b);
    printf("  c at offset %zu: %p\n", offsetof(struct Padded, c), (void*)&p.c);

    printf("\nRaw bytes:\n");
    dump_bytes(&p, sizeof(p));

    return 0;
}

Expected Output:

sizeof(struct Padded) = 12
Expected without padding: 6

Field addresses:
  a at offset 0
  b at offset 4
  c at offset 8

Raw bytes:
  Byte 0: 0x41 ('A')
  Byte 1: 0x00 (padding)
  Byte 2: 0x00 (padding)
  Byte 3: 0x00 (padding)
  Byte 4: 0x64 (100, least significant)
  Byte 5: 0x00
  Byte 6: 0x00
  Byte 7: 0x00
  Byte 8: 0x42 ('B')
  Byte 9: 0x00 (padding)
  Byte 10: 0x00 (padding)
  Byte 11: 0x00 (padding)

Testing Strategy

Test 1: Address Consistency

Run the program multiple times and verify:

Stack addresses change (ASLR)
Relative positions within a function remain consistent
Stack grows downward (addresses decrease with depth)

Test 2: Stack vs Heap Verification

void test_stack_vs_heap(void) {
    int stack_var;
    int *heap_ptr = malloc(sizeof(int));

    // Stack should be at higher address than heap
    assert((uintptr_t)&stack_var > (uintptr_t)heap_ptr);

    free(heap_ptr);
    printf("Stack vs Heap test: PASS\n");
}

Test 3: Endianness Verification

void test_endianness(void) {
    int x = 0x01;
    unsigned char *bytes = (unsigned char*)&x;

    if (bytes[0] == 0x01) {
        printf("System is little-endian (x86/x64)\n");
    } else {
        printf("System is big-endian\n");
    }
}

Test 4: Using lldb for Verification

$ clang -g memory_inspector.c -o memory_inspector
$ lldb ./memory_inspector
(lldb) breakpoint set --name main
(lldb) run
(lldb) frame variable         # Show local variables
(lldb) memory read &x         # Show raw bytes at x's address
(lldb) register read rsp      # Show stack pointer

Common Pitfalls and Debugging Tips

Pitfall 1: Forgetting (void*) Cast with %p

// WRONG - undefined behavior
printf("%p\n", &x);

// CORRECT
printf("%p\n", (void*)&x);

Pitfall 2: Confusing & and *

int x = 42;
int *p = &x;

// &x  = address of x     (a number like 0x7fff...)
// x   = value of x       (42)
// p   = address of x     (same as &x)
// *p  = value at address p (42, same as x)
// &p  = address of p     (different from &x!)

Pitfall 3: Returning Pointer to Local Variable

// WRONG - undefined behavior!
int* bad_function(void) {
    int local = 42;
    return &local;  // local dies when function returns!
}

// CORRECT - allocate on heap
int* good_function(void) {
    int *ptr = malloc(sizeof(int));
    *ptr = 42;
    return ptr;  // caller must free
}

Debugging with AddressSanitizer

$ clang -fsanitize=address -g memory_inspector.c -o memory_inspector
$ ./memory_inspector

AddressSanitizer will catch:

Use-after-free
Buffer overflows
Stack use after return

Extensions and Challenges

Challenge 1: Memory Region Classifier

Implement a function that determines which region an address belongs to:

MemoryRegion classify_address(void *addr) {
    // Use heuristics based on address ranges
    // Stack: 0x7fff... range
    // Heap: 0x6... range
    // etc.
}

Challenge 2: Pointer Validity Detector

Create a function that attempts to detect dangling pointers:

// Track allocations and frees
void* tracked_malloc(size_t size);
void tracked_free(void *ptr);
bool is_valid_pointer(void *ptr);

Challenge 3: Memory Layout Visualizer

Create an ASCII art visualization of the process memory:

=== MEMORY LAYOUT ===
0x7fff... [####----] Stack (4KB used, 8KB total)
          ...
0x6000... [##------] Heap (2KB used, 8KB total)
          ...
0x4000... [########] Code (read-only)

Challenge 4: ASLR Demonstration

Show how Address Space Layout Randomization works:

$ for i in {1..5}; do ./memory_inspector --stack-addr; done
# Show that addresses change each run

Real-World Connections

Connection 1: Debugger Internals

Debuggers like lldb and gdb use these same concepts to:

Display variable values
Show memory contents
Set breakpoints at specific addresses

Connection 2: Exploit Development

Understanding memory layout is essential for:

Buffer overflow exploitation
Return-oriented programming (ROP)
Understanding how ASLR protects against attacks

Connection 3: Performance Optimization

Memory layout affects:

Cache utilization (struct packing)
Memory bandwidth (alignment)
False sharing in multi-threaded code

Interview Questions You Can Now Answer

“What is the difference between &x and x?”
- &x is the address where x is stored; x is the value at that address
“How can you tell if an address is on the stack or the heap?”
- Stack addresses are typically much higher (0x7fff… range on 64-bit)
- Heap addresses are lower (0x6… range)
“What happens to a local variable when a function returns?”
- Its stack frame is “popped”—the memory is still there but invalid
“What is a pointer, really?”
- A number that represents a memory address, with type information for arithmetic
“Why does the stack grow downward on x86?”
- Historical convention; allows stack and heap to grow toward each other
“What is ASLR and why does it exist?”
- Address Space Layout Randomization; prevents attackers from knowing where code/data is located

Resources

Books

Computer Systems: A Programmer’s Perspective - Ch. 2-3
Understanding and Using C Pointers by Richard Reese - Ch. 1-2
The Linux Programming Interface by Michael Kerrisk - Ch. 6

Online

Tools

lldb or gdb - Debuggers
AddressSanitizer (-fsanitize=address)
objdump -d - Disassembly

Self-Assessment Checklist

Before moving to the next project, you should be able to:

Explain why &x and x are different
Predict whether a variable is on stack or heap by looking at its address
Draw a diagram of a stack frame with local variables
Explain why int *p and char *p behave differently with p + 1
Demonstrate struct padding with actual numbers
Use lldb to inspect memory at a given address
Explain what ASLR does and why it matters

Final Milestone: You instinctively think of memory as numbered bytes, not abstract “variables.”