Project 3: Memory Layout Visualizer

Build a tool that displays the memory layout of a running C program, showing exactly where different types of data live in the process address space.


Quick Reference

Attribute Value
Language C
Difficulty Level 3 (Intermediate)
Time Weekend (10-16 hours)
Book Reference Expert C Programming Ch. 6, CS:APP Ch. 7, 9
Coolness Essential Systems Knowledge
Portfolio Value High - Demonstrates deep understanding

Learning Objectives

By completing this project, you will:

  1. Map the process address space - Understand text, data, bss, heap, and stack segments
  2. Use /proc filesystem on Linux - Parse /proc/self/maps for memory regions
  3. Understand ELF segment loading - Connect binary format to runtime layout
  4. Distinguish between sections and segments - Know what the linker vs loader sees
  5. Analyze variable placement - Predict where any variable will be allocated
  6. Use objdump and nm effectively - Examine binary contents
  7. Understand ASLR - See address space layout randomization in action
  8. Debug memory-related issues - Know where to look for different problems

The Core Question You’re Answering

“How does the operating system organize a process’s virtual address space, and where does each type of data live?”

When you write C code, you create different kinds of data:

  • Executable code (functions)
  • Initialized global variables
  • Uninitialized global variables
  • String literals
  • Heap allocations
  • Stack variables

Each of these lives in a specific region of memory with specific properties (readable, writable, executable). Understanding this layout is fundamental to:

  • Debugging segmentation faults
  • Understanding security vulnerabilities
  • Optimizing memory usage
  • Working with embedded systems

Theoretical Foundation

The Process Virtual Address Space

Every process has its own virtual address space, mapped by the OS to physical memory:

PROCESS VIRTUAL ADDRESS SPACE (64-bit Linux, typical layout)
═══════════════════════════════════════════════════════════════════

HIGH ADDRESS (0x7fffffffffff on x86-64)
┌─────────────────────────────────────────────────────────────────┐
│                         KERNEL SPACE                            │
│              (Not accessible from user mode)                    │
│                        ~ top 128TB ~                            │
├─────────────────────────────────────────────────────────────────┤ 0x7fffffffffff
│                                                                 │
│                           STACK                                 │
│                                                                 │
│  • Function local variables                                     │
│  • Function parameters                                          │
│  • Return addresses                                             │
│  • Saved registers                                              │
│                                                                 │
│                         ↓ grows down                            │
├ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ┤
│                                                                 │
│                    [ UNMAPPED REGION ]                          │
│                                                                 │
│           Stack grows down, heap grows up                       │
│           This gap provides protection                          │
│                                                                 │
├ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ┤
│                         ↑ grows up                              │
│                                                                 │
│                           HEAP                                  │
│                                                                 │
│  • malloc/calloc/realloc allocations                           │
│  • Dynamic memory                                               │
│                                                                 │
├─────────────────────────────────────────────────────────────────┤
│                   MEMORY-MAPPED FILES                           │
│  • Shared libraries (.so files)                                │
│  • mmap() allocations                                          │
│  • Shared memory segments                                       │
├─────────────────────────────────────────────────────────────────┤
│                          BSS                                    │
│  • Uninitialized global variables                              │
│  • Uninitialized static variables                              │
│  • Zero-initialized at program start                           │
├─────────────────────────────────────────────────────────────────┤
│                         DATA                                    │
│  • Initialized global variables                                │
│  • Initialized static variables                                │
│  • Non-const global data                                       │
├─────────────────────────────────────────────────────────────────┤
│                        RODATA                                   │
│  • String literals                                             │
│  • const global variables                                      │
│  • Read-only data                                              │
├─────────────────────────────────────────────────────────────────┤
│                         TEXT                                    │
│  • Executable code                                             │
│  • Function machine code                                       │
│  • Read-only, executable                                       │
└─────────────────────────────────────────────────────────────────┘
LOW ADDRESS (typically starts around 0x400000)

Process Virtual Address Space

Memory Segment Properties

Segment Contents Permissions Lifetime
TEXT Executable code r-x (read, execute) Program lifetime
RODATA String literals, const r– (read only) Program lifetime
DATA Initialized globals rw- (read, write) Program lifetime
BSS Uninitialized globals rw- (read, write) Program lifetime
HEAP malloc’d memory rw- (read, write) Until free’d
STACK Local variables rw- (read, write) Until function returns

C Code to Memory Mapping

/* file: example.c */

#include <stdio.h>
#include <stdlib.h>

/* TEXT: Function code goes here */
int add(int a, int b) {
    return a + b;
}

/* RODATA: String literal */
const char *greeting = "Hello, World!";

/* DATA: Initialized global */
int initialized_global = 42;

/* BSS: Uninitialized global */
int uninitialized_global;

/* DATA: Initialized static at file scope */
static int static_initialized = 100;

/* BSS: Uninitialized static at file scope */
static int static_uninitialized;

int main(void) {
    /* STACK: Local variables */
    int local_var = 10;
    char local_array[100];

    /* HEAP: Dynamic allocation */
    int *heap_ptr = malloc(sizeof(int) * 10);

    /* DATA/BSS: Static inside function */
    static int func_static_init = 200;
    static int func_static_uninit;

    free(heap_ptr);
    return 0;
}
MAPPING TO MEMORY SEGMENTS:

┌─────────────────────────────────────────────────────────────────┐
│ TEXT SEGMENT (.text)                                           │
├─────────────────────────────────────────────────────────────────┤
│ • add() function machine code                                  │
│ • main() function machine code                                 │
│ • Other libc functions (linked)                                │
└─────────────────────────────────────────────────────────────────┘

┌─────────────────────────────────────────────────────────────────┐
│ RODATA SEGMENT (.rodata)                                       │
├─────────────────────────────────────────────────────────────────┤
│ • "Hello, World!" string                                       │
│ • (The pointer 'greeting' is in DATA, the string is here)     │
└─────────────────────────────────────────────────────────────────┘

┌─────────────────────────────────────────────────────────────────┐
│ DATA SEGMENT (.data)                                           │
├─────────────────────────────────────────────────────────────────┤
│ • initialized_global (42)                                      │
│ • static_initialized (100)                                     │
│ • func_static_init (200)                                       │
│ • greeting pointer (points to RODATA)                         │
└─────────────────────────────────────────────────────────────────┘

┌─────────────────────────────────────────────────────────────────┐
│ BSS SEGMENT (.bss)                                             │
├─────────────────────────────────────────────────────────────────┤
│ • uninitialized_global                                         │
│ • static_uninitialized                                         │
│ • func_static_uninit                                           │
│ (All zero-initialized at load time)                           │
└─────────────────────────────────────────────────────────────────┘

┌─────────────────────────────────────────────────────────────────┐
│ HEAP (runtime)                                                 │
├─────────────────────────────────────────────────────────────────┤
│ • Memory from malloc(sizeof(int) * 10)                        │
│ • heap_ptr points here                                         │
└─────────────────────────────────────────────────────────────────┘

┌─────────────────────────────────────────────────────────────────┐
│ STACK (runtime)                                                │
├─────────────────────────────────────────────────────────────────┤
│ • local_var (10)                                               │
│ • local_array[100]                                             │
│ • heap_ptr (the pointer itself, not what it points to)        │
│ • main's return address                                        │
│ • Saved frame pointer                                          │
└─────────────────────────────────────────────────────────────────┘

/proc/self/maps Format

On Linux, /proc/self/maps shows the memory layout:

$ cat /proc/self/maps

address          perms offset   dev   inode   pathname
─────────────────────────────────────────────────────────────────
00400000-00401000 r--p 00000000 08:01 1234567 /path/to/program
00401000-00402000 r-xp 00001000 08:01 1234567 /path/to/program
00402000-00403000 r--p 00002000 08:01 1234567 /path/to/program
00403000-00404000 r--p 00002000 08:01 1234567 /path/to/program
00404000-00405000 rw-p 00003000 08:01 1234567 /path/to/program
00405000-00406000 rw-p 00000000 00:00 0       [heap]
7f0000000000-7f0000200000 r-xp 00000000 08:01 2345678 /lib/x86_64-linux-gnu/libc.so.6
...
7ffc00000000-7ffc00021000 rw-p 00000000 00:00 0       [stack]

FIELD EXPLANATION:
──────────────────
address:  Start-End virtual addresses
perms:    r=read, w=write, x=execute, p=private, s=shared
offset:   Offset in the file
dev:      Device number (major:minor)
inode:    Inode number on the device
pathname: File backing this mapping (or special like [heap], [stack])

ELF Sections vs Segments

Important distinction:

SECTIONS (linking view - seen by linker):
=========================================
.text     - Code
.rodata   - Read-only data
.data     - Initialized data
.bss      - Uninitialized data
.symtab   - Symbol table
.strtab   - String table
.rel.text - Relocations for code

SEGMENTS (execution view - seen by loader):
===========================================
LOAD (rx)  - Contains .text, .rodata
LOAD (rw)  - Contains .data, .bss
DYNAMIC    - Dynamic linking info
NOTE       - Auxiliary info

The loader uses SEGMENTS to create memory mappings.
Multiple SECTIONS can be part of one SEGMENT.

┌─────────────────────────────────────────────────────────────────┐
│                      SEGMENT LOAD (r-x)                         │
│  ┌─────────────────────────────────────────────────────────────┐│
│  │ .init  │  .text  │  .fini  │  .rodata  │                   ││
│  └─────────────────────────────────────────────────────────────┘│
└─────────────────────────────────────────────────────────────────┘

┌─────────────────────────────────────────────────────────────────┐
│                      SEGMENT LOAD (rw-)                         │
│  ┌─────────────────────────────────────────────────────────────┐│
│  │  .data  │  .bss  │                                          ││
│  └─────────────────────────────────────────────────────────────┘│
└─────────────────────────────────────────────────────────────────┘

ELF Sections vs Segments

ASLR (Address Space Layout Randomization)

Modern systems randomize memory layout for security:

WITHOUT ASLR (predictable):
══════════════════════════════
Run 1: Stack at 0x7fff00000000
Run 2: Stack at 0x7fff00000000
Run 3: Stack at 0x7fff00000000

WITH ASLR (randomized):
══════════════════════════════
Run 1: Stack at 0x7ffd12345000
Run 2: Stack at 0x7ffe98765000
Run 3: Stack at 0x7ffc00bad000

ASLR makes exploits harder because attackers can't
predict where code and data are located.

Disable for this project (easier debugging):
$ echo 0 | sudo tee /proc/sys/kernel/randomize_va_space

Project Specification

What You Will Build

A command-line tool called memlayout that displays the memory layout of the current process:

$ ./memlayout
╔════════════════════════════════════════════════════════════════╗
║              PROCESS MEMORY LAYOUT                             ║
║              PID: 12345                                        ║
╠════════════════════════════════════════════════════════════════╣
║                                                                ║
║  SEGMENT        RANGE                    SIZE       PERM       ║
║  ─────────────────────────────────────────────────────────────  ║
║  STACK          0x7ffe8a5d0000-0x7ffe8a5f2000   136 KB   rw-   ║
║    local_var    @ 0x7ffe8a5d2f4c                               ║
║    local_arr    @ 0x7ffe8a5d2e00                               ║
║                                                                ║
║  [gap]          0x7f0000000000-0x7ffe8a5d0000   ~127 TB        ║
║                                                                ║
║  HEAP           0x5620a1a4d000-0x5620a1a6e000   132 KB   rw-   ║
║    malloc'd ptr @ 0x5620a1a4d260                               ║
║                                                                ║
║  BSS            0x5620a0a3c040-0x5620a0a3d000   4 KB     rw-   ║
║    uninit_glob  @ 0x5620a0a3c044                               ║
║                                                                ║
║  DATA           0x5620a0a3c000-0x5620a0a3c040   64 B     rw-   ║
║    init_global  @ 0x5620a0a3c000                               ║
║                                                                ║
║  RODATA         0x5620a0a3b000-0x5620a0a3c000   4 KB     r--   ║
║    "Hello"      @ 0x5620a0a3b000                               ║
║                                                                ║
║  TEXT           0x5620a0a39000-0x5620a0a3b000   8 KB     r-x   ║
║    main()       @ 0x5620a0a39160                               ║
║    helper()     @ 0x5620a0a391a0                               ║
║                                                                ║
╚════════════════════════════════════════════════════════════════╝

Functional Requirements

  1. Parse /proc/self/maps:
    • Read and parse the memory map file
    • Identify each region’s purpose
    • Calculate sizes
  2. Display Segment Information:
    • Show address ranges
    • Show permissions (r/w/x)
    • Show sizes in human-readable format
  3. Show Variable Locations:
    • Display addresses of sample variables
    • Categorize by segment type
    • Show which segment each variable belongs to
  4. Compare Modes:
    • Static view (current process)
    • Compare two runs (show ASLR effect)
    • objdump comparison mode
  5. Interactive Mode:
    • Accept variable address to classify
    • Show segment for any address

Non-Functional Requirements

  • Portability: Work on Linux (primary), macOS (stretch goal)
  • Performance: Fast parsing and display
  • Accuracy: Correctly identify all major segments
  • Educational: Clear output with explanations

Real World Outcome

Complete tool session:

$ ./memlayout --help
Usage: memlayout [options]
  --all          Show all memory regions
  --summary      Show segment summary only
  --variables    Show sample variables in each segment
  --compare      Run twice and show ASLR differences
  --lookup ADDR  Identify which segment contains address
  --objdump      Compare with objdump output

$ ./memlayout --variables

╔════════════════════════════════════════════════════════════════╗
║                    VARIABLE PLACEMENT                           ║
╠════════════════════════════════════════════════════════════════╣
║                                                                ║
║  STACK (Local Variables):                                      ║
║  ─────────────────────────────────────────────────────────────  ║
║    int local_int         @ 0x7ffd12345678   (4 bytes)          ║
║    char local_arr[100]   @ 0x7ffd12345600   (100 bytes)        ║
║    int *local_ptr        @ 0x7ffd123455f0   (8 bytes)          ║
║                                                                ║
║  HEAP (Dynamic Allocations):                                   ║
║  ─────────────────────────────────────────────────────────────  ║
║    malloc(40)            @ 0x561234567890   (40 bytes)         ║
║    calloc(10, 8)         @ 0x5612345678c0   (80 bytes)         ║
║                                                                ║
║  BSS (Uninitialized Globals):                                  ║
║  ─────────────────────────────────────────────────────────────  ║
║    int uninit_global     @ 0x561234566040   (4 bytes)          ║
║    static int uninit_s   @ 0x561234566044   (4 bytes)          ║
║                                                                ║
║  DATA (Initialized Globals):                                   ║
║  ─────────────────────────────────────────────────────────────  ║
║    int init_global=42    @ 0x561234565000   (4 bytes)          ║
║    static int init_s=10  @ 0x561234565004   (4 bytes)          ║
║                                                                ║
║  RODATA (Constants):                                           ║
║  ─────────────────────────────────────────────────────────────  ║
║    "Hello, World!"       @ 0x561234564000   (14 bytes)         ║
║    const int CONSTANT    @ 0x56123456400e   (4 bytes)          ║
║                                                                ║
║  TEXT (Code):                                                  ║
║  ─────────────────────────────────────────────────────────────  ║
║    main()                @ 0x561234563000                      ║
║    helper_function()     @ 0x5612345630a0                      ║
║                                                                ║
╚════════════════════════════════════════════════════════════════╝

$ ./memlayout --lookup 0x7ffd12345678
Address 0x7ffd12345678 is in: STACK
  Region: 0x7ffd12340000-0x7ffd12361000
  Permissions: rw-
  Offset from region start: 0x5678 (22136 bytes)

$ ./memlayout --compare
Run 1:
  TEXT:  0x561234560000
  STACK: 0x7ffd12340000
  HEAP:  0x561234580000

Run 2:
  TEXT:  0x55a234560000  (delta: -0xb000000000000)
  STACK: 0x7ffe98760000  (delta: +0x186420000)
  HEAP:  0x55a234580000  (delta: -0xb000000000000)

ASLR is ENABLED - addresses change between runs

Solution Architecture

Project Structure

memlayout/
├── src/
│   ├── main.c              # Entry point, CLI
│   ├── maps_parser.c       # Parse /proc/self/maps
│   ├── segment_info.c      # Segment identification
│   ├── variable_demo.c     # Sample variables
│   ├── display.c           # Output formatting
│   └── objdump_compare.c   # Compare with objdump
├── include/
│   ├── memlayout.h         # Main header
│   ├── maps_parser.h       # Parser declarations
│   └── segment.h           # Segment types
├── Makefile
└── README.md

Key Data Structures

/* Memory region from /proc/self/maps */
typedef struct {
    unsigned long start;
    unsigned long end;
    char perms[5];          /* rwxp */
    unsigned long offset;
    char dev[16];
    unsigned long inode;
    char pathname[256];
} MemoryRegion;

/* Segment classification */
typedef enum {
    SEG_TEXT,
    SEG_RODATA,
    SEG_DATA,
    SEG_BSS,
    SEG_HEAP,
    SEG_STACK,
    SEG_MMAP,
    SEG_VDSO,
    SEG_VSYSCALL,
    SEG_UNKNOWN
} SegmentType;

/* Segment with classification */
typedef struct {
    MemoryRegion region;
    SegmentType type;
    const char *description;
} Segment;

/* Variable location info */
typedef struct {
    const char *name;
    void *address;
    size_t size;
    SegmentType expected_segment;
} VariableInfo;

Key Algorithms

Parsing /proc/self/maps:

int parse_maps(Segment *segments, int max_segments) {
    FILE *fp = fopen("/proc/self/maps", "r");
    if (!fp) return -1;

    char line[512];
    int count = 0;

    while (fgets(line, sizeof(line), fp) && count < max_segments) {
        MemoryRegion *r = &segments[count].region;

        /* Parse: address-address perms offset dev inode pathname */
        int fields = sscanf(line,
            "%lx-%lx %4s %lx %s %lu %255s",
            &r->start, &r->end, r->perms,
            &r->offset, r->dev, &r->inode, r->pathname);

        if (fields >= 5) {
            if (fields < 7) r->pathname[0] = '\0';
            segments[count].type = classify_segment(&segments[count]);
            count++;
        }
    }

    fclose(fp);
    return count;
}

Segment Classification:

SegmentType classify_segment(Segment *seg) {
    MemoryRegion *r = &seg->region;

    /* Special kernel mappings */
    if (strstr(r->pathname, "[stack]"))   return SEG_STACK;
    if (strstr(r->pathname, "[heap]"))    return SEG_HEAP;
    if (strstr(r->pathname, "[vdso]"))    return SEG_VDSO;
    if (strstr(r->pathname, "[vsyscall]")) return SEG_VSYSCALL;

    /* Executable code (r-x) */
    if (r->perms[0] == 'r' && r->perms[2] == 'x') {
        return SEG_TEXT;
    }

    /* Read-only data (r--) */
    if (r->perms[0] == 'r' && r->perms[1] == '-' && r->perms[2] == '-') {
        return SEG_RODATA;
    }

    /* Writable data (rw-) for main executable */
    if (r->perms[0] == 'r' && r->perms[1] == 'w') {
        /* Distinguish DATA from BSS by offset */
        if (r->offset > 0) {
            return SEG_DATA;
        } else if (r->inode == 0) {
            return SEG_BSS;  /* Anonymous, writable = likely BSS/heap */
        }
    }

    return SEG_UNKNOWN;
}

Address Lookup:

SegmentType lookup_address(unsigned long addr, Segment *segments, int count) {
    for (int i = 0; i < count; i++) {
        if (addr >= segments[i].region.start &&
            addr < segments[i].region.end) {
            return segments[i].type;
        }
    }
    return SEG_UNKNOWN;
}

Implementation Guide

Phase 1: Parse /proc/self/maps (2-3 hours)

Goal: Read and parse the memory map.

/* maps_parser.c */
#include <stdio.h>
#include <string.h>
#include "memlayout.h"

int parse_proc_maps(MemoryRegion *regions, int max_count) {
    FILE *fp = fopen("/proc/self/maps", "r");
    if (!fp) {
        perror("Failed to open /proc/self/maps");
        return -1;
    }

    char line[512];
    int count = 0;

    while (fgets(line, sizeof(line), fp) && count < max_count) {
        MemoryRegion *r = &regions[count];

        /* Parse the line */
        char start_str[32], end_str[32];
        int n = sscanf(line, "%[^-]-%s %4s %lx %15s %lu %255[^\n]",
                       start_str, end_str, r->perms,
                       &r->offset, r->dev, &r->inode, r->pathname);

        r->start = strtoul(start_str, NULL, 16);
        r->end = strtoul(end_str, NULL, 16);

        if (n < 6) r->pathname[0] = '\0';

        count++;
    }

    fclose(fp);
    return count;
}

Phase 2: Classify Segments (2 hours)

Goal: Identify what each region is used for.

Key classification logic:

  • [stack] -> Stack
  • [heap] -> Heap
  • Executable (r-xp) -> Text
  • Read-only (r--p) -> Rodata
  • Read-write from file -> Data
  • Anonymous read-write -> BSS or heap

Phase 3: Display Formatting (2 hours)

Goal: Create clean, readable output.

/* display.c */
void print_segment_summary(Segment *seg) {
    const char *type_names[] = {
        "TEXT", "RODATA", "DATA", "BSS",
        "HEAP", "STACK", "MMAP", "VDSO",
        "VSYSCALL", "UNKNOWN"
    };

    unsigned long size = seg->region.end - seg->region.start;
    char size_str[32];

    if (size >= 1024 * 1024) {
        snprintf(size_str, sizeof(size_str), "%lu MB", size / (1024*1024));
    } else if (size >= 1024) {
        snprintf(size_str, sizeof(size_str), "%lu KB", size / 1024);
    } else {
        snprintf(size_str, sizeof(size_str), "%lu B", size);
    }

    printf("  %-12s  0x%012lx-0x%012lx  %8s  %s\n",
           type_names[seg->type],
           seg->region.start, seg->region.end,
           size_str, seg->region.perms);
}

Phase 4: Variable Demonstration (2 hours)

Goal: Show actual variables in each segment.

/* variable_demo.c */
#include <stdlib.h>
#include "memlayout.h"

/* Sample variables in different segments */
const char *rodata_string = "I am in RODATA";
int data_var = 42;
int bss_var;

void demonstrate_variables(void) {
    /* Stack variable */
    int stack_var = 100;

    /* Heap allocation */
    int *heap_var = malloc(sizeof(int));
    *heap_var = 200;

    printf("Variable Locations:\n");
    printf("  rodata_string @ %p (RODATA)\n", (void*)rodata_string);
    printf("  data_var      @ %p (DATA)\n", (void*)&data_var);
    printf("  bss_var       @ %p (BSS)\n", (void*)&bss_var);
    printf("  stack_var     @ %p (STACK)\n", (void*)&stack_var);
    printf("  heap_var      @ %p (HEAP)\n", (void*)heap_var);
    printf("  main()        @ %p (TEXT)\n", (void*)main);

    free(heap_var);
}

Phase 5: objdump Comparison (2 hours)

Goal: Compare runtime layout with binary analysis.

/* objdump_compare.c */
void compare_with_objdump(const char *executable) {
    char cmd[256];

    printf("Comparing with objdump output:\n\n");

    /* Show section headers */
    snprintf(cmd, sizeof(cmd), "objdump -h %s | grep -E '\\.text|\\.rodata|\\.data|\\.bss'",
             executable);
    printf("Section Headers (from objdump -h):\n");
    system(cmd);
    printf("\n");

    /* Show symbols */
    snprintf(cmd, sizeof(cmd), "nm %s | head -20", executable);
    printf("Symbols (from nm):\n");
    system(cmd);
}

Phase 6: ASLR Comparison (1 hour)

Goal: Show address randomization effect.

void compare_aslr(void) {
    /* Record addresses */
    printf("Current run addresses:\n");
    printf("  main():        %p\n", (void*)main);
    printf("  data_var:      %p\n", (void*)&data_var);
    printf("  stack:         %p\n", (void*)&(int){0});

    printf("\nRun this program multiple times to see ASLR effect.\n");
    printf("To disable ASLR for testing:\n");
    printf("  echo 0 | sudo tee /proc/sys/kernel/randomize_va_space\n");
}

Hints in Layers

Hint 1: Getting Started with /proc/self/maps

Start by simply printing the maps file:

#include <stdio.h>

int main(void) {
    FILE *fp = fopen("/proc/self/maps", "r");
    char line[256];

    while (fgets(line, sizeof(line), fp)) {
        printf("%s", line);
    }

    fclose(fp);
    return 0;
}

Notice the patterns - [stack], [heap], and permission flags like r-xp, rw-p.

Hint 2: Getting Variable Addresses

Use the address-of operator and cast to void* for printing:

int global = 42;
int main(void) {
    int local = 10;
    printf("global: %p\n", (void*)&global);
    printf("local:  %p\n", (void*)&local);
    printf("main:   %p\n", (void*)main);
}
Hint 3: Using objdump and nm

Compare these commands with your runtime output:

# Show sections and their addresses
objdump -h ./memlayout

# Show all symbols
nm ./memlayout

# Show only global symbols
nm -g ./memlayout

# Show section contents
objdump -s -j .rodata ./memlayout
Hint 4: Classifying Regions

The key is combining permissions with pathname:

if (strstr(pathname, "[stack]")) return SEG_STACK;
if (strstr(pathname, "[heap]"))  return SEG_HEAP;
if (perms[2] == 'x')             return SEG_TEXT;  // executable
if (perms[1] == '-')             return SEG_RODATA; // not writable
if (perms[1] == 'w' && offset>0) return SEG_DATA;
// etc.
Hint 5: Distinguishing DATA from BSS

The key insight: DATA comes from the file (has offset), BSS is anonymous:

/* From objdump output, DATA is at file offset > 0 */
/* BSS has offset 0 and no file inode */

if (region.offset > 0 && region.inode > 0) {
    /* Backed by file = DATA or TEXT */
} else if (region.inode == 0) {
    /* Anonymous mapping = BSS or heap */
}

Testing Strategy

Test Cases

Test Description Expected Outcome
Parse maps Read /proc/self/maps All regions parsed
Classify TEXT Find executable region main() in TEXT
Classify STACK Find stack region Local var in STACK
Classify HEAP malloc() allocation Allocation in HEAP
ASLR test Run twice Addresses differ
objdump match Compare with binary Sections match

Validation Script

#!/bin/bash

echo "Testing memlayout tool..."

# Basic execution
./memlayout --summary || { echo "FAIL: Basic execution"; exit 1; }

# Check for expected segments
./memlayout --all | grep -q "TEXT" || { echo "FAIL: No TEXT segment"; exit 1; }
./memlayout --all | grep -q "STACK" || { echo "FAIL: No STACK segment"; exit 1; }
./memlayout --all | grep -q "HEAP" || { echo "FAIL: No HEAP segment"; exit 1; }

# Compare with objdump
./memlayout --objdump 2>/dev/null

# ASLR test (run twice)
addr1=$(./memlayout --summary | grep TEXT | awk '{print $2}')
addr2=$(./memlayout --summary | grep TEXT | awk '{print $2}')
if [ "$addr1" != "$addr2" ]; then
    echo "ASLR is working - addresses differ"
fi

echo "All tests passed!"

Common Pitfalls & Debugging

Pitfall Symptom Solution
Parsing errors Missing regions Check sscanf format
Classification wrong HEAP shown as BSS Check inode and pathname
Address format Truncated addresses Use %lx not %x
File not found No /proc/self/maps Linux-only feature
ASLR confusion Addresses don’t match objdump Disable ASLR for comparison
Position-independent code TEXT at unexpected address Normal with PIE binaries

Extensions & Challenges

Beginner Extensions

  • Color-coded output (different colors for segments)
  • Show shared library regions
  • Calculate total memory usage

Intermediate Extensions

  • macOS support using vmmap command
  • Compare two processes’ layouts
  • Track allocations over time

Advanced Extensions

  • Direct ELF parsing (no procfs dependency)
  • DWARF debug info for symbol resolution
  • Memory protection change detection
  • Integrate with /proc/self/smaps for detailed stats

Real-World Connections

Security Applications

  • Stack overflow detection: Know where stack ends
  • Heap overflow analysis: Understand heap boundaries
  • ASLR verification: Confirm randomization is working
  • ROP gadget finding: Locate executable regions

Debugging Applications

  • Segfault analysis: Which segment did we hit?
  • Memory leak investigation: Where is memory allocated?
  • Performance analysis: Cache behavior by segment

Interview Questions

  1. “Draw the process memory layout” - This project gives you the answer
  2. “What is ASLR and why is it used?” - Security through randomization
  3. “What’s the difference between .data and .bss?” - Initialized vs uninitialized
  4. “Where are local variables stored?” - Stack
  5. “What happens when you malloc()?” - Heap allocation

Books That Will Help

Topic Book Chapter
Memory Layout Expert C Programming Ch. 6
Virtual Memory CS:APP Ch. 9
ELF Format CS:APP Ch. 7
Linking & Loading Linkers and Loaders Ch. 3-5
Linux Memory Understanding Linux Kernel Ch. 9

Self-Assessment Checklist

Understanding

  • I can draw the process memory layout from memory
  • I know which segment each variable type goes in
  • I understand the difference between sections and segments
  • I can explain ASLR and its purpose
  • I know how to use /proc/self/maps

Implementation

  • Tool parses /proc/self/maps correctly
  • All major segments are classified
  • Variable addresses match expected segments
  • Output is clear and well-formatted
  • objdump comparison works

Growth

  • I can debug segfaults by reasoning about memory layout
  • I can predict where variables will be allocated
  • I understand security implications of memory layout

Submission / Completion Criteria

Minimum Viable Completion

  • Parses /proc/self/maps successfully
  • Identifies TEXT, STACK, and HEAP segments
  • Shows addresses of sample variables

Full Completion

  • All segment types correctly classified
  • Clean formatted output
  • objdump comparison mode
  • ASLR demonstration

Excellence

  • macOS support
  • Direct ELF parsing
  • Integration with debugging tools
  • Memory usage statistics

This guide was expanded from EXPERT_C_PROGRAMMING_DEEP_DIVE.md. For the complete learning path, see the project index.