Project 4: Stack Frame Inspector
Build a tool that walks the call stack and displays activation records, revealing how functions maintain their execution context.
Quick Reference
| Attribute | Value |
|---|---|
| Difficulty | Level 4 (Advanced) |
| Time Estimate | 1 Week |
| Language | C (with inline assembly) |
| Prerequisites | Memory layout (P03), calling conventions basics, assembly reading |
| Key Topics | Stack frames, frame pointers, return addresses, calling conventions |
| Book Reference | CS:APP Ch. 3, Expert C Programming Ch. 6 |
| Portfolio Value | High - Demonstrates systems mastery |
1. Learning Objectives
By completing this project, you will:
-
Understand stack frame anatomy - Know exactly what data is stored in each activation record: return address, saved registers, local variables, and function arguments
-
Master the frame pointer chain - Understand how RBP/EBP links stack frames together and enables stack walking
-
Learn calling conventions deeply - Distinguish between cdecl, System V AMD64 ABI, and Microsoft x64 calling conventions
-
Implement stack walking - Build a tool that traverses the call stack without relying on library functions
-
Read and write inline assembly - Use GCC/Clang inline assembly to access CPU registers directly
-
Debug at the assembly level - Use GDB/LLDB to examine stack frames and verify your understanding
-
Understand compiler optimization effects - See how
-fomit-frame-pointerand optimizations affect stack layout -
Connect to real debugging tools - Understand how GDB’s
backtrace, Valgrind, and crash reporters work
2. Theoretical Foundation
2.1 Core Concepts: What Is a Stack Frame?
A stack frame (also called an activation record) is the portion of the stack allocated for a single function invocation. It contains everything the function needs to execute and return to its caller.
THE STACK FRAME CONCEPT
========================================================================
When you call a function:
main() calls foo() calls bar()
Each function gets its own "workspace" on the stack:
HIGH ADDRESS (stack grows DOWN)
┌────────────────────────────────────────────────────────────────────┐
│ main()'s stack frame │
│ - main's local variables │
│ - saved registers main needs to preserve │
├────────────────────────────────────────────────────────────────────┤
│ Return address to __libc_start_main │
│ Saved RBP (frame pointer) ◄─────────────────────────┐ │
├────────────────────────────────────────────────────────│────────────┤
│ foo()'s stack frame │ │
│ - foo's local variables │ │
│ - arguments for bar() (if any go on stack) │ │
├───────────────────────────────────────────────────────│────────────┤
│ Return address to main │ │
│ Saved RBP ──────────────────────────────────────────┘ ◄────┐ │
├──────────────────────────────────────────────────────────────│─────┤
│ bar()'s stack frame │ │
│ - bar's local variables │ │
├──────────────────────────────────────────────────────────────│─────┤
│ Return address to foo │ │
│ Saved RBP ─────────────────────────────────────────────────┘ │
│ │
│ ──────────────────────────── RSP (current stack pointer) │
└────────────────────────────────────────────────────────────────────┘
LOW ADDRESS
KEY INSIGHT: The saved RBP values form a LINKED LIST through the stack!
Following this chain = stack walking

2.2 Why Stack Frames Matter
Understanding stack frames is essential for:
PRACTICAL APPLICATIONS
========================================================================
DEBUGGING:
• Stack traces in crash reports (how debuggers show "where you are")
• GDB's `backtrace` command walks the frame pointer chain
• Understanding "stack overflow" errors
SECURITY:
• Stack buffer overflow exploits overwrite return addresses
• Stack canaries detect corruption before return
• ASLR randomizes stack location
• Understanding "Return-Oriented Programming" (ROP) attacks
PERFORMANCE:
• Function call overhead (stack manipulation costs)
• Tail call optimization eliminates stack growth
• Understanding why recursion can be expensive
SYSTEMS PROGRAMMING:
• Implementing exception handling (C++, Rust panics)
• Signal handlers need separate stacks
• Debugger implementation
• Profilers sample the stack to attribute time
2.3 Historical Context: Calling Conventions Evolution
CALLING CONVENTIONS THROUGH HISTORY
========================================================================
1970s - EARLY C (PDP-11):
• All arguments passed on stack
• Simple and consistent
• Slower (memory access for every argument)
1980s - cdecl (x86 32-bit):
• Arguments pushed right-to-left on stack
• Caller cleans up stack (allows variadic functions)
• Return value in EAX
foo(1, 2, 3);
Assembly:
push 3 ; Third argument
push 2 ; Second argument
push 1 ; First argument
call foo
add esp, 12 ; Caller cleans up (3 × 4 bytes)
2000s - System V AMD64 ABI (Linux/macOS x86-64):
• First 6 INTEGER args in: RDI, RSI, RDX, RCX, R8, R9
• First 8 FLOAT args in: XMM0-XMM7
• Remaining args on stack
• Return value in RAX (or RAX:RDX for 128-bit)
foo(1, 2, 3, 4, 5, 6, 7, 8);
Assembly:
mov r9d, 6 ; Sixth arg in R9
mov r8d, 5 ; Fifth arg in R8
mov ecx, 4 ; Fourth arg in RCX
mov edx, 3 ; Third arg in RDX
mov esi, 2 ; Second arg in RSI
mov edi, 1 ; First arg in RDI
push 8 ; Eighth arg on stack
push 7 ; Seventh arg on stack
call foo
add rsp, 16 ; Clean up stack args
WHY THE CHANGE?
• Registers are much faster than memory
• Modern x86-64 has more registers (16 vs 8)
• Cache misses on stack access are expensive
2.4 Common Misconceptions
MISCONCEPTION 1: "The stack grows up"
────────────────────────────────────────────────────────────────────────
REALITY: On x86/x86-64 (and most architectures), the stack grows DOWN.
push decreases RSP; pop increases RSP.
"Higher addresses" are at the TOP (earlier in the call chain).
MISCONCEPTION 2: "Arguments are always on the stack"
────────────────────────────────────────────────────────────────────────
REALITY: On x86-64, most arguments go in REGISTERS.
Only overflow arguments (7th+ integer arg) go on stack.
Variadic functions are special (va_start needs them somewhere).
MISCONCEPTION 3: "Frame pointers are always present"
────────────────────────────────────────────────────────────────────────
REALITY: Modern compilers often use -fomit-frame-pointer by default at -O2.
This frees up RBP as a general-purpose register.
Debugging info (DWARF) can still enable stack unwinding.
MISCONCEPTION 4: "Return addresses are in a fixed location"
────────────────────────────────────────────────────────────────────────
REALITY: Without frame pointers, return address offset varies per function.
Each function may have different local variable sizes.
Only DWARF unwind info knows the exact layout.
MISCONCEPTION 5: "Local variables are in the order declared"
────────────────────────────────────────────────────────────────────────
REALITY: Compiler can reorder locals for alignment and optimization.
-O0 usually preserves order; -O2+ may change it entirely.
3. Project Specification
3.1 What You Will Build
A command-line tool called stackwalk that walks the call stack and displays detailed information about each activation record:
$ ./stackwalk
╔════════════════════════════════════════════════════════════════════╗
║ STACK FRAME INSPECTOR ║
║ Call Stack Analysis ║
╠════════════════════════════════════════════════════════════════════╣
║ ║
║ Frame 0: bar() at 0x401234 ║
║ ──────────────────────────────────────────────────────────────── ║
║ Frame pointer (RBP): 0x7fffffffe100 ║
║ Return address: 0x4012ab (in foo+0x2b) ║
║ Saved RBP: 0x7fffffffe130 ║
║ Local variables: 32 bytes ║
║ Stack frame size: 48 bytes ║
║ ║
║ Frame 1: foo() at 0x401280 ║
║ ──────────────────────────────────────────────────────────────── ║
║ Frame pointer (RBP): 0x7fffffffe130 ║
║ Return address: 0x40134f (in main+0x1f) ║
║ Saved RBP: 0x7fffffffe160 ║
║ Local variables: 16 bytes ║
║ Stack frame size: 48 bytes ║
║ ║
║ Frame 2: main() at 0x401330 ║
║ ──────────────────────────────────────────────────────────────── ║
║ Frame pointer (RBP): 0x7fffffffe160 ║
║ Return address: 0x7ffff7a2d830 (in __libc_start_main) ║
║ Saved RBP: 0x0000000000000000 ║
║ Local variables: 8 bytes ║
║ Stack frame size: 32 bytes ║
║ ║
╚════════════════════════════════════════════════════════════════════╝
Call chain: bar() <- foo() <- main() <- __libc_start_main
Total stack depth: 3 user frames
Total stack usage: 128 bytes
3.2 Functional Requirements
- Stack Walking:
- Walk the frame pointer chain from current frame to main()
- Handle both 32-bit and 64-bit builds (via conditional compilation)
- Detect end of chain (NULL frame pointer or invalid address)
- Frame Information Display:
- Current frame pointer (RBP/EBP)
- Return address for each frame
- Saved frame pointer value
- Estimated local variable size
- Total frame size
- Symbol Resolution (Optional but Recommended):
- Resolve return addresses to function names
- Show offset within function
- Use dladdr() or similar for symbol lookup
- Modes of Operation:
- Default: Walk stack from current position
- Demo mode: Set up a known call chain for testing
- Verbose mode: Show raw hex dumps of frame data
- Comparison mode: Show effect of -fomit-frame-pointer
- Safety Features:
- Validate addresses before dereferencing
- Limit maximum frames to prevent infinite loops
- Handle corrupted frame pointer chains gracefully
3.3 Non-Functional Requirements
- Platform: Linux x86-64 (primary), macOS x86-64 (stretch goal)
- Compiler: GCC or Clang with -fno-omit-frame-pointer
- Performance: Stack walk should complete in < 1ms for typical stacks
- Portability: Use conditional compilation for architecture-specific code
- Educational: Clear output with explanatory comments
3.4 Example Usage / Output
# Basic usage - walk current stack
$ ./stackwalk
# With verbose hex dump
$ ./stackwalk --verbose
Frame 0: bar() at 0x401234
Raw frame data (48 bytes):
0x7fffffffe100: 30 e1 ff ff ff 7f 00 00 ab 12 40 00 00 00 00 00
0x7fffffffe110: 0a 00 00 00 14 00 00 00 00 00 00 00 00 00 00 00
...
# Demo mode with known call chain
$ ./stackwalk --demo
Setting up: main() -> foo() -> bar() -> inspect_stack()
[Stack trace output]
# Compare with/without frame pointer
$ ./stackwalk --compare
With frame pointer (-fno-omit-frame-pointer):
[Full trace]
Without frame pointer (-fomit-frame-pointer):
WARNING: Cannot walk stack - no frame pointer chain
Only __builtin_return_address(0) available
# Get specific frame info
$ ./stackwalk --frame 2
Frame 2: main()
[Detailed info for main's frame only]
3.5 Real World Outcome
After completing this project, you will be able to:
- Debug without a debugger: Understand crash dumps and manually trace execution
- Implement crash handlers: Build your own stack trace printer for crash reports
- Understand security vulnerabilities: See exactly what stack buffer overflows corrupt
- Optimize function calls: Know the real cost of deep call chains
- Read compiler output: Understand assembly prologue/epilogue sequences
4. Solution Architecture
4.1 High-Level Design
STACK FRAME INSPECTOR ARCHITECTURE
========================================================================
┌─────────────────────────────────────────────────┐
│ User Interface │
│ CLI parsing, output formatting, modes │
└────────────────────────┬────────────────────────┘
│
┌────────────────────────▼────────────────────────┐
│ Stack Walker Core │
│ Walk frame chain, collect frame data │
└────────────────────────┬────────────────────────┘
│
┌──────────────────────────────────┼──────────────────────────────┐
│ │ │
▼ ▼ ▼
┌─────────────────────┐ ┌─────────────────────┐ ┌─────────────────────┐
│ Register Access │ │ Memory Reader │ │ Symbol Resolver │
│ Get RBP, RSP │ │ Validate & read │ │ dladdr, nm │
│ Inline assembly │ │ frame data │ │ addr2line │
└─────────────────────┘ └─────────────────────┘ └─────────────────────┘
│ │ │
└──────────────────────────────────┼──────────────────────────────┘
│
┌────────────────────────▼────────────────────────┐
│ Output Formatter │
│ Pretty print, hex dump, summary │
└─────────────────────────────────────────────────┘
STACK FRAME LAYOUT (x86-64 System V ABI)
========================================================================
When foo() calls bar():
BEFORE CALL (in foo): AFTER CALL PROLOGUE (in bar):
┌───────────────────────┐ ┌───────────────────────┐
│ foo's frame │ │ foo's frame │
│ ... │ │ ... │
│ foo's locals │ │ foo's locals │
├───────────────────────┤ ├───────────────────────┤
│ ... │ │ arg 7 (if any) │
│ │ ├───────────────────────┤
│ RSP ──────────┼──┐ │ return address │◄── call pushes this
└───────────────────────┘ │ ├───────────────────────┤
│ │ saved RBP (foo's) │◄── push rbp
│ ├───────────────────────┤
│ │ bar's locals │◄── sub rsp, N
│ │ │
│ │ RSP ──────────┼──┐
│ └───────────────────────┘ │
│ │
└────► Stack grew by call+prologue ─┘
STANDARD PROLOGUE:
push rbp ; Save caller's frame pointer
mov rbp, rsp ; Set up new frame pointer
sub rsp, N ; Allocate space for locals
STANDARD EPILOGUE:
mov rsp, rbp ; Deallocate locals (or just "leave")
pop rbp ; Restore caller's frame pointer
ret ; Pop return address, jump to it
4.2 Key Components
| Component | Responsibility | Key Functions |
|---|---|---|
| main.c | CLI parsing, mode selection | main(), parse_args() |
| stackwalk.c | Core stack walking logic | walk_stack(), get_frame_info() |
| registers.c | Architecture-specific register access | get_rbp(), get_rsp() |
| memory.c | Safe memory reading with validation | safe_read(), is_valid_address() |
| symbols.c | Symbol resolution | resolve_address(), get_function_name() |
| output.c | Formatting and display | print_frame(), print_summary() |
4.3 Data Structures
/* Information about a single stack frame */
typedef struct {
void *frame_pointer; /* Current frame's RBP value */
void *return_address; /* Where to return after this function */
void *saved_rbp; /* Previous frame's RBP (forms the chain) */
void *stack_pointer; /* RSP at time of inspection (frame 0 only) */
size_t local_var_size; /* Estimated size of local variables */
size_t frame_size; /* Total frame size (to next frame) */
int frame_number; /* 0 = current, 1 = caller, etc. */
/* Symbol information (if available) */
char function_name[128]; /* Resolved function name */
void *function_start; /* Start address of function */
size_t offset_in_func; /* Offset of return addr within function */
char source_file[256]; /* Source file (if debug info available) */
int source_line; /* Line number (if debug info available) */
} FrameInfo;
/* Collection of all frames in the stack */
typedef struct {
FrameInfo *frames; /* Array of frame info */
int count; /* Number of frames */
int capacity; /* Allocated capacity */
void *stack_bottom; /* Highest valid stack address */
void *stack_top; /* RSP at start of walk */
size_t total_stack_usage; /* Sum of all frame sizes */
} StackTrace;
/* Options for stack walking */
typedef struct {
int max_frames; /* Maximum frames to walk (default: 100) */
int verbose; /* Show hex dumps */
int resolve_symbols; /* Attempt symbol resolution */
int show_raw_addresses; /* Show addresses without symbolication */
} WalkOptions;
4.4 Algorithm Overview
STACK WALKING ALGORITHM
========================================================================
INPUT: None (reads current CPU registers)
OUTPUT: StackTrace with all frame information
ALGORITHM:
1. GET CURRENT REGISTERS
rbp = current frame pointer (via inline assembly)
rsp = current stack pointer (via inline assembly)
2. INITIALIZE
frames = empty list
current_rbp = rbp
frame_num = 0
3. WALK THE CHAIN
WHILE current_rbp is valid AND frame_num < max_frames:
a. VALIDATE ADDRESS
IF current_rbp is NULL OR not aligned OR outside stack bounds:
BREAK (end of chain or corruption)
b. READ FRAME DATA
saved_rbp = memory[current_rbp] // Previous frame pointer
return_addr = memory[current_rbp + 8] // Return address
c. CALCULATE FRAME SIZE
IF frame_num > 0:
frame_size = previous_rbp - current_rbp
ELSE:
frame_size = estimate from rsp
d. RESOLVE SYMBOL (optional)
function_name = dladdr(return_addr - 1) // -1 to get call site
e. STORE FRAME INFO
frames.append(FrameInfo{
frame_pointer: current_rbp,
return_address: return_addr,
saved_rbp: saved_rbp,
frame_size: frame_size,
function_name: function_name,
...
})
f. MOVE TO NEXT FRAME
current_rbp = saved_rbp
frame_num++
4. RETURN frames
WHY return_addr - 1?
========================================================================
The return address points to the instruction AFTER the call.
To find which function made the call, we need an address WITHIN
the call instruction. Subtracting 1 gives us an address inside
the CALL instruction, which dladdr() can resolve correctly.
0x401230: mov edi, 42
0x401235: call bar <- call instruction (5 bytes on x86-64)
0x40123a: mov eax, 0 <- return address points HERE
return_addr = 0x40123a
return_addr - 1 = 0x401239 (inside the call instruction's address range)
dladdr(0x401239) -> "foo" (correct!)
5. Implementation Guide
5.1 Development Environment Setup
# Required packages (Ubuntu/Debian)
sudo apt-get install gcc gdb binutils
# Required packages (macOS)
xcode-select --install
# Verify tools
gcc --version
gdb --version # or lldb --version on macOS
# Create project directory
mkdir stackwalk && cd stackwalk
mkdir src include
# CRITICAL: We need frame pointers for stack walking
# Add this to your Makefile:
CFLAGS = -Wall -Wextra -g -O0 -fno-omit-frame-pointer
5.2 Project Structure
stackwalk/
├── src/
│ ├── main.c # Entry point, CLI parsing
│ ├── stackwalk.c # Core stack walking implementation
│ ├── registers.c # Register access (inline assembly)
│ ├── memory.c # Safe memory operations
│ ├── symbols.c # Symbol resolution (dladdr)
│ └── output.c # Output formatting
├── include/
│ ├── stackwalk.h # Public API
│ ├── registers.h # Register access declarations
│ ├── memory.h # Memory operations
│ ├── symbols.h # Symbol resolution
│ └── output.h # Output declarations
├── tests/
│ ├── test_basic.c # Basic stack walking test
│ ├── test_deep.c # Deep recursion test
│ └── test_no_fp.c # Test without frame pointers
├── Makefile
└── README.md
5.3 The Core Question You’re Answering
“What exactly is stored in a stack frame, and how can we walk the chain of frames?”
Every function call creates a stack frame. Your tool will reveal:
- Where the return address is stored (how does the CPU know where to go back?)
- How frame pointers link together (the linked list through the stack)
- What happens to local variables (where do they live?)
- How calling conventions affect layout (registers vs stack)
5.4 Concepts You Must Understand First
Before implementing, ensure you can answer:
- What is RBP (or EBP on 32-bit)?
- The frame pointer / base pointer register
- Points to the “base” of the current stack frame
- Used as a stable reference for accessing locals and parameters
- Reference: CS:APP Section 3.7
- What does
calldo at the machine level?- Pushes the return address (next instruction) onto the stack
- Jumps to the target address
- Does NOT set up the frame pointer (that’s the prologue’s job)
- What is the function prologue?
push rbp ; Save caller's frame pointer mov rbp, rsp ; Establish our frame pointer sub rsp, N ; Allocate space for local variables - What is the function epilogue?
mov rsp, rbp ; Or use "leave" instruction pop rbp ; Restore caller's frame pointer ret ; Pop return address and jump - Why do we need -fno-omit-frame-pointer?
- Without it, RBP is used as a general-purpose register
- The frame pointer chain doesn’t exist
- Stack unwinding requires DWARF debug info instead
5.5 Questions to Guide Your Design
Work through these before coding:
- How do you get the current RBP value in C?
- Inline assembly? Compiler builtins? Something else?
- How do you know when to stop walking?
- What does the frame pointer look like at main()?
- How do you detect a corrupted chain?
- How do you validate addresses before reading?
- What makes an address “valid”?
- What’s in /proc/self/maps that can help?
- How do you handle optimized code?
- What if frame pointers are omitted?
- Can you detect this condition?
- How do you calculate frame sizes?
- The chain gives you pointers, not sizes
- How do you determine where one frame ends and another begins?
5.6 Thinking Exercise
Before writing any code, trace through this scenario on paper:
#include <stdio.h>
void bar(int x) {
int local = x * 2;
printf("In bar, local = %d\n", local);
// <-- Imagine stack walk starts here
}
void foo(int a, int b) {
int sum = a + b;
bar(sum);
}
int main(void) {
foo(10, 20);
return 0;
}
Questions to answer:
- Draw the stack layout when we’re inside
bar(), just before printf.- Where is main’s frame? foo’s frame? bar’s frame?
- Where is each return address?
- Where is each saved RBP?
- Starting from bar’s RBP, trace the chain:
- bar’s RBP points to… (what address/value?)
- That location contains… (foo’s saved RBP)
- Following that pointer leads to… (main’s frame)
- For each frame, identify:
- The return address (where in the caller do we return to?)
- The saved RBP (what was the caller’s frame pointer?)
- The local variables (where are sum, local stored?)
5.7 Hints in Layers
Hint 1: Getting Started - Reading RBP
The simplest way to get RBP in GCC/Clang:
void *get_rbp(void) {
void *rbp;
__asm__ volatile ("mov %%rbp, %0" : "=r" (rbp));
return rbp;
}
Or use the compiler builtin:
void *frame = __builtin_frame_address(0); // Current frame
void *caller_frame = __builtin_frame_address(1); // Caller's frame
Note: __builtin_frame_address(N) for N > 0 requires frame pointers!
Hint 2: Basic Frame Walking
The core walking loop is simple once you understand the layout:
void walk_stack(void) {
void **rbp = (void **)get_rbp();
int frame = 0;
while (rbp != NULL && frame < 100) {
void *return_addr = rbp[1]; // Return address is at rbp+8
void *saved_rbp = rbp[0]; // Saved RBP is at rbp+0
printf("Frame %d:\n", frame);
printf(" RBP: %p\n", (void *)rbp);
printf(" Return addr: %p\n", return_addr);
printf(" Saved RBP: %p\n", saved_rbp);
rbp = (void **)saved_rbp; // Move to caller's frame
frame++;
}
}
But this is dangerous! We’re dereferencing arbitrary pointers.
Hint 3: Validating Addresses
Before dereferencing, validate the address:
#include <sys/mman.h>
#include <signal.h>
/* Check if address is likely valid stack address */
int is_valid_stack_address(void *addr) {
/* Must be aligned to 8 bytes (x86-64) */
if ((uintptr_t)addr % 8 != 0) return 0;
/* Must be within reasonable stack range */
/* Stack is typically in high memory, growing down */
if ((uintptr_t)addr < 0x7f0000000000ULL) return 0;
if ((uintptr_t)addr > 0x7ffffffffff0ULL) return 0;
return 1;
}
/* Safer approach: parse /proc/self/maps to find stack bounds */
int get_stack_bounds(void **low, void **high) {
FILE *f = fopen("/proc/self/maps", "r");
char line[256];
while (fgets(line, sizeof(line), f)) {
if (strstr(line, "[stack]")) {
sscanf(line, "%p-%p", low, high);
fclose(f);
return 1;
}
}
fclose(f);
return 0;
}
Hint 4: Symbol Resolution with dladdr
Use dladdr() to convert addresses to function names:
#define _GNU_SOURCE
#include <dlfcn.h>
void resolve_address(void *addr, char *buf, size_t buflen) {
Dl_info info;
if (dladdr(addr, &info) && info.dli_sname) {
snprintf(buf, buflen, "%s+0x%lx (%s)",
info.dli_sname,
(unsigned long)((char *)addr - (char *)info.dli_saddr),
info.dli_fname ? info.dli_fname : "?");
} else {
snprintf(buf, buflen, "%p (unknown)", addr);
}
}
Remember to compile with -rdynamic to export symbols for dladdr:
gcc -rdynamic -o stackwalk stackwalk.c -ldl
Hint 5: Full Implementation Structure
Here’s the skeleton for the complete implementation:
/* stackwalk.c */
#define _GNU_SOURCE
#include <stdio.h>
#include <stdlib.h>
#include <stdint.h>
#include <string.h>
#include <dlfcn.h>
#define MAX_FRAMES 100
typedef struct {
void *rbp;
void *return_addr;
void *saved_rbp;
char symbol[256];
} Frame;
/* Get current frame pointer */
__attribute__((always_inline))
static inline void *get_rbp(void) {
void *rbp;
__asm__ volatile ("mov %%rbp, %0" : "=r" (rbp));
return rbp;
}
/* Get current stack pointer */
__attribute__((always_inline))
static inline void *get_rsp(void) {
void *rsp;
__asm__ volatile ("mov %%rsp, %0" : "=r" (rsp));
return rsp;
}
/* Validate stack address */
static int is_valid(void *addr) {
uintptr_t a = (uintptr_t)addr;
return (a > 0x1000 && a < 0x7fffffffffffULL && (a % 8) == 0);
}
/* Walk the stack */
int walk_stack(Frame *frames, int max_frames) {
void **rbp = (void **)get_rbp();
int count = 0;
/* Skip this function's frame */
if (is_valid(rbp)) {
rbp = (void **)*rbp;
}
while (is_valid(rbp) && count < max_frames) {
frames[count].rbp = rbp;
frames[count].return_addr = rbp[1];
frames[count].saved_rbp = rbp[0];
/* Resolve symbol */
Dl_info info;
if (dladdr(rbp[1], &info) && info.dli_sname) {
snprintf(frames[count].symbol, sizeof(frames[count].symbol),
"%s", info.dli_sname);
} else {
strcpy(frames[count].symbol, "???");
}
/* Check for end of chain */
if (rbp[0] == NULL) {
count++;
break;
}
rbp = (void **)rbp[0];
count++;
}
return count;
}
/* Pretty print */
void print_frames(Frame *frames, int count) {
printf("═══════════════════════════════════════════════════════\n");
printf(" STACK TRACE \n");
printf("═══════════════════════════════════════════════════════\n");
for (int i = 0; i < count; i++) {
printf("Frame %d: %s\n", i, frames[i].symbol);
printf(" RBP: %p\n", frames[i].rbp);
printf(" Return addr: %p\n", frames[i].return_addr);
printf(" Saved RBP: %p\n", frames[i].saved_rbp);
printf("\n");
}
}
/* Demo functions */
__attribute__((noinline)) void bar(void) {
Frame frames[MAX_FRAMES];
int count = walk_stack(frames, MAX_FRAMES);
print_frames(frames, count);
}
__attribute__((noinline)) void foo(void) {
bar();
}
int main(void) {
foo();
return 0;
}
Compile with:
gcc -Wall -g -O0 -fno-omit-frame-pointer -rdynamic stackwalk.c -o stackwalk -ldl
5.8 The Interview Questions They’ll Ask
After completing this project, you’ll confidently answer:
- “How does a debugger show a stack trace?”
- Walks the frame pointer chain (if available)
- Or uses DWARF unwind information
- Resolves addresses to symbols via debug info or dladdr
- Frame 0 = current function, Frame N = caller N levels up
- “What happens when a function is called?”
- Caller: Place arguments (registers first on x64, stack if overflow)
call: Push return address, jump to target- Callee prologue: Save RBP, set new RBP, allocate locals
- Callee epilogue: Restore RSP, restore RBP,
ret
- “How does a buffer overflow exploit work?”
- Local buffer on stack, adjacent to saved RBP and return address
- Overflow writes past buffer into return address
- When function returns, CPU jumps to attacker-controlled address
- Show them exactly where return address is in your stack walk output
- “What’s the difference between RSP and RBP?”
- RSP: Stack pointer, always points to top of stack, changes frequently
- RBP: Frame/base pointer, stable within a function, used for addressing
- With -fomit-frame-pointer, RBP becomes general-purpose, RSP used for everything
- “Why might a stack trace be incomplete or wrong?”
- Frame pointer omitted (-O2 default)
- Tail call optimization (callee’s frame replaces caller’s)
- Corrupted stack (buffer overflow, use-after-free)
- Signal handlers have separate stack frames
- setjmp/longjmp can create confusing traces
5.9 Books That Will Help
| Topic | Book | Chapter |
|---|---|---|
| Calling conventions | CS:APP | Ch. 3.7 “Procedures” |
| Stack layout | CS:APP | Ch. 3.7.1 “Runtime Stack” |
| x86-64 ABI | System V AMD64 ABI | Section 3.2 (online) |
| Memory segments | Expert C Programming | Ch. 6 “Runtime Data Structures” |
| Debugging | GDB Internals | Stack frames section |
5.10 Implementation Phases
Phase 1: Basic Register Access and Frame Walking (2-3 hours)
Goal: Get RBP and walk the basic frame chain.
/* Get this working first: */
void demo(void) {
void *rbp = get_rbp();
printf("Current RBP: %p\n", rbp);
/* Walk one level manually */
void **frame = (void **)rbp;
printf("Return address: %p\n", frame[1]);
printf("Saved RBP: %p\n", frame[0]);
}
Checkpoint: You can print frame pointers for a simple call chain.
Phase 2: Safe Memory Access and Validation (1-2 hours)
Goal: Don’t crash on invalid addresses.
- Implement address validation
- Parse /proc/self/maps for stack bounds
- Add maximum frame limit
Checkpoint: Program handles NULL frame pointers without crashing.
Phase 3: Symbol Resolution (2-3 hours)
Goal: Show function names instead of raw addresses.
- Integrate dladdr()
- Handle missing symbols gracefully
- Calculate offset within function
Checkpoint: Output shows “main”, “foo”, “bar” instead of hex addresses.
Phase 4: Output Formatting and Analysis (2-3 hours)
Goal: Professional, educational output.
- Pretty-printed frame information
- Frame size calculations
- Total stack usage
- Optional hex dump mode
Checkpoint: Output matches the example in section 3.4.
Phase 5: Advanced Features (Optional, 2-4 hours)
Goal: Handle edge cases and add features.
- Compare with/without frame pointers
- Show effect of optimization levels
- Source file and line resolution (addr2line integration)
- Support for recursive functions
Checkpoint: Can demonstrate the effect of -fomit-frame-pointer.
5.11 Key Implementation Decisions
| Decision | Options | Recommendation |
|---|---|---|
| Register access | Inline assembly vs builtins | Inline assembly for clarity and control |
| Symbol resolution | dladdr vs backtrace_symbols | dladdr for more control and less allocation |
| Address validation | Simple range check vs /proc/maps | Both: quick check first, then maps if needed |
| Output format | Plain text vs structured | Plain text for learning, JSON option for tools |
| Error handling | Abort vs graceful | Graceful with clear error messages |
6. Testing Strategy
Test Cases
| Test | Description | Expected Outcome |
|---|---|---|
| Basic chain | main -> foo -> bar | 3+ frames shown with correct nesting |
| Deep recursion | Recursive function 50 calls deep | All 50 frames captured |
| Symbol resolution | All demo functions | Function names resolved correctly |
| Invalid input | Manually corrupt frame pointer | Graceful termination, no crash |
| Edge: no frame ptr | Compiled with -fomit-frame-pointer | Clear warning, limited output |
| Edge: optimized | Compiled with -O2 | Detect and report optimization issues |
Verification Script
#!/bin/bash
echo "Testing stack walker..."
# Test 1: Basic execution
./stackwalk || { echo "FAIL: Basic execution"; exit 1; }
# Test 2: Check for expected symbols
./stackwalk | grep -q "main" || { echo "FAIL: main not found"; exit 1; }
./stackwalk | grep -q "foo" || { echo "FAIL: foo not found"; exit 1; }
./stackwalk | grep -q "bar" || { echo "FAIL: bar not found"; exit 1; }
# Test 3: Verify frame ordering (bar before foo before main)
OUTPUT=$(./stackwalk)
BAR_LINE=$(echo "$OUTPUT" | grep -n "bar" | head -1 | cut -d: -f1)
FOO_LINE=$(echo "$OUTPUT" | grep -n "foo" | head -1 | cut -d: -f1)
MAIN_LINE=$(echo "$OUTPUT" | grep -n "main" | head -1 | cut -d: -f1)
if [ "$BAR_LINE" -lt "$FOO_LINE" ] && [ "$FOO_LINE" -lt "$MAIN_LINE" ]; then
echo "PASS: Frame ordering correct"
else
echo "FAIL: Frame ordering wrong"
exit 1
fi
# Test 4: Compare with GDB backtrace
echo "Comparing with GDB..."
GDB_BT=$(echo -e "break bar\nrun\nbt\nquit" | gdb -q ./stackwalk 2>/dev/null | grep -E "#[0-9]")
echo "$GDB_BT"
echo "All tests passed!"
Debugging with GDB
Use GDB to verify your understanding:
$ gdb ./stackwalk
(gdb) break bar
(gdb) run
(gdb) info frame # Show current frame details
(gdb) bt # Compare with your output
(gdb) x/8gx $rbp # Examine memory at RBP
(gdb) print $rbp # Current frame pointer
(gdb) print *(void**)$rbp # Saved RBP (caller's frame)
(gdb) x/i *($rbp+8) # Instruction at return address
7. Common Pitfalls & Debugging
| Pitfall | Symptom | Solution |
|---|---|---|
| Missing -fno-omit-frame-pointer | Empty or single-frame output | Add flag to CFLAGS |
| Forgetting -rdynamic | dladdr returns NULL for your functions | Add -rdynamic to link flags |
| Not skipping your own frame | First frame is walk_stack itself |
Skip one frame at start |
| Unaligned access | SIGBUS on some architectures | Validate 8-byte alignment |
| Infinite loop | Circular frame pointer chain | Check for NULL AND limit count |
| Wrong return address offset | Symbols don’t match | It’s rbp[1], not rbp[-1] |
| Optimized away locals | Can’t see local variables | Use -O0 for debugging |
| Inlined functions | Missing expected frames | Use __attribute__((noinline)) |
Debugging Strategies
If frames are missing:
# Check if frame pointers are enabled
objdump -d ./stackwalk | grep "push.*rbp" | head -5
# Should see "push %rbp" at function start
# Check with GDB
gdb ./stackwalk -ex "break bar" -ex "run" -ex "bt" -ex "quit"
If symbols are ???:
# Check symbol table
nm ./stackwalk | grep " T "
# Check if built with -rdynamic
readelf -d ./stackwalk | grep SYMBOLIC
If it crashes:
# Run with address sanitizer
gcc -fsanitize=address -g ./stackwalk.c -o stackwalk -ldl -rdynamic
./stackwalk
# Check where it fails
gdb ./stackwalk -ex "run" -ex "bt"
8. Extensions & Challenges
Beginner Extensions
- Add color-coded output (different colors for each frame)
- Show frame sizes in human-readable format (e.g., “48 bytes”)
- Add JSON output mode for use with other tools
- Show time spent in each frame (requires sampling profiler approach)
Intermediate Extensions
- Integrate addr2line for source file and line numbers
- Support both 32-bit and 64-bit (via conditional compilation)
- Add DWARF unwind support for optimized binaries
- Show register values saved in each frame
- Detect and report leaf functions (no prologue)
Advanced Extensions
- Build a mini-profiler using stack sampling
- Implement exception unwinding (like C++ throw)
- Support for signal handler stacks
- Compare with libunwind or libbacktrace
- Add support for stripped binaries (no symbols)
- Detect stack corruption and buffer overflows
9. Real-World Connections
How Production Tools Do This
GDB:
- Uses DWARF .debug_frame or .eh_frame sections
- Falls back to frame pointer chain if available
- Supports “pseudo-frame pointers” for optimized code
Valgrind:
- Instruments every function entry/exit
- Maintains shadow stack for accurate traces
- Doesn’t rely on frame pointers
Linux Kernel (perf):
- Uses frame pointers when available (kernel built with them)
- Or LBR (Last Branch Record) hardware feature
- Or DWARF unwinding for userspace
Crash Reporting (like Crashpad):
- Captures register state at crash time
- Uses platform-specific unwinders
- Symbols resolved server-side from debug packages
Related Projects
- libunwind: Portable stack unwinding library
- libbacktrace: Ian Lance Taylor’s backtrace library (used in GCC)
- elfutils: DWARF unwinding support
- simpleperf: Android’s profiler with stack sampling
10. Resources
Essential Reading
- CS:APP Chapter 3.7: “Procedures” - Complete coverage of calling conventions
- System V AMD64 ABI: Official x86-64 calling convention specification
- GDB Internals: Stack frame handling documentation
Online Resources
- Eli Bendersky’s Stack Frame Layout - Excellent visual explanation
- OSDev Wiki: Stack - Low-level details
- DWARF Standard - Debug information format
Tools
gdb/lldb- Reference implementations of stack walkingobjdump -d- Disassemble to see prologues/epiloguesreadelf -wF- Show .eh_frame (unwind info)addr2line- Convert addresses to source lines
11. Self-Assessment Checklist
Understanding
- I can draw a stack frame showing RBP, return address, locals
- I know the difference between RSP and RBP
- I understand why frame pointers form a linked list
- I can explain the function prologue and epilogue
- I know why -fomit-frame-pointer breaks stack walking
- I understand System V AMD64 ABI register passing order
Implementation
- My tool walks frames correctly for a known call chain
- Symbol resolution works (function names appear)
- Invalid addresses don’t cause crashes
- Output matches GDB’s
btcommand - Code handles edge cases (NULL frame pointer, max depth)
Growth
- I can debug stack traces in crash reports
- I understand buffer overflow exploits at the stack level
- I can read GCC output and identify prologue/epilogue
- I know when frame pointers are present or absent
- I can explain this to someone learning systems programming
12. Submission / Completion Criteria
Minimum Viable Completion
- Walk at least 3 frames of a test program
- Display frame pointer and return address for each
- Compile and run without crashes
Full Completion
- All functional requirements implemented
- Symbol resolution working
- Clean, formatted output
- Handle edge cases gracefully
- Works with provided test cases
- Compare mode showing -fomit-frame-pointer effect
Excellence
- Source line resolution (addr2line integration)
- Support for both 32-bit and 64-bit
- DWARF unwind support for optimized code
- Performance profiler extension
- Comprehensive documentation and examples
This guide was expanded from EXPERT_C_PROGRAMMING_DEEP_DIVE.md. For the complete learning path, see the project index.