Project 5: The Assembly Level - Disassemble and Stepi
Debug an optimized program at the instruction level and learn what the CPU actually executes.
Quick Reference
| Attribute | Value |
|---|---|
| Difficulty | Advanced |
| Time Estimate | 3-4 hours |
| Language | GDB commands / x86-64 assembly |
| Prerequisites | Project 1, basic registers, calling convention |
| Key Topics | disassemble, stepi, registers, optimizations |
1. Learning Objectives
By completing this project, you will:
- Navigate disassembly and correlate it to source code.
- Step instruction-by-instruction with
stepiandnexti. - Inspect register state and calling convention behavior.
- Explain how compiler optimizations change program structure.
2. Theoretical Foundation
2.1 Core Concepts
- Instruction pointer (RIP): Points to the next instruction to execute.
- Calling convention: Arguments in
rdi,rsi,rdx, return inrax. - Optimization effects: Variables may never exist in memory; code can be reordered or removed.
2.2 Why This Matters
When debugging optimized code, source-level stepping lies. Assembly is the source of truth for what ran.
2.3 Historical Context / Background
Compiler optimization research matured in the 1970s and 1980s. Modern compilers aggressively transform code for speed, which makes assembly-level debugging essential.
2.4 Common Misconceptions
- “C lines map 1:1 to assembly”: Often false under optimization.
- “Variables always exist”: Many are optimized into registers or removed.
3. Project Specification
3.1 What You Will Build
A simple C program compiled with -O2. You will step through instructions, inspect registers, and prove how optimizations rewrite the code.
3.2 Functional Requirements
- Compile with debug symbols and optimizations.
- Disassemble
mainand a helper function. - Step into and through the helper using
stepi. - Read register values to confirm computations.
3.3 Non-Functional Requirements
- Performance: Small program for clear disassembly.
- Reliability: Same binary used for analysis.
- Usability: Use TUI for readability.
3.4 Example Usage / Output
(gdb) disassemble main
(gdb) layout asm
(gdb) stepi
(gdb) info registers rax rdi rsi
3.5 Real World Outcome
You will see assembly and register changes during execution:
$ gcc -g -O2 -o optimized optimized.c
$ gdb ./optimized
(gdb) break main
(gdb) run
(gdb) disassemble calculate
Dump of assembler code for function calculate:
0x... <+0>: lea (%rdi,%rsi,1),%eax
0x... <+3>: add %eax,%eax
0x... <+5>: ret
(gdb) stepi
(gdb) info registers rax
rax 0x3c 60
4. Solution Architecture
4.1 High-Level Design
┌──────────────┐ ┌───────────────┐ ┌──────────────┐
│ optimized.c │────▶│ gcc -O2 -g │────▶│ gdb assembly │
└──────────────┘ └───────────────┘ └──────────────┘
4.2 Key Components
| Component | Responsibility | Key Decisions |
|---|---|---|
| Optimized binary | Provide realistic assembly | Use -O2 |
| GDB session | Inspect instructions | Use layout asm |
| Register view | Verify computation | Inspect rax, rdi, rsi |
4.3 Data Structures
struct RegisterSnapshot {
unsigned long rax;
unsigned long rdi;
unsigned long rsi;
};
4.4 Algorithm Overview
Key Algorithm: Instruction trace
- Stop at
main. - Disassemble functions.
- Step instruction-by-instruction.
- Read registers after each instruction.
Complexity Analysis:
- Time: O(I) for number of instructions stepped.
- Space: O(1).
5. Implementation Guide
5.1 Development Environment Setup
gcc -g -O2 -o optimized optimized.c
5.2 Project Structure
project-root/
├── optimized.c
└── optimized
5.3 The Core Question You’re Answering
“What is the CPU actually executing, and how does it differ from my source code?”
5.4 Concepts You Must Understand First
Stop and research these before coding:
- Calling convention
- Argument registers and return register
- Basic x86-64 instructions
mov,lea,add,call,ret
- Optimization levels
-O0vs-O2vs-O3
5.5 Questions to Guide Your Design
- Which register holds the return value?
- How do you map an instruction to a C line?
- Why might the compiler remove a variable?
5.6 Thinking Exercise
How would you force the compiler to keep a variable in memory so it is visible in GDB?
5.7 The Interview Questions They’ll Ask
- What is the calling convention on x86-64?
- Why does optimized code make debugging harder?
- How do you step a single instruction in GDB?
5.8 Hints in Layers
Hint 1: Use TUI
layout asm,layout regs
Hint 2: Mixed view
disassemble /m main
Hint 3: Inspect registers
info registers rax rdi rsi
5.9 Books That Will Help
| Topic | Book | Chapter |
|---|---|---|
| Assembly basics | CSAPP | Ch. 3 |
| Optimizations | CSAPP | Ch. 5 |
| GDB assembly features | GDB Manual | “Examining Source” |
5.10 Implementation Phases
Phase 1: Foundation (45 minutes)
Goals:
- Produce an optimized binary.
Tasks:
- Compile with
-O2 -g. - Disassemble
main.
Checkpoint: You can see assembly output.
Phase 2: Core Functionality (60 minutes)
Goals:
- Step and inspect.
Tasks:
- Step through instructions with
stepi. - Record register values.
Checkpoint: You can explain how calculate works in assembly.
Phase 3: Polish & Edge Cases (45 minutes)
Goals:
- Compare with
-O0build.
Tasks:
- Build a non-optimized binary.
- Compare disassembly and variable visibility.
Checkpoint: You can explain optimization differences.
5.11 Key Implementation Decisions
| Decision | Options | Recommendation | Rationale |
|---|---|---|---|
| Optimization | -O0, -O2 |
-O2 |
Demonstrates real-world issues |
| View mode | TUI vs plain | TUI | Easier to follow |
6. Testing Strategy
6.1 Test Categories
| Category | Purpose | Examples |
|---|---|---|
| Disassembly | Confirm view | disassemble main works |
| Stepping | Validate instruction stepping | stepi advances RIP |
| Register read | Confirm state | info registers rax |
6.2 Critical Test Cases
- Disassembly shows
calculateinstructions. stepimoves acrosscallandret.raxholds expected result.
6.3 Test Data
calculate(10, 20) -> 60
7. Common Pitfalls & Debugging
7.1 Frequent Mistakes
| Pitfall | Symptom | Solution |
|---|---|---|
| Optimized out variables | <optimized out> |
Inspect registers instead |
| Confusing source lines | Stepping jumps | Use disassemble /m |
| Missing symbols | No source mapping | Compile with -g |
7.2 Debugging Strategies
- Track
ripand usex/10i $rip. - Compare
-O0and-O2builds to learn changes.
7.3 Performance Traps
None for this small target, but be aware that single-stepping optimized code is slower.
8. Extensions & Challenges
8.1 Beginner Extensions
- Use
nextito skip over function calls.
8.2 Intermediate Extensions
- Debug a loop unrolled by the compiler.
8.3 Advanced Extensions
- Inspect SIMD instructions and vector registers.
9. Real-World Connections
9.1 Industry Applications
- Performance debugging: Identify compiler reordering and optimization effects.
- Security: Understand how binary-level behavior differs from source.
9.2 Related Open Source Projects
- Compiler Explorer: Compare source and assembly visually.
- GDB: Official debugger.
9.3 Interview Relevance
- Demonstrates low-level understanding of how code executes.
10. Resources
10.1 Essential Reading
- CSAPP - Machine-level representation and optimizations.
- Intel SDM - Instruction reference.
10.2 Video Resources
- Search: “gdb assembly stepi”.
10.3 Tools & Documentation
- GDB: https://sourceware.org/gdb/
- objdump: For offline disassembly.
10.4 Related Projects in This Series
11. Self-Assessment Checklist
11.1 Understanding
- I can explain how a
callsets up a return. - I can map an instruction to a C statement.
11.2 Implementation
- I stepped through an optimized function.
- I verified results in registers.
11.3 Growth
- I can debug code even if source is misleading.
12. Submission / Completion Criteria
Minimum Viable Completion:
- Disassemble and step through
calculate.
Full Completion:
- Explain register values at each instruction.
Excellence (Going Above & Beyond):
- Compare instruction sequences for
-O0and-O2builds.
This guide was generated from LEARN_GDB_DEEP_DIVE.md. For the complete learning path, see the parent directory README.