Project 17: Calling Convention Visualizer
Build a tool that shows how function arguments are passed (registers vs stack), revealing the ABI layer between C and machine code.
Quick Reference
| Attribute | Value |
|---|---|
| Language | C |
| Difficulty | Level 4 (Advanced) |
| Time | 2 Weeks |
| Book Reference | CS:APP Ch. 3, Expert C Programming Ch. 6 |
| Key Concepts | Calling conventions, ABI, registers, stack |
| Prerequisites | Assembly basics, stack frames, function calls |
| Portfolio Value | High - Demonstrates systems understanding |
1. Learning Objectives
By completing this project, you will:
-
Master the System V AMD64 ABI - Understand how the most common 64-bit Unix calling convention passes arguments and returns values
-
Understand register allocation for parameters - Know exactly which registers (RDI, RSI, RDX, RCX, R8, R9, XMM0-XMM7) are used for which argument types
-
Know when arguments spill to the stack - Understand why the 7th integer argument goes on the stack, and how large structs are handled
-
Compare calling conventions - Understand differences between System V AMD64, Windows x64, cdecl, and ARM64 ABIs
-
Parse C function prototypes - Build a parser that extracts return type, function name, and parameter types
-
Generate verification assembly - Create assembly code that proves your analysis is correct
-
Understand struct passing rules - Learn the complex classification algorithm for passing structs (INTEGER, SSE, MEMORY classes)
-
Debug FFI issues - Apply this knowledge to debug foreign function interface problems in languages like Python, Rust, and Go
2. Theoretical Foundation
2.1 Core Concepts
What is a Calling Convention?
A calling convention is a contract between the caller and callee that specifies:
+------------------------------------------------------------------+
| CALLING CONVENTION CONTRACT |
+------------------------------------------------------------------+
| |
| 1. ARGUMENT PASSING |
| - Which arguments go in which registers? |
| - Which arguments go on the stack? |
| - In what order? |
| |
| 2. RETURN VALUES |
| - Which register holds the return value? |
| - How are large return values handled? |
| |
| 3. REGISTER PRESERVATION |
| - Which registers must the callee save? (callee-saved) |
| - Which registers can the callee trash? (caller-saved) |
| |
| 4. STACK MANAGEMENT |
| - Who cleans up the stack after the call? |
| - What alignment is required? |
| |
+------------------------------------------------------------------+
System V AMD64 ABI - The Linux/macOS Standard
The System V AMD64 ABI is used on Linux, macOS, FreeBSD, and other Unix-like systems:
+------------------------------------------------------------------+
| SYSTEM V AMD64 ABI - ARGUMENT PASSING |
+------------------------------------------------------------------+
| |
| INTEGER/POINTER ARGUMENTS (in order): |
| +--------+--------+--------+--------+--------+--------+ |
| | 1st | 2nd | 3rd | 4th | 5th | 6th | |
| | RDI | RSI | RDX | RCX | R8 | R9 | |
| +--------+--------+--------+--------+--------+--------+ |
| |
| FLOATING-POINT ARGUMENTS (in order): |
| +--------+--------+--------+--------+--------+--------+--------+ |
| | 1st | 2nd | 3rd | 4th | 5th | 6th | 7th | |
| | XMM0 | XMM1 | XMM2 | XMM3 | XMM4 | XMM5 | XMM6 | |
| +--------+--------+--------+--------+--------+--------+--------+ |
| | 8th | |
| | XMM7 | |
| +--------+ |
| |
| OVERFLOW ARGUMENTS: Pushed on stack right-to-left |
| |
| RETURN VALUE: |
| - Integer/pointer: RAX (and RDX for 128-bit) |
| - Floating-point: XMM0 (and XMM1 for 128-bit) |
| - Large structs: Caller allocates, passes hidden pointer in RDI |
| |
+------------------------------------------------------------------+
Register Sizes and Subregisters
+------------------------------------------------------------------+
| x86-64 REGISTER NAMING |
+------------------------------------------------------------------+
| |
| 64-bit 32-bit 16-bit 8-bit |
| ------ ------ ------ ----- |
| RAX EAX AX AL |
| RBX EBX BX BL |
| RCX ECX CX CL |
| RDX EDX DX DL |
| RSI ESI SI SIL |
| RDI EDI DI DIL |
| RBP EBP BP BPL |
| RSP ESP SP SPL |
| R8 R8D R8W R8B |
| R9 R9D R9W R9B |
| R10 R10D R10W R10B |
| R11 R11D R11W R11B |
| R12 R12D R12W R12B |
| R13 R13D R13W R13B |
| R14 R14D R14W R14B |
| R15 R15D R15W R15B |
| |
| IMPORTANT: For 32-bit operations (int), use EDI, ESI, etc. |
| Writing to EAX automatically zero-extends to RAX. |
| |
+------------------------------------------------------------------+
Caller-Saved vs Callee-Saved Registers
+------------------------------------------------------------------+
| REGISTER PRESERVATION IN SYSTEM V AMD64 |
+------------------------------------------------------------------+
| |
| CALLER-SAVED (volatile - callee can modify freely): |
| +-----+-----+-----+-----+-----+-----+-----+-----+-----+-----+ |
| | RAX | RCX | RDX | RSI | RDI | R8 | R9 | R10 | R11 | | |
| +-----+-----+-----+-----+-----+-----+-----+-----+-----+-----+ |
| | XMM0-XMM15 | (all SSE registers are caller-saved) | |
| +-----+-----+-----+-----+-----+-----+-----+-----+-----+-----+ |
| |
| If caller needs these values after the call, it must save them. |
| |
| CALLEE-SAVED (non-volatile - callee must preserve): |
| +-----+-----+-----+-----+-----+-----+-----+ |
| | RBX | RBP | R12 | R13 | R14 | R15 | RSP | |
| +-----+-----+-----+-----+-----+-----+-----+ |
| |
| Callee must save these on entry and restore before returning. |
| |
+------------------------------------------------------------------+
Stack Frame Layout
+------------------------------------------------------------------+
| STACK FRAME LAYOUT (System V AMD64) |
+------------------------------------------------------------------+
| |
| HIGH ADDRESSES |
| +------------------------------------------+ |
| | Argument 8 (if any) | [RBP + 24] |
| +------------------------------------------+ |
| | Argument 7 (if any) | [RBP + 16] |
| +------------------------------------------+ |
| | Return address | [RBP + 8] |
| +------------------------------------------+ |
| | Saved RBP (frame pointer) | [RBP] <-- RBP |
| +------------------------------------------+ |
| | Local variable 1 | [RBP - 8] |
| +------------------------------------------+ |
| | Local variable 2 | [RBP - 16] |
| +------------------------------------------+ |
| | ... | |
| +------------------------------------------+ |
| | Red zone (128 bytes) | <-- RSP |
| +------------------------------------------+ |
| LOW ADDRESSES |
| |
| NOTES: |
| - Stack must be 16-byte aligned before CALL instruction |
| - Red zone: 128 bytes below RSP that leaf functions can use |
| without adjusting RSP (System V only, not Windows) |
| - CALL pushes 8-byte return address, so RSP becomes 8-byte |
| aligned at function entry |
| |
+------------------------------------------------------------------+
2.2 Why This Matters
Understanding calling conventions is essential for:
1. Foreign Function Interface (FFI)
// Python calling C code (ctypes/cffi)
// Must match the ABI exactly or crash/corrupt data
// If you misunderstand the ABI:
// - Arguments get wrong values
// - Return values are garbage
// - Stack corruption causes crashes
2. Writing Assembly
; If you're writing hand-optimized assembly:
; - Must know where arguments arrive
; - Must preserve the right registers
; - Must return values in the right place
3. Debugging Core Dumps
When examining a crash in GDB:
- Register values tell you argument values
- Understanding the ABI helps trace the call
4. Performance Optimization
// Knowing that args 1-6 go in registers:
// - Functions with <= 6 args have zero memory traffic for args
// - Reordering parameters can improve cache behavior
// - Choosing types affects register usage (int vs float)
5. Security Research
ROP/JOP exploits require understanding:
- Where return addresses are stored
- How to pass "arguments" to gadgets
- Register state at function entry
2.3 Historical Context
+------------------------------------------------------------------+
| EVOLUTION OF CALLING CONVENTIONS |
+------------------------------------------------------------------+
| |
| 1970s-1980s: CHAOS |
| - Each compiler had its own convention |
| - Calling code from different compilers was painful |
| |
| 1980s: CDECL (C Declaration) |
| - Arguments pushed on stack right-to-left |
| - Caller cleans up stack |
| - Works for variadic functions (printf) |
| - EAX holds return value |
| |
| 1990s: FASTCALL variants |
| - Pass first 2-3 args in registers (ECX, EDX) |
| - Rest on stack |
| - Faster but incompatible implementations |
| |
| 2000s: AMD64/x86-64 |
| - Two major ABIs diverged: |
| * System V AMD64: Linux, macOS, BSD (6 integer regs) |
| * Microsoft x64: Windows (4 integer regs, shadow space) |
| |
| 2010s-present: ARM64 |
| - x0-x7 for integer arguments (8 registers!) |
| - Cleaner design, learned from x86 mistakes |
| |
+------------------------------------------------------------------+
2.4 Common Misconceptions
+------------------------------------------------------------------+
| CALLING CONVENTION MYTHS |
+------------------------------------------------------------------+
| |
| MYTH 1: "All arguments go on the stack" |
| ------- |
| REALITY: In System V AMD64, first 6 integer args go in registers. |
| Stack is only used for overflow arguments. |
| This is much faster than pure stack passing. |
| |
| MYTH 2: "Calling conventions are the same everywhere" |
| ------- |
| REALITY: Windows x64 uses different registers (RCX, RDX, R8, R9) |
| and requires 32-byte shadow space. Linux code won't work.|
| |
| MYTH 3: "The compiler handles it, I don't need to know" |
| ------- |
| REALITY: True for pure C. But FFI, assembly, debugging, and |
| reverse engineering all require this knowledge. |
| |
| MYTH 4: "Structs are always passed by pointer" |
| ------- |
| REALITY: Small structs (up to 16 bytes) can be passed in |
| registers. The classification algorithm is complex. |
| |
| MYTH 5: "Return values are always in RAX" |
| ------- |
| REALITY: Floats return in XMM0. Large structs use a hidden |
| pointer parameter. Complex numbers use XMM0:XMM1. |
| |
+------------------------------------------------------------------+
3. Project Specification
3.1 What You Will Build
A calling convention visualizer that parses C function prototypes and shows exactly where each argument is passed:
$ ./callconv "int add(int a, int b, int c, int d, int e, int f, int g)"
Function: int add(int a, int b, int c, int d, int e, int f, int g)
Argument Passing (System V AMD64):
+------+-------+-------+-----------+---------------------------+
| Arg | Name | Type | Passed In | Notes |
+------+-------+-------+-----------+---------------------------+
| 1 | a | int | EDI | Integer arg #1 (32-bit) |
| 2 | b | int | ESI | Integer arg #2 (32-bit) |
| 3 | c | int | EDX | Integer arg #3 (32-bit) |
| 4 | d | int | ECX | Integer arg #4 (32-bit) |
| 5 | e | int | R8D | Integer arg #5 (32-bit) |
| 6 | f | int | R9D | Integer arg #6 (32-bit) |
| 7 | g | int | [RSP+8] | Stack overflow argument |
+------+-------+-------+-----------+---------------------------+
Return Value: EAX (32-bit integer)
Stack Frame at Function Entry:
+------------------+
| Argument 7 (g) | [RSP + 8]
+------------------+
| Return Address | [RSP]
+------------------+
3.2 Functional Requirements
FR1: Parse C Function Prototypes
- Handle basic types:
void,char,short,int,long,long long - Handle floating-point:
float,double,long double - Handle pointers:
int *,char **,void * - Handle const qualifiers:
const char * - Handle unsigned:
unsigned int,unsigned long - Handle typedefs:
size_t,int64_t(common ones)
FR2: Show Register Allocation
- Display which register each argument uses
- Show correct subregister names (EDI vs RDI for int vs long)
- Mark stack overflow arguments with offset
- Distinguish INTEGER, SSE, and MEMORY class arguments
FR3: Show Return Value Location
- Integer returns in RAX/EAX
- Floating-point returns in XMM0
- Large struct returns via hidden pointer
FR4: Handle Mixed Arguments
$ ./callconv "double compute(int x, double y, int z, float w)"
+------+-------+---------+-----------+---------------------------+
| Arg | Name | Type | Passed In | Notes |
+------+-------+---------+-----------+---------------------------+
| 1 | x | int | EDI | Integer arg #1 |
| 2 | y | double | XMM0 | SSE arg #1 (64-bit) |
| 3 | z | int | ESI | Integer arg #2 |
| 4 | w | float | XMM1 | SSE arg #2 (32-bit) |
+------+-------+---------+-----------+---------------------------+
Return Value: XMM0 (double, 64-bit)
FR5: Generate Verification Assembly
$ ./callconv "int add(int a, int b)" --verify
; Verification assembly (paste into Godbolt):
; Arguments should appear in EDI and ESI
global verify_add
verify_add:
; int a is in EDI
; int b is in ESI
mov eax, edi ; Copy first arg to return
add eax, esi ; Add second arg
ret
FR6: Compare with Windows x64
$ ./callconv "int add(int a, int b, int c, int d, int e)" --windows
Comparison: System V AMD64 vs Windows x64
System V AMD64:
arg1: EDI arg2: ESI arg3: EDX arg4: ECX arg5: R8D
Windows x64:
arg1: ECX arg2: EDX arg3: R8D arg4: R9D arg5: [RSP+40]
(Note: Windows requires 32-byte shadow space)
3.3 Non-Functional Requirements
- Accuracy: Must match actual compiler behavior (verify with GCC/Clang)
- Performance: Parsing should be instantaneous for single prototypes
- Educational Value: Output should teach, not just inform
- Portability: Should compile on Linux and macOS
3.4 Example Usage / Output
Example 1: Simple Function
$ ./callconv "void hello(void)"
Function: void hello(void)
No arguments to pass.
Return Value: None (void)
Example 2: Variadic Function
$ ./callconv "int printf(const char *fmt, ...)"
Function: int printf(const char *fmt, ...)
+------+-------+---------------+-----------+---------------------------+
| Arg | Name | Type | Passed In | Notes |
+------+-------+---------------+-----------+---------------------------+
| 1 | fmt | const char * | RDI | Integer arg #1 (pointer) |
| ... | ... | variadic | varies | See below |
+------+-------+---------------+-----------+---------------------------+
Variadic Arguments:
- For variadic functions, AL must contain the number of XMM registers used
- Integer variadic args continue in RSI, RDX, RCX, R8, R9, then stack
- Float variadic args use XMM0-XMM7, then stack
Return Value: EAX (32-bit integer)
Example 3: Struct Parameters
$ ./callconv "void process(struct point p)" --struct "struct point { int x; int y; }"
Function: void process(struct point p)
Struct Analysis:
struct point { int x; int y; }
Size: 8 bytes
Classification: INTEGER, INTEGER
+------+-------+----------------+-----------+---------------------------+
| Arg | Name | Type | Passed In | Notes |
+------+-------+----------------+-----------+---------------------------+
| 1 | p | struct point | RDI | Packed into single reg |
| | | (8 bytes) | | x in low 32, y in high 32 |
+------+-------+----------------+-----------+---------------------------+
Return Value: None (void)
Example 4: Large Struct Return
$ ./callconv "struct big getBig(int x)" --struct "struct big { long a; long b; long c; }"
Function: struct big getBig(int x)
Struct Analysis (Return Type):
struct big { long a; long b; long c; }
Size: 24 bytes (> 16 bytes)
Classification: MEMORY (too large for registers)
+------+-------+---------------+-----------+---------------------------+
| Arg | Name | Type | Passed In | Notes |
+------+-------+---------------+-----------+---------------------------+
| 0 | (ret) | struct big * | RDI | HIDDEN: caller allocates |
| 1 | x | int | ESI | Shifted to arg #2 slot! |
+------+-------+---------------+-----------+---------------------------+
Return Value:
- RAX contains pointer to result (same as hidden arg)
- Callee writes to memory pointed by hidden RDI
NOTE: The hidden pointer shifts all other arguments by one position!
3.5 Real World Outcome
After building this tool, you will be able to:
- Debug FFI calls when Python/Ruby/Go calls C code and gets garbage values
- Write correct inline assembly that receives arguments properly
- Read disassembly and immediately know what arguments a function received
- Understand ABI breaks when library updates change struct sizes
- Answer interview questions about low-level function call mechanics
4. Solution Architecture
4.1 High-Level Design
+------------------------------------------------------------------+
| CALLCONV ARCHITECTURE |
+------------------------------------------------------------------+
| |
| Input: "int add(int a, int b, int c)" |
| | |
| v |
| +----------------------------+ |
| | PROTOTYPE PARSER | |
| | - Tokenize the string | |
| | - Parse return type | |
| | - Parse function name | |
| | - Parse parameters | |
| +----------------------------+ |
| | |
| v |
| +----------------------------+ |
| | TYPE CLASSIFIER | |
| | - Classify each type | |
| | - INTEGER, SSE, MEMORY | |
| | - Handle structs | |
| +----------------------------+ |
| | |
| v |
| +----------------------------+ |
| | REGISTER ALLOCATOR | |
| | - Assign registers | |
| | - Track overflow to stack | |
| | - Handle hidden pointers | |
| +----------------------------+ |
| | |
| v |
| +----------------------------+ |
| | OUTPUT FORMATTER | |
| | - Generate table | |
| | - Generate assembly | |
| | - Compare ABIs | |
| +----------------------------+ |
| | |
| v |
| Output: Formatted table and optional assembly |
| |
+------------------------------------------------------------------+
4.2 Key Components
1. Prototype Parser (parser.c)
- Lexer to tokenize C declarations
- Simple recursive descent parser
- Handles type specifiers, qualifiers, pointers, arrays
2. Type Classifier (classifier.c)
- Maps C types to ABI classes (INTEGER, SSE, SSEUP, X87, MEMORY, etc.)
- Implements struct classification algorithm
- Handles alignment requirements
3. Register Allocator (allocator.c)
- Tracks available integer registers (RDI, RSI, RDX, RCX, R8, R9)
- Tracks available SSE registers (XMM0-XMM7)
- Calculates stack offsets for overflow arguments
4. Output Formatter (output.c)
- Pretty-prints tables
- Generates verification assembly
- Produces comparison output for different ABIs
4.3 Data Structures
/* Type classification per System V AMD64 ABI */
typedef enum {
CLASS_INTEGER, /* Passed in integer registers */
CLASS_SSE, /* Passed in SSE registers (first half) */
CLASS_SSEUP, /* Passed in SSE registers (second half) */
CLASS_X87, /* x87 floating point (long double) */
CLASS_X87UP, /* x87 second eightbyte */
CLASS_COMPLEX_X87,
CLASS_MEMORY, /* Passed on stack */
CLASS_NO_CLASS /* Padding, empty */
} TypeClass;
/* Basic type information */
typedef enum {
TYPE_VOID,
TYPE_CHAR,
TYPE_SHORT,
TYPE_INT,
TYPE_LONG,
TYPE_LONGLONG,
TYPE_FLOAT,
TYPE_DOUBLE,
TYPE_LONGDOUBLE,
TYPE_POINTER,
TYPE_STRUCT,
TYPE_UNION,
TYPE_ARRAY
} BaseType;
/* Parsed type representation */
typedef struct Type {
BaseType base;
int is_unsigned;
int is_const;
int pointer_depth; /* 0 = not pointer, 1 = *, 2 = **, etc. */
size_t size; /* Size in bytes */
size_t alignment; /* Alignment requirement */
struct Type *pointee; /* For pointers, what we point to */
/* For structs/arrays */
struct StructDef *struct_def;
size_t array_size;
} Type;
/* Parsed parameter */
typedef struct {
char name[64]; /* Parameter name (may be empty) */
Type type; /* Parameter type */
TypeClass classes[2]; /* ABI classification (for 16-byte types) */
} Parameter;
/* Parsed function prototype */
typedef struct {
Type return_type;
char name[64];
Parameter params[32];
int param_count;
int is_variadic;
} FunctionPrototype;
/* Register allocation result */
typedef struct {
int param_index; /* Which parameter */
int is_stack; /* True if passed on stack */
union {
struct {
const char *reg_name; /* "RDI", "XMM0", etc. */
const char *sub_reg; /* "EDI", etc. for 32-bit */
} reg;
struct {
int offset; /* Offset from RSP after CALL */
} stack;
} location;
TypeClass class;
} AllocationResult;
4.4 Algorithm Overview
The Classification Algorithm for Structs (Simplified)
The System V AMD64 ABI has a complex algorithm for classifying structs. Here is the essence:
+------------------------------------------------------------------+
| STRUCT CLASSIFICATION ALGORITHM (SIMPLIFIED) |
+------------------------------------------------------------------+
| |
| 1. If size > 16 bytes (or has unaligned fields): |
| -> CLASS_MEMORY (pass by hidden pointer) |
| |
| 2. Split struct into 8-byte "eightbytes" |
| |
| 3. For each eightbyte, determine class: |
| - If all fields are integer types -> CLASS_INTEGER |
| - If any field is float/double -> CLASS_SSE |
| - If any field is long double -> CLASS_X87 |
| - Apply merge rules for overlapping cases |
| |
| 4. Post-process: |
| - If any eightbyte is MEMORY -> whole struct is MEMORY |
| - If all INTEGER -> pass in integer registers |
| - If SSE -> pass in SSE registers |
| |
+------------------------------------------------------------------+
Example 1: struct { int x; int y; } (8 bytes)
-> One eightbyte, both ints -> CLASS_INTEGER
-> Pass in RDI
Example 2: struct { double x; double y; } (16 bytes)
-> Two eightbytes, both doubles -> CLASS_SSE, CLASS_SSEUP
-> Pass in XMM0 (low 64 bits = x, high 64 bits = y)
Example 3: struct { int x; double y; } (16 bytes)
-> First eightbyte: int (4) + padding (4) -> CLASS_INTEGER
-> Second eightbyte: double -> CLASS_SSE
-> Pass x in RDI, y in XMM0
Example 4: struct { long a; long b; long c; } (24 bytes)
-> Size > 16 bytes -> CLASS_MEMORY
-> Pass by hidden pointer in RDI
5. Implementation Guide
5.1 Development Environment Setup
Required Tools:
# Compiler and debugger
gcc --version # or clang --version
gdb --version # or lldb on macOS
# For verification
objdump --version
# Or Compiler Explorer (godbolt.org) for online verification
Recommended Compiler Flags:
CFLAGS = -Wall -Wextra -Wpedantic -std=c11 -g -O0
5.2 Project Structure
calling-convention-visualizer/
|-- include/
| |-- callconv.h # Main header
| |-- parser.h # Prototype parser
| |-- classifier.h # Type classifier
| |-- allocator.h # Register allocator
| |-- output.h # Output formatting
| |-- types.h # Type definitions
|-- src/
| |-- main.c # Entry point, CLI
| |-- parser.c # Prototype parsing
| |-- lexer.c # Tokenization
| |-- classifier.c # ABI classification
| |-- allocator.c # Register allocation
| |-- output.c # Table formatting
| |-- verify.c # Assembly generation
| |-- windows_abi.c # Windows x64 rules
|-- tests/
| |-- test_parser.c # Parser tests
| |-- test_allocator.c # Allocation tests
| |-- verify/ # Verification programs
| | |-- verify_int_args.c
| | |-- verify_float_args.c
| | |-- verify_mixed.c
| | |-- verify_structs.c
|-- Makefile
|-- README.md
5.3 The Core Question You’re Answering
“How does the calling convention determine where function arguments go, and how can we visualize this mapping for any C function prototype?”
5.4 Concepts You Must Understand First
Before implementing, ensure you understand:
1. INTEGER class arguments in System V AMD64:
- Arguments go in: RDI, RSI, RDX, RCX, R8, R9 (in that order)
- After 6 integer args, overflow to stack
- Pointers are INTEGER class (64-bit)
intuses 32-bit subregisters (EDI, ESI, etc.)
2. SSE class arguments:
- Arguments go in: XMM0, XMM1, XMM2, XMM3, XMM4, XMM5, XMM6, XMM7
- After 8 SSE args, overflow to stack
floatuses low 32 bits of XMM registerdoubleuses low 64 bits of XMM register
3. Return value location:
- INTEGER class: RAX (and RDX for 128-bit)
- SSE class: XMM0 (and XMM1 for 128-bit)
- Large structs (>16 bytes): Hidden first parameter (pointer in RDI)
4. Struct classification:
- Structs <= 16 bytes are split into “eightbytes”
- Each eightbyte gets classified
- If any part is MEMORY, whole struct is MEMORY
- Structs > 16 bytes always use MEMORY class
5.5 Questions to Guide Your Design
Parsing:
- How will you tokenize type specifiers (
unsigned,long,const)? - How will you handle arbitrary pointer depth (
char ***)? - How will you parse unnamed parameters (
void foo(int, int))?
Classification:
- How will you handle
longvslong long(may differ by platform)? - How will you implement the struct classification algorithm?
- How will you handle arrays (decay to pointers)?
Allocation:
- How will you track separate counts for INTEGER and SSE registers?
- How will you calculate stack offsets for overflow arguments?
- How will you handle the hidden return pointer for large structs?
Output:
- How will you generate correct subregister names (RDI vs EDI)?
- How will you show struct packing in registers?
- How will you verify correctness against actual compiler output?
5.6 Thinking Exercise
Before coding, manually classify these functions:
// Exercise 1: Where do the arguments go?
void func1(int a, long b, int *c, double d, float e, long long f, int g, double h);
// Exercise 2: What about the return value?
double func2(void);
struct small { int x; int y; };
struct small func3(void);
struct big { long a; long b; long c; };
struct big func4(void);
// Exercise 3: Hidden parameters?
struct big func5(int x); // How many registers does x use?
Answers:
Exercise 1:
a: EDI (int, 32-bit in RDI)
b: RSI (long, 64-bit)
c: RDX (pointer, 64-bit)
d: XMM0 (double)
e: XMM1 (float, low 32 bits)
f: RCX (long long, 64-bit)
g: R8D (int, 32-bit in R8)
h: XMM2 (double)
Integer regs used: 5 (RDI, RSI, RDX, RCX, R8)
SSE regs used: 3 (XMM0, XMM1, XMM2)
No stack overflow.
Exercise 2:
func2: returns in XMM0
func3: struct is 8 bytes, one INTEGER eightbyte -> returns in RAX
func4: struct is 24 bytes > 16 -> MEMORY class -> hidden pointer
Exercise 3:
func5 has hidden return pointer in RDI
Therefore x is in ESI (not EDI!)
5.7 Hints in Layers
Hint 1: Starting the Lexer
Start with a simple tokenizer that recognizes: - Type keywords: `void`, `char`, `short`, `int`, `long`, `float`, `double` - Qualifiers: `const`, `unsigned`, `signed`, `volatile` - Symbols: `*`, `(`, `)`, `,`, `...` - Identifiers: parameter names, function name ```c typedef enum { TOK_VOID, TOK_CHAR, TOK_SHORT, TOK_INT, TOK_LONG, TOK_FLOAT, TOK_DOUBLE, TOK_UNSIGNED, TOK_SIGNED, TOK_CONST, TOK_VOLATILE, TOK_STRUCT, TOK_UNION, TOK_STAR, TOK_LPAREN, TOK_RPAREN, TOK_COMMA, TOK_ELLIPSIS, TOK_IDENT, TOK_EOF } TokenType; ```Hint 2: Parsing Types
Types in C declarations have: 1. Declaration specifiers (const, unsigned, int, etc.) 2. Optional pointer declarator (*, **, etc.) ```c Type parse_type(Lexer *lex) { Type t = {0}; // Parse specifiers while (is_type_specifier(peek(lex))) { Token tok = next(lex); switch (tok.type) { case TOK_CONST: t.is_const = 1; break; case TOK_UNSIGNED: t.is_unsigned = 1; break; case TOK_INT: t.base = TYPE_INT; break; case TOK_LONG: if (t.base == TYPE_LONG) t.base = TYPE_LONGLONG; else t.base = TYPE_LONG; break; // ... etc } } // Parse pointer depth while (peek(lex).type == TOK_STAR) { next(lex); t.pointer_depth++; } return t; } ```Hint 3: Type Classification
The basic classification is simpler than full ABI compliance: ```c TypeClass classify_type(Type *t) { // Pointers are always INTEGER if (t->pointer_depth > 0) return CLASS_INTEGER; switch (t->base) { case TYPE_VOID: return CLASS_NO_CLASS; case TYPE_CHAR: case TYPE_SHORT: case TYPE_INT: case TYPE_LONG: case TYPE_LONGLONG: return CLASS_INTEGER; case TYPE_FLOAT: case TYPE_DOUBLE: return CLASS_SSE; case TYPE_LONGDOUBLE: return CLASS_X87; case TYPE_STRUCT: return classify_struct(t->struct_def); default: return CLASS_MEMORY; } } ```Hint 4: Register Allocation
Track register usage separately for INTEGER and SSE: ```c typedef struct { int next_int_reg; /* 0-5 for RDI,RSI,RDX,RCX,R8,R9 */ int next_sse_reg; /* 0-7 for XMM0-XMM7 */ int stack_offset; /* Offset for stack arguments */ } AllocationState; static const char *int_regs_64[] = {"RDI","RSI","RDX","RCX","R8","R9"}; static const char *int_regs_32[] = {"EDI","ESI","EDX","ECX","R8D","R9D"}; AllocationResult allocate_param(AllocationState *state, Parameter *param) { AllocationResult result = {0}; TypeClass class = param->classes[0]; if (class == CLASS_INTEGER) { if (state->next_int_reg < 6) { result.is_stack = 0; result.location.reg.reg_name = int_regs_64[state->next_int_reg]; // Use 32-bit name for int if (param->type.base == TYPE_INT) { result.location.reg.sub_reg = int_regs_32[state->next_int_reg]; } state->next_int_reg++; } else { result.is_stack = 1; result.location.stack.offset = state->stack_offset; state->stack_offset += 8; // Always 8-byte aligned } } else if (class == CLASS_SSE) { // Similar logic for XMM registers } return result; } ```Hint 5: Handling Hidden Return Pointer
Large struct returns need a hidden first parameter: ```c void process_function(FunctionPrototype *proto) { AllocationState state = {0}; // Check if return type needs hidden pointer if (needs_hidden_pointer(&proto->return_type)) { // Hidden parameter takes RDI printf("Hidden return pointer in RDI\n"); state.next_int_reg = 1; // Start from RSI for user params } // Now allocate user parameters for (int i = 0; i < proto->param_count; i++) { AllocationResult res = allocate_param(&state, &proto->params[i]); // ... } } int needs_hidden_pointer(Type *ret_type) { if (ret_type->base != TYPE_STRUCT) return 0; return ret_type->size > 16; // Structs > 16 bytes } ```5.8 The Interview Questions They’ll Ask
- “Walk me through what happens at the ABI level when I call
printf("Hello %d", 42)“"Hello %d"pointer goes in RDI42goes in ESI (second integer arg)- AL is set to 0 (number of XMM registers used for varargs)
- CALL instruction pushes return address and jumps
- “Why does Windows use different registers than Linux for the same CPU?”
- Historical: Different teams designed ABIs independently
- Windows: RCX, RDX, R8, R9 (4 regs), plus shadow space
- System V: RDI, RSI, RDX, RCX, R8, R9 (6 regs), no shadow space
- Both are valid; they just made different trade-offs
- “How would you debug an FFI call that’s returning garbage?”
- Check if types match exactly (especially size and signedness)
- Verify ABI convention matches (System V vs Windows)
- Check if struct is being passed by value or pointer
- Look for hidden return pointer issues
- “Why can small structs be passed in registers but large ones can’t?”
- Registers are limited and fixed-size
- Copying large data to registers would be slow and wasteful
- Hidden pointer approach is more efficient for large data
- 16-byte limit chosen because two 8-byte registers can hold it
- “What is the ‘red zone’ and when can you use it?”
- 128 bytes below RSP that leaf functions can use without moving RSP
- Only safe for leaf functions (no function calls)
- Signal handlers must not corrupt it
- Windows x64 has no red zone (shadow space instead)
- “How does variadic argument passing work?”
- Named args follow normal rules
- Variadic args continue the same sequence
- AL must contain count of XMM registers used (for va_arg)
- Variadic functions cannot rely on register arguments after stack overflow
5.9 Books That Will Help
| Topic | Book | Chapter |
|---|---|---|
| x86-64 calling convention | CS:APP 3rd Ed | Chapter 3.7 (Procedures) |
| System V ABI details | System V AMD64 ABI Spec | Section 3.2 (Function Calling) |
| Assembly programming | “x64 Assembly Language” by Ray Seyfarth | Chapter 5 |
| Calling conventions | Expert C Programming | Chapter 6 (Runtime Data Structures) |
| Low-level C details | Write Great Code Vol 2 | Chapter 5 |
5.10 Implementation Phases
Phase 1: Basic Parser (Days 1-3)
- Implement lexer for C type syntax
- Parse simple types (int, char, long, float, double)
- Parse pointer types
- Parse function prototypes with named parameters
- Test with:
int add(int a, int b)
Phase 2: Type Classification (Days 4-5)
- Classify basic types (INTEGER, SSE, X87)
- Determine type sizes and alignments
- Handle unsigned/signed variants
- Test with:
void mixed(int x, double y, long z)
Phase 3: Register Allocation (Days 6-8)
- Allocate INTEGER class registers
- Allocate SSE class registers
- Handle stack overflow arguments
- Test with:
int many(int a, int b, int c, int d, int e, int f, int g)
Phase 4: Output Formatting (Days 9-10)
- Generate ASCII tables
- Show correct subregister names
- Add stack frame visualization
- Test output against expected format
Phase 5: Verification (Days 11-12)
- Generate verification assembly
- Compare against gcc -S output
- Test on Compiler Explorer (godbolt.org)
- Fix any discrepancies
Phase 6: Advanced Features (Days 13-14)
- Add struct support (basic)
- Add Windows x64 comparison mode
- Handle variadic functions
- Add hidden return pointer support
5.11 Key Implementation Decisions
Decision 1: How to represent types?
- Option A: String-based (easy parsing, hard manipulation)
- Option B: AST-based (harder parsing, flexible manipulation) [Recommended]
- Option C: Direct enum mapping (simple, limited)
Decision 2: How to verify correctness?
- Option A: Manual verification against docs (error-prone)
- Option B: Generate test programs and compare gcc output [Recommended]
- Option C: Use existing tools like ABI compliance checkers
Decision 3: How much struct support?
- Option A: No struct support (simpler)
- Option B: Simple structs (<= 16 bytes) [Recommended for MVP]
- Option C: Full ABI compliance (complex)
6. Testing Strategy
Test Categories
| Category | Purpose | Examples |
|---|---|---|
| Parser Tests | Verify prototype parsing | Simple types, pointers, qualifiers |
| Classification Tests | Verify type classification | int->INTEGER, double->SSE |
| Allocation Tests | Verify register assignment | First 6 args in registers |
| Overflow Tests | Verify stack handling | 7+ arguments |
| Integration Tests | End-to-end verification | Compare with gcc -S output |
Verification Programs
Create test C programs and compare your output with actual compiler output:
/* tests/verify/verify_int_args.c */
/* Compile: gcc -S -O0 verify_int_args.c -o verify_int_args.s */
int test_int_args(int a, int b, int c, int d, int e, int f, int g) {
/*
* Your tool should predict:
* a: EDI, b: ESI, c: EDX, d: ECX, e: R8D, f: R9D, g: [RSP+8]
*/
return a + b + c + d + e + f + g;
}
/* tests/verify/verify_float_args.c */
double test_float_args(double a, float b, double c, float d) {
/*
* Your tool should predict:
* a: XMM0, b: XMM1 (low 32), c: XMM2, d: XMM3 (low 32)
*/
return a + b + c + d;
}
/* tests/verify/verify_mixed.c */
void test_mixed(int a, double b, int c, float d, long e, double f) {
/*
* Integer regs: a->EDI, c->ESI, e->RDX
* SSE regs: b->XMM0, d->XMM1, f->XMM2
*/
}
Verification Script
#!/bin/bash
# tests/verify_all.sh
echo "Compiling verification programs..."
gcc -S -O0 tests/verify/verify_int_args.c -o /tmp/verify_int_args.s
gcc -S -O0 tests/verify/verify_float_args.c -o /tmp/verify_float_args.s
gcc -S -O0 tests/verify/verify_mixed.c -o /tmp/verify_mixed.s
echo ""
echo "=== verify_int_args ==="
echo "Tool predicts:"
./callconv "int test_int_args(int a, int b, int c, int d, int e, int f, int g)"
echo ""
echo "GCC generated:"
grep -A 10 "test_int_args:" /tmp/verify_int_args.s | head -15
echo ""
echo "=== verify_float_args ==="
echo "Tool predicts:"
./callconv "double test_float_args(double a, float b, double c, float d)"
echo ""
echo "GCC generated:"
grep -A 10 "test_float_args:" /tmp/verify_float_args.s | head -15
echo ""
echo "=== verify_mixed ==="
echo "Tool predicts:"
./callconv "void test_mixed(int a, double b, int c, float d, long e, double f)"
echo ""
echo "GCC generated:"
grep -A 10 "test_mixed:" /tmp/verify_mixed.s | head -15
7. Common Pitfalls & Debugging
| Pitfall | Symptom | Solution |
|---|---|---|
| Confusing 32-bit and 64-bit reg names | Shows RDI for int instead of EDI | Check type size; int uses 32-bit subregister |
| Wrong register order | Arguments in wrong registers | System V: RDI,RSI,RDX,RCX,R8,R9 (not RCX,RDX) |
| Forgetting hidden return pointer | Large struct return has wrong args | Check if return type > 16 bytes; shifts all args |
| Mixing INTEGER and SSE counts | Floats using integer regs | Track separate counts for int vs float arguments |
| Stack offset wrong | Stack args at wrong offset | After CALL: first stack arg at [RSP+8], not [RSP] |
| Ignoring variadic rules | Variadic args wrong | AL must specify XMM register count |
| Assuming all platforms same | Works on Linux, fails on Windows | Different ABIs for different platforms |
Debugging Your Tool
/* Add debug output mode */
#ifdef DEBUG
#define DBG(fmt, ...) fprintf(stderr, "[DBG] " fmt "\n", ##__VA_ARGS__)
#else
#define DBG(fmt, ...)
#endif
void allocate_params(FunctionPrototype *proto) {
AllocationState state = {0};
DBG("Starting allocation for %s", proto->name);
DBG("Return type class: %d", classify_type(&proto->return_type));
if (needs_hidden_pointer(&proto->return_type)) {
DBG("Hidden return pointer uses RDI");
state.next_int_reg = 1;
}
for (int i = 0; i < proto->param_count; i++) {
TypeClass class = classify_type(&proto->params[i].type);
DBG("Param %d (%s): class=%d", i, proto->params[i].name, class);
AllocationResult res = allocate_param(&state, &proto->params[i]);
DBG(" -> %s", res.is_stack ? "STACK" : res.location.reg.reg_name);
}
}
8. Extensions & Challenges
Beginner Extensions
- Add color output (green for registers, yellow for stack)
- Support
typedefaliases (e.g.,size_t->unsigned long) - Show stack frame diagram for overflow arguments
- Add
--helpwith examples
Intermediate Extensions
- Support basic struct definitions inline
- Add Windows x64 ABI comparison mode
- Support ARM64 ABI (x0-x7 for integers, v0-v7 for floats)
- Parse and display function attributes (
__attribute__((cdecl)))
Advanced Extensions
- Full struct classification algorithm per ABI spec
- Support union types
- Handle arrays as parameters (decay to pointers)
- Generate working assembly stubs that can be linked
- Interactive mode: modify prototype and see changes
- Web interface using WebAssembly
Research Extensions
- Compare optimization levels (-O0 vs -O2 calling convention differences)
- Analyze real-world libraries (glibc, openssl) for ABI patterns
- Study tail call optimization effects on calling conventions
9. Real-World Connections
FFI in Modern Languages
Python ctypes/cffi:
from ctypes import CDLL, c_int, c_double
lib = CDLL("./mylib.so")
lib.compute.argtypes = [c_int, c_double, c_int]
lib.compute.restype = c_double
# If argtypes is wrong, you get garbage or crashes
# Your tool helps you verify the expected ABI
result = lib.compute(1, 3.14, 2)
Rust FFI:
extern "C" {
// "C" specifies System V ABI on Unix
fn compute(x: i32, y: f64, z: i32) -> f64;
}
// Incorrect declaration can cause UB
// Your tool helps verify argument passing
Go cgo:
// #include <stdlib.h>
// int compute(int x, double y, int z);
import "C"
func main() {
// Go's cgo must match C ABI exactly
result := C.compute(1, 3.14, 2)
}
Debugging with GDB
When debugging, knowing the ABI helps:
(gdb) break compute
(gdb) run
Breakpoint 1, compute (x=1, y=3.14, z=2)
# Verify arguments are where we expect:
(gdb) info registers rdi rsi rdx xmm0
rdi 0x1 # x = 1 (int, in EDI portion of RDI)
rsi 0x2 # z = 2 (int, in ESI portion of RSI)
xmm0 {v4_float = {...}, v2_double = {3.14, ...}} # y = 3.14
# Your tool predicted: x->EDI, y->XMM0, z->ESI
# GDB confirms!
Security Research: ROP Gadgets
Understanding calling conventions helps in exploit development:
# To "call" a function via ROP, you need to:
# 1. Set up arguments in correct registers
# 2. gadget to pop values into RDI, RSI, etc.
# Example ROP chain to call execve("/bin/sh", NULL, NULL):
# pop rdi; ret; -> address of "/bin/sh"
# pop rsi; ret; -> NULL (argv)
# pop rdx; ret; -> NULL (envp)
# execve address
# Your tool helps understand what values go where!
10. Resources
Official Specifications
- System V AMD64 ABI - The authoritative source
- Microsoft x64 Calling Convention - Windows specifics
- ARM64 ABI (AAPCS64) - ARM calling convention
Tools for Verification
- Compiler Explorer (Godbolt) - See actual assembly output
- cdecl.org - Parse C declarations
- ABI Compliance Checker - Check library ABI
Related Reading
- What Every Programmer Should Know About Memory - Ulrich Drepper
- Calling Conventions Demystified - CodeProject article
- System V ABI for Dummies - OSDev Wiki
11. Self-Assessment Checklist
Understanding Verification
- I can list the 6 integer argument registers in order (RDI, RSI, RDX, RCX, R8, R9)
- I can list the 8 SSE argument registers (XMM0-XMM7)
- I know where the 7th integer argument goes (stack)
- I understand the difference between EDI and RDI (32-bit vs 64-bit)
- I can explain why large structs use a hidden pointer
- I know what the “red zone” is and when it’s safe to use
- I can explain caller-saved vs callee-saved registers
- I understand how variadic functions work at the ABI level
Implementation Verification
- Parser correctly handles basic types (int, long, double, etc.)
- Parser correctly handles pointers (int *, char **)
- Classification matches expected ABI classes
- Register allocation matches gcc -S output
- Stack overflow arguments have correct offsets
- Output formatting is clear and readable
- Verification assembly works when compiled
Quality Verification
- Tool produces correct results for all test cases
- Edge cases handled (void functions, variadic, etc.)
- Output is educational, not just informative
- Code is well-structured and documented
12. Submission / Completion Criteria
Minimum Viable Completion
- Parses simple function prototypes (basic types, pointers)
- Shows register allocation for INTEGER class arguments
- Shows register allocation for SSE class arguments
- Shows stack overflow for 7+ arguments
- Output format is clear and correct
Full Completion
- Handles all basic C types correctly
- Generates verification assembly
- Matches gcc -S output for test cases
- Shows subregister names (EDI vs RDI)
- Documents variadic function behavior
- Handles hidden return pointer for large returns
Excellence
- Implements basic struct classification
- Compares System V vs Windows x64
- Provides ARM64 ABI option
- Interactive mode or web interface
- Comprehensive test suite with automated verification
- Used to debug a real FFI issue
The Core Question You’re Answering
“How does the System V AMD64 ABI determine where function arguments are passed, and how can we build a tool that visualizes this mapping for any C function prototype?”
By completing this project, you will have internalized the fundamental truth that function calls are not magic - they follow precise, documented rules that govern exactly where data travels between caller and callee. This knowledge transforms how you debug FFI issues, write assembly code, and understand what happens beneath your high-level code.
This guide was expanded from EXPERT_C_PROGRAMMING_DEEP_DIVE.md. For the complete learning path, see the project index.