Project 17: Calling Convention Visualizer

Build a tool that shows how function arguments are passed (registers vs stack), revealing the ABI layer between C and machine code.


Quick Reference

Attribute Value
Language C
Difficulty Level 4 (Advanced)
Time 2 Weeks
Book Reference CS:APP Ch. 3, Expert C Programming Ch. 6
Key Concepts Calling conventions, ABI, registers, stack
Prerequisites Assembly basics, stack frames, function calls
Portfolio Value High - Demonstrates systems understanding

1. Learning Objectives

By completing this project, you will:

  1. Master the System V AMD64 ABI - Understand how the most common 64-bit Unix calling convention passes arguments and returns values

  2. Understand register allocation for parameters - Know exactly which registers (RDI, RSI, RDX, RCX, R8, R9, XMM0-XMM7) are used for which argument types

  3. Know when arguments spill to the stack - Understand why the 7th integer argument goes on the stack, and how large structs are handled

  4. Compare calling conventions - Understand differences between System V AMD64, Windows x64, cdecl, and ARM64 ABIs

  5. Parse C function prototypes - Build a parser that extracts return type, function name, and parameter types

  6. Generate verification assembly - Create assembly code that proves your analysis is correct

  7. Understand struct passing rules - Learn the complex classification algorithm for passing structs (INTEGER, SSE, MEMORY classes)

  8. Debug FFI issues - Apply this knowledge to debug foreign function interface problems in languages like Python, Rust, and Go


2. Theoretical Foundation

2.1 Core Concepts

What is a Calling Convention?

A calling convention is a contract between the caller and callee that specifies:

+------------------------------------------------------------------+
|                    CALLING CONVENTION CONTRACT                     |
+------------------------------------------------------------------+
|                                                                    |
|  1. ARGUMENT PASSING                                               |
|     - Which arguments go in which registers?                       |
|     - Which arguments go on the stack?                             |
|     - In what order?                                               |
|                                                                    |
|  2. RETURN VALUES                                                  |
|     - Which register holds the return value?                       |
|     - How are large return values handled?                         |
|                                                                    |
|  3. REGISTER PRESERVATION                                          |
|     - Which registers must the callee save? (callee-saved)         |
|     - Which registers can the callee trash? (caller-saved)         |
|                                                                    |
|  4. STACK MANAGEMENT                                               |
|     - Who cleans up the stack after the call?                      |
|     - What alignment is required?                                  |
|                                                                    |
+------------------------------------------------------------------+

System V AMD64 ABI - The Linux/macOS Standard

The System V AMD64 ABI is used on Linux, macOS, FreeBSD, and other Unix-like systems:

+------------------------------------------------------------------+
|               SYSTEM V AMD64 ABI - ARGUMENT PASSING                |
+------------------------------------------------------------------+
|                                                                    |
|  INTEGER/POINTER ARGUMENTS (in order):                             |
|  +--------+--------+--------+--------+--------+--------+           |
|  |  1st   |  2nd   |  3rd   |  4th   |  5th   |  6th   |           |
|  |  RDI   |  RSI   |  RDX   |  RCX   |   R8   |   R9   |           |
|  +--------+--------+--------+--------+--------+--------+           |
|                                                                    |
|  FLOATING-POINT ARGUMENTS (in order):                              |
|  +--------+--------+--------+--------+--------+--------+--------+  |
|  |  1st   |  2nd   |  3rd   |  4th   |  5th   |  6th   |  7th   |  |
|  | XMM0   | XMM1   | XMM2   | XMM3   | XMM4   | XMM5   | XMM6   |  |
|  +--------+--------+--------+--------+--------+--------+--------+  |
|  |  8th   |                                                        |
|  | XMM7   |                                                        |
|  +--------+                                                        |
|                                                                    |
|  OVERFLOW ARGUMENTS: Pushed on stack right-to-left                 |
|                                                                    |
|  RETURN VALUE:                                                     |
|  - Integer/pointer: RAX (and RDX for 128-bit)                      |
|  - Floating-point: XMM0 (and XMM1 for 128-bit)                     |
|  - Large structs: Caller allocates, passes hidden pointer in RDI   |
|                                                                    |
+------------------------------------------------------------------+

Register Sizes and Subregisters

+------------------------------------------------------------------+
|                    x86-64 REGISTER NAMING                          |
+------------------------------------------------------------------+
|                                                                    |
|  64-bit   32-bit   16-bit   8-bit                                  |
|  ------   ------   ------   -----                                  |
|  RAX      EAX      AX       AL                                     |
|  RBX      EBX      BX       BL                                     |
|  RCX      ECX      CX       CL                                     |
|  RDX      EDX      DX       DL                                     |
|  RSI      ESI      SI       SIL                                    |
|  RDI      EDI      DI       DIL                                    |
|  RBP      EBP      BP       BPL                                    |
|  RSP      ESP      SP       SPL                                    |
|  R8       R8D      R8W      R8B                                    |
|  R9       R9D      R9W      R9B                                    |
|  R10      R10D     R10W     R10B                                   |
|  R11      R11D     R11W     R11B                                   |
|  R12      R12D     R12W     R12B                                   |
|  R13      R13D     R13W     R13B                                   |
|  R14      R14D     R14W     R14B                                   |
|  R15      R15D     R15W     R15B                                   |
|                                                                    |
|  IMPORTANT: For 32-bit operations (int), use EDI, ESI, etc.        |
|  Writing to EAX automatically zero-extends to RAX.                 |
|                                                                    |
+------------------------------------------------------------------+

Caller-Saved vs Callee-Saved Registers

+------------------------------------------------------------------+
|              REGISTER PRESERVATION IN SYSTEM V AMD64               |
+------------------------------------------------------------------+
|                                                                    |
|  CALLER-SAVED (volatile - callee can modify freely):               |
|  +-----+-----+-----+-----+-----+-----+-----+-----+-----+-----+     |
|  | RAX | RCX | RDX | RSI | RDI | R8  | R9  | R10 | R11 |     |     |
|  +-----+-----+-----+-----+-----+-----+-----+-----+-----+-----+     |
|  | XMM0-XMM15 | (all SSE registers are caller-saved)        |     |
|  +-----+-----+-----+-----+-----+-----+-----+-----+-----+-----+     |
|                                                                    |
|  If caller needs these values after the call, it must save them.   |
|                                                                    |
|  CALLEE-SAVED (non-volatile - callee must preserve):               |
|  +-----+-----+-----+-----+-----+-----+-----+                       |
|  | RBX | RBP | R12 | R13 | R14 | R15 | RSP |                       |
|  +-----+-----+-----+-----+-----+-----+-----+                       |
|                                                                    |
|  Callee must save these on entry and restore before returning.     |
|                                                                    |
+------------------------------------------------------------------+

Stack Frame Layout

+------------------------------------------------------------------+
|            STACK FRAME LAYOUT (System V AMD64)                     |
+------------------------------------------------------------------+
|                                                                    |
|  HIGH ADDRESSES                                                    |
|  +------------------------------------------+                      |
|  |         Argument 8 (if any)              | [RBP + 24]           |
|  +------------------------------------------+                      |
|  |         Argument 7 (if any)              | [RBP + 16]           |
|  +------------------------------------------+                      |
|  |         Return address                   | [RBP + 8]            |
|  +------------------------------------------+                      |
|  |         Saved RBP (frame pointer)        | [RBP]     <-- RBP    |
|  +------------------------------------------+                      |
|  |         Local variable 1                 | [RBP - 8]            |
|  +------------------------------------------+                      |
|  |         Local variable 2                 | [RBP - 16]           |
|  +------------------------------------------+                      |
|  |         ...                              |                      |
|  +------------------------------------------+                      |
|  |         Red zone (128 bytes)             | <-- RSP              |
|  +------------------------------------------+                      |
|  LOW ADDRESSES                                                     |
|                                                                    |
|  NOTES:                                                            |
|  - Stack must be 16-byte aligned before CALL instruction           |
|  - Red zone: 128 bytes below RSP that leaf functions can use       |
|    without adjusting RSP (System V only, not Windows)              |
|  - CALL pushes 8-byte return address, so RSP becomes 8-byte        |
|    aligned at function entry                                       |
|                                                                    |
+------------------------------------------------------------------+

2.2 Why This Matters

Understanding calling conventions is essential for:

1. Foreign Function Interface (FFI)

// Python calling C code (ctypes/cffi)
// Must match the ABI exactly or crash/corrupt data

// If you misunderstand the ABI:
// - Arguments get wrong values
// - Return values are garbage
// - Stack corruption causes crashes

2. Writing Assembly

; If you're writing hand-optimized assembly:
; - Must know where arguments arrive
; - Must preserve the right registers
; - Must return values in the right place

3. Debugging Core Dumps

When examining a crash in GDB:
- Register values tell you argument values
- Understanding the ABI helps trace the call

4. Performance Optimization

// Knowing that args 1-6 go in registers:
// - Functions with <= 6 args have zero memory traffic for args
// - Reordering parameters can improve cache behavior
// - Choosing types affects register usage (int vs float)

5. Security Research

ROP/JOP exploits require understanding:
- Where return addresses are stored
- How to pass "arguments" to gadgets
- Register state at function entry

2.3 Historical Context

+------------------------------------------------------------------+
|                  EVOLUTION OF CALLING CONVENTIONS                  |
+------------------------------------------------------------------+
|                                                                    |
|  1970s-1980s: CHAOS                                                |
|  - Each compiler had its own convention                            |
|  - Calling code from different compilers was painful               |
|                                                                    |
|  1980s: CDECL (C Declaration)                                      |
|  - Arguments pushed on stack right-to-left                         |
|  - Caller cleans up stack                                          |
|  - Works for variadic functions (printf)                           |
|  - EAX holds return value                                          |
|                                                                    |
|  1990s: FASTCALL variants                                          |
|  - Pass first 2-3 args in registers (ECX, EDX)                     |
|  - Rest on stack                                                   |
|  - Faster but incompatible implementations                         |
|                                                                    |
|  2000s: AMD64/x86-64                                               |
|  - Two major ABIs diverged:                                        |
|    * System V AMD64: Linux, macOS, BSD (6 integer regs)            |
|    * Microsoft x64: Windows (4 integer regs, shadow space)         |
|                                                                    |
|  2010s-present: ARM64                                              |
|  - x0-x7 for integer arguments (8 registers!)                      |
|  - Cleaner design, learned from x86 mistakes                       |
|                                                                    |
+------------------------------------------------------------------+

2.4 Common Misconceptions

+------------------------------------------------------------------+
|                    CALLING CONVENTION MYTHS                        |
+------------------------------------------------------------------+
|                                                                    |
|  MYTH 1: "All arguments go on the stack"                           |
|  -------                                                           |
|  REALITY: In System V AMD64, first 6 integer args go in registers. |
|           Stack is only used for overflow arguments.               |
|           This is much faster than pure stack passing.             |
|                                                                    |
|  MYTH 2: "Calling conventions are the same everywhere"             |
|  -------                                                           |
|  REALITY: Windows x64 uses different registers (RCX, RDX, R8, R9)  |
|           and requires 32-byte shadow space. Linux code won't work.|
|                                                                    |
|  MYTH 3: "The compiler handles it, I don't need to know"           |
|  -------                                                           |
|  REALITY: True for pure C. But FFI, assembly, debugging, and       |
|           reverse engineering all require this knowledge.          |
|                                                                    |
|  MYTH 4: "Structs are always passed by pointer"                    |
|  -------                                                           |
|  REALITY: Small structs (up to 16 bytes) can be passed in          |
|           registers. The classification algorithm is complex.      |
|                                                                    |
|  MYTH 5: "Return values are always in RAX"                         |
|  -------                                                           |
|  REALITY: Floats return in XMM0. Large structs use a hidden        |
|           pointer parameter. Complex numbers use XMM0:XMM1.        |
|                                                                    |
+------------------------------------------------------------------+

3. Project Specification

3.1 What You Will Build

A calling convention visualizer that parses C function prototypes and shows exactly where each argument is passed:

$ ./callconv "int add(int a, int b, int c, int d, int e, int f, int g)"

Function: int add(int a, int b, int c, int d, int e, int f, int g)

Argument Passing (System V AMD64):
+------+-------+-------+-----------+---------------------------+
| Arg  | Name  | Type  | Passed In | Notes                     |
+------+-------+-------+-----------+---------------------------+
|  1   | a     | int   | EDI       | Integer arg #1 (32-bit)   |
|  2   | b     | int   | ESI       | Integer arg #2 (32-bit)   |
|  3   | c     | int   | EDX       | Integer arg #3 (32-bit)   |
|  4   | d     | int   | ECX       | Integer arg #4 (32-bit)   |
|  5   | e     | int   | R8D       | Integer arg #5 (32-bit)   |
|  6   | f     | int   | R9D       | Integer arg #6 (32-bit)   |
|  7   | g     | int   | [RSP+8]   | Stack overflow argument   |
+------+-------+-------+-----------+---------------------------+

Return Value: EAX (32-bit integer)

Stack Frame at Function Entry:
  +------------------+
  | Argument 7 (g)   | [RSP + 8]
  +------------------+
  | Return Address   | [RSP]
  +------------------+

3.2 Functional Requirements

FR1: Parse C Function Prototypes

  • Handle basic types: void, char, short, int, long, long long
  • Handle floating-point: float, double, long double
  • Handle pointers: int *, char **, void *
  • Handle const qualifiers: const char *
  • Handle unsigned: unsigned int, unsigned long
  • Handle typedefs: size_t, int64_t (common ones)

FR2: Show Register Allocation

  • Display which register each argument uses
  • Show correct subregister names (EDI vs RDI for int vs long)
  • Mark stack overflow arguments with offset
  • Distinguish INTEGER, SSE, and MEMORY class arguments

FR3: Show Return Value Location

  • Integer returns in RAX/EAX
  • Floating-point returns in XMM0
  • Large struct returns via hidden pointer

FR4: Handle Mixed Arguments

$ ./callconv "double compute(int x, double y, int z, float w)"

+------+-------+---------+-----------+---------------------------+
| Arg  | Name  | Type    | Passed In | Notes                     |
+------+-------+---------+-----------+---------------------------+
|  1   | x     | int     | EDI       | Integer arg #1            |
|  2   | y     | double  | XMM0      | SSE arg #1 (64-bit)       |
|  3   | z     | int     | ESI       | Integer arg #2            |
|  4   | w     | float   | XMM1      | SSE arg #2 (32-bit)       |
+------+-------+---------+-----------+---------------------------+

Return Value: XMM0 (double, 64-bit)

FR5: Generate Verification Assembly

$ ./callconv "int add(int a, int b)" --verify

; Verification assembly (paste into Godbolt):
; Arguments should appear in EDI and ESI

global verify_add
verify_add:
    ; int a is in EDI
    ; int b is in ESI
    mov eax, edi      ; Copy first arg to return
    add eax, esi      ; Add second arg
    ret

FR6: Compare with Windows x64

$ ./callconv "int add(int a, int b, int c, int d, int e)" --windows

Comparison: System V AMD64 vs Windows x64

System V AMD64:
  arg1: EDI    arg2: ESI    arg3: EDX    arg4: ECX    arg5: R8D

Windows x64:
  arg1: ECX    arg2: EDX    arg3: R8D    arg4: R9D    arg5: [RSP+40]
  (Note: Windows requires 32-byte shadow space)

3.3 Non-Functional Requirements

  • Accuracy: Must match actual compiler behavior (verify with GCC/Clang)
  • Performance: Parsing should be instantaneous for single prototypes
  • Educational Value: Output should teach, not just inform
  • Portability: Should compile on Linux and macOS

3.4 Example Usage / Output

Example 1: Simple Function

$ ./callconv "void hello(void)"

Function: void hello(void)

No arguments to pass.
Return Value: None (void)

Example 2: Variadic Function

$ ./callconv "int printf(const char *fmt, ...)"

Function: int printf(const char *fmt, ...)

+------+-------+---------------+-----------+---------------------------+
| Arg  | Name  | Type          | Passed In | Notes                     |
+------+-------+---------------+-----------+---------------------------+
|  1   | fmt   | const char *  | RDI       | Integer arg #1 (pointer)  |
|  ... | ...   | variadic      | varies    | See below                 |
+------+-------+---------------+-----------+---------------------------+

Variadic Arguments:
  - For variadic functions, AL must contain the number of XMM registers used
  - Integer variadic args continue in RSI, RDX, RCX, R8, R9, then stack
  - Float variadic args use XMM0-XMM7, then stack

Return Value: EAX (32-bit integer)

Example 3: Struct Parameters

$ ./callconv "void process(struct point p)" --struct "struct point { int x; int y; }"

Function: void process(struct point p)

Struct Analysis:
  struct point { int x; int y; }
  Size: 8 bytes
  Classification: INTEGER, INTEGER

+------+-------+----------------+-----------+---------------------------+
| Arg  | Name  | Type           | Passed In | Notes                     |
+------+-------+----------------+-----------+---------------------------+
|  1   | p     | struct point   | RDI       | Packed into single reg    |
|      |       | (8 bytes)      |           | x in low 32, y in high 32 |
+------+-------+----------------+-----------+---------------------------+

Return Value: None (void)

Example 4: Large Struct Return

$ ./callconv "struct big getBig(int x)" --struct "struct big { long a; long b; long c; }"

Function: struct big getBig(int x)

Struct Analysis (Return Type):
  struct big { long a; long b; long c; }
  Size: 24 bytes (> 16 bytes)
  Classification: MEMORY (too large for registers)

+------+-------+---------------+-----------+---------------------------+
| Arg  | Name  | Type          | Passed In | Notes                     |
+------+-------+---------------+-----------+---------------------------+
|  0   | (ret) | struct big *  | RDI       | HIDDEN: caller allocates  |
|  1   | x     | int           | ESI       | Shifted to arg #2 slot!   |
+------+-------+---------------+-----------+---------------------------+

Return Value:
  - RAX contains pointer to result (same as hidden arg)
  - Callee writes to memory pointed by hidden RDI

NOTE: The hidden pointer shifts all other arguments by one position!

3.5 Real World Outcome

After building this tool, you will be able to:

  1. Debug FFI calls when Python/Ruby/Go calls C code and gets garbage values
  2. Write correct inline assembly that receives arguments properly
  3. Read disassembly and immediately know what arguments a function received
  4. Understand ABI breaks when library updates change struct sizes
  5. Answer interview questions about low-level function call mechanics

4. Solution Architecture

4.1 High-Level Design

+------------------------------------------------------------------+
|                    CALLCONV ARCHITECTURE                           |
+------------------------------------------------------------------+
|                                                                    |
|  Input: "int add(int a, int b, int c)"                             |
|                |                                                   |
|                v                                                   |
|  +----------------------------+                                    |
|  |      PROTOTYPE PARSER      |                                    |
|  |  - Tokenize the string     |                                    |
|  |  - Parse return type       |                                    |
|  |  - Parse function name     |                                    |
|  |  - Parse parameters        |                                    |
|  +----------------------------+                                    |
|                |                                                   |
|                v                                                   |
|  +----------------------------+                                    |
|  |     TYPE CLASSIFIER        |                                    |
|  |  - Classify each type      |                                    |
|  |  - INTEGER, SSE, MEMORY    |                                    |
|  |  - Handle structs          |                                    |
|  +----------------------------+                                    |
|                |                                                   |
|                v                                                   |
|  +----------------------------+                                    |
|  |    REGISTER ALLOCATOR      |                                    |
|  |  - Assign registers        |                                    |
|  |  - Track overflow to stack |                                    |
|  |  - Handle hidden pointers  |                                    |
|  +----------------------------+                                    |
|                |                                                   |
|                v                                                   |
|  +----------------------------+                                    |
|  |      OUTPUT FORMATTER      |                                    |
|  |  - Generate table          |                                    |
|  |  - Generate assembly       |                                    |
|  |  - Compare ABIs            |                                    |
|  +----------------------------+                                    |
|                |                                                   |
|                v                                                   |
|  Output: Formatted table and optional assembly                     |
|                                                                    |
+------------------------------------------------------------------+

4.2 Key Components

1. Prototype Parser (parser.c)

  • Lexer to tokenize C declarations
  • Simple recursive descent parser
  • Handles type specifiers, qualifiers, pointers, arrays

2. Type Classifier (classifier.c)

  • Maps C types to ABI classes (INTEGER, SSE, SSEUP, X87, MEMORY, etc.)
  • Implements struct classification algorithm
  • Handles alignment requirements

3. Register Allocator (allocator.c)

  • Tracks available integer registers (RDI, RSI, RDX, RCX, R8, R9)
  • Tracks available SSE registers (XMM0-XMM7)
  • Calculates stack offsets for overflow arguments

4. Output Formatter (output.c)

  • Pretty-prints tables
  • Generates verification assembly
  • Produces comparison output for different ABIs

4.3 Data Structures

/* Type classification per System V AMD64 ABI */
typedef enum {
    CLASS_INTEGER,   /* Passed in integer registers */
    CLASS_SSE,       /* Passed in SSE registers (first half) */
    CLASS_SSEUP,     /* Passed in SSE registers (second half) */
    CLASS_X87,       /* x87 floating point (long double) */
    CLASS_X87UP,     /* x87 second eightbyte */
    CLASS_COMPLEX_X87,
    CLASS_MEMORY,    /* Passed on stack */
    CLASS_NO_CLASS   /* Padding, empty */
} TypeClass;

/* Basic type information */
typedef enum {
    TYPE_VOID,
    TYPE_CHAR,
    TYPE_SHORT,
    TYPE_INT,
    TYPE_LONG,
    TYPE_LONGLONG,
    TYPE_FLOAT,
    TYPE_DOUBLE,
    TYPE_LONGDOUBLE,
    TYPE_POINTER,
    TYPE_STRUCT,
    TYPE_UNION,
    TYPE_ARRAY
} BaseType;

/* Parsed type representation */
typedef struct Type {
    BaseType base;
    int is_unsigned;
    int is_const;
    int pointer_depth;      /* 0 = not pointer, 1 = *, 2 = **, etc. */
    size_t size;            /* Size in bytes */
    size_t alignment;       /* Alignment requirement */
    struct Type *pointee;   /* For pointers, what we point to */
    /* For structs/arrays */
    struct StructDef *struct_def;
    size_t array_size;
} Type;

/* Parsed parameter */
typedef struct {
    char name[64];          /* Parameter name (may be empty) */
    Type type;              /* Parameter type */
    TypeClass classes[2];   /* ABI classification (for 16-byte types) */
} Parameter;

/* Parsed function prototype */
typedef struct {
    Type return_type;
    char name[64];
    Parameter params[32];
    int param_count;
    int is_variadic;
} FunctionPrototype;

/* Register allocation result */
typedef struct {
    int param_index;        /* Which parameter */
    int is_stack;           /* True if passed on stack */
    union {
        struct {
            const char *reg_name;  /* "RDI", "XMM0", etc. */
            const char *sub_reg;   /* "EDI", etc. for 32-bit */
        } reg;
        struct {
            int offset;     /* Offset from RSP after CALL */
        } stack;
    } location;
    TypeClass class;
} AllocationResult;

4.4 Algorithm Overview

The Classification Algorithm for Structs (Simplified)

The System V AMD64 ABI has a complex algorithm for classifying structs. Here is the essence:

+------------------------------------------------------------------+
|             STRUCT CLASSIFICATION ALGORITHM (SIMPLIFIED)           |
+------------------------------------------------------------------+
|                                                                    |
|  1. If size > 16 bytes (or has unaligned fields):                  |
|     -> CLASS_MEMORY (pass by hidden pointer)                       |
|                                                                    |
|  2. Split struct into 8-byte "eightbytes"                          |
|                                                                    |
|  3. For each eightbyte, determine class:                           |
|     - If all fields are integer types -> CLASS_INTEGER             |
|     - If any field is float/double -> CLASS_SSE                    |
|     - If any field is long double -> CLASS_X87                     |
|     - Apply merge rules for overlapping cases                      |
|                                                                    |
|  4. Post-process:                                                  |
|     - If any eightbyte is MEMORY -> whole struct is MEMORY         |
|     - If all INTEGER -> pass in integer registers                  |
|     - If SSE -> pass in SSE registers                              |
|                                                                    |
+------------------------------------------------------------------+

Example 1: struct { int x; int y; }   (8 bytes)
  -> One eightbyte, both ints -> CLASS_INTEGER
  -> Pass in RDI

Example 2: struct { double x; double y; }  (16 bytes)
  -> Two eightbytes, both doubles -> CLASS_SSE, CLASS_SSEUP
  -> Pass in XMM0 (low 64 bits = x, high 64 bits = y)

Example 3: struct { int x; double y; }  (16 bytes)
  -> First eightbyte: int (4) + padding (4) -> CLASS_INTEGER
  -> Second eightbyte: double -> CLASS_SSE
  -> Pass x in RDI, y in XMM0

Example 4: struct { long a; long b; long c; }  (24 bytes)
  -> Size > 16 bytes -> CLASS_MEMORY
  -> Pass by hidden pointer in RDI

5. Implementation Guide

5.1 Development Environment Setup

Required Tools:

# Compiler and debugger
gcc --version          # or clang --version
gdb --version          # or lldb on macOS

# For verification
objdump --version
# Or Compiler Explorer (godbolt.org) for online verification

Recommended Compiler Flags:

CFLAGS = -Wall -Wextra -Wpedantic -std=c11 -g -O0

5.2 Project Structure

calling-convention-visualizer/
|-- include/
|   |-- callconv.h         # Main header
|   |-- parser.h           # Prototype parser
|   |-- classifier.h       # Type classifier
|   |-- allocator.h        # Register allocator
|   |-- output.h           # Output formatting
|   |-- types.h            # Type definitions
|-- src/
|   |-- main.c             # Entry point, CLI
|   |-- parser.c           # Prototype parsing
|   |-- lexer.c            # Tokenization
|   |-- classifier.c       # ABI classification
|   |-- allocator.c        # Register allocation
|   |-- output.c           # Table formatting
|   |-- verify.c           # Assembly generation
|   |-- windows_abi.c      # Windows x64 rules
|-- tests/
|   |-- test_parser.c      # Parser tests
|   |-- test_allocator.c   # Allocation tests
|   |-- verify/            # Verification programs
|   |   |-- verify_int_args.c
|   |   |-- verify_float_args.c
|   |   |-- verify_mixed.c
|   |   |-- verify_structs.c
|-- Makefile
|-- README.md

5.3 The Core Question You’re Answering

“How does the calling convention determine where function arguments go, and how can we visualize this mapping for any C function prototype?”

5.4 Concepts You Must Understand First

Before implementing, ensure you understand:

1. INTEGER class arguments in System V AMD64:

  • Arguments go in: RDI, RSI, RDX, RCX, R8, R9 (in that order)
  • After 6 integer args, overflow to stack
  • Pointers are INTEGER class (64-bit)
  • int uses 32-bit subregisters (EDI, ESI, etc.)

2. SSE class arguments:

  • Arguments go in: XMM0, XMM1, XMM2, XMM3, XMM4, XMM5, XMM6, XMM7
  • After 8 SSE args, overflow to stack
  • float uses low 32 bits of XMM register
  • double uses low 64 bits of XMM register

3. Return value location:

  • INTEGER class: RAX (and RDX for 128-bit)
  • SSE class: XMM0 (and XMM1 for 128-bit)
  • Large structs (>16 bytes): Hidden first parameter (pointer in RDI)

4. Struct classification:

  • Structs <= 16 bytes are split into “eightbytes”
  • Each eightbyte gets classified
  • If any part is MEMORY, whole struct is MEMORY
  • Structs > 16 bytes always use MEMORY class

5.5 Questions to Guide Your Design

Parsing:

  1. How will you tokenize type specifiers (unsigned, long, const)?
  2. How will you handle arbitrary pointer depth (char ***)?
  3. How will you parse unnamed parameters (void foo(int, int))?

Classification:

  1. How will you handle long vs long long (may differ by platform)?
  2. How will you implement the struct classification algorithm?
  3. How will you handle arrays (decay to pointers)?

Allocation:

  1. How will you track separate counts for INTEGER and SSE registers?
  2. How will you calculate stack offsets for overflow arguments?
  3. How will you handle the hidden return pointer for large structs?

Output:

  1. How will you generate correct subregister names (RDI vs EDI)?
  2. How will you show struct packing in registers?
  3. How will you verify correctness against actual compiler output?

5.6 Thinking Exercise

Before coding, manually classify these functions:

// Exercise 1: Where do the arguments go?
void func1(int a, long b, int *c, double d, float e, long long f, int g, double h);

// Exercise 2: What about the return value?
double func2(void);
struct small { int x; int y; };
struct small func3(void);
struct big { long a; long b; long c; };
struct big func4(void);

// Exercise 3: Hidden parameters?
struct big func5(int x);  // How many registers does x use?

Answers:

Exercise 1:

a: EDI (int, 32-bit in RDI)
b: RSI (long, 64-bit)
c: RDX (pointer, 64-bit)
d: XMM0 (double)
e: XMM1 (float, low 32 bits)
f: RCX (long long, 64-bit)
g: R8D (int, 32-bit in R8)
h: XMM2 (double)

Integer regs used: 5 (RDI, RSI, RDX, RCX, R8)
SSE regs used: 3 (XMM0, XMM1, XMM2)
No stack overflow.

Exercise 2:

func2: returns in XMM0
func3: struct is 8 bytes, one INTEGER eightbyte -> returns in RAX
func4: struct is 24 bytes > 16 -> MEMORY class -> hidden pointer

Exercise 3:

func5 has hidden return pointer in RDI
Therefore x is in ESI (not EDI!)

5.7 Hints in Layers

Hint 1: Starting the Lexer Start with a simple tokenizer that recognizes: - Type keywords: `void`, `char`, `short`, `int`, `long`, `float`, `double` - Qualifiers: `const`, `unsigned`, `signed`, `volatile` - Symbols: `*`, `(`, `)`, `,`, `...` - Identifiers: parameter names, function name ```c typedef enum { TOK_VOID, TOK_CHAR, TOK_SHORT, TOK_INT, TOK_LONG, TOK_FLOAT, TOK_DOUBLE, TOK_UNSIGNED, TOK_SIGNED, TOK_CONST, TOK_VOLATILE, TOK_STRUCT, TOK_UNION, TOK_STAR, TOK_LPAREN, TOK_RPAREN, TOK_COMMA, TOK_ELLIPSIS, TOK_IDENT, TOK_EOF } TokenType; ```
Hint 2: Parsing Types Types in C declarations have: 1. Declaration specifiers (const, unsigned, int, etc.) 2. Optional pointer declarator (*, **, etc.) ```c Type parse_type(Lexer *lex) { Type t = {0}; // Parse specifiers while (is_type_specifier(peek(lex))) { Token tok = next(lex); switch (tok.type) { case TOK_CONST: t.is_const = 1; break; case TOK_UNSIGNED: t.is_unsigned = 1; break; case TOK_INT: t.base = TYPE_INT; break; case TOK_LONG: if (t.base == TYPE_LONG) t.base = TYPE_LONGLONG; else t.base = TYPE_LONG; break; // ... etc } } // Parse pointer depth while (peek(lex).type == TOK_STAR) { next(lex); t.pointer_depth++; } return t; } ```
Hint 3: Type Classification The basic classification is simpler than full ABI compliance: ```c TypeClass classify_type(Type *t) { // Pointers are always INTEGER if (t->pointer_depth > 0) return CLASS_INTEGER; switch (t->base) { case TYPE_VOID: return CLASS_NO_CLASS; case TYPE_CHAR: case TYPE_SHORT: case TYPE_INT: case TYPE_LONG: case TYPE_LONGLONG: return CLASS_INTEGER; case TYPE_FLOAT: case TYPE_DOUBLE: return CLASS_SSE; case TYPE_LONGDOUBLE: return CLASS_X87; case TYPE_STRUCT: return classify_struct(t->struct_def); default: return CLASS_MEMORY; } } ```
Hint 4: Register Allocation Track register usage separately for INTEGER and SSE: ```c typedef struct { int next_int_reg; /* 0-5 for RDI,RSI,RDX,RCX,R8,R9 */ int next_sse_reg; /* 0-7 for XMM0-XMM7 */ int stack_offset; /* Offset for stack arguments */ } AllocationState; static const char *int_regs_64[] = {"RDI","RSI","RDX","RCX","R8","R9"}; static const char *int_regs_32[] = {"EDI","ESI","EDX","ECX","R8D","R9D"}; AllocationResult allocate_param(AllocationState *state, Parameter *param) { AllocationResult result = {0}; TypeClass class = param->classes[0]; if (class == CLASS_INTEGER) { if (state->next_int_reg < 6) { result.is_stack = 0; result.location.reg.reg_name = int_regs_64[state->next_int_reg]; // Use 32-bit name for int if (param->type.base == TYPE_INT) { result.location.reg.sub_reg = int_regs_32[state->next_int_reg]; } state->next_int_reg++; } else { result.is_stack = 1; result.location.stack.offset = state->stack_offset; state->stack_offset += 8; // Always 8-byte aligned } } else if (class == CLASS_SSE) { // Similar logic for XMM registers } return result; } ```
Hint 5: Handling Hidden Return Pointer Large struct returns need a hidden first parameter: ```c void process_function(FunctionPrototype *proto) { AllocationState state = {0}; // Check if return type needs hidden pointer if (needs_hidden_pointer(&proto->return_type)) { // Hidden parameter takes RDI printf("Hidden return pointer in RDI\n"); state.next_int_reg = 1; // Start from RSI for user params } // Now allocate user parameters for (int i = 0; i < proto->param_count; i++) { AllocationResult res = allocate_param(&state, &proto->params[i]); // ... } } int needs_hidden_pointer(Type *ret_type) { if (ret_type->base != TYPE_STRUCT) return 0; return ret_type->size > 16; // Structs > 16 bytes } ```

5.8 The Interview Questions They’ll Ask

  1. “Walk me through what happens at the ABI level when I call printf("Hello %d", 42)
    • "Hello %d" pointer goes in RDI
    • 42 goes in ESI (second integer arg)
    • AL is set to 0 (number of XMM registers used for varargs)
    • CALL instruction pushes return address and jumps
  2. “Why does Windows use different registers than Linux for the same CPU?”
    • Historical: Different teams designed ABIs independently
    • Windows: RCX, RDX, R8, R9 (4 regs), plus shadow space
    • System V: RDI, RSI, RDX, RCX, R8, R9 (6 regs), no shadow space
    • Both are valid; they just made different trade-offs
  3. “How would you debug an FFI call that’s returning garbage?”
    • Check if types match exactly (especially size and signedness)
    • Verify ABI convention matches (System V vs Windows)
    • Check if struct is being passed by value or pointer
    • Look for hidden return pointer issues
  4. “Why can small structs be passed in registers but large ones can’t?”
    • Registers are limited and fixed-size
    • Copying large data to registers would be slow and wasteful
    • Hidden pointer approach is more efficient for large data
    • 16-byte limit chosen because two 8-byte registers can hold it
  5. “What is the ‘red zone’ and when can you use it?”
    • 128 bytes below RSP that leaf functions can use without moving RSP
    • Only safe for leaf functions (no function calls)
    • Signal handlers must not corrupt it
    • Windows x64 has no red zone (shadow space instead)
  6. “How does variadic argument passing work?”
    • Named args follow normal rules
    • Variadic args continue the same sequence
    • AL must contain count of XMM registers used (for va_arg)
    • Variadic functions cannot rely on register arguments after stack overflow

5.9 Books That Will Help

Topic Book Chapter
x86-64 calling convention CS:APP 3rd Ed Chapter 3.7 (Procedures)
System V ABI details System V AMD64 ABI Spec Section 3.2 (Function Calling)
Assembly programming “x64 Assembly Language” by Ray Seyfarth Chapter 5
Calling conventions Expert C Programming Chapter 6 (Runtime Data Structures)
Low-level C details Write Great Code Vol 2 Chapter 5

5.10 Implementation Phases

Phase 1: Basic Parser (Days 1-3)

  • Implement lexer for C type syntax
  • Parse simple types (int, char, long, float, double)
  • Parse pointer types
  • Parse function prototypes with named parameters
  • Test with: int add(int a, int b)

Phase 2: Type Classification (Days 4-5)

  • Classify basic types (INTEGER, SSE, X87)
  • Determine type sizes and alignments
  • Handle unsigned/signed variants
  • Test with: void mixed(int x, double y, long z)

Phase 3: Register Allocation (Days 6-8)

  • Allocate INTEGER class registers
  • Allocate SSE class registers
  • Handle stack overflow arguments
  • Test with: int many(int a, int b, int c, int d, int e, int f, int g)

Phase 4: Output Formatting (Days 9-10)

  • Generate ASCII tables
  • Show correct subregister names
  • Add stack frame visualization
  • Test output against expected format

Phase 5: Verification (Days 11-12)

  • Generate verification assembly
  • Compare against gcc -S output
  • Test on Compiler Explorer (godbolt.org)
  • Fix any discrepancies

Phase 6: Advanced Features (Days 13-14)

  • Add struct support (basic)
  • Add Windows x64 comparison mode
  • Handle variadic functions
  • Add hidden return pointer support

5.11 Key Implementation Decisions

Decision 1: How to represent types?

  • Option A: String-based (easy parsing, hard manipulation)
  • Option B: AST-based (harder parsing, flexible manipulation) [Recommended]
  • Option C: Direct enum mapping (simple, limited)

Decision 2: How to verify correctness?

  • Option A: Manual verification against docs (error-prone)
  • Option B: Generate test programs and compare gcc output [Recommended]
  • Option C: Use existing tools like ABI compliance checkers

Decision 3: How much struct support?

  • Option A: No struct support (simpler)
  • Option B: Simple structs (<= 16 bytes) [Recommended for MVP]
  • Option C: Full ABI compliance (complex)

6. Testing Strategy

Test Categories

Category Purpose Examples
Parser Tests Verify prototype parsing Simple types, pointers, qualifiers
Classification Tests Verify type classification int->INTEGER, double->SSE
Allocation Tests Verify register assignment First 6 args in registers
Overflow Tests Verify stack handling 7+ arguments
Integration Tests End-to-end verification Compare with gcc -S output

Verification Programs

Create test C programs and compare your output with actual compiler output:

/* tests/verify/verify_int_args.c */
/* Compile: gcc -S -O0 verify_int_args.c -o verify_int_args.s */

int test_int_args(int a, int b, int c, int d, int e, int f, int g) {
    /*
     * Your tool should predict:
     * a: EDI, b: ESI, c: EDX, d: ECX, e: R8D, f: R9D, g: [RSP+8]
     */
    return a + b + c + d + e + f + g;
}
/* tests/verify/verify_float_args.c */
double test_float_args(double a, float b, double c, float d) {
    /*
     * Your tool should predict:
     * a: XMM0, b: XMM1 (low 32), c: XMM2, d: XMM3 (low 32)
     */
    return a + b + c + d;
}
/* tests/verify/verify_mixed.c */
void test_mixed(int a, double b, int c, float d, long e, double f) {
    /*
     * Integer regs: a->EDI, c->ESI, e->RDX
     * SSE regs: b->XMM0, d->XMM1, f->XMM2
     */
}

Verification Script

#!/bin/bash
# tests/verify_all.sh

echo "Compiling verification programs..."
gcc -S -O0 tests/verify/verify_int_args.c -o /tmp/verify_int_args.s
gcc -S -O0 tests/verify/verify_float_args.c -o /tmp/verify_float_args.s
gcc -S -O0 tests/verify/verify_mixed.c -o /tmp/verify_mixed.s

echo ""
echo "=== verify_int_args ==="
echo "Tool predicts:"
./callconv "int test_int_args(int a, int b, int c, int d, int e, int f, int g)"
echo ""
echo "GCC generated:"
grep -A 10 "test_int_args:" /tmp/verify_int_args.s | head -15

echo ""
echo "=== verify_float_args ==="
echo "Tool predicts:"
./callconv "double test_float_args(double a, float b, double c, float d)"
echo ""
echo "GCC generated:"
grep -A 10 "test_float_args:" /tmp/verify_float_args.s | head -15

echo ""
echo "=== verify_mixed ==="
echo "Tool predicts:"
./callconv "void test_mixed(int a, double b, int c, float d, long e, double f)"
echo ""
echo "GCC generated:"
grep -A 10 "test_mixed:" /tmp/verify_mixed.s | head -15

7. Common Pitfalls & Debugging

Pitfall Symptom Solution
Confusing 32-bit and 64-bit reg names Shows RDI for int instead of EDI Check type size; int uses 32-bit subregister
Wrong register order Arguments in wrong registers System V: RDI,RSI,RDX,RCX,R8,R9 (not RCX,RDX)
Forgetting hidden return pointer Large struct return has wrong args Check if return type > 16 bytes; shifts all args
Mixing INTEGER and SSE counts Floats using integer regs Track separate counts for int vs float arguments
Stack offset wrong Stack args at wrong offset After CALL: first stack arg at [RSP+8], not [RSP]
Ignoring variadic rules Variadic args wrong AL must specify XMM register count
Assuming all platforms same Works on Linux, fails on Windows Different ABIs for different platforms

Debugging Your Tool

/* Add debug output mode */
#ifdef DEBUG
#define DBG(fmt, ...) fprintf(stderr, "[DBG] " fmt "\n", ##__VA_ARGS__)
#else
#define DBG(fmt, ...)
#endif

void allocate_params(FunctionPrototype *proto) {
    AllocationState state = {0};

    DBG("Starting allocation for %s", proto->name);
    DBG("Return type class: %d", classify_type(&proto->return_type));

    if (needs_hidden_pointer(&proto->return_type)) {
        DBG("Hidden return pointer uses RDI");
        state.next_int_reg = 1;
    }

    for (int i = 0; i < proto->param_count; i++) {
        TypeClass class = classify_type(&proto->params[i].type);
        DBG("Param %d (%s): class=%d", i, proto->params[i].name, class);

        AllocationResult res = allocate_param(&state, &proto->params[i]);
        DBG("  -> %s", res.is_stack ? "STACK" : res.location.reg.reg_name);
    }
}

8. Extensions & Challenges

Beginner Extensions

  • Add color output (green for registers, yellow for stack)
  • Support typedef aliases (e.g., size_t -> unsigned long)
  • Show stack frame diagram for overflow arguments
  • Add --help with examples

Intermediate Extensions

  • Support basic struct definitions inline
  • Add Windows x64 ABI comparison mode
  • Support ARM64 ABI (x0-x7 for integers, v0-v7 for floats)
  • Parse and display function attributes (__attribute__((cdecl)))

Advanced Extensions

  • Full struct classification algorithm per ABI spec
  • Support union types
  • Handle arrays as parameters (decay to pointers)
  • Generate working assembly stubs that can be linked
  • Interactive mode: modify prototype and see changes
  • Web interface using WebAssembly

Research Extensions

  • Compare optimization levels (-O0 vs -O2 calling convention differences)
  • Analyze real-world libraries (glibc, openssl) for ABI patterns
  • Study tail call optimization effects on calling conventions

9. Real-World Connections

FFI in Modern Languages

Python ctypes/cffi:

from ctypes import CDLL, c_int, c_double

lib = CDLL("./mylib.so")
lib.compute.argtypes = [c_int, c_double, c_int]
lib.compute.restype = c_double

# If argtypes is wrong, you get garbage or crashes
# Your tool helps you verify the expected ABI
result = lib.compute(1, 3.14, 2)

Rust FFI:

extern "C" {
    // "C" specifies System V ABI on Unix
    fn compute(x: i32, y: f64, z: i32) -> f64;
}

// Incorrect declaration can cause UB
// Your tool helps verify argument passing

Go cgo:

// #include <stdlib.h>
// int compute(int x, double y, int z);
import "C"

func main() {
    // Go's cgo must match C ABI exactly
    result := C.compute(1, 3.14, 2)
}

Debugging with GDB

When debugging, knowing the ABI helps:

(gdb) break compute
(gdb) run
Breakpoint 1, compute (x=1, y=3.14, z=2)

# Verify arguments are where we expect:
(gdb) info registers rdi rsi rdx xmm0
rdi    0x1        # x = 1 (int, in EDI portion of RDI)
rsi    0x2        # z = 2 (int, in ESI portion of RSI)
xmm0   {v4_float = {...}, v2_double = {3.14, ...}}  # y = 3.14

# Your tool predicted: x->EDI, y->XMM0, z->ESI
# GDB confirms!

Security Research: ROP Gadgets

Understanding calling conventions helps in exploit development:

# To "call" a function via ROP, you need to:
# 1. Set up arguments in correct registers
# 2. gadget to pop values into RDI, RSI, etc.

# Example ROP chain to call execve("/bin/sh", NULL, NULL):
# pop rdi; ret;  -> address of "/bin/sh"
# pop rsi; ret;  -> NULL (argv)
# pop rdx; ret;  -> NULL (envp)
# execve address

# Your tool helps understand what values go where!

10. Resources

Official Specifications

Tools for Verification


11. Self-Assessment Checklist

Understanding Verification

  • I can list the 6 integer argument registers in order (RDI, RSI, RDX, RCX, R8, R9)
  • I can list the 8 SSE argument registers (XMM0-XMM7)
  • I know where the 7th integer argument goes (stack)
  • I understand the difference between EDI and RDI (32-bit vs 64-bit)
  • I can explain why large structs use a hidden pointer
  • I know what the “red zone” is and when it’s safe to use
  • I can explain caller-saved vs callee-saved registers
  • I understand how variadic functions work at the ABI level

Implementation Verification

  • Parser correctly handles basic types (int, long, double, etc.)
  • Parser correctly handles pointers (int *, char **)
  • Classification matches expected ABI classes
  • Register allocation matches gcc -S output
  • Stack overflow arguments have correct offsets
  • Output formatting is clear and readable
  • Verification assembly works when compiled

Quality Verification

  • Tool produces correct results for all test cases
  • Edge cases handled (void functions, variadic, etc.)
  • Output is educational, not just informative
  • Code is well-structured and documented

12. Submission / Completion Criteria

Minimum Viable Completion

  • Parses simple function prototypes (basic types, pointers)
  • Shows register allocation for INTEGER class arguments
  • Shows register allocation for SSE class arguments
  • Shows stack overflow for 7+ arguments
  • Output format is clear and correct

Full Completion

  • Handles all basic C types correctly
  • Generates verification assembly
  • Matches gcc -S output for test cases
  • Shows subregister names (EDI vs RDI)
  • Documents variadic function behavior
  • Handles hidden return pointer for large returns

Excellence

  • Implements basic struct classification
  • Compares System V vs Windows x64
  • Provides ARM64 ABI option
  • Interactive mode or web interface
  • Comprehensive test suite with automated verification
  • Used to debug a real FFI issue

The Core Question You’re Answering

“How does the System V AMD64 ABI determine where function arguments are passed, and how can we build a tool that visualizes this mapping for any C function prototype?”

By completing this project, you will have internalized the fundamental truth that function calls are not magic - they follow precise, documented rules that govern exactly where data travels between caller and callee. This knowledge transforms how you debug FFI issues, write assembly code, and understand what happens beneath your high-level code.


This guide was expanded from EXPERT_C_PROGRAMMING_DEEP_DIVE.md. For the complete learning path, see the project index.