Project 10: Function Pointer Dispatch Table
Build a command interpreter using function pointer tables for O(1) dispatch, demonstrating polymorphism in C and the power of treating functions as first-class data.
Quick Reference
| Attribute | Value |
|---|---|
| Language | C |
| Difficulty | Level 3 (Intermediate) |
| Time | 1 Week |
| Book Reference | Expert C Programming, Ch. 3, 9 |
| Coolness | 5/5 - Real-World Pattern |
| Portfolio Value | High - Used in shells, VMs, interpreters |
1. Learning Objectives
By completing this project, you will:
- Master function pointer syntax: Declare, initialize, and invoke function pointers with confidence
- Understand dispatch tables: Build lookup tables that map keys to function handlers
- Implement O(1) dispatch: Use array indexing or hash tables for constant-time command lookup
- Create extensible systems: Design software where new commands can be added without modifying core logic
- Recognize the callback pattern: See how function pointers enable event-driven and modular design
- Understand C polymorphism: Achieve runtime behavior switching without classes or inheritance
- Build real-world systems: Apply patterns used in shells, interpreters, game engines, and protocol handlers
2. Theoretical Foundation
2.1 Core Concepts
What Are Function Pointers?
In C, functions have addresses just like variables. A function pointer stores the address of a function, allowing you to call functions indirectly:
FUNCTION POINTERS: THE BASICS
=============================
Regular function call:
int result = add(3, 4);
The compiler generates:
call add ; Direct jump to add's address
Function pointer call:
int (*operation)(int, int) = add; // Store function's address
int result = operation(3, 4); // Call through pointer
The compiler generates:
mov rax, [operation] ; Load function address from variable
call rax ; Indirect jump through register
KEY INSIGHT:
============
Functions ARE addresses. The function name without () is already a pointer.
int add(int a, int b) { return a + b; }
add → Address of the add function (e.g., 0x401100)
add() → CALL the add function
This is similar to arrays:
int arr[5];
arr → Address of first element
arr[0] → ACCESS the first element
Function Pointer Declaration Syntax
READING FUNCTION POINTER DECLARATIONS
=====================================
Use the clockwise/spiral rule from Chapter 3:
int (*fp)(int, int);
─┬─
│
Start at identifier 'fp'
1. fp is...
2. * → a pointer to...
3. (int, int) → a function taking (int, int)...
4. int → returning int
DECLARATION MEANING
───────────────────────────────────────────────────────────
int (*fp)(); fp is pointer to function() returning int
int *(*fp)(); fp is pointer to function() returning int*
int (*fp)(int); fp is pointer to function(int) returning int
int (*fp)(int, char*); fp is pointer to function(int, char*) returning int
void (*fp)(void); fp is pointer to function(void) returning void
int (*arr[10])(int); arr is array[10] of pointers to function(int) returning int
int (*(*fp)(int))(double); fp is pointer to function(int) returning pointer to
function(double) returning int
TYPEDEF MAKES IT READABLE
=========================
Instead of:
int (*operations[10])(int, int);
Use:
typedef int (*BinaryOp)(int, int);
BinaryOp operations[10];
or (C11+):
typedef int BinaryOp(int, int);
BinaryOp *operations[10];
Dispatch Tables
DISPATCH TABLES: THE PATTERN
============================
A dispatch table maps keys (commands, opcodes, message types) to handlers (functions).
Traditional approach (O(n) in number of commands):
┌───────────────────────────────────────────────────────────────┐
│ if (strcmp(cmd, "add") == 0) │
│ return do_add(args); │
│ else if (strcmp(cmd, "sub") == 0) │
│ return do_sub(args); │
│ else if (strcmp(cmd, "mul") == 0) │
│ return do_mul(args); │
│ // ... N more comparisons │
└───────────────────────────────────────────────────────────────┘
Problems:
- O(n) time to find the command
- Adding new commands requires modifying the if/else chain
- Hard to add commands at runtime
- Mixing control flow with data
Dispatch table approach (O(1) lookup):
┌───────────────────────────────────────────────────────────────┐
│ // Data: command table │
│ typedef struct { │
│ const char *name; │
│ cmd_func handler; │
│ } Command; │
│ │
│ Command commands[] = { │
│ {"add", do_add}, │
│ {"sub", do_sub}, │
│ {"mul", do_mul}, │
│ }; │
│ │
│ // Lookup (O(1) with hash table, O(log n) with sorted array) │
│ Command *cmd = lookup(commands, user_input); │
│ return cmd->handler(args); │
└───────────────────────────────────────────────────────────────┘
Advantages:
- Fast lookup with proper data structure
- Data-driven: add commands without changing logic
- Can add/remove commands at runtime
- Separation of concerns

2.2 Why This Matters
REAL-WORLD USES OF DISPATCH TABLES
==================================
1. COMMAND-LINE SHELLS (bash, zsh, fish)
┌─────────────────────────────────────────────────────────┐
│ User input: "ls -la /home" │
│ │ │
│ ▼ │
│ ┌─────────────────────────────────────┐ │
│ │ builtin_commands[] = { │ │
│ │ {"cd", builtin_cd}, │ │
│ │ {"export", builtin_export}, │ │
│ │ {"alias", builtin_alias}, │ │
│ │ ... │ │
│ │ } │ │
│ └─────────────────────────────────────┘ │
│ │ │
│ ▼ │
│ Lookup "ls" → not found → exec external program │
│ Lookup "cd" → found → call builtin_cd() │
└─────────────────────────────────────────────────────────┘
2. BYTECODE INTERPRETERS (Python, Lua, JVM)
┌─────────────────────────────────────────────────────────┐
│ Bytecode: LOAD_CONST 0, LOAD_CONST 1, BINARY_ADD │
│ │
│ // Dispatch table indexed by opcode (O(1)) │
│ void (*opcode_handlers[256])(VM *vm) = { │
│ [OP_LOAD_CONST] = op_load_const, │
│ [OP_BINARY_ADD] = op_binary_add, │
│ [OP_CALL] = op_call, │
│ ... │
│ }; │
│ │
│ // The interpreter loop (blazingly fast) │
│ while (running) { │
│ opcode_handlers[*ip++](vm); │
│ } │
└─────────────────────────────────────────────────────────┘
3. NETWORK PROTOCOL HANDLERS
┌─────────────────────────────────────────────────────────┐
│ Packet: [TYPE: 0x03] [LENGTH: 256] [DATA: ...] │
│ │ │
│ ▼ │
│ PacketHandler handlers[] = { │
│ [PKT_CONNECT] = handle_connect, │
│ [PKT_DATA] = handle_data, │
│ [PKT_DISCONNECT] = handle_disconnect, │
│ }; │
│ │
│ handlers[packet->type](conn, packet); │
└─────────────────────────────────────────────────────────┘
4. GUI EVENT HANDLING
┌─────────────────────────────────────────────────────────┐
│ Event: {type: MOUSE_CLICK, x: 100, y: 200} │
│ │
│ EventHandler handlers[] = { │
│ [EVENT_MOUSE_CLICK] = on_mouse_click, │
│ [EVENT_KEY_PRESS] = on_key_press, │
│ [EVENT_RESIZE] = on_resize, │
│ }; │
│ │
│ handlers[event.type](widget, &event); │
└─────────────────────────────────────────────────────────┘
5. STATE MACHINES
┌─────────────────────────────────────────────────────────┐
│ Current state: STATE_CONNECTED │
│ Event: EVT_TIMEOUT │
│ │
│ // 2D dispatch table: state × event → handler │
│ StateHandler transitions[NUM_STATES][NUM_EVENTS] = { │
│ [STATE_IDLE][EVT_CONNECT] = do_connect, │
│ [STATE_CONNECTED][EVT_TIMEOUT] = do_reconnect, │
│ [STATE_CONNECTED][EVT_DATA] = do_receive, │
│ }; │
│ │
│ transitions[current_state][event](ctx); │
└─────────────────────────────────────────────────────────┘
6. PLUGIN SYSTEMS
┌─────────────────────────────────────────────────────────┐
│ // Plugin interface │
│ typedef struct { │
│ const char *name; │
│ int (*init)(void); │
│ int (*process)(Data *); │
│ void (*cleanup)(void); │
│ } Plugin; │
│ │
│ // Load plugin at runtime │
│ void *handle = dlopen("plugin.so", RTLD_NOW); │
│ Plugin *p = dlsym(handle, "plugin_info"); │
│ p->init(); │
│ // Register in dispatch table │
│ register_command(p->name, p->process); │
└─────────────────────────────────────────────────────────┘
2.3 Historical Context
THE EVOLUTION OF DISPATCH MECHANISMS
====================================
1960s: FORTRAN COMPUTED GOTO
────────────────────────────
GOTO (100, 200, 300), I
! Jump to label 100, 200, or 300 based on I
1970s: C SWITCH STATEMENT
─────────────────────────
switch (opcode) {
case ADD: ... break;
case SUB: ... break;
}
! Compiler might use jump table internally
1970s: C FUNCTION POINTERS
──────────────────────────
void (*handlers[])() = {add, sub, mul};
handlers[opcode]();
! Explicit dispatch table in source code
1980s: C++ VIRTUAL FUNCTIONS
────────────────────────────
class Animal {
virtual void speak() = 0;
};
! Compiler generates vtable (dispatch table)
1990s-Now: INTERPRETED LANGUAGE VMs
───────────────────────────────────
Python, Lua, Ruby all use dispatch tables
for bytecode interpretation
INSIGHT: Virtual functions ARE dispatch tables!
================================================
When you write:
class Animal { virtual void speak(); };
class Dog : public Animal { void speak() override; };
The compiler generates:
struct AnimalVTable {
void (*speak)(Animal*);
};
AnimalVTable dog_vtable = { &Dog::speak };
// animal->speak() becomes:
animal->vtable->speak(animal);
C doesn't have virtual functions, but you can build them yourself!
That's exactly what this project teaches.
2.4 Common Misconceptions
MISCONCEPTION 1: "Function pointers are slow"
─────────────────────────────────────────────
Reality: An indirect call (through function pointer) is typically just ONE
extra memory load compared to a direct call.
Direct call:
call do_add ; Immediate address in instruction
Indirect call:
mov rax, [handler] ; Load address from memory (1 extra instruction)
call rax ; Jump to loaded address
Cost: ~1-3 CPU cycles extra (negligible for most applications)
Exception: Branch prediction. If the function pointer changes frequently,
the CPU can't predict where to jump, causing pipeline stalls.
But for dispatch tables, the same command usually calls the
same function, so prediction works well.
MISCONCEPTION 2: "Function pointers are just for callbacks"
───────────────────────────────────────────────────────────
Reality: Function pointers enable multiple patterns:
1. Callbacks (what most tutorials show)
qsort(arr, n, sizeof(int), compare_func);
2. Dispatch tables (this project)
handlers[opcode](args);
3. Strategy pattern (runtime algorithm selection)
sorter->algorithm(data); // quicksort, mergesort, etc.
4. Object-oriented C (manual vtables)
obj->vtable->method(obj, args);
5. State machines (state-dependent behavior)
states[current]->on_event(machine, event);
MISCONCEPTION 3: "You can't have polymorphism in C"
───────────────────────────────────────────────────
Reality: C has polymorphism through function pointers. It's just explicit
rather than implicit (like C++ virtual).
// C "polymorphism"
typedef struct {
void (*speak)(void *self);
void (*move)(void *self, int x, int y);
} AnimalOps;
typedef struct {
AnimalOps *ops;
char name[32];
} Animal;
void animal_speak(Animal *a) {
a->ops->speak(a); // Dispatches to Dog::speak, Cat::speak, etc.
}
This is EXACTLY how Linux kernel implements polymorphism for drivers,
filesystems, and more. See: struct file_operations, struct inode_operations
MISCONCEPTION 4: "The function pointer syntax is too complicated"
─────────────────────────────────────────────────────────────────
Reality: Use typedef to make it readable!
// Ugly
int (*(*fp)(int, int))(char)
// Readable
typedef int (*CharFunc)(char);
typedef CharFunc (*IntIntFunc)(int, int);
IntIntFunc fp;
Rule: If your function pointer declaration needs the spiral rule,
use typedef to give the inner types names.
3. Project Specification
3.1 What You Will Build
A fully-featured command interpreter that demonstrates function pointer dispatch:
$ ./cmdinterp
Welcome to CmdInterp v1.0
Type 'help' for available commands.
> help
Available commands:
help Show this help message
echo <args> Echo arguments back
add <n1> <n2> Add two numbers
sub <n1> <n2> Subtract two numbers
mul <n1> <n2> Multiply two numbers
div <n1> <n2> Divide two numbers
repeat <n> <cmd> Repeat a command n times
history Show command history
alias <n>=<cmd> Create command alias
quit Exit the interpreter
> add 5 3
Result: 8
> echo hello world
hello world
> mul 7 6
Result: 42
> repeat 3 echo testing
testing
testing
testing
> alias double=mul 2
Alias 'double' created.
> double 21
Result: 42
> history
1: help
2: add 5 3
3: echo hello world
4: mul 7 6
5: repeat 3 echo testing
6: alias double=mul 2
7: double 21
> unknown_command
Error: Unknown command 'unknown_command'. Type 'help' for available commands.
> quit
Goodbye!
3.2 Functional Requirements
- Core Command System:
- Parse user input into command and arguments
- Look up commands in dispatch table (O(1) or O(log n))
- Execute handler function with arguments
- Handle unknown commands gracefully
- Built-in Commands:
help- List all available commands with descriptionsecho- Print argumentsadd,sub,mul,div- Basic arithmeticquit- Exit the interpreter
- Advanced Features:
repeat <n> <cmd>- Execute a command multiple timeshistory- Show command historyalias <name>=<cmd>- Create command aliases (shows dispatch table modification)
- Error Handling:
- Unknown command messages
- Invalid argument messages
- Division by zero handling
3.3 Non-Functional Requirements
- Extensibility: Adding a new command should require only adding an entry to a table
- Performance: Command lookup should be O(1) using hash table or O(log n) using binary search
- Clean Code: Each command handler is a separate function with consistent signature
- Memory Safety: No buffer overflows, proper input validation
3.4 Example Usage / Output
# Startup
$ ./cmdinterp
Welcome to CmdInterp v1.0
Type 'help' for available commands.
# Help system
> help
Available commands:
help Show this help message
echo <args> Echo arguments back
add <n1> <n2> Add two numbers
sub <n1> <n2> Subtract two numbers
mul <n1> <n2> Multiply two numbers
div <n1> <n2> Divide two numbers
repeat <n> <cmd> Repeat a command n times
history Show command history
alias <n>=<cmd> Create command alias
quit Exit the interpreter
# Arithmetic
> add 10 20
Result: 30
> sub 100 42
Result: 58
> mul 6 7
Result: 42
> div 100 4
Result: 25
> div 10 0
Error: Division by zero
# String operations
> echo The quick brown fox
The quick brown fox
> echo
(empty line)
# Repetition
> repeat 5 echo Hello
Hello
Hello
Hello
Hello
Hello
> repeat 3 add 1 1
Result: 2
Result: 2
Result: 2
# Aliases (modifying dispatch table at runtime)
> alias square=mul
Alias 'square' created. Usage: square <n> <n>
> square 5 5
Result: 25
> alias greet=echo Hello,
Alias 'greet' created. Usage: greet <additional args>
> greet World!
Hello, World!
# History
> history
1: help
2: add 10 20
3: sub 100 42
4: mul 6 7
5: div 100 4
6: div 10 0
7: echo The quick brown fox
8: repeat 5 echo Hello
9: alias square=mul
10: square 5 5
# Error handling
> nonexistent
Error: Unknown command 'nonexistent'. Type 'help' for available commands.
> add 5
Error: add requires 2 arguments, got 1
> add five three
Error: Invalid argument 'five' - expected a number
# Exit
> quit
Goodbye!
3.5 Real World Outcome
After completing this project, you will:
- Understand dispatch tables deeply - Not just use them, but know when and why to apply them
- Write extensible systems - Design software that grows without massive refactoring
- Read real-world C code - Recognize dispatch patterns in shells, VMs, kernels
- Pass systems interviews - Function pointers are a classic interview topic
- Build plugin systems - Know how to add functionality at runtime
- Implement state machines - Use 2D dispatch tables for state × event → action
4. Solution Architecture
4.1 High-Level Design
┌─────────────────────────────────────────────────────────────────────────────┐
│ COMMAND INTERPRETER ARCHITECTURE │
├─────────────────────────────────────────────────────────────────────────────┤
│ │
│ USER INPUT │
│ │ │
│ ▼ │
│ ┌───────────────┐ │
│ │ LEXER │ Split input into tokens │
│ │ │ "add 5 3" → ["add", "5", "3"] │
│ └───────┬───────┘ │
│ │ │
│ ▼ │
│ ┌───────────────┐ │
│ │ DISPATCHER │ Look up command in dispatch table │
│ │ │ │
│ │ ┌──────────────────────────────────────────────────────────────────┐ │
│ │ │ DISPATCH TABLE │ │
│ │ ├───────────┬─────────────────────┬───────────────────────────────┤ │
│ │ │ Name │ Handler Function │ Help Text │ │
│ │ ├───────────┼─────────────────────┼───────────────────────────────┤ │
│ │ │ "help" │ cmd_help │ "Show this help message" │ │
│ │ │ "echo" │ cmd_echo │ "Echo arguments back" │ │
│ │ │ "add" │ cmd_add │ "Add two numbers" │ │
│ │ │ "sub" │ cmd_sub │ "Subtract two numbers" │ │
│ │ │ "quit" │ cmd_quit │ "Exit the interpreter" │ │
│ │ │ ... │ ... │ ... │ │
│ │ └───────────┴─────────────────────┴───────────────────────────────┘ │
│ │ │ │
│ └───────┬───────┘ │
│ │ │
│ ▼ │
│ ┌───────────────┐ │
│ │ EXECUTOR │ Call handler with arguments │
│ │ │ cmd->handler(argc, argv) │
│ └───────┬───────┘ │
│ │ │
│ ▼ │
│ OUTPUT │
│ │
└─────────────────────────────────────────────────────────────────────────────┘
4.2 Key Components
COMPONENT BREAKDOWN
===================
1. COMMAND HANDLER TYPE
┌─────────────────────────────────────────────────────────┐
│ // All command handlers have this signature │
│ typedef int (*cmd_func)(int argc, char **argv); │
│ │
│ // Returns: 0 = success, non-zero = error code │
│ // argc: number of arguments (including command name) │
│ // argv: array of argument strings │
└─────────────────────────────────────────────────────────┘
2. COMMAND ENTRY STRUCTURE
┌─────────────────────────────────────────────────────────┐
│ typedef struct { │
│ const char *name; // "add", "echo", etc. │
│ cmd_func handler; // Function to call │
│ const char *help; // Help text │
│ int min_args; // Minimum required args │
│ int max_args; // Maximum args (-1 = any) │
│ } Command; │
└─────────────────────────────────────────────────────────┘
3. DISPATCH TABLE
┌─────────────────────────────────────────────────────────┐
│ // Static table of built-in commands │
│ static Command commands[] = { │
│ {"help", cmd_help, "Show help", 0, 0}, │
│ {"echo", cmd_echo, "Echo arguments", 0, -1}, │
│ {"add", cmd_add, "Add numbers", 2, 2}, │
│ {"quit", cmd_quit, "Exit", 0, 0}, │
│ {NULL, NULL, NULL, 0, 0} // Sentinel │
│ }; │
└─────────────────────────────────────────────────────────┘
4. LOOKUP FUNCTION
┌─────────────────────────────────────────────────────────┐
│ Command *find_command(const char *name) { │
│ // Linear search (simple, O(n)) │
│ for (int i = 0; commands[i].name != NULL; i++) { │
│ if (strcmp(commands[i].name, name) == 0) { │
│ return &commands[i]; │
│ } │
│ } │
│ return NULL; // Not found │
│ } │
│ │
│ // Or: hash table for O(1) │
│ // Or: binary search on sorted table for O(log n) │
└─────────────────────────────────────────────────────────┘
5. MAIN LOOP
┌─────────────────────────────────────────────────────────┐
│ void repl(void) { │
│ char line[256]; │
│ while (1) { │
│ printf("> "); │
│ if (!fgets(line, sizeof(line), stdin)) break; │
│ │
│ int argc; │
│ char **argv = tokenize(line, &argc); │
│ if (argc == 0) continue; │
│ │
│ Command *cmd = find_command(argv[0]); │
│ if (cmd == NULL) { │
│ printf("Unknown command: %s\n", argv[0]); │
│ continue; │
│ } │
│ │
│ int result = cmd->handler(argc, argv); │
│ // Handle result... │
│ │
│ free_tokens(argv); │
│ } │
│ } │
└─────────────────────────────────────────────────────────┘
4.3 Data Structures
/* Function pointer type for command handlers */
typedef int (*cmd_func)(int argc, char **argv);
/* Command entry in dispatch table */
typedef struct Command {
const char *name; /* Command name (e.g., "add") */
cmd_func handler; /* Function to call */
const char *usage; /* Usage string (e.g., "add <n1> <n2>") */
const char *help; /* Help description */
int min_args; /* Minimum arguments (excluding command name) */
int max_args; /* Maximum arguments (-1 for unlimited) */
} Command;
/* Alias entry for runtime-added commands */
typedef struct Alias {
char name[32]; /* Alias name */
char expansion[256]; /* What it expands to */
struct Alias *next; /* Linked list for dynamic aliases */
} Alias;
/* Interpreter state */
typedef struct {
Command *commands; /* Dispatch table */
int num_commands; /* Number of commands */
Alias *aliases; /* Runtime aliases */
char **history; /* Command history */
int history_count; /* Number of history entries */
int history_capacity; /* History buffer size */
int running; /* 0 = quit requested */
} Interpreter;
/* Token buffer for parsed input */
typedef struct {
char *tokens[64]; /* Array of token pointers */
int count; /* Number of tokens */
char buffer[1024]; /* Storage for token strings */
} TokenBuffer;
4.4 Algorithm Overview
DISPATCH ALGORITHM
==================
Input: User command string "add 5 3"
Output: Execute appropriate handler and return result
1. TOKENIZE
─────────
Input: "add 5 3"
Output: argc=3, argv=["add", "5", "3"]
Algorithm:
- Skip leading whitespace
- For each character:
- If whitespace: terminate current token, start new
- If quote: read until matching quote
- Otherwise: append to current token
2. LOOKUP
───────
Input: command name "add"
Output: Command* or NULL
Methods:
a) Linear search: O(n) - Simple, good for <20 commands
b) Binary search: O(log n) - Requires sorted table
c) Hash table: O(1) - Best for many commands
d) Array index: O(1) - If commands are numbered (opcodes)
3. VALIDATE
─────────
Check argument count against min_args and max_args
Return error if validation fails
4. DISPATCH
─────────
Call: cmd->handler(argc, argv)
The magic line! This is the indirect function call:
int result = cmd->handler(argc, argv);
In assembly:
mov rax, [cmd + offset_of_handler] ; Load function pointer
mov rdi, argc ; First argument
mov rsi, argv ; Second argument
call rax ; Indirect call
5. HANDLE RESULT
──────────────
- result == 0: Success
- result > 0: Error code (print message)
- result < 0: Special (e.g., -1 = quit requested)
HASH TABLE FOR O(1) LOOKUP
==========================
For production interpreters with many commands:
typedef struct {
Command *buckets[HASH_SIZE]; // Array of command pointers
} CommandTable;
unsigned int hash(const char *str) {
unsigned int h = 0;
while (*str) {
h = h * 31 + *str++;
}
return h % HASH_SIZE;
}
Command *lookup(CommandTable *table, const char *name) {
unsigned int h = hash(name);
Command *cmd = table->buckets[h];
while (cmd) {
if (strcmp(cmd->name, name) == 0) return cmd;
cmd = cmd->next; // Handle collisions with chaining
}
return NULL;
}
5. Implementation Guide
5.1 Development Environment Setup
Required Tools:
- GCC or Clang compiler
- Make build system
- gdb/lldb debugger
Compiler Flags:
CFLAGS = -Wall -Wextra -Wpedantic -std=c11 -g
# -Wall: Enable all common warnings
# -Wextra: Enable extra warnings
# -Wpedantic: Strict ISO C compliance
# -std=c11: Use C11 standard
# -g: Include debug symbols
5.2 Project Structure
cmdinterp/
├── Makefile
├── include/
│ ├── cmdinterp.h # Main header with types
│ ├── commands.h # Command handler declarations
│ └── tokenizer.h # Tokenizer interface
├── src/
│ ├── main.c # Entry point and REPL
│ ├── dispatch.c # Dispatch table and lookup
│ ├── tokenizer.c # Input tokenization
│ ├── commands/
│ │ ├── builtin.c # help, echo, quit
│ │ ├── math.c # add, sub, mul, div
│ │ ├── meta.c # repeat, alias, history
│ │ └── commands.c # Command registration
│ └── history.c # History tracking
└── tests/
└── test_dispatch.c # Unit tests
5.3 The Core Question You’re Answering
“How can we use function pointers to create extensible, polymorphic behavior in C?”
This question goes to the heart of C’s power: you can achieve the same flexibility that object-oriented languages provide through inheritance and virtual functions, but with explicit control over the mechanism.
The answer is: by storing function pointers in data structures, you can:
- Select which function to call at runtime (polymorphism)
- Add new behaviors without changing existing code (extensibility)
- Pass behaviors as data (callbacks, strategies, plugins)
5.4 Concepts You Must Understand First
| Concept | Why It Matters | Self-Assessment Question |
|---|---|---|
| Function pointer syntax | You’ll write typedef int (*cmd_func)(int, char**) |
Can you declare a pointer to a function taking two ints and returning void? |
| Struct with function pointer | Commands need both data (name) and behavior (handler) | Can you write a struct containing a function pointer member? |
| Array of structs | The dispatch table is an array of Command structs | Can you iterate over an array of structs? |
| strcmp() and string comparison | Command lookup compares strings | Why can’t you use == to compare strings in C? |
| Indirect function calls | cmd->handler(args) calls through a pointer |
What assembly instruction does an indirect call use? |
5.5 Questions to Guide Your Design
- How will you handle commands with different argument counts?
- Fixed:
add <n1> <n2>always takes 2 - Variable:
echotakes 0 or more - Consider: min_args, max_args fields
- Fixed:
- How will you make the dispatch table extensible?
- Static table: Fast, but fixed at compile time
- Dynamic table: Can add commands at runtime (for aliases)
- Hybrid: Static built-ins + dynamic extensions
- What should the command handler signature be?
- Option A:
int handler(int argc, char **argv)- Like main() - Option B:
int handler(char *args)- Raw argument string - Option C:
int handler(Interpreter *interp, int argc, char **argv)- With context
- Option A:
- How will you implement ‘repeat’ and ‘alias’?
repeat: Needs to parse and call another commandalias: Needs to modify or shadow the dispatch table
- What return values indicate success/failure/quit?
- Design a consistent error handling strategy
5.6 Thinking Exercise
Before coding, trace through this scenario on paper:
Scenario: User types add 5 3
Step 1: main() calls repl()
Current state: _______________
Step 2: repl() reads line "add 5 3\n"
line buffer contains: _______________
Step 3: tokenize() parses line
argc = ___
argv[0] = ___
argv[1] = ___
argv[2] = ___
Step 4: find_command("add") searches table
Compare "add" with "help": ___
Compare "add" with "echo": ___
Compare "add" with "add": ___
Returns: _______________
Step 5: Validate arguments
min_args = ___, max_args = ___
argc - 1 = ___ (excluding command name)
Validation: _______________
Step 6: Call cmd->handler(3, argv)
Handler function address: _______________
Inside cmd_add():
argv[1] = "5" → atoi → ___
argv[2] = "3" → atoi → ___
result = ___
Step 7: Return to repl()
Print: _______________
5.7 Hints in Layers
Hint 1: Starting the Dispatch Table
Start with the simplest possible dispatch table:
typedef int (*cmd_func)(int argc, char **argv);
typedef struct {
const char *name;
cmd_func handler;
} Command;
int cmd_hello(int argc, char **argv) {
printf("Hello, World!\n");
return 0;
}
int cmd_quit(int argc, char **argv) {
return -1; // Signal to quit
}
Command commands[] = {
{"hello", cmd_hello},
{"quit", cmd_quit},
{NULL, NULL} // End marker
};
Get this working before adding complexity.
Hint 2: The Lookup Function
Linear search is fine for small tables:
Command *find_command(const char *name) {
for (int i = 0; commands[i].name != NULL; i++) {
if (strcmp(commands[i].name, name) == 0) {
return &commands[i];
}
}
return NULL;
}
The key insight: return a pointer to the Command, not a copy.
This lets you call cmd->handler(...) directly.
Hint 3: The Main Loop
void repl(void) {
char line[256];
int running = 1;
while (running) {
printf("> ");
fflush(stdout);
if (fgets(line, sizeof(line), stdin) == NULL) {
break; // EOF
}
// Remove newline
line[strcspn(line, "\n")] = '\0';
// Skip empty lines
if (line[0] == '\0') continue;
// Tokenize
char *argv[64];
int argc = tokenize(line, argv, 64);
if (argc == 0) continue;
// Dispatch
Command *cmd = find_command(argv[0]);
if (cmd == NULL) {
printf("Unknown command: %s\n", argv[0]);
continue;
}
int result = cmd->handler(argc, argv);
if (result < 0) running = 0; // Quit signal
}
}
Hint 4: Simple Tokenizer
// In-place tokenization (modifies the input string)
int tokenize(char *line, char **argv, int max_args) {
int argc = 0;
char *p = line;
while (*p && argc < max_args) {
// Skip whitespace
while (*p == ' ' || *p == '\t') p++;
if (*p == '\0') break;
// Start of token
argv[argc++] = p;
// Find end of token
while (*p && *p != ' ' && *p != '\t') p++;
if (*p) *p++ = '\0'; // Null-terminate token
}
return argc;
}
Hint 5: Implementing 'repeat'
repeat needs to call another command. This shows the power of dispatch tables - you can recursively use the same mechanism:
int cmd_repeat(int argc, char **argv) {
if (argc < 3) {
printf("Usage: repeat <n> <command> [args...]\n");
return 1;
}
int count = atoi(argv[1]);
if (count <= 0) {
printf("Error: repeat count must be positive\n");
return 1;
}
// Build new argv for the sub-command
// argv: ["repeat", "3", "echo", "hello"]
// sub: ["echo", "hello"]
int sub_argc = argc - 2;
char **sub_argv = &argv[2];
Command *cmd = find_command(sub_argv[0]);
if (cmd == NULL) {
printf("Unknown command: %s\n", sub_argv[0]);
return 1;
}
for (int i = 0; i < count; i++) {
cmd->handler(sub_argc, sub_argv);
}
return 0;
}
Hint 6: Implementing Aliases
Aliases can be implemented as a separate lookup that happens before the main dispatch:
typedef struct Alias {
char name[32];
char expansion[256];
struct Alias *next;
} Alias;
Alias *aliases = NULL;
char *expand_alias(const char *name) {
for (Alias *a = aliases; a != NULL; a = a->next) {
if (strcmp(a->name, name) == 0) {
return a->expansion;
}
}
return NULL;
}
// In the main loop, before find_command:
char *expansion = expand_alias(argv[0]);
if (expansion) {
// Prepend expansion to the argument list
// "double 5" with alias "double=mul 2" becomes "mul 2 5"
// ... reparse with expansion ...
}
5.8 The Interview Questions They’ll Ask
After completing this project, be ready for:
- “Explain how function pointers work in C”
- Functions have addresses; pointers store addresses
- Indirect call vs direct call
- The compiler generates a
callinstruction with a register operand
- “What is a dispatch table and when would you use one?”
- Maps keys to handlers
- Use for: command interpreters, bytecode VMs, protocol handlers, event systems
- Benefits: extensibility, data-driven design, O(1) lookup possible
- “How does C achieve polymorphism without classes?”
- Function pointers in structs (manual vtables)
- Example: Linux kernel’s file_operations, inode_operations
- Trade-off: explicit vs implicit (C++ virtual)
- “What’s the performance cost of function pointers?”
- One extra memory load for the address
- Potential branch prediction miss if pointer changes frequently
- Usually negligible; profile before optimizing
- “Design a plugin system in C”
- Define interface as struct of function pointers
- Load shared library with dlopen/dlsym
- Get plugin’s function table
- Register handlers in dispatch table
- “How would you implement a finite state machine in C?”
- 2D dispatch table:
handlers[state][event] - Or struct per state with entry/exit/event handlers
- Function pointers enable clean state-dependent behavior
- 2D dispatch table:
5.9 Books That Will Help
| Topic | Book | Chapter |
|---|---|---|
| Function pointer syntax | Expert C Programming | Ch. 3, 9 |
| Callbacks and function pointers | C Interfaces and Implementations | Ch. 1-2 |
| Linux kernel dispatch patterns | Linux Device Drivers | Ch. 3 |
| VM dispatch optimization | Crafting Interpreters | Ch. 15 (switch vs computed goto) |
| State machines in C | Practical Statecharts in C/C++ | Ch. 2-3 |
5.10 Implementation Phases
Phase 1: Minimal Viable Interpreter (Day 1-2)
- Basic REPL loop
- Tokenizer
- Dispatch table with 3-4 commands (help, echo, quit)
- Linear search lookup
Phase 2: Arithmetic Commands (Day 3)
- Add, sub, mul, div commands
- Argument validation
- Error handling
Phase 3: Advanced Features (Day 4-5)
- History tracking
- Repeat command
- Alias system
Phase 4: Polish (Day 6-7)
- Improve error messages
- Add more commands
- Optimize lookup (optional: hash table)
- Write tests
5.11 Key Implementation Decisions
| Decision | Simple Approach | Advanced Approach |
|---|---|---|
| Lookup | Linear search O(n) | Hash table O(1) |
| Table storage | Static array | Dynamic array/linked list |
| Handler signature | int f(int, char**) |
int f(Interp*, int, char**) |
| Tokenizer | In-place modification | Copy to new buffer |
| Aliases | Separate linked list | Merge into main table |
For this project, start with the simple approaches. Optimization can come later.
6. Testing Strategy
Test Categories
| Category | Purpose | Examples |
|---|---|---|
| Unit Tests | Test individual components | Tokenizer, lookup function |
| Integration Tests | Test command execution | Full command invocation |
| Error Tests | Test error handling | Invalid commands, bad arguments |
| Edge Cases | Test boundaries | Empty input, very long input |
Critical Test Cases
// Tokenizer tests
void test_tokenizer(void) {
char line[256];
char *argv[64];
int argc;
// Basic tokenization
strcpy(line, "add 5 3");
argc = tokenize(line, argv, 64);
assert(argc == 3);
assert(strcmp(argv[0], "add") == 0);
assert(strcmp(argv[1], "5") == 0);
assert(strcmp(argv[2], "3") == 0);
// Multiple spaces
strcpy(line, "echo hello world");
argc = tokenize(line, argv, 64);
assert(argc == 3);
// Empty input
strcpy(line, "");
argc = tokenize(line, argv, 64);
assert(argc == 0);
// Whitespace only
strcpy(line, " \t ");
argc = tokenize(line, argv, 64);
assert(argc == 0);
}
// Dispatch tests
void test_dispatch(void) {
// Known command
Command *cmd = find_command("add");
assert(cmd != NULL);
assert(strcmp(cmd->name, "add") == 0);
// Unknown command
cmd = find_command("nonexistent");
assert(cmd == NULL);
// Case sensitivity
cmd = find_command("ADD");
assert(cmd == NULL); // Commands are case-sensitive
}
// Command execution tests
void test_commands(void) {
char *argv_add[] = {"add", "10", "20"};
int result = cmd_add(3, argv_add);
assert(result == 0); // Success
// Check output was "Result: 30"
char *argv_bad[] = {"add", "10"};
result = cmd_add(2, argv_bad);
assert(result != 0); // Error: missing argument
char *argv_div0[] = {"div", "10", "0"};
result = cmd_div(3, argv_div0);
assert(result != 0); // Error: division by zero
}
Test Script
#!/bin/bash
# test_cmdinterp.sh
INTERP=./cmdinterp
# Test basic commands
echo "Testing basic commands..."
echo "help" | $INTERP | grep -q "Available commands" || { echo "FAIL: help"; exit 1; }
echo "echo hello world" | $INTERP | grep -q "hello world" || { echo "FAIL: echo"; exit 1; }
echo "add 5 3" | $INTERP | grep -q "8" || { echo "FAIL: add"; exit 1; }
echo "sub 10 4" | $INTERP | grep -q "6" || { echo "FAIL: sub"; exit 1; }
echo "mul 6 7" | $INTERP | grep -q "42" || { echo "FAIL: mul"; exit 1; }
echo "div 20 5" | $INTERP | grep -q "4" || { echo "FAIL: div"; exit 1; }
# Test error handling
echo "Testing error handling..."
echo "unknown_cmd" | $INTERP | grep -q "Unknown command" || { echo "FAIL: unknown"; exit 1; }
echo "add 5" | $INTERP | grep -q -i "error\|usage" || { echo "FAIL: add missing arg"; exit 1; }
echo "div 10 0" | $INTERP | grep -q -i "error\|zero" || { echo "FAIL: div by zero"; exit 1; }
# Test repeat
echo "Testing repeat..."
result=$(echo "repeat 3 echo test" | $INTERP | grep -c "test")
[ "$result" -eq 3 ] || { echo "FAIL: repeat count"; exit 1; }
echo "All tests passed!"
7. Common Pitfalls & Debugging
| Pitfall | Symptom | Solution |
|---|---|---|
| Forgot sentinel in table | Infinite loop or crash | Always end array with {NULL, NULL, ...} |
| Wrong function signature | Compiler warning or crash | Ensure all handlers match cmd_func typedef |
| Modifying string literals | Segfault | Use char[] not char* for tokenizer buffer |
| Off-by-one in argc | Wrong argument parsing | Remember: argc includes command name |
| Not null-checking lookup result | Crash on unknown command | Always check find_command() != NULL |
| Forgetting to flush stdout | No prompt shown | fflush(stdout) after printf("> ") |
| Alias infinite loop | Stack overflow | Detect and prevent alias cycles |
Debugging Strategies
- Print the dispatch table at startup:
void debug_print_commands(void) { printf("Registered commands:\n"); for (int i = 0; commands[i].name != NULL; i++) { printf(" %s -> %p\n", commands[i].name, (void*)commands[i].handler); } } - Trace function pointer calls:
// Before calling handler printf("DEBUG: Calling %s at %p with argc=%d\n", cmd->name, (void*)cmd->handler, argc); for (int i = 0; i < argc; i++) { printf("DEBUG: argv[%d] = '%s'\n", i, argv[i]); } - Use GDB to inspect function pointers:
(gdb) print commands[0] $1 = {name = "help", handler = 0x401234 <cmd_help>, ...} (gdb) print cmd->handler $2 = (int (*)(int, char **)) 0x401234 <cmd_help> (gdb) break *cmd->handler Breakpoint 1 at 0x401234: file commands.c, line 10.
8. Extensions & Challenges
Beginner Extensions
- Add
clearcommand to clear the screen - Add
versioncommand to show interpreter version - Add command abbreviations (e.g.,
qforquit) - Colorize output (errors in red, success in green)
Intermediate Extensions
- Implement
ifcommand for conditional execution - Add variables:
set x 10,echo $x - Implement piping:
history | grep add - Load commands from a file:
source script.cmd - Implement tab completion
Advanced Extensions
- Plugin system: Load commands from shared libraries at runtime
// plugin.c int plugin_init(void) { /* register commands */ } int cmd_plugin_func(int argc, char **argv) { ... } // main interpreter void *handle = dlopen("./plugin.so", RTLD_NOW); void (*init)(void) = dlsym(handle, "plugin_init"); init(); - State machine mode: Use 2D dispatch for
state x event -> handlertypedef int (*state_handler)(Context *ctx, Event *evt); state_handler transitions[NUM_STATES][NUM_EVENTS]; - Bytecode compiler: Compile commands to bytecode, execute with dispatch
// Compile "add 5 3" to bytecode: [OP_PUSH, 5, OP_PUSH, 3, OP_ADD, OP_PRINT] // Execute with: handlers[opcode](vm); - Coroutine support: Implement
yieldfor cooperative multitasking
9. Real-World Connections
How Real Systems Use Dispatch Tables
1. Bash Shell
// From bash source code (builtins.h)
struct builtin {
char *name;
sh_builtin_func_t *function;
int flags;
char * const *long_doc;
const char *short_doc;
char *handle;
};
// Lookup and execute
builtin = find_shell_builtin(words->word->word);
result = (*(builtin->function))(words->next);
2. Linux Kernel (syscall dispatch)
// arch/x86/entry/syscall_64.c
const sys_call_ptr_t sys_call_table[__NR_syscall_max+1] = {
[0] = sys_read,
[1] = sys_write,
[2] = sys_open,
// ...
};
// Dispatch
sys_call_table[nr](regs);
3. CPython VM
// Python/ceval.c
switch (opcode) {
case LOAD_CONST: ... break;
case BINARY_ADD: ... break;
// Or with computed goto:
static void *opcode_targets[] = {
&&TARGET_LOAD_CONST,
&&TARGET_BINARY_ADD,
};
DISPATCH() { goto *opcode_targets[*ip++]; }
4. Redis Command Handling
// src/server.h
struct redisCommand {
char *name;
redisCommandProc *proc;
int arity;
char *sflags;
// ...
};
// Command execution
c->cmd->proc(c);
5. Nginx Event Handling
// Event dispatch for different OSes
typedef struct {
ngx_int_t (*add)(ngx_event_t *ev, ngx_int_t event, ngx_uint_t flags);
ngx_int_t (*del)(ngx_event_t *ev, ngx_int_t event, ngx_uint_t flags);
ngx_int_t (*process_events)(ngx_cycle_t *cycle, ...);
// ...
} ngx_event_actions_t;
ngx_event_actions_t ngx_event_actions;
// Dispatch: ngx_event_actions.process_events(cycle, ...);
10. Resources
Essential Reading
- Expert C Programming, Chapter 3: “Unscrambling Declarations in C” (function pointer syntax)
- Expert C Programming, Chapter 9: “More about Arrays” (arrays of function pointers)
- C Interfaces and Implementations, Chapter 1-2 (callback patterns)
Online Resources
- cdecl.org - Decode function pointer declarations
- Compiler Explorer - See how function pointers compile to assembly
- Linux kernel source:
include/linux/fs.h(struct file_operations)
Related Projects
- readline library - For real line editing
- linenoise - Lightweight readline alternative
- Crafting Interpreters - Full interpreter implementation
11. Self-Assessment Checklist
Understanding
- I can declare function pointers without looking up the syntax
- I understand why
handlers[i](args)works - I can explain the performance characteristics of dispatch tables
- I know when to use dispatch tables vs switch statements
- I understand how this relates to C++ virtual functions
Implementation
- My interpreter correctly parses commands and arguments
- Command lookup works correctly (returns NULL for unknown commands)
- All arithmetic commands handle their arguments properly
- Error handling provides helpful messages
- The
repeatcommand correctly executes sub-commands - The
aliassystem allows creating new commands
Testing
- All built-in commands work correctly
- Error cases are handled gracefully
- Edge cases (empty input, long input) are handled
- I can add a new command by just adding a table entry
Growth
- I can identify dispatch table patterns in other codebases
- I understand how to extend this to a plugin system
- I can explain this pattern in an interview
12. Submission / Completion Criteria
Minimum Viable Completion
- REPL loop reads and responds to commands
- At least 5 commands: help, echo, add, quit, and one other
- Unknown commands produce error messages
- Commands are looked up in a dispatch table
Full Completion
- All specified commands implemented (help, echo, add, sub, mul, div, quit)
repeatcommand works correctlyhistorycommand shows previous commandsaliascommand creates runtime aliases- Proper error handling for all edge cases
- Clean code with consistent style
Excellence (Going Above & Beyond)
- Hash table for O(1) command lookup
- Plugin system with dlopen/dlsym
- Tab completion
- Command history with arrow key navigation (using readline)
- Variables and interpolation
- Comprehensive test suite
Thinking Exercise
Before writing code, design your command table on paper:
Exercise 1: Define the Handler Signature
What should the function signature be? Consider:
- How will handlers know their arguments?
- How will handlers signal success/failure/quit?
- Do handlers need access to interpreter state?
Write your typedef:
typedef _____________ (*cmd_func)(_____________);
Exercise 2: Design the Command Structure
What fields does each command entry need?
typedef struct {
_____________ // Name
_____________ // Handler
_____________ // Help text
_____________ // Anything else?
} Command;
Exercise 3: Trace Through Dispatch
For the input "mul 6 7", trace:
- How is it tokenized?
- How is the command found?
- How is the handler called?
- What does the handler do?
- What is printed?
Exercise 4: Design the Alias System
How would you implement:
alias double=mul 2(prepend “mul 2” when “double” is used)alias greet=echo Hello,(prepend “echo Hello,” when “greet” is used)
Sketch the data structure and lookup algorithm.
This project is part of the Expert C Programming Mastery series. For the complete learning path, see the project index.