Project 10: Function Pointer Dispatch Table

Build a command interpreter using function pointer tables for O(1) dispatch, demonstrating polymorphism in C and the power of treating functions as first-class data.


Quick Reference

Attribute Value
Language C
Difficulty Level 3 (Intermediate)
Time 1 Week
Book Reference Expert C Programming, Ch. 3, 9
Coolness 5/5 - Real-World Pattern
Portfolio Value High - Used in shells, VMs, interpreters

1. Learning Objectives

By completing this project, you will:

  1. Master function pointer syntax: Declare, initialize, and invoke function pointers with confidence
  2. Understand dispatch tables: Build lookup tables that map keys to function handlers
  3. Implement O(1) dispatch: Use array indexing or hash tables for constant-time command lookup
  4. Create extensible systems: Design software where new commands can be added without modifying core logic
  5. Recognize the callback pattern: See how function pointers enable event-driven and modular design
  6. Understand C polymorphism: Achieve runtime behavior switching without classes or inheritance
  7. Build real-world systems: Apply patterns used in shells, interpreters, game engines, and protocol handlers

2. Theoretical Foundation

2.1 Core Concepts

What Are Function Pointers?

In C, functions have addresses just like variables. A function pointer stores the address of a function, allowing you to call functions indirectly:

FUNCTION POINTERS: THE BASICS
=============================

Regular function call:
    int result = add(3, 4);

    The compiler generates:
    call add        ; Direct jump to add's address

Function pointer call:
    int (*operation)(int, int) = add;  // Store function's address
    int result = operation(3, 4);      // Call through pointer

    The compiler generates:
    mov rax, [operation]  ; Load function address from variable
    call rax              ; Indirect jump through register

KEY INSIGHT:
============
Functions ARE addresses. The function name without () is already a pointer.

    int add(int a, int b) { return a + b; }

    add    → Address of the add function (e.g., 0x401100)
    add()  → CALL the add function

This is similar to arrays:
    int arr[5];
    arr    → Address of first element
    arr[0] → ACCESS the first element

Function Pointer Declaration Syntax

READING FUNCTION POINTER DECLARATIONS
=====================================

Use the clockwise/spiral rule from Chapter 3:

    int (*fp)(int, int);
        ─┬─
         │
    Start at identifier 'fp'

    1. fp is...
    2. * → a pointer to...
    3. (int, int) → a function taking (int, int)...
    4. int → returning int

DECLARATION                     MEANING
───────────────────────────────────────────────────────────
int (*fp)();                    fp is pointer to function() returning int
int *(*fp)();                   fp is pointer to function() returning int*
int (*fp)(int);                 fp is pointer to function(int) returning int
int (*fp)(int, char*);          fp is pointer to function(int, char*) returning int
void (*fp)(void);               fp is pointer to function(void) returning void
int (*arr[10])(int);            arr is array[10] of pointers to function(int) returning int
int (*(*fp)(int))(double);      fp is pointer to function(int) returning pointer to
                                   function(double) returning int

TYPEDEF MAKES IT READABLE
=========================

Instead of:
    int (*operations[10])(int, int);

Use:
    typedef int (*BinaryOp)(int, int);
    BinaryOp operations[10];

    or (C11+):
    typedef int BinaryOp(int, int);
    BinaryOp *operations[10];

Dispatch Tables

DISPATCH TABLES: THE PATTERN
============================

A dispatch table maps keys (commands, opcodes, message types) to handlers (functions).

Traditional approach (O(n) in number of commands):
┌───────────────────────────────────────────────────────────────┐
│ if (strcmp(cmd, "add") == 0)                                  │
│     return do_add(args);                                      │
│ else if (strcmp(cmd, "sub") == 0)                            │
│     return do_sub(args);                                      │
│ else if (strcmp(cmd, "mul") == 0)                            │
│     return do_mul(args);                                      │
│ // ... N more comparisons                                     │
└───────────────────────────────────────────────────────────────┘

Problems:
- O(n) time to find the command
- Adding new commands requires modifying the if/else chain
- Hard to add commands at runtime
- Mixing control flow with data

Dispatch table approach (O(1) lookup):
┌───────────────────────────────────────────────────────────────┐
│ // Data: command table                                        │
│ typedef struct {                                              │
│     const char *name;                                         │
│     cmd_func handler;                                         │
│ } Command;                                                    │
│                                                               │
│ Command commands[] = {                                        │
│     {"add", do_add},                                          │
│     {"sub", do_sub},                                          │
│     {"mul", do_mul},                                          │
│ };                                                            │
│                                                               │
│ // Lookup (O(1) with hash table, O(log n) with sorted array) │
│ Command *cmd = lookup(commands, user_input);                  │
│ return cmd->handler(args);                                    │
└───────────────────────────────────────────────────────────────┘

Advantages:
- Fast lookup with proper data structure
- Data-driven: add commands without changing logic
- Can add/remove commands at runtime
- Separation of concerns

Dispatch Table Pattern

2.2 Why This Matters

REAL-WORLD USES OF DISPATCH TABLES
==================================

1. COMMAND-LINE SHELLS (bash, zsh, fish)
   ┌─────────────────────────────────────────────────────────┐
   │ User input: "ls -la /home"                              │
   │                │                                        │
   │                ▼                                        │
   │ ┌─────────────────────────────────────┐                │
   │ │ builtin_commands[] = {              │                │
   │ │   {"cd",     builtin_cd},           │                │
   │ │   {"export", builtin_export},       │                │
   │ │   {"alias",  builtin_alias},        │                │
   │ │   ...                               │                │
   │ │ }                                   │                │
   │ └─────────────────────────────────────┘                │
   │                │                                        │
   │                ▼                                        │
   │ Lookup "ls" → not found → exec external program        │
   │ Lookup "cd" → found → call builtin_cd()                │
   └─────────────────────────────────────────────────────────┘

2. BYTECODE INTERPRETERS (Python, Lua, JVM)
   ┌─────────────────────────────────────────────────────────┐
   │ Bytecode: LOAD_CONST 0, LOAD_CONST 1, BINARY_ADD       │
   │                                                         │
   │ // Dispatch table indexed by opcode (O(1))              │
   │ void (*opcode_handlers[256])(VM *vm) = {               │
   │     [OP_LOAD_CONST] = op_load_const,                   │
   │     [OP_BINARY_ADD] = op_binary_add,                   │
   │     [OP_CALL]       = op_call,                         │
   │     ...                                                 │
   │ };                                                      │
   │                                                         │
   │ // The interpreter loop (blazingly fast)                │
   │ while (running) {                                       │
   │     opcode_handlers[*ip++](vm);                        │
   │ }                                                       │
   └─────────────────────────────────────────────────────────┘

3. NETWORK PROTOCOL HANDLERS
   ┌─────────────────────────────────────────────────────────┐
   │ Packet: [TYPE: 0x03] [LENGTH: 256] [DATA: ...]         │
   │                │                                        │
   │                ▼                                        │
   │ PacketHandler handlers[] = {                           │
   │     [PKT_CONNECT]    = handle_connect,                 │
   │     [PKT_DATA]       = handle_data,                    │
   │     [PKT_DISCONNECT] = handle_disconnect,              │
   │ };                                                      │
   │                                                         │
   │ handlers[packet->type](conn, packet);                  │
   └─────────────────────────────────────────────────────────┘

4. GUI EVENT HANDLING
   ┌─────────────────────────────────────────────────────────┐
   │ Event: {type: MOUSE_CLICK, x: 100, y: 200}             │
   │                                                         │
   │ EventHandler handlers[] = {                            │
   │     [EVENT_MOUSE_CLICK] = on_mouse_click,              │
   │     [EVENT_KEY_PRESS]   = on_key_press,                │
   │     [EVENT_RESIZE]      = on_resize,                   │
   │ };                                                      │
   │                                                         │
   │ handlers[event.type](widget, &event);                  │
   └─────────────────────────────────────────────────────────┘

5. STATE MACHINES
   ┌─────────────────────────────────────────────────────────┐
   │ Current state: STATE_CONNECTED                         │
   │ Event: EVT_TIMEOUT                                     │
   │                                                         │
   │ // 2D dispatch table: state × event → handler          │
   │ StateHandler transitions[NUM_STATES][NUM_EVENTS] = {   │
   │     [STATE_IDLE][EVT_CONNECT] = do_connect,            │
   │     [STATE_CONNECTED][EVT_TIMEOUT] = do_reconnect,     │
   │     [STATE_CONNECTED][EVT_DATA] = do_receive,          │
   │ };                                                      │
   │                                                         │
   │ transitions[current_state][event](ctx);                │
   └─────────────────────────────────────────────────────────┘

6. PLUGIN SYSTEMS
   ┌─────────────────────────────────────────────────────────┐
   │ // Plugin interface                                     │
   │ typedef struct {                                        │
   │     const char *name;                                   │
   │     int (*init)(void);                                  │
   │     int (*process)(Data *);                             │
   │     void (*cleanup)(void);                              │
   │ } Plugin;                                               │
   │                                                         │
   │ // Load plugin at runtime                               │
   │ void *handle = dlopen("plugin.so", RTLD_NOW);          │
   │ Plugin *p = dlsym(handle, "plugin_info");              │
   │ p->init();                                              │
   │ // Register in dispatch table                           │
   │ register_command(p->name, p->process);                 │
   └─────────────────────────────────────────────────────────┘

2.3 Historical Context

THE EVOLUTION OF DISPATCH MECHANISMS
====================================

1960s: FORTRAN COMPUTED GOTO
────────────────────────────
      GOTO (100, 200, 300), I
      ! Jump to label 100, 200, or 300 based on I

1970s: C SWITCH STATEMENT
─────────────────────────
      switch (opcode) {
          case ADD: ... break;
          case SUB: ... break;
      }
      ! Compiler might use jump table internally

1970s: C FUNCTION POINTERS
──────────────────────────
      void (*handlers[])() = {add, sub, mul};
      handlers[opcode]();
      ! Explicit dispatch table in source code

1980s: C++ VIRTUAL FUNCTIONS
────────────────────────────
      class Animal {
          virtual void speak() = 0;
      };
      ! Compiler generates vtable (dispatch table)

1990s-Now: INTERPRETED LANGUAGE VMs
───────────────────────────────────
      Python, Lua, Ruby all use dispatch tables
      for bytecode interpretation

INSIGHT: Virtual functions ARE dispatch tables!
================================================

When you write:
    class Animal { virtual void speak(); };
    class Dog : public Animal { void speak() override; };

The compiler generates:
    struct AnimalVTable {
        void (*speak)(Animal*);
    };

    AnimalVTable dog_vtable = { &Dog::speak };

    // animal->speak() becomes:
    animal->vtable->speak(animal);

C doesn't have virtual functions, but you can build them yourself!
That's exactly what this project teaches.

2.4 Common Misconceptions

MISCONCEPTION 1: "Function pointers are slow"
─────────────────────────────────────────────

Reality: An indirect call (through function pointer) is typically just ONE
         extra memory load compared to a direct call.

Direct call:
    call do_add          ; Immediate address in instruction

Indirect call:
    mov rax, [handler]   ; Load address from memory (1 extra instruction)
    call rax             ; Jump to loaded address

Cost: ~1-3 CPU cycles extra (negligible for most applications)

Exception: Branch prediction. If the function pointer changes frequently,
           the CPU can't predict where to jump, causing pipeline stalls.
           But for dispatch tables, the same command usually calls the
           same function, so prediction works well.


MISCONCEPTION 2: "Function pointers are just for callbacks"
───────────────────────────────────────────────────────────

Reality: Function pointers enable multiple patterns:

1. Callbacks (what most tutorials show)
   qsort(arr, n, sizeof(int), compare_func);

2. Dispatch tables (this project)
   handlers[opcode](args);

3. Strategy pattern (runtime algorithm selection)
   sorter->algorithm(data);  // quicksort, mergesort, etc.

4. Object-oriented C (manual vtables)
   obj->vtable->method(obj, args);

5. State machines (state-dependent behavior)
   states[current]->on_event(machine, event);


MISCONCEPTION 3: "You can't have polymorphism in C"
───────────────────────────────────────────────────

Reality: C has polymorphism through function pointers. It's just explicit
         rather than implicit (like C++ virtual).

// C "polymorphism"
typedef struct {
    void (*speak)(void *self);
    void (*move)(void *self, int x, int y);
} AnimalOps;

typedef struct {
    AnimalOps *ops;
    char name[32];
} Animal;

void animal_speak(Animal *a) {
    a->ops->speak(a);  // Dispatches to Dog::speak, Cat::speak, etc.
}

This is EXACTLY how Linux kernel implements polymorphism for drivers,
filesystems, and more. See: struct file_operations, struct inode_operations


MISCONCEPTION 4: "The function pointer syntax is too complicated"
─────────────────────────────────────────────────────────────────

Reality: Use typedef to make it readable!

// Ugly
int (*(*fp)(int, int))(char)

// Readable
typedef int (*CharFunc)(char);
typedef CharFunc (*IntIntFunc)(int, int);
IntIntFunc fp;

Rule: If your function pointer declaration needs the spiral rule,
      use typedef to give the inner types names.

3. Project Specification

3.1 What You Will Build

A fully-featured command interpreter that demonstrates function pointer dispatch:

$ ./cmdinterp
Welcome to CmdInterp v1.0
Type 'help' for available commands.

> help
Available commands:
  help              Show this help message
  echo <args>       Echo arguments back
  add <n1> <n2>     Add two numbers
  sub <n1> <n2>     Subtract two numbers
  mul <n1> <n2>     Multiply two numbers
  div <n1> <n2>     Divide two numbers
  repeat <n> <cmd>  Repeat a command n times
  history           Show command history
  alias <n>=<cmd>   Create command alias
  quit              Exit the interpreter

> add 5 3
Result: 8

> echo hello world
hello world

> mul 7 6
Result: 42

> repeat 3 echo testing
testing
testing
testing

> alias double=mul 2
Alias 'double' created.

> double 21
Result: 42

> history
  1: help
  2: add 5 3
  3: echo hello world
  4: mul 7 6
  5: repeat 3 echo testing
  6: alias double=mul 2
  7: double 21

> unknown_command
Error: Unknown command 'unknown_command'. Type 'help' for available commands.

> quit
Goodbye!

3.2 Functional Requirements

  1. Core Command System:
    • Parse user input into command and arguments
    • Look up commands in dispatch table (O(1) or O(log n))
    • Execute handler function with arguments
    • Handle unknown commands gracefully
  2. Built-in Commands:
    • help - List all available commands with descriptions
    • echo - Print arguments
    • add, sub, mul, div - Basic arithmetic
    • quit - Exit the interpreter
  3. Advanced Features:
    • repeat <n> <cmd> - Execute a command multiple times
    • history - Show command history
    • alias <name>=<cmd> - Create command aliases (shows dispatch table modification)
  4. Error Handling:
    • Unknown command messages
    • Invalid argument messages
    • Division by zero handling

3.3 Non-Functional Requirements

  • Extensibility: Adding a new command should require only adding an entry to a table
  • Performance: Command lookup should be O(1) using hash table or O(log n) using binary search
  • Clean Code: Each command handler is a separate function with consistent signature
  • Memory Safety: No buffer overflows, proper input validation

3.4 Example Usage / Output

# Startup
$ ./cmdinterp
Welcome to CmdInterp v1.0
Type 'help' for available commands.

# Help system
> help
Available commands:
  help              Show this help message
  echo <args>       Echo arguments back
  add <n1> <n2>     Add two numbers
  sub <n1> <n2>     Subtract two numbers
  mul <n1> <n2>     Multiply two numbers
  div <n1> <n2>     Divide two numbers
  repeat <n> <cmd>  Repeat a command n times
  history           Show command history
  alias <n>=<cmd>   Create command alias
  quit              Exit the interpreter

# Arithmetic
> add 10 20
Result: 30

> sub 100 42
Result: 58

> mul 6 7
Result: 42

> div 100 4
Result: 25

> div 10 0
Error: Division by zero

# String operations
> echo The quick brown fox
The quick brown fox

> echo
(empty line)

# Repetition
> repeat 5 echo Hello
Hello
Hello
Hello
Hello
Hello

> repeat 3 add 1 1
Result: 2
Result: 2
Result: 2

# Aliases (modifying dispatch table at runtime)
> alias square=mul
Alias 'square' created. Usage: square <n> <n>

> square 5 5
Result: 25

> alias greet=echo Hello,
Alias 'greet' created. Usage: greet <additional args>

> greet World!
Hello, World!

# History
> history
  1: help
  2: add 10 20
  3: sub 100 42
  4: mul 6 7
  5: div 100 4
  6: div 10 0
  7: echo The quick brown fox
  8: repeat 5 echo Hello
  9: alias square=mul
 10: square 5 5

# Error handling
> nonexistent
Error: Unknown command 'nonexistent'. Type 'help' for available commands.

> add 5
Error: add requires 2 arguments, got 1

> add five three
Error: Invalid argument 'five' - expected a number

# Exit
> quit
Goodbye!

3.5 Real World Outcome

After completing this project, you will:

  1. Understand dispatch tables deeply - Not just use them, but know when and why to apply them
  2. Write extensible systems - Design software that grows without massive refactoring
  3. Read real-world C code - Recognize dispatch patterns in shells, VMs, kernels
  4. Pass systems interviews - Function pointers are a classic interview topic
  5. Build plugin systems - Know how to add functionality at runtime
  6. Implement state machines - Use 2D dispatch tables for state × event → action

4. Solution Architecture

4.1 High-Level Design

┌─────────────────────────────────────────────────────────────────────────────┐
│                         COMMAND INTERPRETER ARCHITECTURE                      │
├─────────────────────────────────────────────────────────────────────────────┤
│                                                                             │
│   USER INPUT                                                                │
│       │                                                                     │
│       ▼                                                                     │
│   ┌───────────────┐                                                         │
│   │    LEXER      │  Split input into tokens                                │
│   │               │  "add 5 3" → ["add", "5", "3"]                          │
│   └───────┬───────┘                                                         │
│           │                                                                 │
│           ▼                                                                 │
│   ┌───────────────┐                                                         │
│   │   DISPATCHER  │  Look up command in dispatch table                      │
│   │               │                                                         │
│   │  ┌──────────────────────────────────────────────────────────────────┐   │
│   │  │                    DISPATCH TABLE                                │   │
│   │  ├───────────┬─────────────────────┬───────────────────────────────┤   │
│   │  │  Name     │  Handler Function   │  Help Text                    │   │
│   │  ├───────────┼─────────────────────┼───────────────────────────────┤   │
│   │  │  "help"   │  cmd_help          │  "Show this help message"     │   │
│   │  │  "echo"   │  cmd_echo          │  "Echo arguments back"        │   │
│   │  │  "add"    │  cmd_add           │  "Add two numbers"            │   │
│   │  │  "sub"    │  cmd_sub           │  "Subtract two numbers"       │   │
│   │  │  "quit"   │  cmd_quit          │  "Exit the interpreter"       │   │
│   │  │  ...      │  ...               │  ...                          │   │
│   │  └───────────┴─────────────────────┴───────────────────────────────┘   │
│   │               │                                                         │
│   └───────┬───────┘                                                         │
│           │                                                                 │
│           ▼                                                                 │
│   ┌───────────────┐                                                         │
│   │   EXECUTOR    │  Call handler with arguments                            │
│   │               │  cmd->handler(argc, argv)                               │
│   └───────┬───────┘                                                         │
│           │                                                                 │
│           ▼                                                                 │
│       OUTPUT                                                                │
│                                                                             │
└─────────────────────────────────────────────────────────────────────────────┘

4.2 Key Components

COMPONENT BREAKDOWN
===================

1. COMMAND HANDLER TYPE
   ┌─────────────────────────────────────────────────────────┐
   │ // All command handlers have this signature             │
   │ typedef int (*cmd_func)(int argc, char **argv);        │
   │                                                         │
   │ // Returns: 0 = success, non-zero = error code         │
   │ // argc: number of arguments (including command name)   │
   │ // argv: array of argument strings                      │
   └─────────────────────────────────────────────────────────┘

2. COMMAND ENTRY STRUCTURE
   ┌─────────────────────────────────────────────────────────┐
   │ typedef struct {                                        │
   │     const char *name;      // "add", "echo", etc.      │
   │     cmd_func handler;      // Function to call         │
   │     const char *help;      // Help text                │
   │     int min_args;          // Minimum required args    │
   │     int max_args;          // Maximum args (-1 = any)  │
   │ } Command;                                              │
   └─────────────────────────────────────────────────────────┘

3. DISPATCH TABLE
   ┌─────────────────────────────────────────────────────────┐
   │ // Static table of built-in commands                    │
   │ static Command commands[] = {                           │
   │     {"help", cmd_help, "Show help", 0, 0},             │
   │     {"echo", cmd_echo, "Echo arguments", 0, -1},       │
   │     {"add",  cmd_add,  "Add numbers", 2, 2},           │
   │     {"quit", cmd_quit, "Exit", 0, 0},                  │
   │     {NULL, NULL, NULL, 0, 0}  // Sentinel               │
   │ };                                                      │
   └─────────────────────────────────────────────────────────┘

4. LOOKUP FUNCTION
   ┌─────────────────────────────────────────────────────────┐
   │ Command *find_command(const char *name) {              │
   │     // Linear search (simple, O(n))                    │
   │     for (int i = 0; commands[i].name != NULL; i++) {   │
   │         if (strcmp(commands[i].name, name) == 0) {     │
   │             return &commands[i];                        │
   │         }                                               │
   │     }                                                   │
   │     return NULL;  // Not found                          │
   │ }                                                       │
   │                                                         │
   │ // Or: hash table for O(1)                              │
   │ // Or: binary search on sorted table for O(log n)      │
   └─────────────────────────────────────────────────────────┘

5. MAIN LOOP
   ┌─────────────────────────────────────────────────────────┐
   │ void repl(void) {                                       │
   │     char line[256];                                     │
   │     while (1) {                                         │
   │         printf("> ");                                   │
   │         if (!fgets(line, sizeof(line), stdin)) break;  │
   │                                                         │
   │         int argc;                                       │
   │         char **argv = tokenize(line, &argc);           │
   │         if (argc == 0) continue;                        │
   │                                                         │
   │         Command *cmd = find_command(argv[0]);          │
   │         if (cmd == NULL) {                              │
   │             printf("Unknown command: %s\n", argv[0]);  │
   │             continue;                                   │
   │         }                                               │
   │                                                         │
   │         int result = cmd->handler(argc, argv);         │
   │         // Handle result...                             │
   │                                                         │
   │         free_tokens(argv);                              │
   │     }                                                   │
   │ }                                                       │
   └─────────────────────────────────────────────────────────┘

4.3 Data Structures

/* Function pointer type for command handlers */
typedef int (*cmd_func)(int argc, char **argv);

/* Command entry in dispatch table */
typedef struct Command {
    const char *name;        /* Command name (e.g., "add") */
    cmd_func handler;        /* Function to call */
    const char *usage;       /* Usage string (e.g., "add <n1> <n2>") */
    const char *help;        /* Help description */
    int min_args;            /* Minimum arguments (excluding command name) */
    int max_args;            /* Maximum arguments (-1 for unlimited) */
} Command;

/* Alias entry for runtime-added commands */
typedef struct Alias {
    char name[32];           /* Alias name */
    char expansion[256];     /* What it expands to */
    struct Alias *next;      /* Linked list for dynamic aliases */
} Alias;

/* Interpreter state */
typedef struct {
    Command *commands;       /* Dispatch table */
    int num_commands;        /* Number of commands */
    Alias *aliases;          /* Runtime aliases */
    char **history;          /* Command history */
    int history_count;       /* Number of history entries */
    int history_capacity;    /* History buffer size */
    int running;             /* 0 = quit requested */
} Interpreter;

/* Token buffer for parsed input */
typedef struct {
    char *tokens[64];        /* Array of token pointers */
    int count;               /* Number of tokens */
    char buffer[1024];       /* Storage for token strings */
} TokenBuffer;

4.4 Algorithm Overview

DISPATCH ALGORITHM
==================

Input: User command string "add 5 3"
Output: Execute appropriate handler and return result

1. TOKENIZE
   ─────────
   Input:  "add 5 3"
   Output: argc=3, argv=["add", "5", "3"]

   Algorithm:
   - Skip leading whitespace
   - For each character:
     - If whitespace: terminate current token, start new
     - If quote: read until matching quote
     - Otherwise: append to current token

2. LOOKUP
   ───────
   Input:  command name "add"
   Output: Command* or NULL

   Methods:
   a) Linear search:  O(n)  - Simple, good for <20 commands
   b) Binary search:  O(log n) - Requires sorted table
   c) Hash table:     O(1)  - Best for many commands
   d) Array index:    O(1)  - If commands are numbered (opcodes)

3. VALIDATE
   ─────────
   Check argument count against min_args and max_args
   Return error if validation fails

4. DISPATCH
   ─────────
   Call: cmd->handler(argc, argv)

   The magic line! This is the indirect function call:

       int result = cmd->handler(argc, argv);

   In assembly:
       mov rax, [cmd + offset_of_handler]  ; Load function pointer
       mov rdi, argc                        ; First argument
       mov rsi, argv                        ; Second argument
       call rax                             ; Indirect call

5. HANDLE RESULT
   ──────────────
   - result == 0:  Success
   - result > 0:   Error code (print message)
   - result < 0:   Special (e.g., -1 = quit requested)


HASH TABLE FOR O(1) LOOKUP
==========================

For production interpreters with many commands:

typedef struct {
    Command *buckets[HASH_SIZE];  // Array of command pointers
} CommandTable;

unsigned int hash(const char *str) {
    unsigned int h = 0;
    while (*str) {
        h = h * 31 + *str++;
    }
    return h % HASH_SIZE;
}

Command *lookup(CommandTable *table, const char *name) {
    unsigned int h = hash(name);
    Command *cmd = table->buckets[h];
    while (cmd) {
        if (strcmp(cmd->name, name) == 0) return cmd;
        cmd = cmd->next;  // Handle collisions with chaining
    }
    return NULL;
}

5. Implementation Guide

5.1 Development Environment Setup

Required Tools:

  • GCC or Clang compiler
  • Make build system
  • gdb/lldb debugger

Compiler Flags:

CFLAGS = -Wall -Wextra -Wpedantic -std=c11 -g
# -Wall: Enable all common warnings
# -Wextra: Enable extra warnings
# -Wpedantic: Strict ISO C compliance
# -std=c11: Use C11 standard
# -g: Include debug symbols

5.2 Project Structure

cmdinterp/
├── Makefile
├── include/
│   ├── cmdinterp.h      # Main header with types
│   ├── commands.h       # Command handler declarations
│   └── tokenizer.h      # Tokenizer interface
├── src/
│   ├── main.c           # Entry point and REPL
│   ├── dispatch.c       # Dispatch table and lookup
│   ├── tokenizer.c      # Input tokenization
│   ├── commands/
│   │   ├── builtin.c    # help, echo, quit
│   │   ├── math.c       # add, sub, mul, div
│   │   ├── meta.c       # repeat, alias, history
│   │   └── commands.c   # Command registration
│   └── history.c        # History tracking
└── tests/
    └── test_dispatch.c  # Unit tests

5.3 The Core Question You’re Answering

“How can we use function pointers to create extensible, polymorphic behavior in C?”

This question goes to the heart of C’s power: you can achieve the same flexibility that object-oriented languages provide through inheritance and virtual functions, but with explicit control over the mechanism.

The answer is: by storing function pointers in data structures, you can:

  1. Select which function to call at runtime (polymorphism)
  2. Add new behaviors without changing existing code (extensibility)
  3. Pass behaviors as data (callbacks, strategies, plugins)

5.4 Concepts You Must Understand First

Concept Why It Matters Self-Assessment Question
Function pointer syntax You’ll write typedef int (*cmd_func)(int, char**) Can you declare a pointer to a function taking two ints and returning void?
Struct with function pointer Commands need both data (name) and behavior (handler) Can you write a struct containing a function pointer member?
Array of structs The dispatch table is an array of Command structs Can you iterate over an array of structs?
strcmp() and string comparison Command lookup compares strings Why can’t you use == to compare strings in C?
Indirect function calls cmd->handler(args) calls through a pointer What assembly instruction does an indirect call use?

5.5 Questions to Guide Your Design

  1. How will you handle commands with different argument counts?
    • Fixed: add <n1> <n2> always takes 2
    • Variable: echo takes 0 or more
    • Consider: min_args, max_args fields
  2. How will you make the dispatch table extensible?
    • Static table: Fast, but fixed at compile time
    • Dynamic table: Can add commands at runtime (for aliases)
    • Hybrid: Static built-ins + dynamic extensions
  3. What should the command handler signature be?
    • Option A: int handler(int argc, char **argv) - Like main()
    • Option B: int handler(char *args) - Raw argument string
    • Option C: int handler(Interpreter *interp, int argc, char **argv) - With context
  4. How will you implement ‘repeat’ and ‘alias’?
    • repeat: Needs to parse and call another command
    • alias: Needs to modify or shadow the dispatch table
  5. What return values indicate success/failure/quit?
    • Design a consistent error handling strategy

5.6 Thinking Exercise

Before coding, trace through this scenario on paper:

Scenario: User types add 5 3

Step 1: main() calls repl()
        Current state: _______________

Step 2: repl() reads line "add 5 3\n"
        line buffer contains: _______________

Step 3: tokenize() parses line
        argc = ___
        argv[0] = ___
        argv[1] = ___
        argv[2] = ___

Step 4: find_command("add") searches table
        Compare "add" with "help": ___
        Compare "add" with "echo": ___
        Compare "add" with "add":  ___
        Returns: _______________

Step 5: Validate arguments
        min_args = ___, max_args = ___
        argc - 1 = ___ (excluding command name)
        Validation: _______________

Step 6: Call cmd->handler(3, argv)
        Handler function address: _______________
        Inside cmd_add():
          argv[1] = "5" → atoi → ___
          argv[2] = "3" → atoi → ___
          result = ___

Step 7: Return to repl()
        Print: _______________

5.7 Hints in Layers

Hint 1: Starting the Dispatch Table

Start with the simplest possible dispatch table:

typedef int (*cmd_func)(int argc, char **argv);

typedef struct {
    const char *name;
    cmd_func handler;
} Command;

int cmd_hello(int argc, char **argv) {
    printf("Hello, World!\n");
    return 0;
}

int cmd_quit(int argc, char **argv) {
    return -1;  // Signal to quit
}

Command commands[] = {
    {"hello", cmd_hello},
    {"quit", cmd_quit},
    {NULL, NULL}  // End marker
};

Get this working before adding complexity.

Hint 2: The Lookup Function

Linear search is fine for small tables:

Command *find_command(const char *name) {
    for (int i = 0; commands[i].name != NULL; i++) {
        if (strcmp(commands[i].name, name) == 0) {
            return &commands[i];
        }
    }
    return NULL;
}

The key insight: return a pointer to the Command, not a copy. This lets you call cmd->handler(...) directly.

Hint 3: The Main Loop
void repl(void) {
    char line[256];
    int running = 1;

    while (running) {
        printf("> ");
        fflush(stdout);

        if (fgets(line, sizeof(line), stdin) == NULL) {
            break;  // EOF
        }

        // Remove newline
        line[strcspn(line, "\n")] = '\0';

        // Skip empty lines
        if (line[0] == '\0') continue;

        // Tokenize
        char *argv[64];
        int argc = tokenize(line, argv, 64);
        if (argc == 0) continue;

        // Dispatch
        Command *cmd = find_command(argv[0]);
        if (cmd == NULL) {
            printf("Unknown command: %s\n", argv[0]);
            continue;
        }

        int result = cmd->handler(argc, argv);
        if (result < 0) running = 0;  // Quit signal
    }
}
Hint 4: Simple Tokenizer
// In-place tokenization (modifies the input string)
int tokenize(char *line, char **argv, int max_args) {
    int argc = 0;
    char *p = line;

    while (*p && argc < max_args) {
        // Skip whitespace
        while (*p == ' ' || *p == '\t') p++;
        if (*p == '\0') break;

        // Start of token
        argv[argc++] = p;

        // Find end of token
        while (*p && *p != ' ' && *p != '\t') p++;
        if (*p) *p++ = '\0';  // Null-terminate token
    }

    return argc;
}
Hint 5: Implementing 'repeat'

repeat needs to call another command. This shows the power of dispatch tables - you can recursively use the same mechanism:

int cmd_repeat(int argc, char **argv) {
    if (argc < 3) {
        printf("Usage: repeat <n> <command> [args...]\n");
        return 1;
    }

    int count = atoi(argv[1]);
    if (count <= 0) {
        printf("Error: repeat count must be positive\n");
        return 1;
    }

    // Build new argv for the sub-command
    // argv: ["repeat", "3", "echo", "hello"]
    // sub:  ["echo", "hello"]
    int sub_argc = argc - 2;
    char **sub_argv = &argv[2];

    Command *cmd = find_command(sub_argv[0]);
    if (cmd == NULL) {
        printf("Unknown command: %s\n", sub_argv[0]);
        return 1;
    }

    for (int i = 0; i < count; i++) {
        cmd->handler(sub_argc, sub_argv);
    }

    return 0;
}
Hint 6: Implementing Aliases

Aliases can be implemented as a separate lookup that happens before the main dispatch:

typedef struct Alias {
    char name[32];
    char expansion[256];
    struct Alias *next;
} Alias;

Alias *aliases = NULL;

char *expand_alias(const char *name) {
    for (Alias *a = aliases; a != NULL; a = a->next) {
        if (strcmp(a->name, name) == 0) {
            return a->expansion;
        }
    }
    return NULL;
}

// In the main loop, before find_command:
char *expansion = expand_alias(argv[0]);
if (expansion) {
    // Prepend expansion to the argument list
    // "double 5" with alias "double=mul 2" becomes "mul 2 5"
    // ... reparse with expansion ...
}

5.8 The Interview Questions They’ll Ask

After completing this project, be ready for:

  1. “Explain how function pointers work in C”
    • Functions have addresses; pointers store addresses
    • Indirect call vs direct call
    • The compiler generates a call instruction with a register operand
  2. “What is a dispatch table and when would you use one?”
    • Maps keys to handlers
    • Use for: command interpreters, bytecode VMs, protocol handlers, event systems
    • Benefits: extensibility, data-driven design, O(1) lookup possible
  3. “How does C achieve polymorphism without classes?”
    • Function pointers in structs (manual vtables)
    • Example: Linux kernel’s file_operations, inode_operations
    • Trade-off: explicit vs implicit (C++ virtual)
  4. “What’s the performance cost of function pointers?”
    • One extra memory load for the address
    • Potential branch prediction miss if pointer changes frequently
    • Usually negligible; profile before optimizing
  5. “Design a plugin system in C”
    • Define interface as struct of function pointers
    • Load shared library with dlopen/dlsym
    • Get plugin’s function table
    • Register handlers in dispatch table
  6. “How would you implement a finite state machine in C?”
    • 2D dispatch table: handlers[state][event]
    • Or struct per state with entry/exit/event handlers
    • Function pointers enable clean state-dependent behavior

5.9 Books That Will Help

Topic Book Chapter
Function pointer syntax Expert C Programming Ch. 3, 9
Callbacks and function pointers C Interfaces and Implementations Ch. 1-2
Linux kernel dispatch patterns Linux Device Drivers Ch. 3
VM dispatch optimization Crafting Interpreters Ch. 15 (switch vs computed goto)
State machines in C Practical Statecharts in C/C++ Ch. 2-3

5.10 Implementation Phases

Phase 1: Minimal Viable Interpreter (Day 1-2)

  • Basic REPL loop
  • Tokenizer
  • Dispatch table with 3-4 commands (help, echo, quit)
  • Linear search lookup

Phase 2: Arithmetic Commands (Day 3)

  • Add, sub, mul, div commands
  • Argument validation
  • Error handling

Phase 3: Advanced Features (Day 4-5)

  • History tracking
  • Repeat command
  • Alias system

Phase 4: Polish (Day 6-7)

  • Improve error messages
  • Add more commands
  • Optimize lookup (optional: hash table)
  • Write tests

5.11 Key Implementation Decisions

Decision Simple Approach Advanced Approach
Lookup Linear search O(n) Hash table O(1)
Table storage Static array Dynamic array/linked list
Handler signature int f(int, char**) int f(Interp*, int, char**)
Tokenizer In-place modification Copy to new buffer
Aliases Separate linked list Merge into main table

For this project, start with the simple approaches. Optimization can come later.


6. Testing Strategy

Test Categories

Category Purpose Examples
Unit Tests Test individual components Tokenizer, lookup function
Integration Tests Test command execution Full command invocation
Error Tests Test error handling Invalid commands, bad arguments
Edge Cases Test boundaries Empty input, very long input

Critical Test Cases

// Tokenizer tests
void test_tokenizer(void) {
    char line[256];
    char *argv[64];
    int argc;

    // Basic tokenization
    strcpy(line, "add 5 3");
    argc = tokenize(line, argv, 64);
    assert(argc == 3);
    assert(strcmp(argv[0], "add") == 0);
    assert(strcmp(argv[1], "5") == 0);
    assert(strcmp(argv[2], "3") == 0);

    // Multiple spaces
    strcpy(line, "echo   hello    world");
    argc = tokenize(line, argv, 64);
    assert(argc == 3);

    // Empty input
    strcpy(line, "");
    argc = tokenize(line, argv, 64);
    assert(argc == 0);

    // Whitespace only
    strcpy(line, "   \t  ");
    argc = tokenize(line, argv, 64);
    assert(argc == 0);
}

// Dispatch tests
void test_dispatch(void) {
    // Known command
    Command *cmd = find_command("add");
    assert(cmd != NULL);
    assert(strcmp(cmd->name, "add") == 0);

    // Unknown command
    cmd = find_command("nonexistent");
    assert(cmd == NULL);

    // Case sensitivity
    cmd = find_command("ADD");
    assert(cmd == NULL);  // Commands are case-sensitive
}

// Command execution tests
void test_commands(void) {
    char *argv_add[] = {"add", "10", "20"};
    int result = cmd_add(3, argv_add);
    assert(result == 0);  // Success
    // Check output was "Result: 30"

    char *argv_bad[] = {"add", "10"};
    result = cmd_add(2, argv_bad);
    assert(result != 0);  // Error: missing argument

    char *argv_div0[] = {"div", "10", "0"};
    result = cmd_div(3, argv_div0);
    assert(result != 0);  // Error: division by zero
}

Test Script

#!/bin/bash
# test_cmdinterp.sh

INTERP=./cmdinterp

# Test basic commands
echo "Testing basic commands..."
echo "help" | $INTERP | grep -q "Available commands" || { echo "FAIL: help"; exit 1; }
echo "echo hello world" | $INTERP | grep -q "hello world" || { echo "FAIL: echo"; exit 1; }
echo "add 5 3" | $INTERP | grep -q "8" || { echo "FAIL: add"; exit 1; }
echo "sub 10 4" | $INTERP | grep -q "6" || { echo "FAIL: sub"; exit 1; }
echo "mul 6 7" | $INTERP | grep -q "42" || { echo "FAIL: mul"; exit 1; }
echo "div 20 5" | $INTERP | grep -q "4" || { echo "FAIL: div"; exit 1; }

# Test error handling
echo "Testing error handling..."
echo "unknown_cmd" | $INTERP | grep -q "Unknown command" || { echo "FAIL: unknown"; exit 1; }
echo "add 5" | $INTERP | grep -q -i "error\|usage" || { echo "FAIL: add missing arg"; exit 1; }
echo "div 10 0" | $INTERP | grep -q -i "error\|zero" || { echo "FAIL: div by zero"; exit 1; }

# Test repeat
echo "Testing repeat..."
result=$(echo "repeat 3 echo test" | $INTERP | grep -c "test")
[ "$result" -eq 3 ] || { echo "FAIL: repeat count"; exit 1; }

echo "All tests passed!"

7. Common Pitfalls & Debugging

Pitfall Symptom Solution
Forgot sentinel in table Infinite loop or crash Always end array with {NULL, NULL, ...}
Wrong function signature Compiler warning or crash Ensure all handlers match cmd_func typedef
Modifying string literals Segfault Use char[] not char* for tokenizer buffer
Off-by-one in argc Wrong argument parsing Remember: argc includes command name
Not null-checking lookup result Crash on unknown command Always check find_command() != NULL
Forgetting to flush stdout No prompt shown fflush(stdout) after printf("> ")
Alias infinite loop Stack overflow Detect and prevent alias cycles

Debugging Strategies

  1. Print the dispatch table at startup:
    void debug_print_commands(void) {
        printf("Registered commands:\n");
        for (int i = 0; commands[i].name != NULL; i++) {
            printf("  %s -> %p\n", commands[i].name,
                   (void*)commands[i].handler);
        }
    }
    
  2. Trace function pointer calls:
    // Before calling handler
    printf("DEBUG: Calling %s at %p with argc=%d\n",
           cmd->name, (void*)cmd->handler, argc);
    for (int i = 0; i < argc; i++) {
        printf("DEBUG:   argv[%d] = '%s'\n", i, argv[i]);
    }
    
  3. Use GDB to inspect function pointers:
    (gdb) print commands[0]
    $1 = {name = "help", handler = 0x401234 <cmd_help>, ...}
    
    (gdb) print cmd->handler
    $2 = (int (*)(int, char **)) 0x401234 <cmd_help>
    
    (gdb) break *cmd->handler
    Breakpoint 1 at 0x401234: file commands.c, line 10.
    

8. Extensions & Challenges

Beginner Extensions

  • Add clear command to clear the screen
  • Add version command to show interpreter version
  • Add command abbreviations (e.g., q for quit)
  • Colorize output (errors in red, success in green)

Intermediate Extensions

  • Implement if command for conditional execution
  • Add variables: set x 10, echo $x
  • Implement piping: history | grep add
  • Load commands from a file: source script.cmd
  • Implement tab completion

Advanced Extensions

  • Plugin system: Load commands from shared libraries at runtime
    // plugin.c
    int plugin_init(void) { /* register commands */ }
    int cmd_plugin_func(int argc, char **argv) { ... }
    
    // main interpreter
    void *handle = dlopen("./plugin.so", RTLD_NOW);
    void (*init)(void) = dlsym(handle, "plugin_init");
    init();
    
  • State machine mode: Use 2D dispatch for state x event -> handler
    typedef int (*state_handler)(Context *ctx, Event *evt);
    state_handler transitions[NUM_STATES][NUM_EVENTS];
    
  • Bytecode compiler: Compile commands to bytecode, execute with dispatch
    // Compile "add 5 3" to bytecode: [OP_PUSH, 5, OP_PUSH, 3, OP_ADD, OP_PRINT]
    // Execute with: handlers[opcode](vm);
    
  • Coroutine support: Implement yield for cooperative multitasking

9. Real-World Connections

How Real Systems Use Dispatch Tables

1. Bash Shell

// From bash source code (builtins.h)
struct builtin {
    char *name;
    sh_builtin_func_t *function;
    int flags;
    char * const *long_doc;
    const char *short_doc;
    char *handle;
};

// Lookup and execute
builtin = find_shell_builtin(words->word->word);
result = (*(builtin->function))(words->next);

2. Linux Kernel (syscall dispatch)

// arch/x86/entry/syscall_64.c
const sys_call_ptr_t sys_call_table[__NR_syscall_max+1] = {
    [0] = sys_read,
    [1] = sys_write,
    [2] = sys_open,
    // ...
};

// Dispatch
sys_call_table[nr](regs);

3. CPython VM

// Python/ceval.c
switch (opcode) {
    case LOAD_CONST: ... break;
    case BINARY_ADD: ... break;
    // Or with computed goto:
    static void *opcode_targets[] = {
        &&TARGET_LOAD_CONST,
        &&TARGET_BINARY_ADD,
    };
    DISPATCH() { goto *opcode_targets[*ip++]; }

4. Redis Command Handling

// src/server.h
struct redisCommand {
    char *name;
    redisCommandProc *proc;
    int arity;
    char *sflags;
    // ...
};

// Command execution
c->cmd->proc(c);

5. Nginx Event Handling

// Event dispatch for different OSes
typedef struct {
    ngx_int_t (*add)(ngx_event_t *ev, ngx_int_t event, ngx_uint_t flags);
    ngx_int_t (*del)(ngx_event_t *ev, ngx_int_t event, ngx_uint_t flags);
    ngx_int_t (*process_events)(ngx_cycle_t *cycle, ...);
    // ...
} ngx_event_actions_t;

ngx_event_actions_t ngx_event_actions;
// Dispatch: ngx_event_actions.process_events(cycle, ...);

10. Resources

Essential Reading

  • Expert C Programming, Chapter 3: “Unscrambling Declarations in C” (function pointer syntax)
  • Expert C Programming, Chapter 9: “More about Arrays” (arrays of function pointers)
  • C Interfaces and Implementations, Chapter 1-2 (callback patterns)

Online Resources

  • cdecl.org - Decode function pointer declarations
  • Compiler Explorer - See how function pointers compile to assembly
  • Linux kernel source: include/linux/fs.h (struct file_operations)

11. Self-Assessment Checklist

Understanding

  • I can declare function pointers without looking up the syntax
  • I understand why handlers[i](args) works
  • I can explain the performance characteristics of dispatch tables
  • I know when to use dispatch tables vs switch statements
  • I understand how this relates to C++ virtual functions

Implementation

  • My interpreter correctly parses commands and arguments
  • Command lookup works correctly (returns NULL for unknown commands)
  • All arithmetic commands handle their arguments properly
  • Error handling provides helpful messages
  • The repeat command correctly executes sub-commands
  • The alias system allows creating new commands

Testing

  • All built-in commands work correctly
  • Error cases are handled gracefully
  • Edge cases (empty input, long input) are handled
  • I can add a new command by just adding a table entry

Growth

  • I can identify dispatch table patterns in other codebases
  • I understand how to extend this to a plugin system
  • I can explain this pattern in an interview

12. Submission / Completion Criteria

Minimum Viable Completion

  • REPL loop reads and responds to commands
  • At least 5 commands: help, echo, add, quit, and one other
  • Unknown commands produce error messages
  • Commands are looked up in a dispatch table

Full Completion

  • All specified commands implemented (help, echo, add, sub, mul, div, quit)
  • repeat command works correctly
  • history command shows previous commands
  • alias command creates runtime aliases
  • Proper error handling for all edge cases
  • Clean code with consistent style

Excellence (Going Above & Beyond)

  • Hash table for O(1) command lookup
  • Plugin system with dlopen/dlsym
  • Tab completion
  • Command history with arrow key navigation (using readline)
  • Variables and interpolation
  • Comprehensive test suite

Thinking Exercise

Before writing code, design your command table on paper:

Exercise 1: Define the Handler Signature

What should the function signature be? Consider:

  • How will handlers know their arguments?
  • How will handlers signal success/failure/quit?
  • Do handlers need access to interpreter state?

Write your typedef:

typedef _____________ (*cmd_func)(_____________);

Exercise 2: Design the Command Structure

What fields does each command entry need?

typedef struct {
    _____________  // Name
    _____________  // Handler
    _____________  // Help text
    _____________  // Anything else?
} Command;

Exercise 3: Trace Through Dispatch

For the input "mul 6 7", trace:

  1. How is it tokenized?
  2. How is the command found?
  3. How is the handler called?
  4. What does the handler do?
  5. What is printed?

Exercise 4: Design the Alias System

How would you implement:

  • alias double=mul 2 (prepend “mul 2” when “double” is used)
  • alias greet=echo Hello, (prepend “echo Hello,” when “greet” is used)

Sketch the data structure and lookup algorithm.


This project is part of the Expert C Programming Mastery series. For the complete learning path, see the project index.