← Back to all projects

PROJECT 16 ENHANCEMENTS

**How do bootloaders implement interactive command-line interfaces in a constrained environment without an operating system, standard library, or shell infrastructure?**

Project 16 Educational Enhancements

Insert this content after “You’ve built a GRUB/U-Boot-like interactive environment!” and before “Implementation Hints:”

The Core Question You’re Answering

“How do bootloaders implement interactive command-line interfaces in a constrained environment without an operating system, standard library, or shell infrastructure?”

This project answers the fundamental question of how to build a REPL (Read-Eval-Print-Loop) from scratch when you have nothing—no printf, no scanf, no readline, no malloc. You’re implementing the same user experience as bash or U-Boot, but without any of their dependencies. You’ll discover why bootloaders use command tables instead of if-else chains, how to safely parse user input that could contain anything, and how to maintain state (environment variables) across command invocations without a filesystem.

Concepts You Must Understand First

Before starting this project, verify your understanding with these self-assessment questions:

1. Terminal I/O at BIOS/UART Level

  • Question: How does BIOS INT 16h differ from INT 10h for character input/output?
  • Why it matters: You can’t use C’s getchar()—you need raw firmware calls
  • Book reference: “The BIOS Companion” by Phil Croucher, Chapter 7: Console I/O Services

2. String Processing Without stdlib

  • Question: How would you implement strchr() and strtok() without including string.h?
  • Why it matters: Every string operation must be hand-written or copied
  • Book reference: “Computer Systems: A Programmer’s Perspective” Chapter 3.9: String Operations

3. Function Pointers and Dispatch Tables

  • Question: What is a dispatch table, and how does it enable extensible command handling?
  • Why it matters: This is how you’ll register commands without giant switch statements
  • Book reference: “Expert C Programming: Deep C Secrets” Chapter 9: Function Pointers

4. Buffer Management and Safety

  • Question: What happens if a user types more characters than your input buffer can hold?
  • Why it matters: Buffer overflows can corrupt your bootloader’s stack
  • Book reference: “Effective C, 2nd Edition” Chapter 6: Dynamically Allocated Memory

5. State Management Without a Filesystem

  • Question: How would you store environment variables in RAM that persist across command invocations?
  • Why it matters: You need a simple key-value store in a fixed memory region
  • Book reference: “Data Structures and Algorithm Analysis in C” Chapter 5: Hash Tables (simplified)

6. ASCII Control Characters and Terminal Behavior

  • Question: What ASCII values represent backspace (0x08), carriage return (0x0D), and escape sequences?
  • Why it matters: You’re implementing raw terminal control
  • Book reference: VT100 User Guide (freely available), ANSI X3.64 escape codes

Questions to Guide Your Design

Input Handling Architecture

  1. Should you implement input buffering character-by-character or line-by-line?
  2. How will you handle special keys (backspace, arrows, delete) differently from printable characters?
  3. What’s the maximum line length you’ll support, and what happens when exceeded?
  4. Will you support command history (up/down arrows), and if so, how many entries?

Command Parsing Strategy

  1. Should you parse commands during input or after the user presses Enter?
  2. How will you tokenize the input string into command name and arguments?
  3. Will you support quoted arguments (e.g., load "my file.bin") or only space-separated?
  4. How will you handle environment variable substitution (e.g., $bootfile)?

Command Registration and Dispatch

  1. Should commands be registered in a static array or linked list structure?
  2. What information should each command entry contain (name, handler, help text, min/max args)?
  3. How will you implement command name lookup—linear search or binary search?
  4. Should built-in commands and user-defined commands use the same mechanism?

Memory Safety Considerations

  1. How will you validate memory addresses before allowing md or mm commands?
  2. What ranges should be considered “safe” to read/write (bootloader data, free RAM)?
  3. How will you prevent users from overwriting the bootloader code itself?
  4. Should you implement read-only regions that prevent modification?

Environment Variable Storage

  1. What data structure will you use for environment variables (array, hash table)?
  2. How will you handle variable name collisions and updates?
  3. Should variables persist across reboots (saved to disk) or only in RAM?
  4. What’s the maximum number of variables and maximum value length?

Thinking Exercise

Before writing any code, perform this mental execution:

Scenario: A user types the following sequence:

boot> set k[backspace][backspace]bootfile=kernel.bin[enter]
boot> load $bootfile[enter]

Trace through the execution:

  1. Character-by-character processing:
    • User types ‘s’ → echo ‘s’, add to buffer → buffer: “s”
    • User types ‘e’ → echo ‘e’, add to buffer → buffer: “se”
    • User types ‘t’ → echo ‘t’, add to buffer → buffer: “set”
    • User types ‘ ‘ → echo ‘ ‘, add to buffer → buffer: “set “
    • User types ‘k’ → echo ‘k’, add to buffer → buffer: “set k”
    • User types backspace → echo backspace sequence, remove from buffer → buffer: “set “
    • User types backspace → echo backspace sequence, remove from buffer → buffer: “set”
    • User types “bootfile=kernel.bin” → echo each char, add to buffer → buffer: “set bootfile=kernel.bin”
    • User presses Enter → process command
  2. Command parsing:
    • Extract command name: “set”
    • Extract arguments: “bootfile=kernel.bin”
    • Split on ‘=’ → key: “bootfile”, value: “kernel.bin”
  3. Environment variable storage:
    • Search environment array for existing “bootfile” entry → not found
    • Add new entry: env[0] = “bootfile=kernel.bin”
    • Return success
  4. Second command:
    • Parse: command=”load”, arg=”$bootfile”
    • Detect ‘$’ → perform variable expansion
    • Lookup “bootfile” → find “kernel.bin”
    • Replace in argument → arg=”kernel.bin”
    • Call load command handler with “kernel.bin”

Questions to consider:

  • What happens if the buffer fills up mid-input?
  • How do you echo the backspace (move cursor back, write space, move back again)?
  • What if the user types ‘$unknown_var’—do you expand to empty string or leave literal?
  • If load fails, should the bootloader crash or return to prompt?

The Interview Questions They’ll Ask

Question 1: “How would you implement command-line editing with backspace support in a bootloader without a terminal library?”

  • What they’re testing: Understanding of low-level I/O and terminal control
  • Strong answer: “I’d use BIOS INT 16h, AH=0x00 to read keystrokes (returns scancode + ASCII). For backspace (scancode 0x0E, ASCII 0x08), I’d: (1) check buffer isn’t empty, (2) decrement buffer pointer, (3) use INT 10h to move cursor back one position, (4) write a space to erase the character, (5) move cursor back again. For UEFI, I’d use ConIn->ReadKeyStroke() and similar VGA writes or ConOut->SetCursorPosition().”
  • Follow-up: “How do you handle backspace at the start of the line?” → Check buffer position > 0 before processing

Question 2: “Explain how you’d implement a command dispatch table in C for a bootloader shell.”

  • What they’re testing: Knowledge of function pointers and table-driven design
  • Strong answer: “I’d define a struct like struct cmd { const char *name; int (*handler)(int argc, char **argv); const char *help; } and create a static array: static struct cmd commands[] = { {"md", cmd_md, "Memory dump"}, ... }. When parsing input, I’d loop through the table comparing input against cmd->name using strcmp. On match, call cmd->handler(argc, argv). This is extensible—adding commands is just adding table entries.”
  • Follow-up: “Why not use a switch statement or if-else chain?” → Doesn’t scale, table is data-driven and easier to modify

Question 3: “How would you safely implement a memory dump command that doesn’t crash on invalid addresses?”

  • What they’re testing: Understanding of memory safety in bare-metal environments
  • Strong answer: “First, define valid ranges based on memory map from BIOS INT 15h or UEFI GetMemoryMap(). For each address the user requests, check if it falls within usable RAM or read-only regions (BIOS ROM, VGA memory). Exclude bootloader code/data sections and NULL. If invalid, return an error instead of accessing. For extra safety, you could set up a page fault handler in protected mode to catch unexpected accesses.”
  • Follow-up: “What about MMIO regions?” → MMIO reads can have side effects (trigger device actions), so mark them as dangerous and warn user

Question 4: “How would you implement environment variable storage without malloc or a filesystem?”

  • What they’re testing: Data structure design in constrained environments
  • Strong answer: “I’d allocate a static buffer like char env_storage[4096] and store variables as null-terminated ‘name=value’ strings consecutively. Maintain a pointer to the next free space. To get a variable, scan the buffer for strings starting with ‘name=’, parse the value after ‘=’. To set, check if it exists (update in place if same length, or mark old as deleted and append new). This is simple but wastes space. For better performance, use a fixed array of struct { char name[32]; char value[128]; bool used; } entries.”
  • Follow-up: “How do you persist variables across reboots?” → Write the buffer to a reserved disk sector, reload on startup

Question 5: “What challenges arise when implementing command-line parsing without the C standard library?”

  • What they’re testing: Practical systems programming skills
  • Strong answer: “Main challenges: (1) No strtok—must manually scan for delimiters, (2) No malloc—must use fixed-size buffers for argv, (3) No isspace—check against ‘ ‘, ‘\t’ manually, (4) Quoting support requires state machine (inside/outside quotes), (5) Variable expansion ($VAR) requires scanning and string substitution. I’d write minimal versions: a split_string(char *str, char **argv, int max_args) that modifies str in-place by replacing spaces with ‘\0’ and populating argv.”
  • Follow-up: “How do you handle escaped characters?” → Implement a simple state machine that tracks backslash escapes

Question 6: “Describe how you’d implement a REPL (Read-Eval-Print Loop) in a bootloader context.”

  • What they’re testing: Software architecture understanding
  • Strong answer: “The main loop is: (1) Print prompt (‘boot> ‘), (2) Read line into buffer with editing support (ReadLine function), (3) Parse line into command + args (ParseCommand function), (4) Lookup command in dispatch table, (5) Execute handler, (6) Print result/error, (7) Repeat. Critical design choices: buffer overflow protection (stop at max length), error handling (invalid commands return to prompt, don’t crash), signal handling (Ctrl+C resets to prompt in advanced versions). This is essentially a mini-interpreter.”
  • Follow-up: “How does this compare to a full shell like bash?” → Bash has job control, pipes, redirection, scripting—we only need basic command execution

Hints in Layers

Hint 1: Starting Point - Build the Input System First

Don’t try to build everything at once—start with a working line editor. Create a read_line(char *buffer, int max_len) function that:

  • Calls BIOS INT 16h, AH=0 (or UEFI ConIn->ReadKeyStroke) in a loop
  • Handles printable ASCII (0x20-0x7E): echo character, add to buffer
  • Handles backspace (0x08): remove last character if buffer not empty
  • Handles Enter (0x0D): return the line
  • Stops accepting input when buffer is full (always leave space for null terminator)

Test this thoroughly before adding commands—type characters, backspace, test buffer limits. Once this works, you have the foundation for everything else. The line editor should feel responsive—if backspace is slow or glitchy, debug the terminal control codes.

Hint 2: Next Level - Implement Command Parsing

Create a parse_command(char *line, char **argv, int *argc) function that tokenizes the input. Simplest approach:

  • Scan the string for space characters
  • Replace each space with ‘\0’ to create separate strings
  • Store pointers to the start of each token in argv array
  • Set argc to the number of tokens found

Example: input “md 0x7C00 64” becomes argv[0]=”md”, argv[1]=”0x7C00”, argv[2]=”64”, argc=3.

Then create your command table:

struct command {
    const char *name;
    int (*handler)(int argc, char **argv);
    const char *help;
};

struct command cmd_table[] = {
    {"help", cmd_help, "Show help"},
    {"md", cmd_memory_dump, "Memory dump"},
    {NULL, NULL, NULL}  // Terminator
};

Your main loop: read line → parse → find command in table → call handler.

Hint 3: Technical Details - Implement Core Commands

For memory dump (md):

int cmd_memory_dump(int argc, char **argv) {
    if (argc < 2) return error("Usage: md <addr> [len]");

    uint32_t addr = parse_hex(argv[1]);  // Write parse_hex() to convert "0x7C00"
    uint32_t len = (argc >= 3) ? parse_hex(argv[2]) : 64;

    if (!is_valid_address(addr, len)) return error("Invalid address");

    uint8_t *ptr = (uint8_t*)addr;
    for (uint32_t i = 0; i < len; i += 16) {
        printf("%08X: ", addr + i);  // Implement simple printf
        // Print 16 bytes in hex, then ASCII representation
    }
    return 0;
}

For environment variables:

char env_storage[4096];  // Fixed storage
char *env_ptr = env_storage;

void set_env(const char *name, const char *value) {
    // Format: "name=value\0"
    sprintf(env_ptr, "%s=%s", name, value);
    env_ptr += strlen(env_ptr) + 1;
}

const char* get_env(const char *name) {
    char *p = env_storage;
    while (p < env_ptr) {
        if (strncmp(p, name, strlen(name)) == 0 && p[strlen(name)] == '=') {
            return p + strlen(name) + 1;
        }
        p += strlen(p) + 1;
    }
    return NULL;
}

Hint 4: Tools and Debugging - Verify with Test Inputs

Debug your shell by testing these scenarios systematically:

  1. Input edge cases:
    • Empty input (just pressing Enter) → should return to prompt
    • Maximum length input → should stop accepting characters gracefully
    • Backspace at start of line → should do nothing
    • Backspace all characters → should allow typing again
  2. Command parsing edge cases:
    • Single word (“help”) → argc=1
    • Multiple spaces (“md 0x7C00”) → should handle extra spaces
    • No arguments when required → command should return error
  3. Memory safety:
    • md 0x0 → test NULL address handling
    • md 0xFFFFFFFF → test out-of-range address
    • mm 0x7C00 0xFF → verify memory write works (use caution!)
  4. Environment variables:
    • Set and immediately print → verify storage
    • Set same variable twice → should update, not duplicate
    • Reference undefined variable → should handle gracefully
  5. Integration:
    • Set bootfile, load using $bootfile, verify expansion works
    • Chain multiple commands (if you implement ; separator)

Use QEMU with -monitor stdio to inspect memory from outside the VM and verify your commands show correct data. Add debug prints (can be removed later) to trace execution flow.

Books That Will Help

Topic Book Specific Chapters/Sections
BIOS Interrupts “The BIOS Companion” by Phil Croucher Chapter 7: Console I/O Services (INT 10h, INT 16h)
String Processing “Computer Systems: A Programmer’s Perspective” (3rd ed.) by Bryant & O’Hallaron Chapter 3.9: String Operations and Representation
Function Pointers “Expert C Programming: Deep C Secrets” by Peter van der Linden Chapter 9: More About Function Pointers and Arrays
Memory Safety “Effective C, 2nd Edition” by Robert C. Seacord Chapter 6: Dynamically Allocated Memory, Chapter 7: Characters and Strings
Terminal Control VT100 User Guide (DEC, freely available) Section 4: Control Sequences, ANSI Escape Codes
Command-Line Parsing “The UNIX Programming Environment” by Kernighan & Pike Chapter 5: Shell Programming (design philosophy)
U-Boot Architecture U-Boot source code (cmd/ directory) Study cmd/mem.c, cmd/bootm.c, common/cli.c, common/command.c
REPL Design “Structure and Interpretation of Computer Programs” by Abelson & Sussman Introduction: The Elements of Programming (REPL concept)
Data Structures “Data Structures and Algorithm Analysis in C” by Mark Allen Weiss Chapter 5: Hashing (for environment variable lookup)

Common Pitfalls & Debugging

Problem 1: Backspace Doesn’t Erase Character Visually

Symptom: When you press backspace, the cursor moves but the character remains on screen.

Root cause: You’re only moving the cursor back, not erasing. The VT100 backspace sequence requires three operations: move back, write space, move back again.

Fix:

void handle_backspace(char *buffer, int *pos) {
    if (*pos == 0) return;  // Can't backspace at start

    (*pos)--;
    buffer[*pos] = '\0';

    // VT100 erase sequence: \b \b (back, space, back)
    putchar('\b');   // Move cursor back
    putchar(' ');    // Erase character with space
    putchar('\b');   // Move cursor back again
}

Quick test: Type “hello”, backspace 3 times, type “p”. You should see “hep”.


Problem 2: Commands Randomly Fail or Show Garbage Data

Symptom: md command sometimes works, sometimes shows random bytes or crashes.

Root cause: You’re not null-terminating your input buffer, so parsing reads past the actual input.

Fix:

int read_line(char *buffer, int max_len) {
    int pos = 0;
    while (pos < max_len - 1) {  // Leave space for \0
        char c = getchar();
        if (c == '\r') {
            buffer[pos] = '\0';  // CRITICAL: null terminate
            putchar('\n');
            return pos;
        }
        // ... handle backspace, printable chars ...
    }
    buffer[max_len - 1] = '\0';  // Always terminate
    return max_len - 1;
}

Quick test: Enable debug prints showing buffer contents as hex before parsing.


Problem 3: Environment Variables Don’t Update or Duplicate

Symptom: Setting bootfile twice creates two entries, or the old value persists.

Root cause: Your set_env() doesn’t check for existing variables before appending.

Fix:

void set_env(const char *name, const char *value) {
    // First, try to find existing variable
    char *p = env_storage;
    int name_len = strlen(name);

    while (p < env_ptr) {
        if (strncmp(p, name, name_len) == 0 && p[name_len] == '=') {
            // Found existing—check if we can update in place
            int old_len = strlen(p);
            int new_len = name_len + 1 + strlen(value);

            if (new_len <= old_len) {
                // Fits in place—overwrite
                sprintf(p, "%s=%s", name, value);
                return;
            } else {
                // Doesn't fit—mark as deleted (set first char to \0)
                *p = '\0';
                break;
            }
        }
        p += strlen(p) + 1;
    }

    // Not found or didn't fit—append new
    sprintf(env_ptr, "%s=%s", name, value);
    env_ptr += strlen(env_ptr) + 1;
}

Quick test: Run set x=1, set x=2, print—should only show x=2.


Problem 4: Shell Crashes When User Types Invalid Command

Symptom: Bootloader hangs or reboots when you type a command that doesn’t exist.

Root cause: Your command lookup doesn’t handle “not found” case, dereferencing a NULL function pointer.

Fix:

void execute_command(int argc, char **argv) {
    if (argc == 0) return;  // Empty command

    // Search command table
    for (int i = 0; cmd_table[i].name != NULL; i++) {
        if (strcmp(argv[0], cmd_table[i].name) == 0) {
            // Found—call handler
            int ret = cmd_table[i].handler(argc, argv);
            if (ret != 0) {
                printf("Command failed with code %d\n", ret);
            }
            return;
        }
    }

    // Not found
    printf("Unknown command: %s\n", argv[0]);
    printf("Type 'help' for available commands\n");
}

Quick test: Type “asdfasdf”—should show error, not crash.


Problem 5: Variable Expansion Causes Buffer Overflow

Symptom: Using $variable_with_long_value causes corruption or crashes.

Root cause: You’re expanding variables in place without checking if the result fits in the buffer.

Fix:

void expand_variables(char *input, char *output, int max_len) {
    int out_pos = 0;

    for (int i = 0; input[i] != '\0'; i++) {
        if (input[i] == '$') {
            // Extract variable name (until space or end)
            char var_name[64];
            int j = 0;
            i++;  // Skip $
            while (input[i] && input[i] != ' ' && j < 63) {
                var_name[j++] = input[i++];
            }
            var_name[j] = '\0';
            i--;  // Back up one (loop will increment)

            // Look up value
            const char *value = get_env(var_name);
            if (value) {
                // Copy value to output (with bounds check)
                while (*value && out_pos < max_len - 1) {
                    output[out_pos++] = *value++;
                }
            }
        } else {
            // Regular character
            if (out_pos < max_len - 1) {
                output[out_pos++] = input[i];
            }
        }
    }
    output[out_pos] = '\0';
}

Quick test: set long=AAAAAAAAAAAA..., then load $long—should handle gracefully.

Learning Milestones

Milestone 1: You can type, edit, and submit commands Evidence: Typing “hello world”, backspacing to “hello wo”, typing “rld”, pressing Enter shows “hello world” in your debug output. The input buffer correctly reflects edits. What this means: You’ve mastered low-level terminal I/O and state management. You can read keyboard input without an OS, handle control characters, and maintain an editable buffer—skills that transfer to writing any kind of interactive firmware.

Milestone 2: Commands execute from a dispatch table Evidence: Typing “help” shows command list, “unknown” shows error message, “md 0x7C00” calls your memory dump handler. Adding a new command requires only adding an entry to the table. What this means: You understand function pointers and table-driven design. This is the same pattern used in interrupt handlers, system call tables, and protocol handlers in network stacks—a fundamental systems programming pattern.

Milestone 3: Memory dump shows correct data at any address Evidence: md 0x7C00 shows your bootloader signature (0x55AA at offset 510), md 0x500 shows BIOS data area contents, invalid addresses like md 0xFFFFFFFF show errors instead of crashing. What this means: You can safely manipulate memory in a bare-metal environment. You understand address validation, pointer arithmetic, and hex formatting—critical for debugging hardware and writing drivers.

Milestone 4: Environment variables persist and expand correctly Evidence: Running set bootfile=kernel.bin, print bootfile shows “kernel.bin”, load $bootfile attempts to load “kernel.bin” (substitution works). Setting the same variable twice updates instead of duplicating. What this means: You’ve implemented a key-value store without malloc or a database. You understand string manipulation, variable lifetime, and state persistence—skills needed for configuration management in embedded systems.

Milestone 5: You can boot a kernel from the shell Evidence: Running load kernel.bin, boot 0x100000 successfully transfers control to your loaded kernel. The shell correctly parses addresses, loads files, and performs far jumps. What this means: You’ve built a complete bootloader workflow from interactive input to kernel execution. You understand the full boot chain and can integrate multiple subsystems (UI, filesystem, loader, execution) into a cohesive tool—this is the hallmark of a systems architect.