Project 16: Bootloader with Interactive Shell

Build a U-Boot-style command-line environment that runs before any operating system, providing memory inspection, disk operations, environment variables, and extensible command infrastructure.

Quick Reference

Attribute Value
Difficulty ★★★★☆ Expert
Time Estimate 2-3 weeks
Language C (with assembly startup)
Alternative Languages Rust (no_std), Pure Assembly
Prerequisites Projects 1-4, C string handling, basic terminal I/O
Key Topics Command parsing, line editing, memory safety, REPL design, extensible architecture
Portfolio Value Strong side project demonstrating systems programming depth

1. Learning Objectives

By completing this project, you will:

  1. Understand REPL Architecture: Learn how Read-Eval-Print Loops work at the lowest level, without any standard library support
  2. Master Terminal I/O in Bare Metal: Implement keyboard input and screen output using only BIOS interrupts or direct hardware access
  3. Build a Command Parser Without stdlib: Parse commands and arguments using only your own string manipulation code
  4. Implement Safe Memory Inspection: Create tools to safely examine and modify memory without crashing
  5. Design Extensible Systems: Create a modular command registration system that makes adding new commands trivial
  6. Handle State Management: Implement environment variables and persistent state in a constrained environment
  7. Apply Software Engineering in Constraints: Write clean, maintainable code with no heap, no OS, and minimal resources

2. Theoretical Foundation

2.1 Core Concepts

The REPL Pattern

Every interactive system follows the Read-Eval-Print Loop:

┌─────────────────────────────────────────────────────────────────┐
│                        REPL Architecture                         │
├─────────────────────────────────────────────────────────────────┤
│                                                                  │
│    ┌─────────┐                                                  │
│    │  START  │                                                  │
│    └────┬────┘                                                  │
│         │                                                       │
│         ▼                                                       │
│    ┌─────────┐     User types                                  │
│    │  READ   │◄────command────┐                                │
│    │         │                 │                                │
│    └────┬────┘                 │                                │
│         │                      │                                │
│         │ Parse input          │                                │
│         ▼                      │                                │
│    ┌─────────┐                 │                                │
│    │ EVALUATE│                 │                                │
│    │         │                 │                                │
│    └────┬────┘                 │                                │
│         │                      │                                │
│         │ Execute command      │                                │
│         ▼                      │                                │
│    ┌─────────┐                 │                                │
│    │  PRINT  │─────────────────┘                                │
│    │         │                                                  │
│    └─────────┘     Show result, prompt again                    │
│                                                                  │
└─────────────────────────────────────────────────────────────────┘

Line Editor Components

A proper line editor needs multiple components working together:

┌─────────────────────────────────────────────────────────────────┐
│                      Line Editor Components                      │
├─────────────────────────────────────────────────────────────────┤
│                                                                  │
│  Input Buffer:                                                   │
│  ┌────────────────────────────────────────────────────────────┐ │
│  │ m │ d │   │ 0 │ x │ 7 │ C │ 0 │ 0 │ _ │ _ │ _ │ _ │ _ │ _ │ │
│  └────────────────────────────────────────────────────────────┘ │
│    0   1   2   3   4   5   6   7   8   ▲                        │
│                                        │                         │
│                              cursor_pos = 9                      │
│                                                                  │
│  State Variables:                                                │
│  ┌──────────────────────────────────────────────────────────┐   │
│  │ buffer[MAX_LINE_LENGTH]  - Character storage              │   │
│  │ buffer_length            - Current content length         │   │
│  │ cursor_pos               - Current cursor position        │   │
│  │ history[MAX_HISTORY]     - Previous commands              │   │
│  │ history_index            - Current history position       │   │
│  └──────────────────────────────────────────────────────────┘   │
│                                                                  │
│  Key Handlers:                                                   │
│  ┌──────────────────────────────────────────────────────────┐   │
│  │ Backspace (0x08) → Delete char before cursor              │   │
│  │ Delete    (0x53) → Delete char at cursor                  │   │
│  │ Left      (0x4B) → Move cursor left                       │   │
│  │ Right     (0x4D) → Move cursor right                      │   │
│  │ Up        (0x48) → Previous history entry                 │   │
│  │ Down      (0x50) → Next history entry                     │   │
│  │ Home      (0x47) → Move cursor to start                   │   │
│  │ End       (0x4F) → Move cursor to end                     │   │
│  │ Enter     (0x0D) → Submit command                         │   │
│  │ Tab       (0x09) → Command completion (optional)          │   │
│  └──────────────────────────────────────────────────────────┘   │
│                                                                  │
└─────────────────────────────────────────────────────────────────┘

Command Dispatch Architecture

Professional bootloaders use a table-driven command system:

┌─────────────────────────────────────────────────────────────────┐
│                    Command Dispatch System                       │
├─────────────────────────────────────────────────────────────────┤
│                                                                  │
│  Command Table (Static Array):                                   │
│  ┌──────────────────────────────────────────────────────────┐   │
│  │ struct command {                                          │   │
│  │     const char *name;        // "md"                      │   │
│  │     const char *help;        // "Memory dump"             │   │
│  │     const char *usage;       // "md <addr> [len]"         │   │
│  │     int (*handler)(int argc, char *argv[]);               │   │
│  │ };                                                        │   │
│  └──────────────────────────────────────────────────────────┘   │
│                                                                  │
│  ┌────────────────────────────────────────────────────────────┐ │
│  │ Index │ Name   │ Handler          │ Help                   │ │
│  ├───────┼────────┼──────────────────┼────────────────────────┤ │
│  │   0   │ help   │ cmd_help()       │ Show available...      │ │
│  │   1   │ md     │ cmd_memdump()    │ Memory dump            │ │
│  │   2   │ mm     │ cmd_memmod()     │ Memory modify          │ │
│  │   3   │ disk   │ cmd_disk()       │ Disk operations        │ │
│  │   4   │ load   │ cmd_load()       │ Load file to memory    │ │
│  │   5   │ set    │ cmd_set()        │ Set env variable       │ │
│  │   6   │ print  │ cmd_print()      │ Print env variables    │ │
│  │   7   │ boot   │ cmd_boot()       │ Boot kernel            │ │
│  │  ...  │  ...   │ ...              │ ...                    │ │
│  │   N   │ NULL   │ NULL             │ NULL (sentinel)        │ │
│  └────────────────────────────────────────────────────────────┘ │
│                                                                  │
│  Dispatch Flow:                                                  │
│                                                                  │
│  Input: "md 0x7C00 64"                                          │
│         │                                                       │
│         ▼                                                       │
│  ┌─────────────┐                                                │
│  │   Tokenize  │  →  argv[0]="md", argv[1]="0x7C00", argv[2]="64"│
│  └──────┬──────┘     argc=3                                     │
│         │                                                       │
│         ▼                                                       │
│  ┌─────────────┐                                                │
│  │ Find in     │  →  for each cmd: if strcmp(cmd.name, argv[0]) │
│  │ table       │                                                │
│  └──────┬──────┘                                                │
│         │                                                       │
│         ▼                                                       │
│  ┌─────────────┐                                                │
│  │ Call handler│  →  cmd.handler(argc, argv)                    │
│  └─────────────┘                                                │
│                                                                  │
└─────────────────────────────────────────────────────────────────┘

Memory Safety in Bootloaders

Unlike user-space programs, memory errors in bootloaders are catastrophic:

┌─────────────────────────────────────────────────────────────────┐
│                      Memory Safety Zones                         │
├─────────────────────────────────────────────────────────────────┤
│                                                                  │
│  Real Mode Memory Map (1MB):                                     │
│                                                                  │
│  0x00000 ┌───────────────────────────────┐                      │
│          │ Interrupt Vector Table (IVT)  │ ← DO NOT MODIFY      │
│  0x00400 ├───────────────────────────────┤   (unless you know   │
│          │ BIOS Data Area (BDA)          │    what you're doing)│
│  0x00500 ├───────────────────────────────┤                      │
│          │ Free memory (usable)          │ ← SAFE for variables │
│          │                               │                      │
│  0x07C00 ├───────────────────────────────┤                      │
│          │ YOUR BOOTLOADER CODE          │ ← Don't overwrite!   │
│          │ (512 bytes minimum)           │                      │
│  0x07E00 ├───────────────────────────────┤                      │
│          │ Free memory (usable)          │ ← SAFE for data      │
│          │                               │                      │
│  0x80000 ├───────────────────────────────┤                      │
│          │ Extended BIOS Data Area       │ ← Variable location  │
│  0xA0000 ├───────────────────────────────┤                      │
│          │ Video Memory (VGA)            │ ← Write = screen     │
│  0xC0000 ├───────────────────────────────┤                      │
│          │ Video BIOS ROM                │ ← READ ONLY          │
│  0xF0000 ├───────────────────────────────┤                      │
│          │ System BIOS ROM               │ ← READ ONLY          │
│  0xFFFFF └───────────────────────────────┘                      │
│                                                                  │
│  Safety Rules for Memory Commands:                               │
│  ┌──────────────────────────────────────────────────────────┐   │
│  │ 1. Validate address range before access                   │   │
│  │ 2. Warn before writing to known critical regions          │   │
│  │ 3. Never dereference NULL (0x0000-0x00FF especially)      │   │
│  │ 4. ROM writes silently fail - not an error                │   │
│  │ 5. Video memory writes are visible - useful for testing   │   │
│  └──────────────────────────────────────────────────────────┘   │
│                                                                  │
└─────────────────────────────────────────────────────────────────┘

2.2 Why This Matters

Real-World Applications:

  1. U-Boot: The most widely used bootloader for embedded systems (routers, phones, IoT devices) provides exactly this interface. Understanding U-Boot’s shell helps you work with billions of devices.

  2. GRUB Command Line: When Linux fails to boot, the GRUB shell is your recovery tool. Knowing how it works helps you fix systems.

  3. Debugging Hardware: Memory inspection commands let you examine hardware registers, debug device drivers, and understand what firmware has configured.

  4. Firmware Development: Many firmware systems (UEFI Shell, coreboot) provide interactive environments for testing and configuration.

  5. Bootkit/Rootkit Analysis: Security researchers need to understand bootloader shells to analyze malware that persists at this level.

Career Impact:

  • Embedded systems engineers use U-Boot daily
  • Kernel developers debug boot issues through bootloader shells
  • Security researchers analyze firmware through similar interfaces
  • DevOps engineers troubleshoot boot failures using these tools

2.3 Historical Context

The concept of an interactive bootloader shell evolved from:

  1. Monitor Programs (1960s): Early computers had simple “monitor” programs that let operators examine memory and load programs
  2. ROM BASIC (1980s): IBM PCs could boot directly into BASIC if no disk was present
  3. LILO (1992): Early Linux bootloader with command-line options
  4. GRUB (1995): GNU Grand Unified Bootloader introduced rich command-line
  5. U-Boot (2000): Universal Bootloader became the standard for embedded systems
  6. UEFI Shell (2005): Modern firmware provides DOS-like shell environment

2.4 Common Misconceptions

Misconception 1: “A bootloader is just 512 bytes”

  • Reality: The 512-byte limit is only for the MBR. Stage 2 can be any size, and that’s where the shell lives.

Misconception 2: “You need an OS for a command line”

  • Reality: A REPL needs only keyboard input, screen output, and basic parsing. No OS required.

Misconception 3: “C requires a standard library”

  • Reality: You can write C with no stdlib. You just implement what you need (or nothing at all).

Misconception 4: “Memory access is always safe”

  • Reality: In a bootloader, bad memory access = instant reboot or freeze. No segfaults, no error messages.

Misconception 5: “This is obsolete with UEFI”

  • Reality: UEFI Shell exists and is widely used. The concepts transfer directly.

3. Project Specification

3.1 What You Will Build

A fully interactive bootloader shell that provides:

  1. Line Editor: Full-featured input with backspace, cursor movement, and command history
  2. Command Parser: Tokenizes input into command and arguments, handles quoting
  3. Memory Commands: md (dump), mm (modify), mf (find), mc (compare)
  4. Disk Commands: disk read, disk info, sector inspection
  5. File Commands: load (if filesystem support from Project 5)
  6. Environment System: set, print, unset, variable substitution
  7. Boot Command: Load and execute kernel with configurable address
  8. Extensible Architecture: Adding new commands requires only a table entry

3.2 Functional Requirements

ID Requirement Priority
FR1 Shell displays prompt and accepts keyboard input Must Have
FR2 Backspace deletes previous character correctly Must Have
FR3 Enter key executes current command Must Have
FR4 Unknown commands display error message Must Have
FR5 help command lists all available commands Must Have
FR6 md <addr> [len] displays hexdump of memory Must Have
FR7 mm <addr> <val> modifies single byte in memory Must Have
FR8 set <var>=<value> stores environment variable Must Have
FR9 print [var] displays environment variables Must Have
FR10 boot [addr] jumps to specified address Must Have
FR11 Up/Down arrows navigate command history Should Have
FR12 Left/Right arrows move cursor within line Should Have
FR13 disk read <sector> [count] reads disk sectors Should Have
FR14 Variable substitution in commands ($var) Should Have
FR15 Tab completion for commands Nice to Have
FR16 mf <addr> <len> <pattern> finds pattern in memory Nice to Have

3.3 Non-Functional Requirements

ID Requirement Target
NFR1 Total shell code fits in 32KB < 32,768 bytes
NFR2 Command execution latency < 100ms for simple commands
NFR3 Maximum command line length 256 characters
NFR4 Maximum environment variables 32 variables
NFR5 Command history depth 16 entries
NFR6 Works in both Real Mode and Protected Mode Cross-mode compatible

3.4 Example Usage / Output

Bootloader Shell v1.0
Type 'help' for available commands

boot> help
Available commands:
  help              - Display this help message
  md <addr> [len]   - Memory dump (default len=256)
  mm <addr> <val>   - Memory modify (byte)
  mw <addr> <val>   - Memory modify (word)
  md.l <addr> [len] - Memory dump (long/32-bit)
  mf <a> <l> <pat>  - Memory find pattern
  mc <a1> <a2> <l>  - Memory compare
  disk info         - Show disk information
  disk read <s> [n] - Read sector(s) to buffer
  load <filename>   - Load file to memory
  set <var>=<val>   - Set environment variable
  print [var]       - Print environment variable(s)
  unset <var>       - Remove environment variable
  boot [addr]       - Boot kernel at address (default: $loadaddr)
  reset             - Reset system

boot> md 0x7C00 64
00007C00: EB 3C 90 4D 53 44 4F 53  35 2E 30 00 02 01 01 00  |.<.MSDOS5.0.....|
00007C10: 02 E0 00 40 0B F0 09 00  12 00 02 00 00 00 00 00  |...@............|
00007C20: 00 00 00 00 00 00 29 12  34 56 78 4E 4F 20 4E 41  |......)..4VxNO NA|
00007C30: 4D 45 20 20 20 20 46 41  54 31 32 20 20 20 8E D0  |ME    FAT12   ..|

boot> mm 0x500 0x41
Writing 0x41 to 0x00000500
Verify: 0x41 OK

boot> set loadaddr=0x100000
boot> set kernel=kernel.bin
boot> print
Environment:
  loadaddr=0x100000
  kernel=kernel.bin

boot> load $kernel
Loading kernel.bin to $loadaddr (0x100000)...
Read 131072 bytes (256 sectors)
Done.

boot> boot
Booting from 0x100000...

[Kernel takes over]

3.5 Real World Outcome

When complete, you will have:

  1. A Working Bootloader Shell: An interactive environment that boots before any OS
  2. Debugging Capability: Tools to inspect memory, useful for kernel development
  3. Understanding of U-Boot: Direct experience with how the most popular embedded bootloader works
  4. Portable Components: String handling, parsing code usable in other bare-metal projects
  5. Portfolio Piece: Impressive demonstration of systems programming skills

4. Solution Architecture

4.1 High-Level Design

┌─────────────────────────────────────────────────────────────────────────────┐
│                        Bootloader Shell Architecture                         │
├─────────────────────────────────────────────────────────────────────────────┤
│                                                                              │
│  ┌─────────────────────────────────────────────────────────────────────┐    │
│  │                         Application Layer                            │    │
│  │  ┌──────────┐ ┌──────────┐ ┌──────────┐ ┌──────────┐ ┌──────────┐  │    │
│  │  │ cmd_help │ │ cmd_md   │ │ cmd_mm   │ │ cmd_disk │ │ cmd_boot │  │    │
│  │  └──────────┘ └──────────┘ └──────────┘ └──────────┘ └──────────┘  │    │
│  │  ┌──────────┐ ┌──────────┐ ┌──────────┐ ┌──────────┐              │    │
│  │  │ cmd_set  │ │cmd_print │ │ cmd_load │ │cmd_reset │  ... more    │    │
│  │  └──────────┘ └──────────┘ └──────────┘ └──────────┘              │    │
│  └────────────────────────────────┬────────────────────────────────────┘    │
│                                   │                                          │
│                                   ▼                                          │
│  ┌─────────────────────────────────────────────────────────────────────┐    │
│  │                         Shell Core Layer                             │    │
│  │  ┌────────────────┐  ┌────────────────┐  ┌────────────────────────┐ │    │
│  │  │  Line Editor   │  │ Command Parser │  │  Command Dispatcher    │ │    │
│  │  │                │  │                │  │                        │ │    │
│  │  │ - read_line()  │  │ - tokenize()   │  │ - find_command()       │ │    │
│  │  │ - handle_key() │  │ - parse_int()  │  │ - dispatch()           │ │    │
│  │  │ - history[]    │  │ - expand_vars()│  │ - commands[]           │ │    │
│  │  └────────────────┘  └────────────────┘  └────────────────────────┘ │    │
│  └────────────────────────────────┬────────────────────────────────────┘    │
│                                   │                                          │
│                                   ▼                                          │
│  ┌─────────────────────────────────────────────────────────────────────┐    │
│  │                         Services Layer                               │    │
│  │  ┌──────────────┐  ┌──────────────┐  ┌──────────────────────────┐   │    │
│  │  │ Environment  │  │   Memory     │  │    Disk/Filesystem       │   │    │
│  │  │              │  │              │  │                          │   │    │
│  │  │ - env[]      │  │ - mem_read() │  │ - disk_read_sectors()    │   │    │
│  │  │ - get_env()  │  │ - mem_write()│  │ - fs_read_file()         │   │    │
│  │  │ - set_env()  │  │ - mem_dump() │  │ - fs_list_dir()          │   │    │
│  │  └──────────────┘  └──────────────┘  └──────────────────────────┘   │    │
│  └────────────────────────────────┬────────────────────────────────────┘    │
│                                   │                                          │
│                                   ▼                                          │
│  ┌─────────────────────────────────────────────────────────────────────┐    │
│  │                      Hardware Abstraction Layer                      │    │
│  │  ┌──────────────┐  ┌──────────────┐  ┌──────────────────────────┐   │    │
│  │  │   Console    │  │    Disk      │  │       Timer              │   │    │
│  │  │              │  │              │  │                          │   │    │
│  │  │ - getchar()  │  │ - int_13h()  │  │ - get_ticks()            │   │    │
│  │  │ - putchar()  │  │ - ata_read() │  │ - delay_ms()             │   │    │
│  │  │ - puts()     │  │              │  │                          │   │    │
│  │  │ - printf()   │  │              │  │                          │   │    │
│  │  └──────────────┘  └──────────────┘  └──────────────────────────┘   │    │
│  └─────────────────────────────────────────────────────────────────────┘    │
│                                                                              │
└─────────────────────────────────────────────────────────────────────────────┘

4.2 Key Components

Component 1: Console I/O (console.c)

Provides keyboard input and screen output:

// Console functions needed
int console_getchar(void);           // Blocking read, returns ASCII or extended key
void console_putchar(char c);        // Output single character
void console_puts(const char *s);    // Output string
void console_clear(void);            // Clear screen
void console_set_cursor(int x, int y); // Position cursor
int console_printf(const char *fmt, ...); // Formatted output

Component 2: Line Editor (line.c)

Handles input editing:

// Line editor state
struct line_editor {
    char buffer[MAX_LINE];
    int length;
    int cursor;
    char history[MAX_HISTORY][MAX_LINE];
    int history_count;
    int history_pos;
};

// Line editor functions
int line_read(struct line_editor *ed, char *output); // Get line from user
void line_handle_key(struct line_editor *ed, int key);
void line_redraw(struct line_editor *ed);

Component 3: Command Parser (parser.c)

Tokenizes and processes commands:

// Parser functions
int parse_command(const char *input, int *argc, char **argv); // Split into tokens
unsigned long parse_number(const char *s, int *error);        // Parse hex/dec
int expand_variables(char *output, const char *input);        // $var expansion

Component 4: Command Dispatcher (dispatch.c)

Routes commands to handlers:

// Command structure
typedef int (*cmd_handler_t)(int argc, char **argv);

struct command {
    const char *name;
    const char *help;
    const char *usage;
    cmd_handler_t handler;
};

// Dispatcher functions
struct command *find_command(const char *name);
int dispatch_command(int argc, char **argv);
void register_command(struct command *cmd);  // For dynamic registration

Component 5: Environment (env.c)

Manages variables:

// Environment entry
struct env_entry {
    char name[32];
    char value[128];
    int used;
};

// Environment functions
const char *env_get(const char *name);
int env_set(const char *name, const char *value);
int env_unset(const char *name);
void env_print_all(void);

4.3 Data Structures

// =============================================================
// Core Data Structures for Bootloader Shell
// =============================================================

#define MAX_LINE_LENGTH   256
#define MAX_HISTORY       16
#define MAX_ARGC          16
#define MAX_ENV_VARS      32
#define MAX_ENV_NAME      32
#define MAX_ENV_VALUE     128

// -------------------------------------------------------------
// Line Editor State
// -------------------------------------------------------------
typedef struct {
    char buffer[MAX_LINE_LENGTH];   // Current input buffer
    uint16_t length;                // Current content length
    uint16_t cursor;                // Cursor position (0 to length)

    // History ring buffer
    char history[MAX_HISTORY][MAX_LINE_LENGTH];
    uint8_t history_count;          // Number of entries in history
    uint8_t history_write;          // Next write position
    int8_t history_browse;          // Current browse position (-1 = current)
} LineEditor;

// -------------------------------------------------------------
// Parsed Command
// -------------------------------------------------------------
typedef struct {
    int argc;
    char *argv[MAX_ARGC];
    char token_buffer[MAX_LINE_LENGTH];  // Backing storage for tokens
} ParsedCommand;

// -------------------------------------------------------------
// Command Table Entry
// -------------------------------------------------------------
typedef int (*CommandHandler)(int argc, char **argv);

typedef struct {
    const char *name;               // Command name (e.g., "md")
    const char *help;               // Short help text
    const char *usage;              // Usage string (e.g., "md <addr> [len]")
    CommandHandler handler;         // Function pointer
    uint8_t min_args;               // Minimum required arguments
    uint8_t max_args;               // Maximum allowed arguments
} Command;

// -------------------------------------------------------------
// Environment Variable Entry
// -------------------------------------------------------------
typedef struct {
    char name[MAX_ENV_NAME];
    char value[MAX_ENV_VALUE];
    uint8_t used;                   // 1 if slot is occupied, 0 if free
} EnvEntry;

// -------------------------------------------------------------
// Global Shell State
// -------------------------------------------------------------
typedef struct {
    LineEditor line;
    EnvEntry env[MAX_ENV_VARS];
    uint32_t load_address;          // Default kernel load address
    uint8_t echo_enabled;           // Echo commands before execution
    uint8_t verbose;                // Verbose output mode
} ShellState;

// -------------------------------------------------------------
// Memory Operation Results
// -------------------------------------------------------------
typedef struct {
    uint32_t address;
    uint32_t length;
    uint8_t *data;
    int error;                      // 0 = success, negative = error code
} MemoryResult;

// -------------------------------------------------------------
// Disk Read Buffer
// -------------------------------------------------------------
#define SECTOR_SIZE 512

typedef struct {
    uint8_t data[SECTOR_SIZE * 16]; // Buffer for up to 16 sectors
    uint32_t start_sector;
    uint8_t sector_count;
    uint8_t valid;                  // 1 if buffer contains valid data
} DiskBuffer;

4.4 Algorithm Overview

Main Shell Loop

FUNCTION shell_main():
    initialize_console()
    initialize_environment()
    show_banner()

    FOREVER:
        show_prompt()
        line = read_line()

        IF line is empty:
            CONTINUE

        add_to_history(line)
        expand_variables(line, expanded_line)
        parse_command(expanded_line, &argc, argv)

        IF argc == 0:
            CONTINUE

        cmd = find_command(argv[0])

        IF cmd == NULL:
            print_error("Unknown command: ", argv[0])
            CONTINUE

        result = cmd->handler(argc, argv)

        IF result != 0:
            print_error("Command failed with code ", result)

Command Parsing Algorithm

FUNCTION parse_command(input, argc, argv):
    *argc = 0
    state = WHITESPACE
    token_start = NULL

    FOR each character c in input:
        SWITCH state:
            CASE WHITESPACE:
                IF c is whitespace:
                    CONTINUE
                ELSE IF c == '"':
                    state = QUOTED
                    token_start = current_position + 1
                ELSE:
                    state = TOKEN
                    token_start = current_position

            CASE TOKEN:
                IF c is whitespace:
                    terminate_token()
                    argv[(*argc)++] = token_start
                    state = WHITESPACE
                ELSE IF c == '"':
                    state = QUOTED

            CASE QUOTED:
                IF c == '"':
                    state = TOKEN
                ELSE IF c == '\\':
                    state = ESCAPE

            CASE ESCAPE:
                // Handle \n, \t, \\, \"
                state = QUOTED

    IF state == TOKEN:
        terminate_token()
        argv[(*argc)++] = token_start

    argv[*argc] = NULL
    RETURN SUCCESS

5. Implementation Guide

5.1 Development Environment Setup

# Required tools
brew install nasm qemu gcc  # macOS
# or
apt install nasm qemu-system-x86 gcc  # Ubuntu/Debian

# Create project directory
mkdir -p bootloader-shell/{src,include,build,test}
cd bootloader-shell

# Create Makefile
cat > Makefile << 'EOF'
CC = gcc
AS = nasm
LD = ld
OBJCOPY = objcopy

# Freestanding C flags (no standard library)
CFLAGS = -m32 -ffreestanding -fno-pie -fno-stack-protector \
         -nostdlib -nostdinc -O2 -Wall -Wextra \
         -I include
ASFLAGS = -f elf32
LDFLAGS = -m elf_i386 -T linker.ld --oformat binary

# Source files
C_SOURCES = $(wildcard src/*.c)
ASM_SOURCES = $(wildcard src/*.asm)
OBJECTS = $(C_SOURCES:src/%.c=build/%.o) \
          $(ASM_SOURCES:src/%.asm=build/%.o)

all: build/bootloader.bin

build/bootloader.bin: $(OBJECTS)
	$(LD) $(LDFLAGS) -o $@ $^

build/%.o: src/%.c
	$(CC) $(CFLAGS) -c $< -o $@

build/%.o: src/%.asm
	$(AS) $(ASFLAGS) $< -o $@

run: build/bootloader.bin
	qemu-system-i386 -drive format=raw,file=$<

debug: build/bootloader.bin
	qemu-system-i386 -drive format=raw,file=$< -s -S &
	gdb -ex "target remote localhost:1234" \
	    -ex "set architecture i8086"

clean:
	rm -rf build/*

.PHONY: all run debug clean
EOF

5.2 Project Structure

bootloader-shell/
├── Makefile
├── linker.ld                  # Linker script
├── include/
│   ├── types.h                # Basic types (uint8_t, etc.)
│   ├── console.h              # Console I/O functions
│   ├── string.h               # String manipulation
│   ├── memory.h               # Memory operations
│   ├── disk.h                 # Disk access
│   ├── env.h                  # Environment variables
│   ├── command.h              # Command structures
│   └── shell.h                # Main shell interface
├── src/
│   ├── boot.asm               # Stage 1: 512-byte MBR loader
│   ├── stage2.asm             # Stage 2: Assembly startup, call main
│   ├── main.c                 # Shell main loop
│   ├── console.c              # BIOS-based console I/O
│   ├── string.c               # String functions (no stdlib)
│   ├── memory.c               # Memory access functions
│   ├── disk.c                 # Disk read via INT 13h
│   ├── env.c                  # Environment variable storage
│   ├── parser.c               # Command line parsing
│   ├── line.c                 # Line editor with history
│   └── commands/
│       ├── cmd_help.c         # help command
│       ├── cmd_memory.c       # md, mm, mf, mc commands
│       ├── cmd_disk.c         # disk command
│       ├── cmd_env.c          # set, print, unset commands
│       ├── cmd_boot.c         # boot command
│       └── cmd_misc.c         # reset, echo, etc.
├── test/
│   └── test_parser.c          # Host-side unit tests
└── docs/
    └── commands.md            # Command reference

5.3 The Core Question You’re Answering

“How do you build an interactive system with no operating system support?”

This question encompasses:

  • How do you read keyboard input without an OS?
  • How do you display text without printf?
  • How do you manage memory without malloc?
  • How do you build reusable components without libraries?
  • How do you make a system extensible in constrained environments?

5.4 Concepts You Must Understand First

Before implementing, verify you can answer these questions:

  1. BIOS Interrupts: What is INT 16h and how do you use it to read keys?
    • Reference: “The Art of Assembly Language” Chapter 10
  2. Keyboard Scan Codes: What’s the difference between ASCII codes and scan codes?
    • Reference: OSDev Wiki - Keyboard
  3. C Without stdlib: How do you implement strlen(), strcmp(), memcpy()?
    • Reference: “Effective C” Chapter 2
  4. Function Pointers: How do you store and call functions dynamically?
    • Reference: “C Programming: A Modern Approach” Chapter 17
  5. Number Parsing: How do you convert “0x7C00” string to integer?
    • Reference: “The C Programming Language” Section 2.7
  6. Memory Layout: Where is it safe to store variables in real mode?
    • Reference: OSDev Wiki - Memory Map

5.5 Questions to Guide Your Design

Console Design:

  • Will you use BIOS INT 10h/16h or direct VGA/keyboard access?
  • How will you handle extended keys (arrows, function keys)?
  • Will you support ANSI escape codes for cursor movement?

Line Editor Design:

  • What’s the maximum line length?
  • How many history entries will you store?
  • Will you support command completion?
  • How will you handle terminal resize?

Parser Design:

  • Will you support quoted strings with spaces?
  • Will you support escape sequences in strings?
  • How will you handle variable substitution?
  • What about command chaining (;) or pipes (|)?

Command System Design:

  • Static table or dynamic registration?
  • How will you handle subcommands (e.g., disk read vs disk info)?
  • Will commands return status codes?
  • How will you implement help for each command?

Memory Safety:

  • Will you validate addresses before access?
  • What regions will you protect?
  • How will you report access errors?

5.6 Thinking Exercise

Before writing any code, complete this exercise:

Exercise: Trace a Command Execution

Given the command md 0x7C00 32, trace through every step:

  1. Input Phase: How does each character get from keyboard to buffer?
  2. Parsing Phase: How is the string split into tokens?
  3. Number Conversion: How is “0x7C00” converted to 31744?
  4. Dispatch Phase: How is cmd_memdump() found and called?
  5. Execution Phase: How are the 32 bytes read and formatted?
  6. Output Phase: How does the hexdump appear on screen?

Write pseudocode for each phase. This will reveal gaps in your understanding.

5.7 Hints in Layers

Hint 1: Getting Started (Conceptual Direction)

Click to reveal Hint 1 Start with the simplest possible shell: 1. Print a prompt ("boot> ") 2. Read characters until Enter 3. Print what was typed 4. Repeat This "echo shell" proves your I/O works. Then add one feature at a time: - Backspace handling - Command lookup - One simple command (help) - Argument parsing - More commands Don't try to build everything at once.

Hint 2: Console I/O Implementation (More Specific)

Click to reveal Hint 2 **BIOS-based console (Real Mode):** ```c // Read character from keyboard (blocking) int console_getchar(void) { int result; __asm__ volatile( "int $0x16" : "=a"(result) : "a"(0x0000) // AH=0x00: Wait for keypress ); // AL = ASCII code (0 for extended keys) // AH = Scan code return result; } // Write character to screen void console_putchar(char c) { __asm__ volatile( "int $0x10" : : "a"((0x0E << 8) | c), "b"(0x0007) // AH=0x0E: Teletype output ); } ``` **Extended key handling:** ```c #define KEY_UP 0x4800 #define KEY_DOWN 0x5000 #define KEY_LEFT 0x4B00 #define KEY_RIGHT 0x4D00 #define KEY_HOME 0x4700 #define KEY_END 0x4F00 #define KEY_DELETE 0x5300 #define KEY_BACKSPACE 0x08 int get_key(void) { int raw = console_getchar(); char ascii = raw & 0xFF; char scan = (raw >> 8) & 0xFF; if (ascii != 0) return ascii; // Regular ASCII character return (scan << 8); // Extended key (scan code in high byte) } ```

Hint 3: Line Editor Implementation (Technical Details)

Click to reveal Hint 3 **Basic line editor with backspace:** ```c int line_read(char *buffer, int max_len) { int pos = 0; int key; while (1) { key = get_key(); if (key == '\r' || key == '\n') { console_putchar('\n'); buffer[pos] = '\0'; return pos; } else if (key == KEY_BACKSPACE) { if (pos > 0) { pos--; // Move cursor back, print space, move back again console_puts("\b \b"); } } else if (key >= 0x20 && key < 0x7F) { // Printable ASCII if (pos < max_len - 1) { buffer[pos++] = (char)key; console_putchar((char)key); } } // Ignore other keys for now } } ``` **Full cursor movement requires redrawing the line:** ```c void line_redraw(LineEditor *ed) { // Move to start of line for (int i = 0; i < ed->cursor; i++) { console_putchar('\b'); } // Clear line (print spaces) for (int i = 0; i < ed->prev_length; i++) { console_putchar(' '); } // Move back to start for (int i = 0; i < ed->prev_length; i++) { console_putchar('\b'); } // Print current buffer console_puts(ed->buffer); // Move cursor to correct position for (int i = ed->length; i > ed->cursor; i--) { console_putchar('\b'); } ed->prev_length = ed->length; } ```

Hint 4: Command Parsing (Complete Parser)

Click to reveal Hint 4 ```c #define MAX_ARGC 16 // Simple tokenizer (doesn't handle quotes yet) int parse_simple(char *input, int *argc, char *argv[]) { *argc = 0; while (*input && *argc < MAX_ARGC) { // Skip whitespace while (*input == ' ' || *input == '\t') { input++; } if (*input == '\0') break; // Mark start of token argv[(*argc)++] = input; // Find end of token while (*input && *input != ' ' && *input != '\t') { input++; } // Null-terminate token if (*input) { *input++ = '\0'; } } argv[*argc] = NULL; return *argc; } // Parse number with hex/dec/octal support unsigned long parse_number(const char *s, int *error) { *error = 0; unsigned long result = 0; int base = 10; if (s[0] == '0') { if (s[1] == 'x' || s[1] == 'X') { base = 16; s += 2; } else if (s[1] >= '0' && s[1] <= '7') { base = 8; s++; } } while (*s) { int digit; char c = *s++; if (c >= '0' && c <= '9') digit = c - '0'; else if (c >= 'a' && c <= 'f') digit = c - 'a' + 10; else if (c >= 'A' && c <= 'F') digit = c - 'A' + 10; else { *error = 1; return 0; } if (digit >= base) { *error = 1; return 0; } result = result * base + digit; } return result; } ```

Hint 5: Command Table and Dispatch (Implementation)

Click to reveal Hint 5 ```c // Command handler signatures static int cmd_help(int argc, char **argv); static int cmd_memdump(int argc, char **argv); static int cmd_memmod(int argc, char **argv); static int cmd_boot(int argc, char **argv); // Command table (NULL-terminated) static const Command commands[] = { {"help", "Show available commands", "help [command]", cmd_help, 0, 1}, {"md", "Memory dump", "md [len]", cmd_memdump, 1, 2}, {"mm", "Memory modify", "mm ", cmd_memmod, 2, 2}, {"boot", "Boot kernel", "boot [addr]", cmd_boot, 0, 1}, {NULL, NULL, NULL, NULL, 0, 0} // Sentinel }; // Find command by name const Command *find_command(const char *name) { for (int i = 0; commands[i].name != NULL; i++) { if (strcmp(commands[i].name, name) == 0) { return &commands[i]; } } return NULL; } // Execute a command int execute(int argc, char **argv) { if (argc == 0) return 0; const Command *cmd = find_command(argv[0]); if (cmd == NULL) { console_printf("Unknown command: %s\n", argv[0]); console_printf("Type 'help' for available commands.\n"); return -1; } int arg_count = argc - 1; // Exclude command name if (arg_count < cmd->min_args) { console_printf("Too few arguments. Usage: %s\n", cmd->usage); return -1; } if (arg_count > cmd->max_args) { console_printf("Too many arguments. Usage: %s\n", cmd->usage); return -1; } return cmd->handler(argc, argv); } ``` </details> #### Hint 6: Memory Dump Implementation
Click to reveal Hint 6 ```c // Hexdump style output // 00007C00: EB 3C 90 4D 53 44 4F 53 35 2E 30 00 02 01 01 00 |.<.MSDOS5.0.....| static void print_hex_line(uint32_t addr, const uint8_t *data, int len) { // Print address console_printf("%08X: ", addr); // Print hex bytes (first 8) for (int i = 0; i < 8; i++) { if (i < len) { console_printf("%02X ", data[i]); } else { console_puts(" "); } } console_putchar(' '); // Extra space between groups // Print hex bytes (second 8) for (int i = 8; i < 16; i++) { if (i < len) { console_printf("%02X ", data[i]); } else { console_puts(" "); } } console_puts(" |"); // Print ASCII representation for (int i = 0; i < 16 && i < len; i++) { char c = data[i]; if (c >= 0x20 && c < 0x7F) { console_putchar(c); } else { console_putchar('.'); } } console_puts("|\n"); } static int cmd_memdump(int argc, char **argv) { int error; uint32_t addr = parse_number(argv[1], &error); if (error) { console_printf("Invalid address: %s\n", argv[1]); return -1; } uint32_t len = 256; // Default if (argc >= 3) { len = parse_number(argv[2], &error); if (error) { console_printf("Invalid length: %s\n", argv[2]); return -1; } } // Print hexdump const uint8_t *ptr = (const uint8_t *)addr; while (len > 0) { int line_len = (len > 16) ? 16 : len; print_hex_line(addr, ptr, line_len); addr += line_len; ptr += line_len; len -= line_len; } return 0; } ```
### 5.8 The Interview Questions They'll Ask Be prepared to answer these questions in technical interviews: 1. **"How would you implement a command-line parser without using strtok()?"** - Explain tokenization, handling whitespace, managing null terminators 2. **"What are the challenges of writing C without a standard library?"** - No heap, no printf, no string functions, must implement everything 3. **"How do you handle keyboard input in a bootloader?"** - BIOS INT 16h in real mode, or direct keyboard controller access 4. **"What's the difference between scan codes and ASCII codes?"** - Scan codes are hardware-level, ASCII is the character representation 5. **"How would you make a command system extensible?"** - Function pointer tables, registration mechanism, consistent interfaces 6. **"What memory regions are safe to use in real mode?"** - Explain the memory map, IVT, BDA, EBDA, video memory 7. **"How do you implement printf without the standard library?"** - Variable arguments (va_list), format parsing, number conversion 8. **"What's the Ring 0/Ring 3 model and why does it matter?"** - Privilege levels, kernel vs user space, why bootloaders run at Ring 0 ### 5.9 Books That Will Help | Book | Chapter | Topics | Why It Helps | |------|---------|--------|--------------| | "The UNIX Programming Environment" (Kernighan & Pike) | Ch 5: Shell Programming | Command parsing, shell design | The definitive guide to shell architecture | | "The C Programming Language" (K&R) | Ch 5: Pointers & Arrays, Ch 7: I/O | String handling, varargs | Core C skills needed | | "Effective C, 2nd Edition" (Seacord) | Ch 6: Memory Management | Safe memory access | Writing robust code | | "Low-Level Programming" (Zhirkov) | Ch 3: Assembly, Ch 8: OS | Bare-metal programming | Direct bootloader guidance | | "The Art of Assembly Language" (Hyde) | Ch 10: BIOS | BIOS interrupts | Keyboard and display | | "Computer Systems: A Programmer's Perspective" (Bryant) | Ch 1: Tour, Ch 7: Linking | System fundamentals | Understanding the environment | | "Structure and Interpretation of Computer Programs" (Abelson) | Intro, Ch 4: Metalinguistic | REPL design, interpreters | Conceptual understanding of interpreters | ### 5.10 Implementation Phases #### Phase 1: Minimal Echo Shell (Days 1-3) - [ ] Set up build system (Makefile, linker script) - [ ] Implement console_putchar() and console_getchar() - [ ] Create console_puts() and basic console_printf() - [ ] Build simple echo shell (read line, print it back) - [ ] Add backspace handling - [ ] Test in QEMU **Milestone**: Type commands, see them echoed with proper editing #### Phase 2: Command Infrastructure (Days 4-6) - [ ] Implement string functions (strlen, strcmp, strcpy) - [ ] Build simple tokenizer - [ ] Create command table structure - [ ] Implement command lookup - [ ] Add "help" command - [ ] Add error handling for unknown commands **Milestone**: `help` command works, unknown commands show error #### Phase 3: Core Commands (Days 7-10) - [ ] Implement number parsing (hex, decimal, octal) - [ ] Add `md` (memory dump) command - [ ] Add `mm` (memory modify) command - [ ] Add `boot` command (jump to address) - [ ] Add address validation - [ ] Test memory access to various regions **Milestone**: Can inspect MBR at 0x7C00, modify memory, boot kernel #### Phase 4: Environment System (Days 11-13) - [ ] Create environment storage structure - [ ] Implement env_get(), env_set(), env_unset() - [ ] Add `set`, `print`, `unset` commands - [ ] Implement variable expansion ($var) - [ ] Add default variables (loadaddr, etc.) **Milestone**: Can set variables, use them in commands #### Phase 5: Advanced Features (Days 14-17) - [ ] Add command history (up/down arrows) - [ ] Add cursor movement (left/right) - [ ] Implement home/end keys - [ ] Add `disk read` command (requires INT 13h) - [ ] Add `load` command (requires filesystem from P5) - [ ] Polish and bug fixes **Milestone**: Full line editor, disk access working #### Phase 6: Testing and Documentation (Days 18-21) - [ ] Write command reference documentation - [ ] Test edge cases (long commands, invalid input) - [ ] Test on different QEMU configurations - [ ] Test with real hardware (if available) - [ ] Code cleanup and comments - [ ] Create demo video/screenshots **Milestone**: Polished, documented, portfolio-ready ### 5.11 Key Implementation Decisions #### Decision 1: Real Mode vs Protected Mode **Options:** 1. Stay in Real Mode (use BIOS interrupts) 2. Switch to Protected Mode (more memory, no BIOS) 3. Start in Real Mode, switch when needed **Recommendation:** Start in Real Mode for simplicity. BIOS INT 10h/13h/16h provide easy console and disk access. You can always add protected mode later. **Trade-offs:** - Real Mode: 1MB limit, BIOS services available, simpler - Protected Mode: 4GB addressing, must write own drivers, more complex #### Decision 2: Static vs Dynamic Command Registration **Options:** 1. Static array (compile-time) 2. Dynamic registration (runtime) **Recommendation:** Start with static array. It's simpler and uses no heap. ```c // Static (recommended for bootloader) static const Command commands[] = { {"help", ..., cmd_help}, {"md", ..., cmd_md}, {NULL, NULL, NULL} }; // Dynamic (requires memory allocation) void register_command(Command *cmd) { commands[command_count++] = *cmd; } ``` #### Decision 3: Memory Allocation Strategy **Options:** 1. No dynamic allocation (all static) 2. Simple bump allocator 3. Free list allocator **Recommendation:** All static allocation. Define maximum sizes for everything: ```c #define MAX_LINE 256 #define MAX_HISTORY 16 #define MAX_ENV_VARS 32 static char line_buffer[MAX_LINE]; static char history[MAX_HISTORY][MAX_LINE]; static EnvEntry env_table[MAX_ENV_VARS]; ``` #### Decision 4: Printf Implementation **Options:** 1. Minimal (only %s, %d, %x) 2. Moderate (add %c, %u, %p, width/padding) 3. Full (floating point, all flags) **Recommendation:** Minimal for bootloader. You need: - %s - strings - %d - signed decimal - %u - unsigned decimal - %x - hexadecimal - %c - character - %p - pointer (same as %x with 0x prefix) --- ## 6. Testing Strategy ### Unit Tests (Run on Host) Extract platform-independent code and test on your development machine: ```c // test/test_parser.c #include #include #include "../src/parser.c" // Include implementation void test_parse_simple() { char input[] = "md 0x7C00 64"; int argc; char *argv[16]; parse_simple(input, &argc, argv); assert(argc == 3); assert(strcmp(argv[0], "md") == 0); assert(strcmp(argv[1], "0x7C00") == 0); assert(strcmp(argv[2], "64") == 0); printf("test_parse_simple: PASSED\n"); } void test_parse_number_hex() { int error; unsigned long result = parse_number("0x7C00", &error); assert(error == 0); assert(result == 0x7C00); printf("test_parse_number_hex: PASSED\n"); } int main() { test_parse_simple(); test_parse_number_hex(); // ... more tests printf("All tests passed!\n"); return 0; } ``` Compile and run: ```bash gcc -o test_parser test/test_parser.c && ./test_parser ``` ### Integration Tests (Run in QEMU) Create test scripts that interact with the shell: ```bash #!/bin/bash # test/integration_test.sh # Start QEMU with serial output qemu-system-i386 -drive format=raw,file=build/bootloader.bin \ -serial stdio -nographic <<EOF help md 0x7C00 16 set testvar=hello print testvar boot EOF ``` ### Manual Testing Checklist | Test Case | Expected Result | Pass? | |-----------|-----------------|-------| | Type "help" | Shows all commands | [ ] | | Press Backspace | Deletes character | [ ] | | Type unknown command | Shows error | [ ] | | `md 0x7C00` | Shows MBR bytes | [ ] | | `md 0x7C00 16` | Shows 16 bytes | [ ] | | `md invalid` | Shows parse error | [ ] | | `mm 0x500 0x41` | Writes and verifies | [ ] | | `set foo=bar` | Stores variable | [ ] | | `print foo` | Shows "bar" | [ ] | | `print` | Shows all variables | [ ] | | Up arrow | Shows previous command | [ ] | | Down arrow | Shows next command | [ ] | --- ## 7. Common Pitfalls & Debugging | Problem | Symptom | Root Cause | Solution | |---------|---------|------------|----------| | Keyboard doesn't work | No response to typing | Wrong INT call or port | Verify INT 16h AH=0, check scan code handling | | Backspace shows ^H | Character not deleted | Missing terminal handling | Print "\b \b" sequence | | Commands not found | Always "unknown command" | strcmp() bug | Check null termination, case sensitivity | | Crash on md command | System resets | Invalid memory access | Add address validation | | Garbled output | Random characters | Wrong video mode or buffer | Check INT 10h setup, verify putchar | | History corrupts input | Wrong text appears | Buffer overflow | Check history ring buffer bounds | | Variables don't work | $var not expanded | Expansion not called | Verify variable expansion runs before parse | | Boot command hangs | System freezes | Wrong jump address or code | Verify target has valid code, check segment | ### Debugging Techniques **1. Serial Output** Add serial port output for debugging when screen output fails: ```c void serial_init(void) { outb(0x3F8 + 1, 0x00); // Disable interrupts outb(0x3F8 + 3, 0x80); // Enable DLAB outb(0x3F8 + 0, 0x03); // 38400 baud (low byte) outb(0x3F8 + 1, 0x00); // (high byte) outb(0x3F8 + 3, 0x03); // 8 bits, no parity, one stop outb(0x3F8 + 2, 0xC7); // Enable FIFO } void serial_putchar(char c) { while ((inb(0x3F8 + 5) & 0x20) == 0); outb(0x3F8, c); } ``` Run QEMU with: `qemu-system-i386 -serial stdio ...` **2. GDB Debugging** ```bash # Terminal 1: Start QEMU with GDB stub qemu-system-i386 -drive format=raw,file=bootloader.bin -s -S # Terminal 2: Connect GDB gdb (gdb) target remote localhost:1234 (gdb) set architecture i8086 (gdb) break *0x7C00 (gdb) continue ``` **3. Memory Dump at Boot** Add early memory dump to verify loading: ```c void early_debug(void) { console_puts("Boot OK. MBR at 0x7C00:\n"); uint8_t *ptr = (uint8_t *)0x7C00; for (int i = 0; i < 32; i++) { print_hex_byte(ptr[i]); console_putchar(' '); } console_putchar('\n'); } ``` --- ## 8. Extensions & Challenges After completing the basic shell, try these extensions: ### Extension 1: Tab Completion Implement command and filename completion: ``` boot> he boot> help ``` ### Extension 2: Command Aliases Allow defining shortcuts: ``` boot> alias ll="md 0x7C00 512" boot> ll [memory dump output] ``` ### Extension 3: Scripting Support Run commands from a file: ``` boot> source boot.scr # Executes: set kernel=myos.bin # load $kernel # boot ``` ### Extension 4: Network Commands If you completed the PXE project, add: ``` boot> tftp 192.168.1.1 kernel.bin Downloading kernel.bin... done boot> boot ``` ### Extension 5: Memory Test Implement a basic memory tester: ``` boot> memtest 0x100000 0x200000 Testing 1MB at 0x100000... Pattern 0x00: OK Pattern 0xFF: OK Pattern 0xAA: OK Pattern 0x55: OK Memory test passed! ``` ### Extension 6: Protected Mode Shell Extend the shell to work in protected mode with: - Paging support - Larger address space - Own keyboard driver (no BIOS) --- ## 9. Real-World Connections ### U-Boot Commands You'll Recognize After completing this project, you'll understand U-Boot commands: ``` U-Boot> help # Same as your help U-Boot> md 0x80000000 100 # Memory dump (same as your md) U-Boot> mm 0x80000000 # Interactive memory modify U-Boot> mw 0x80000000 0xdeadbeef # Memory write word U-Boot> cp.b src dst len # Memory copy U-Boot> setenv foo bar # Same as your set U-Boot> printenv # Same as your print U-Boot> boot # Boot kernel U-Boot> reset # System reset ``` ### GRUB Command Line The GRUB rescue mode uses similar concepts: ``` grub> ls # List devices grub> set root=(hd0,1) # Set root device grub> linux /vmlinuz # Load kernel grub> boot # Boot ``` ### UEFI Shell The UEFI Shell (available in most UEFI firmware) provides: ``` Shell> help Shell> mem 0x80000000 -b # Memory dump Shell> dblk fs0: 0 1 # Dump disk block Shell> set myvar value # Environment variable Shell> fs0: # Change to filesystem Shell> kernel.efi # Execute application ``` --- ## 10. Resources ### Primary References - [OSDev Wiki - Text Mode Cursor](https://wiki.osdev.org/Text_Mode_Cursor) - [OSDev Wiki - Keyboard](https://wiki.osdev.org/Keyboard) - [U-Boot Documentation](https://u-boot.readthedocs.io/) - [GRUB Manual](https://www.gnu.org/software/grub/manual/) ### Source Code to Study - [U-Boot cmd/ directory](https://source.denx.de/u-boot/u-boot/-/tree/master/cmd) - [BareMetalLib (simple bootloader lib)](https://github.com/ReturnInfinity/BareMetal-OS) - [toaruos bootloader](https://github.com/klange/toaruos/tree/master/boot) ### Related Projects in This Series - **Project 4**: Two-Stage Bootloader (provides the stage2 foundation) - **Project 5**: FAT12 Filesystem (enables `load` command) - **Project 8**: UEFI ELF Loader (UEFI version of shell) - **Project 14**: Graphics Boot (add visual elements to shell) --- ## 11. Self-Assessment Checklist Before considering this project complete, verify: ### Knowledge Check - [ ] I can explain how BIOS INT 16h reads keyboard input - [ ] I can describe the memory layout available in real mode - [ ] I understand how function pointers enable extensible command systems - [ ] I can explain tokenization without strtok() - [ ] I know why hexdump format is useful for debugging ### Implementation Check - [ ] My shell correctly handles backspace and enter - [ ] Commands are dispatched through a table, not if/else chains - [ ] Memory dump works safely on valid and invalid addresses - [ ] Environment variables are stored and retrieved correctly - [ ] The boot command successfully jumps to loaded code ### Code Quality Check - [ ] Code compiles with -Wall -Wextra without warnings - [ ] Functions are documented with comments - [ ] Magic numbers are replaced with named constants - [ ] Error messages are helpful and specific - [ ] Code could be extended with new commands easily ### Testing Check - [ ] Tested in QEMU with various inputs - [ ] Edge cases handled (empty input, long lines, etc.) - [ ] Memory access tested on critical regions - [ ] History works correctly at boundaries --- ## 12. Submission / Completion Criteria Your bootloader shell project is complete when: ### Minimum Viable Product 1. Shell boots and displays prompt 2. Basic line editing (backspace, enter) 3. At least 5 working commands (help, md, mm, set, boot) 4. Error handling for invalid input 5. Runs in QEMU without crashes ### Full Implementation 1. All functional requirements (FR1-FR14) implemented 2. Command history with up/down arrows 3. Cursor movement with left/right arrows 4. Environment variable substitution 5. Disk read command 6. Clean, documented code ### Portfolio Ready 1. README.md with usage instructions 2. Demo GIF or video 3. Architecture documentation 4. Comments explaining design decisions 5. Git history showing incremental development ### Beyond Expectations 1. Tab completion 2. Works in protected mode 3. Filesystem commands 4. Tested on real hardware 5. Published as open source with CI --- ## Conclusion Building a bootloader shell transforms you from someone who uses command-line tools to someone who understands how they work at the deepest level. You'll gain: 1. **Systems Mastery**: Understanding how software interacts with bare hardware 2. **Debugging Skills**: Ability to inspect and modify running systems 3. **Software Design**: Experience building extensible, maintainable systems in constraints 4. **Interview Confidence**: Deep answers to systems programming questions This project is where you stop being intimidated by low-level code and start being the person others ask for help. --- *"The shell is the first program that runs, the last program that helps you recover, and the interface through which you truly control the machine."* --- **Next Steps:** - After completing this project, proceed to **Project 17: Complete Bootloader** - The shell you build here becomes the interactive interface for your full bootloader - Consider adding UEFI shell support for modern systems