Project 16: Bootloader with Interactive Shell
Build a U-Boot-style command-line environment that runs before any operating system, providing memory inspection, disk operations, environment variables, and extensible command infrastructure.
Quick Reference
| Attribute | Value |
|---|---|
| Difficulty | ★★★★☆ Expert |
| Time Estimate | 2-3 weeks |
| Language | C (with assembly startup) |
| Alternative Languages | Rust (no_std), Pure Assembly |
| Prerequisites | Projects 1-4, C string handling, basic terminal I/O |
| Key Topics | Command parsing, line editing, memory safety, REPL design, extensible architecture |
| Portfolio Value | Strong side project demonstrating systems programming depth |
1. Learning Objectives
By completing this project, you will:
- Understand REPL Architecture: Learn how Read-Eval-Print Loops work at the lowest level, without any standard library support
- Master Terminal I/O in Bare Metal: Implement keyboard input and screen output using only BIOS interrupts or direct hardware access
- Build a Command Parser Without stdlib: Parse commands and arguments using only your own string manipulation code
- Implement Safe Memory Inspection: Create tools to safely examine and modify memory without crashing
- Design Extensible Systems: Create a modular command registration system that makes adding new commands trivial
- Handle State Management: Implement environment variables and persistent state in a constrained environment
- Apply Software Engineering in Constraints: Write clean, maintainable code with no heap, no OS, and minimal resources
2. Theoretical Foundation
2.1 Core Concepts
The REPL Pattern
Every interactive system follows the Read-Eval-Print Loop:
┌─────────────────────────────────────────────────────────────────┐
│ REPL Architecture │
├─────────────────────────────────────────────────────────────────┤
│ │
│ ┌─────────┐ │
│ │ START │ │
│ └────┬────┘ │
│ │ │
│ ▼ │
│ ┌─────────┐ User types │
│ │ READ │◄────command────┐ │
│ │ │ │ │
│ └────┬────┘ │ │
│ │ │ │
│ │ Parse input │ │
│ ▼ │ │
│ ┌─────────┐ │ │
│ │ EVALUATE│ │ │
│ │ │ │ │
│ └────┬────┘ │ │
│ │ │ │
│ │ Execute command │ │
│ ▼ │ │
│ ┌─────────┐ │ │
│ │ PRINT │─────────────────┘ │
│ │ │ │
│ └─────────┘ Show result, prompt again │
│ │
└─────────────────────────────────────────────────────────────────┘
Line Editor Components
A proper line editor needs multiple components working together:
┌─────────────────────────────────────────────────────────────────┐
│ Line Editor Components │
├─────────────────────────────────────────────────────────────────┤
│ │
│ Input Buffer: │
│ ┌────────────────────────────────────────────────────────────┐ │
│ │ m │ d │ │ 0 │ x │ 7 │ C │ 0 │ 0 │ _ │ _ │ _ │ _ │ _ │ _ │ │
│ └────────────────────────────────────────────────────────────┘ │
│ 0 1 2 3 4 5 6 7 8 ▲ │
│ │ │
│ cursor_pos = 9 │
│ │
│ State Variables: │
│ ┌──────────────────────────────────────────────────────────┐ │
│ │ buffer[MAX_LINE_LENGTH] - Character storage │ │
│ │ buffer_length - Current content length │ │
│ │ cursor_pos - Current cursor position │ │
│ │ history[MAX_HISTORY] - Previous commands │ │
│ │ history_index - Current history position │ │
│ └──────────────────────────────────────────────────────────┘ │
│ │
│ Key Handlers: │
│ ┌──────────────────────────────────────────────────────────┐ │
│ │ Backspace (0x08) → Delete char before cursor │ │
│ │ Delete (0x53) → Delete char at cursor │ │
│ │ Left (0x4B) → Move cursor left │ │
│ │ Right (0x4D) → Move cursor right │ │
│ │ Up (0x48) → Previous history entry │ │
│ │ Down (0x50) → Next history entry │ │
│ │ Home (0x47) → Move cursor to start │ │
│ │ End (0x4F) → Move cursor to end │ │
│ │ Enter (0x0D) → Submit command │ │
│ │ Tab (0x09) → Command completion (optional) │ │
│ └──────────────────────────────────────────────────────────┘ │
│ │
└─────────────────────────────────────────────────────────────────┘
Command Dispatch Architecture
Professional bootloaders use a table-driven command system:
┌─────────────────────────────────────────────────────────────────┐
│ Command Dispatch System │
├─────────────────────────────────────────────────────────────────┤
│ │
│ Command Table (Static Array): │
│ ┌──────────────────────────────────────────────────────────┐ │
│ │ struct command { │ │
│ │ const char *name; // "md" │ │
│ │ const char *help; // "Memory dump" │ │
│ │ const char *usage; // "md <addr> [len]" │ │
│ │ int (*handler)(int argc, char *argv[]); │ │
│ │ }; │ │
│ └──────────────────────────────────────────────────────────┘ │
│ │
│ ┌────────────────────────────────────────────────────────────┐ │
│ │ Index │ Name │ Handler │ Help │ │
│ ├───────┼────────┼──────────────────┼────────────────────────┤ │
│ │ 0 │ help │ cmd_help() │ Show available... │ │
│ │ 1 │ md │ cmd_memdump() │ Memory dump │ │
│ │ 2 │ mm │ cmd_memmod() │ Memory modify │ │
│ │ 3 │ disk │ cmd_disk() │ Disk operations │ │
│ │ 4 │ load │ cmd_load() │ Load file to memory │ │
│ │ 5 │ set │ cmd_set() │ Set env variable │ │
│ │ 6 │ print │ cmd_print() │ Print env variables │ │
│ │ 7 │ boot │ cmd_boot() │ Boot kernel │ │
│ │ ... │ ... │ ... │ ... │ │
│ │ N │ NULL │ NULL │ NULL (sentinel) │ │
│ └────────────────────────────────────────────────────────────┘ │
│ │
│ Dispatch Flow: │
│ │
│ Input: "md 0x7C00 64" │
│ │ │
│ ▼ │
│ ┌─────────────┐ │
│ │ Tokenize │ → argv[0]="md", argv[1]="0x7C00", argv[2]="64"│
│ └──────┬──────┘ argc=3 │
│ │ │
│ ▼ │
│ ┌─────────────┐ │
│ │ Find in │ → for each cmd: if strcmp(cmd.name, argv[0]) │
│ │ table │ │
│ └──────┬──────┘ │
│ │ │
│ ▼ │
│ ┌─────────────┐ │
│ │ Call handler│ → cmd.handler(argc, argv) │
│ └─────────────┘ │
│ │
└─────────────────────────────────────────────────────────────────┘
Memory Safety in Bootloaders
Unlike user-space programs, memory errors in bootloaders are catastrophic:
┌─────────────────────────────────────────────────────────────────┐
│ Memory Safety Zones │
├─────────────────────────────────────────────────────────────────┤
│ │
│ Real Mode Memory Map (1MB): │
│ │
│ 0x00000 ┌───────────────────────────────┐ │
│ │ Interrupt Vector Table (IVT) │ ← DO NOT MODIFY │
│ 0x00400 ├───────────────────────────────┤ (unless you know │
│ │ BIOS Data Area (BDA) │ what you're doing)│
│ 0x00500 ├───────────────────────────────┤ │
│ │ Free memory (usable) │ ← SAFE for variables │
│ │ │ │
│ 0x07C00 ├───────────────────────────────┤ │
│ │ YOUR BOOTLOADER CODE │ ← Don't overwrite! │
│ │ (512 bytes minimum) │ │
│ 0x07E00 ├───────────────────────────────┤ │
│ │ Free memory (usable) │ ← SAFE for data │
│ │ │ │
│ 0x80000 ├───────────────────────────────┤ │
│ │ Extended BIOS Data Area │ ← Variable location │
│ 0xA0000 ├───────────────────────────────┤ │
│ │ Video Memory (VGA) │ ← Write = screen │
│ 0xC0000 ├───────────────────────────────┤ │
│ │ Video BIOS ROM │ ← READ ONLY │
│ 0xF0000 ├───────────────────────────────┤ │
│ │ System BIOS ROM │ ← READ ONLY │
│ 0xFFFFF └───────────────────────────────┘ │
│ │
│ Safety Rules for Memory Commands: │
│ ┌──────────────────────────────────────────────────────────┐ │
│ │ 1. Validate address range before access │ │
│ │ 2. Warn before writing to known critical regions │ │
│ │ 3. Never dereference NULL (0x0000-0x00FF especially) │ │
│ │ 4. ROM writes silently fail - not an error │ │
│ │ 5. Video memory writes are visible - useful for testing │ │
│ └──────────────────────────────────────────────────────────┘ │
│ │
└─────────────────────────────────────────────────────────────────┘
2.2 Why This Matters
Real-World Applications:
-
U-Boot: The most widely used bootloader for embedded systems (routers, phones, IoT devices) provides exactly this interface. Understanding U-Boot’s shell helps you work with billions of devices.
-
GRUB Command Line: When Linux fails to boot, the GRUB shell is your recovery tool. Knowing how it works helps you fix systems.
-
Debugging Hardware: Memory inspection commands let you examine hardware registers, debug device drivers, and understand what firmware has configured.
-
Firmware Development: Many firmware systems (UEFI Shell, coreboot) provide interactive environments for testing and configuration.
-
Bootkit/Rootkit Analysis: Security researchers need to understand bootloader shells to analyze malware that persists at this level.
Career Impact:
- Embedded systems engineers use U-Boot daily
- Kernel developers debug boot issues through bootloader shells
- Security researchers analyze firmware through similar interfaces
- DevOps engineers troubleshoot boot failures using these tools
2.3 Historical Context
The concept of an interactive bootloader shell evolved from:
- Monitor Programs (1960s): Early computers had simple “monitor” programs that let operators examine memory and load programs
- ROM BASIC (1980s): IBM PCs could boot directly into BASIC if no disk was present
- LILO (1992): Early Linux bootloader with command-line options
- GRUB (1995): GNU Grand Unified Bootloader introduced rich command-line
- U-Boot (2000): Universal Bootloader became the standard for embedded systems
- UEFI Shell (2005): Modern firmware provides DOS-like shell environment
2.4 Common Misconceptions
Misconception 1: “A bootloader is just 512 bytes”
- Reality: The 512-byte limit is only for the MBR. Stage 2 can be any size, and that’s where the shell lives.
Misconception 2: “You need an OS for a command line”
- Reality: A REPL needs only keyboard input, screen output, and basic parsing. No OS required.
Misconception 3: “C requires a standard library”
- Reality: You can write C with no stdlib. You just implement what you need (or nothing at all).
Misconception 4: “Memory access is always safe”
- Reality: In a bootloader, bad memory access = instant reboot or freeze. No segfaults, no error messages.
Misconception 5: “This is obsolete with UEFI”
- Reality: UEFI Shell exists and is widely used. The concepts transfer directly.
3. Project Specification
3.1 What You Will Build
A fully interactive bootloader shell that provides:
- Line Editor: Full-featured input with backspace, cursor movement, and command history
- Command Parser: Tokenizes input into command and arguments, handles quoting
- Memory Commands:
md(dump),mm(modify),mf(find),mc(compare) - Disk Commands:
disk read,disk info, sector inspection - File Commands:
load(if filesystem support from Project 5) - Environment System:
set,print,unset, variable substitution - Boot Command: Load and execute kernel with configurable address
- Extensible Architecture: Adding new commands requires only a table entry
3.2 Functional Requirements
| ID | Requirement | Priority |
|---|---|---|
| FR1 | Shell displays prompt and accepts keyboard input | Must Have |
| FR2 | Backspace deletes previous character correctly | Must Have |
| FR3 | Enter key executes current command | Must Have |
| FR4 | Unknown commands display error message | Must Have |
| FR5 | help command lists all available commands |
Must Have |
| FR6 | md <addr> [len] displays hexdump of memory |
Must Have |
| FR7 | mm <addr> <val> modifies single byte in memory |
Must Have |
| FR8 | set <var>=<value> stores environment variable |
Must Have |
| FR9 | print [var] displays environment variables |
Must Have |
| FR10 | boot [addr] jumps to specified address |
Must Have |
| FR11 | Up/Down arrows navigate command history | Should Have |
| FR12 | Left/Right arrows move cursor within line | Should Have |
| FR13 | disk read <sector> [count] reads disk sectors |
Should Have |
| FR14 | Variable substitution in commands ($var) |
Should Have |
| FR15 | Tab completion for commands | Nice to Have |
| FR16 | mf <addr> <len> <pattern> finds pattern in memory |
Nice to Have |
3.3 Non-Functional Requirements
| ID | Requirement | Target |
|---|---|---|
| NFR1 | Total shell code fits in 32KB | < 32,768 bytes |
| NFR2 | Command execution latency | < 100ms for simple commands |
| NFR3 | Maximum command line length | 256 characters |
| NFR4 | Maximum environment variables | 32 variables |
| NFR5 | Command history depth | 16 entries |
| NFR6 | Works in both Real Mode and Protected Mode | Cross-mode compatible |
3.4 Example Usage / Output
Bootloader Shell v1.0
Type 'help' for available commands
boot> help
Available commands:
help - Display this help message
md <addr> [len] - Memory dump (default len=256)
mm <addr> <val> - Memory modify (byte)
mw <addr> <val> - Memory modify (word)
md.l <addr> [len] - Memory dump (long/32-bit)
mf <a> <l> <pat> - Memory find pattern
mc <a1> <a2> <l> - Memory compare
disk info - Show disk information
disk read <s> [n] - Read sector(s) to buffer
load <filename> - Load file to memory
set <var>=<val> - Set environment variable
print [var] - Print environment variable(s)
unset <var> - Remove environment variable
boot [addr] - Boot kernel at address (default: $loadaddr)
reset - Reset system
boot> md 0x7C00 64
00007C00: EB 3C 90 4D 53 44 4F 53 35 2E 30 00 02 01 01 00 |.<.MSDOS5.0.....|
00007C10: 02 E0 00 40 0B F0 09 00 12 00 02 00 00 00 00 00 |...@............|
00007C20: 00 00 00 00 00 00 29 12 34 56 78 4E 4F 20 4E 41 |......)..4VxNO NA|
00007C30: 4D 45 20 20 20 20 46 41 54 31 32 20 20 20 8E D0 |ME FAT12 ..|
boot> mm 0x500 0x41
Writing 0x41 to 0x00000500
Verify: 0x41 OK
boot> set loadaddr=0x100000
boot> set kernel=kernel.bin
boot> print
Environment:
loadaddr=0x100000
kernel=kernel.bin
boot> load $kernel
Loading kernel.bin to $loadaddr (0x100000)...
Read 131072 bytes (256 sectors)
Done.
boot> boot
Booting from 0x100000...
[Kernel takes over]
3.5 Real World Outcome
When complete, you will have:
- A Working Bootloader Shell: An interactive environment that boots before any OS
- Debugging Capability: Tools to inspect memory, useful for kernel development
- Understanding of U-Boot: Direct experience with how the most popular embedded bootloader works
- Portable Components: String handling, parsing code usable in other bare-metal projects
- Portfolio Piece: Impressive demonstration of systems programming skills
4. Solution Architecture
4.1 High-Level Design
┌─────────────────────────────────────────────────────────────────────────────┐
│ Bootloader Shell Architecture │
├─────────────────────────────────────────────────────────────────────────────┤
│ │
│ ┌─────────────────────────────────────────────────────────────────────┐ │
│ │ Application Layer │ │
│ │ ┌──────────┐ ┌──────────┐ ┌──────────┐ ┌──────────┐ ┌──────────┐ │ │
│ │ │ cmd_help │ │ cmd_md │ │ cmd_mm │ │ cmd_disk │ │ cmd_boot │ │ │
│ │ └──────────┘ └──────────┘ └──────────┘ └──────────┘ └──────────┘ │ │
│ │ ┌──────────┐ ┌──────────┐ ┌──────────┐ ┌──────────┐ │ │
│ │ │ cmd_set │ │cmd_print │ │ cmd_load │ │cmd_reset │ ... more │ │
│ │ └──────────┘ └──────────┘ └──────────┘ └──────────┘ │ │
│ └────────────────────────────────┬────────────────────────────────────┘ │
│ │ │
│ ▼ │
│ ┌─────────────────────────────────────────────────────────────────────┐ │
│ │ Shell Core Layer │ │
│ │ ┌────────────────┐ ┌────────────────┐ ┌────────────────────────┐ │ │
│ │ │ Line Editor │ │ Command Parser │ │ Command Dispatcher │ │ │
│ │ │ │ │ │ │ │ │ │
│ │ │ - read_line() │ │ - tokenize() │ │ - find_command() │ │ │
│ │ │ - handle_key() │ │ - parse_int() │ │ - dispatch() │ │ │
│ │ │ - history[] │ │ - expand_vars()│ │ - commands[] │ │ │
│ │ └────────────────┘ └────────────────┘ └────────────────────────┘ │ │
│ └────────────────────────────────┬────────────────────────────────────┘ │
│ │ │
│ ▼ │
│ ┌─────────────────────────────────────────────────────────────────────┐ │
│ │ Services Layer │ │
│ │ ┌──────────────┐ ┌──────────────┐ ┌──────────────────────────┐ │ │
│ │ │ Environment │ │ Memory │ │ Disk/Filesystem │ │ │
│ │ │ │ │ │ │ │ │ │
│ │ │ - env[] │ │ - mem_read() │ │ - disk_read_sectors() │ │ │
│ │ │ - get_env() │ │ - mem_write()│ │ - fs_read_file() │ │ │
│ │ │ - set_env() │ │ - mem_dump() │ │ - fs_list_dir() │ │ │
│ │ └──────────────┘ └──────────────┘ └──────────────────────────┘ │ │
│ └────────────────────────────────┬────────────────────────────────────┘ │
│ │ │
│ ▼ │
│ ┌─────────────────────────────────────────────────────────────────────┐ │
│ │ Hardware Abstraction Layer │ │
│ │ ┌──────────────┐ ┌──────────────┐ ┌──────────────────────────┐ │ │
│ │ │ Console │ │ Disk │ │ Timer │ │ │
│ │ │ │ │ │ │ │ │ │
│ │ │ - getchar() │ │ - int_13h() │ │ - get_ticks() │ │ │
│ │ │ - putchar() │ │ - ata_read() │ │ - delay_ms() │ │ │
│ │ │ - puts() │ │ │ │ │ │ │
│ │ │ - printf() │ │ │ │ │ │ │
│ │ └──────────────┘ └──────────────┘ └──────────────────────────┘ │ │
│ └─────────────────────────────────────────────────────────────────────┘ │
│ │
└─────────────────────────────────────────────────────────────────────────────┘
4.2 Key Components
Component 1: Console I/O (console.c)
Provides keyboard input and screen output:
// Console functions needed
int console_getchar(void); // Blocking read, returns ASCII or extended key
void console_putchar(char c); // Output single character
void console_puts(const char *s); // Output string
void console_clear(void); // Clear screen
void console_set_cursor(int x, int y); // Position cursor
int console_printf(const char *fmt, ...); // Formatted output
Component 2: Line Editor (line.c)
Handles input editing:
// Line editor state
struct line_editor {
char buffer[MAX_LINE];
int length;
int cursor;
char history[MAX_HISTORY][MAX_LINE];
int history_count;
int history_pos;
};
// Line editor functions
int line_read(struct line_editor *ed, char *output); // Get line from user
void line_handle_key(struct line_editor *ed, int key);
void line_redraw(struct line_editor *ed);
Component 3: Command Parser (parser.c)
Tokenizes and processes commands:
// Parser functions
int parse_command(const char *input, int *argc, char **argv); // Split into tokens
unsigned long parse_number(const char *s, int *error); // Parse hex/dec
int expand_variables(char *output, const char *input); // $var expansion
Component 4: Command Dispatcher (dispatch.c)
Routes commands to handlers:
// Command structure
typedef int (*cmd_handler_t)(int argc, char **argv);
struct command {
const char *name;
const char *help;
const char *usage;
cmd_handler_t handler;
};
// Dispatcher functions
struct command *find_command(const char *name);
int dispatch_command(int argc, char **argv);
void register_command(struct command *cmd); // For dynamic registration
Component 5: Environment (env.c)
Manages variables:
// Environment entry
struct env_entry {
char name[32];
char value[128];
int used;
};
// Environment functions
const char *env_get(const char *name);
int env_set(const char *name, const char *value);
int env_unset(const char *name);
void env_print_all(void);
4.3 Data Structures
// =============================================================
// Core Data Structures for Bootloader Shell
// =============================================================
#define MAX_LINE_LENGTH 256
#define MAX_HISTORY 16
#define MAX_ARGC 16
#define MAX_ENV_VARS 32
#define MAX_ENV_NAME 32
#define MAX_ENV_VALUE 128
// -------------------------------------------------------------
// Line Editor State
// -------------------------------------------------------------
typedef struct {
char buffer[MAX_LINE_LENGTH]; // Current input buffer
uint16_t length; // Current content length
uint16_t cursor; // Cursor position (0 to length)
// History ring buffer
char history[MAX_HISTORY][MAX_LINE_LENGTH];
uint8_t history_count; // Number of entries in history
uint8_t history_write; // Next write position
int8_t history_browse; // Current browse position (-1 = current)
} LineEditor;
// -------------------------------------------------------------
// Parsed Command
// -------------------------------------------------------------
typedef struct {
int argc;
char *argv[MAX_ARGC];
char token_buffer[MAX_LINE_LENGTH]; // Backing storage for tokens
} ParsedCommand;
// -------------------------------------------------------------
// Command Table Entry
// -------------------------------------------------------------
typedef int (*CommandHandler)(int argc, char **argv);
typedef struct {
const char *name; // Command name (e.g., "md")
const char *help; // Short help text
const char *usage; // Usage string (e.g., "md <addr> [len]")
CommandHandler handler; // Function pointer
uint8_t min_args; // Minimum required arguments
uint8_t max_args; // Maximum allowed arguments
} Command;
// -------------------------------------------------------------
// Environment Variable Entry
// -------------------------------------------------------------
typedef struct {
char name[MAX_ENV_NAME];
char value[MAX_ENV_VALUE];
uint8_t used; // 1 if slot is occupied, 0 if free
} EnvEntry;
// -------------------------------------------------------------
// Global Shell State
// -------------------------------------------------------------
typedef struct {
LineEditor line;
EnvEntry env[MAX_ENV_VARS];
uint32_t load_address; // Default kernel load address
uint8_t echo_enabled; // Echo commands before execution
uint8_t verbose; // Verbose output mode
} ShellState;
// -------------------------------------------------------------
// Memory Operation Results
// -------------------------------------------------------------
typedef struct {
uint32_t address;
uint32_t length;
uint8_t *data;
int error; // 0 = success, negative = error code
} MemoryResult;
// -------------------------------------------------------------
// Disk Read Buffer
// -------------------------------------------------------------
#define SECTOR_SIZE 512
typedef struct {
uint8_t data[SECTOR_SIZE * 16]; // Buffer for up to 16 sectors
uint32_t start_sector;
uint8_t sector_count;
uint8_t valid; // 1 if buffer contains valid data
} DiskBuffer;
4.4 Algorithm Overview
Main Shell Loop
FUNCTION shell_main():
initialize_console()
initialize_environment()
show_banner()
FOREVER:
show_prompt()
line = read_line()
IF line is empty:
CONTINUE
add_to_history(line)
expand_variables(line, expanded_line)
parse_command(expanded_line, &argc, argv)
IF argc == 0:
CONTINUE
cmd = find_command(argv[0])
IF cmd == NULL:
print_error("Unknown command: ", argv[0])
CONTINUE
result = cmd->handler(argc, argv)
IF result != 0:
print_error("Command failed with code ", result)
Command Parsing Algorithm
FUNCTION parse_command(input, argc, argv):
*argc = 0
state = WHITESPACE
token_start = NULL
FOR each character c in input:
SWITCH state:
CASE WHITESPACE:
IF c is whitespace:
CONTINUE
ELSE IF c == '"':
state = QUOTED
token_start = current_position + 1
ELSE:
state = TOKEN
token_start = current_position
CASE TOKEN:
IF c is whitespace:
terminate_token()
argv[(*argc)++] = token_start
state = WHITESPACE
ELSE IF c == '"':
state = QUOTED
CASE QUOTED:
IF c == '"':
state = TOKEN
ELSE IF c == '\\':
state = ESCAPE
CASE ESCAPE:
// Handle \n, \t, \\, \"
state = QUOTED
IF state == TOKEN:
terminate_token()
argv[(*argc)++] = token_start
argv[*argc] = NULL
RETURN SUCCESS
5. Implementation Guide
5.1 Development Environment Setup
# Required tools
brew install nasm qemu gcc # macOS
# or
apt install nasm qemu-system-x86 gcc # Ubuntu/Debian
# Create project directory
mkdir -p bootloader-shell/{src,include,build,test}
cd bootloader-shell
# Create Makefile
cat > Makefile << 'EOF'
CC = gcc
AS = nasm
LD = ld
OBJCOPY = objcopy
# Freestanding C flags (no standard library)
CFLAGS = -m32 -ffreestanding -fno-pie -fno-stack-protector \
-nostdlib -nostdinc -O2 -Wall -Wextra \
-I include
ASFLAGS = -f elf32
LDFLAGS = -m elf_i386 -T linker.ld --oformat binary
# Source files
C_SOURCES = $(wildcard src/*.c)
ASM_SOURCES = $(wildcard src/*.asm)
OBJECTS = $(C_SOURCES:src/%.c=build/%.o) \
$(ASM_SOURCES:src/%.asm=build/%.o)
all: build/bootloader.bin
build/bootloader.bin: $(OBJECTS)
$(LD) $(LDFLAGS) -o $@ $^
build/%.o: src/%.c
$(CC) $(CFLAGS) -c $< -o $@
build/%.o: src/%.asm
$(AS) $(ASFLAGS) $< -o $@
run: build/bootloader.bin
qemu-system-i386 -drive format=raw,file=$<
debug: build/bootloader.bin
qemu-system-i386 -drive format=raw,file=$< -s -S &
gdb -ex "target remote localhost:1234" \
-ex "set architecture i8086"
clean:
rm -rf build/*
.PHONY: all run debug clean
EOF
5.2 Project Structure
bootloader-shell/
├── Makefile
├── linker.ld # Linker script
├── include/
│ ├── types.h # Basic types (uint8_t, etc.)
│ ├── console.h # Console I/O functions
│ ├── string.h # String manipulation
│ ├── memory.h # Memory operations
│ ├── disk.h # Disk access
│ ├── env.h # Environment variables
│ ├── command.h # Command structures
│ └── shell.h # Main shell interface
├── src/
│ ├── boot.asm # Stage 1: 512-byte MBR loader
│ ├── stage2.asm # Stage 2: Assembly startup, call main
│ ├── main.c # Shell main loop
│ ├── console.c # BIOS-based console I/O
│ ├── string.c # String functions (no stdlib)
│ ├── memory.c # Memory access functions
│ ├── disk.c # Disk read via INT 13h
│ ├── env.c # Environment variable storage
│ ├── parser.c # Command line parsing
│ ├── line.c # Line editor with history
│ └── commands/
│ ├── cmd_help.c # help command
│ ├── cmd_memory.c # md, mm, mf, mc commands
│ ├── cmd_disk.c # disk command
│ ├── cmd_env.c # set, print, unset commands
│ ├── cmd_boot.c # boot command
│ └── cmd_misc.c # reset, echo, etc.
├── test/
│ └── test_parser.c # Host-side unit tests
└── docs/
└── commands.md # Command reference
5.3 The Core Question You’re Answering
“How do you build an interactive system with no operating system support?”
This question encompasses:
- How do you read keyboard input without an OS?
- How do you display text without printf?
- How do you manage memory without malloc?
- How do you build reusable components without libraries?
- How do you make a system extensible in constrained environments?
5.4 Concepts You Must Understand First
Before implementing, verify you can answer these questions:
- BIOS Interrupts: What is INT 16h and how do you use it to read keys?
- Reference: “The Art of Assembly Language” Chapter 10
- Keyboard Scan Codes: What’s the difference between ASCII codes and scan codes?
- Reference: OSDev Wiki - Keyboard
- C Without stdlib: How do you implement strlen(), strcmp(), memcpy()?
- Reference: “Effective C” Chapter 2
- Function Pointers: How do you store and call functions dynamically?
- Reference: “C Programming: A Modern Approach” Chapter 17
- Number Parsing: How do you convert “0x7C00” string to integer?
- Reference: “The C Programming Language” Section 2.7
- Memory Layout: Where is it safe to store variables in real mode?
- Reference: OSDev Wiki - Memory Map
5.5 Questions to Guide Your Design
Console Design:
- Will you use BIOS INT 10h/16h or direct VGA/keyboard access?
- How will you handle extended keys (arrows, function keys)?
- Will you support ANSI escape codes for cursor movement?
Line Editor Design:
- What’s the maximum line length?
- How many history entries will you store?
- Will you support command completion?
- How will you handle terminal resize?
Parser Design:
- Will you support quoted strings with spaces?
- Will you support escape sequences in strings?
- How will you handle variable substitution?
- What about command chaining (
;) or pipes (|)?
Command System Design:
- Static table or dynamic registration?
- How will you handle subcommands (e.g.,
disk readvsdisk info)? - Will commands return status codes?
- How will you implement help for each command?
Memory Safety:
- Will you validate addresses before access?
- What regions will you protect?
- How will you report access errors?
5.6 Thinking Exercise
Before writing any code, complete this exercise:
Exercise: Trace a Command Execution
Given the command md 0x7C00 32, trace through every step:
- Input Phase: How does each character get from keyboard to buffer?
- Parsing Phase: How is the string split into tokens?
- Number Conversion: How is “0x7C00” converted to 31744?
- Dispatch Phase: How is cmd_memdump() found and called?
- Execution Phase: How are the 32 bytes read and formatted?
- Output Phase: How does the hexdump appear on screen?
Write pseudocode for each phase. This will reveal gaps in your understanding.