Project 1: CHIP-8 Interpreter Emulator
Build a complete emulator for the CHIP-8 virtual machine that runs classic games like Pong, Tetris, and Space Invaders, teaching you the fundamental emulation loop that powers every hypervisor.
Quick Reference
| Attribute | Value |
|---|---|
| Difficulty | Beginner |
| Time Estimate | 1 week (20-40 hours) |
| Language | C |
| Alternative Languages | Rust, Go, Zig |
| Prerequisites | Basic C, hexadecimal/binary, willingness to read documentation |
| Key Topics | Fetch-Decode-Execute cycle, Opcode decoding, Register state management, Timer synchronization, Input/Output emulation |
| Main Book | “Computer Systems: A Programmer’s Perspective” by Bryant & O’Hallaron, Chapter 4 |
1. Learning Objectives
After completing this project, you will be able to:
- Explain the fetch-decode-execute cycle that drives all CPUs and emulators
- Implement an instruction decoder that parses binary/hexadecimal opcodes
- Maintain CPU state (registers, program counter, stack) between instruction executions
- Synchronize emulated time with real-world time (60Hz timers)
- Map virtual input devices to host system input
- Render a virtual display using framebuffer concepts
- Debug emulation issues by inspecting CPU state and memory dumps
- Understand why QEMU, VirtualBox, and other VMMs use these same fundamental patterns
2. Theoretical Foundation
2.1 Core Concepts
The Fetch-Decode-Execute Cycle
Every CPU, from the simplest microcontroller to the most complex server processor, operates on the same fundamental principle:
THE FETCH-DECODE-EXECUTE CYCLE:
================================
┌──────────────────────────────────────────────────────────────┐
│ │
│ ┌────────────┐ ┌────────────┐ ┌────────────┐ │
│ │ │ │ │ │ │ │
│ ──▶│ FETCH │───▶│ DECODE │───▶│ EXECUTE │──┐ │
│ │ │ │ │ │ │ │ │
│ │ Read bytes │ │ Parse │ │ Perform │ │ │
│ │ from memory│ │ opcode │ │ operation │ │ │
│ │ at PC │ │ fields │ │ │ │ │
│ └────────────┘ └────────────┘ └────────────┘ │ │
│ ▲ │ │
│ │ │ │
│ └──────────────────────────────────────────────┘ │
│ (Update PC, repeat) │
│ │
└──────────────────────────────────────────────────────────────┘
DETAILED BREAKDOWN:
1. FETCH:
- Read instruction bytes from memory[PC]
- CHIP-8: Read 2 bytes (16-bit instruction)
- x86: Read 1-15 bytes (variable length)
- ARM: Read 4 bytes (fixed length)
2. DECODE:
- Parse the instruction format
- Extract opcode, operands, addressing modes
- Determine which operation to perform
3. EXECUTE:
- Perform the operation (arithmetic, memory access, control flow)
- Update registers, memory, or program counter
- Handle side effects (flags, interrupts)
This same cycle is what QEMU implements when emulating any architecture. Understanding it deeply is the foundation of all virtualization.
CHIP-8 Architecture Overview
CHIP-8 is a virtual machine specification from the 1970s, designed to make game programming easier on early microcomputers. Its simplicity makes it the perfect learning platform:
CHIP-8 SYSTEM ARCHITECTURE:
============================
┌──────────────────────────────────────────────────────────────────────┐
│ MEMORY MAP (4KB) │
├──────────────────────────────────────────────────────────────────────┤
│ │
│ 0x000 ┌───────────────────────────────────────────────────────────┐ │
│ │ Reserved for Interpreter (Original) │ │
│ │ 0x000-0x04F: Font sprites (80 bytes for 0-F) │ │
│ │ 0x050-0x1FF: Available (often unused) │ │
│ 0x200 ├───────────────────────────────────────────────────────────┤ │
│ │ PROGRAM SPACE │ │
│ │ │ │
│ │ Your CHIP-8 ROM loads here │ │
│ │ Programs start executing at 0x200 │ │
│ │ │ │
│ │ Size: 0x200 to 0xFFF = 3,584 bytes max │ │
│ │ │ │
│ 0xFFF └───────────────────────────────────────────────────────────┘ │
│ │
└──────────────────────────────────────────────────────────────────────┘
┌──────────────────────────────────────────────────────────────────────┐
│ CPU REGISTERS │
├──────────────────────────────────────────────────────────────────────┤
│ │
│ General Purpose (8-bit): │
│ ┌────┬────┬────┬────┬────┬────┬────┬────┐ │
│ │ V0 │ V1 │ V2 │ V3 │ V4 │ V5 │ V6 │ V7 │ Registers V0-VF │
│ ├────┼────┼────┼────┼────┼────┼────┼────┤ │
│ │ V8 │ V9 │ VA │ VB │ VC │ VD │ VE │ VF │ VF = flags register │
│ └────┴────┴────┴────┴────┴────┴────┴────┘ │
│ │
│ Special Purpose: │
│ ┌──────────────────────────────────────────────────────────────┐ │
│ │ I (16-bit) │ Index Register - points to memory locations │ │
│ ├──────────────┼───────────────────────────────────────────────┤ │
│ │ PC (16-bit) │ Program Counter - current instruction address │ │
│ ├──────────────┼───────────────────────────────────────────────┤ │
│ │ SP (8-bit) │ Stack Pointer - current stack level (0-15) │ │
│ ├──────────────┼───────────────────────────────────────────────┤ │
│ │ DT (8-bit) │ Delay Timer - decrements at 60Hz │ │
│ ├──────────────┼───────────────────────────────────────────────┤ │
│ │ ST (8-bit) │ Sound Timer - beeps while non-zero │ │
│ └──────────────┴───────────────────────────────────────────────┘ │
│ │
│ Stack (16 x 16-bit): │
│ ┌──────────────────────────────────────────────────────────────┐ │
│ │ stack[0..15] │ Call stack for subroutine return addresses │ │
│ └──────────────────────────────────────────────────────────────┘ │
│ │
└──────────────────────────────────────────────────────────────────────┘
┌──────────────────────────────────────────────────────────────────────┐
│ DISPLAY (64x32 pixels) │
├──────────────────────────────────────────────────────────────────────┤
│ │
│ ┌────────────────────────────────────────────────────────────────┐ │
│ │ 64 pixels wide │ │
│ │ ┌──────────────────────────────────────────────────────────┐ │ │
│ │ │ (0,0) (63,0) │ │ │
│ │ │ │ │ │
│ │ │ 32 pixels tall │ │ │
│ │ │ │ │ │
│ │ │ (0,31) (63,31) │ │ │
│ │ └──────────────────────────────────────────────────────────┘ │ │
│ └────────────────────────────────────────────────────────────────┘ │
│ │
│ - Monochrome (1-bit per pixel: on/off) │
│ - Sprites are XOR'd onto the display │
│ - VF is set if any pixel is erased (collision detection) │
│ │
└──────────────────────────────────────────────────────────────────────┘
┌──────────────────────────────────────────────────────────────────────┐
│ KEYPAD (16 keys) │
├──────────────────────────────────────────────────────────────────────┤
│ │
│ Original CHIP-8 Keypad: Typical PC Mapping: │
│ │
│ ┌───┬───┬───┬───┐ ┌───┬───┬───┬───┐ │
│ │ 1 │ 2 │ 3 │ C │ │ 1 │ 2 │ 3 │ 4 │ │
│ ├───┼───┼───┼───┤ ├───┼───┼───┼───┤ │
│ │ 4 │ 5 │ 6 │ D │ ──▶ │ Q │ W │ E │ R │ │
│ ├───┼───┼───┼───┤ ├───┼───┼───┼───┤ │
│ │ 7 │ 8 │ 9 │ E │ │ A │ S │ D │ F │ │
│ ├───┼───┼───┼───┤ ├───┼───┼───┼───┤ │
│ │ A │ 0 │ B │ F │ │ Z │ X │ C │ V │ │
│ └───┴───┴───┴───┘ └───┴───┴───┴───┘ │
│ │
└──────────────────────────────────────────────────────────────────────┘
Instruction Encoding
CHIP-8 uses 35 simple instructions, all 16 bits (2 bytes) in length:
CHIP-8 INSTRUCTION FORMAT:
===========================
All instructions are 2 bytes, stored big-endian (MSB first).
The first nibble (4 bits) determines the instruction class.
Notation:
NNN = 12-bit address (0x000 to 0xFFF)
NN = 8-bit constant (0x00 to 0xFF)
N = 4-bit constant (0x0 to 0xF)
X = 4-bit register identifier (V0-VF)
Y = 4-bit register identifier (V0-VF)
INSTRUCTION ENCODING EXAMPLES:
==============================
0x1NNN - Jump to address NNN
┌───────────────────────────────────┐
│ 0001 │ N │ N │ N │ │
│ ──── │ ── │ ─── │ ─── │ │
│ 0x1 │ address │ │
└───────────────────────────────────┘
Example: 0x1234 = Jump to 0x234
0x6XNN - Set VX = NN
┌───────────────────────────────────┐
│ 0110 │ X │ N │ N │ │
│ ──── │ ── │ ─── │ ─── │ │
│ 0x6 │ reg│ value │ │
└───────────────────────────────────┘
Example: 0x6A42 = Set VA = 0x42
0x8XY4 - Add VY to VX, VF = carry
┌───────────────────────────────────┐
│ 1000 │ X │ Y │ 0100│ │
│ ──── │ ── │ ─── │ ─── │ │
│ 0x8 │ VX │ VY │ 0x4 │ │
└───────────────────────────────────┘
Example: 0x8124 = V1 = V1 + V2, VF = carry
DECODING STRATEGY:
==================
uint16_t opcode = (memory[PC] << 8) | memory[PC + 1];
uint8_t first_nibble = (opcode & 0xF000) >> 12; // Instruction class
uint8_t X = (opcode & 0x0F00) >> 8; // Register X
uint8_t Y = (opcode & 0x00F0) >> 4; // Register Y
uint8_t N = (opcode & 0x000F); // 4-bit constant
uint8_t NN = (opcode & 0x00FF); // 8-bit constant
uint16_t NNN = (opcode & 0x0FFF); // 12-bit address
2.2 Why This Matters
Building a CHIP-8 emulator teaches fundamental concepts that apply to all virtualization:
-
Hypervisor CPU Virtualization: QEMU’s TCG (Tiny Code Generator) uses the same fetch-decode-execute loop when emulating foreign architectures. Your CHIP-8 emulator does exactly this.
-
Instruction Set Architecture Understanding: Every CPU has an ISA. Understanding one (CHIP-8’s 35 instructions) makes learning others (x86’s thousands, ARM’s hundreds) approachable.
-
State Machine Programming: An emulator is a state machine. This pattern appears everywhere: network protocol handlers, game engines, parser implementations.
-
Hardware-Software Interface: Emulating display, keyboard, and timers teaches how hardware and software communicate - essential for device driver development and understanding QEMU’s device emulation.
-
Timing and Synchronization: Matching emulated time to real time is fundamental to any virtual machine. If your game runs too fast or too slow, you haven’t solved synchronization.
2.3 Historical Context
1970s - CHIP-8 Origins: Joseph Weisbecker designed CHIP-8 for the COSMAC VIP kit computer (1977). It was a virtual machine that made game programming accessible - you wrote CHIP-8 code, and an interpreter ran it.
Why Virtual Machines Then?: Early computers had tiny memories (2-4KB). A bytecode interpreter saved memory compared to native machine code. Sound familiar? Java, Python, and JavaScript all use similar approaches today.
The Connection to Modern VMs: CHIP-8 was doing in 1977 what we still do today:
- Bytecode interpretation (like JVM, CPython)
- Cross-platform abstraction (like Docker containers)
- Hardware independence (like QEMU)
2.4 Common Misconceptions
Misconception 1: “Emulation is slow and outdated”
- Reality: QEMU’s TCG translates code at near-native speed. Modern JIT compilers in Java/JavaScript are extremely fast. Emulation enables portability, security isolation, and backward compatibility.
Misconception 2: “I need to understand hardware deeply”
- Reality: You’re emulating a specification, not physical hardware. CHIP-8’s spec is a few pages. You implement behavior, not transistors.
Misconception 3: “The main loop is the hard part”
- Reality: The main loop is simple. The hard parts are: subtle opcode semantics, timing accuracy, and input handling edge cases.
Misconception 4: “I can just interpret each instruction”
- Reality: You need to handle timing (60Hz timers), display updates (potentially at different rates), and input (asynchronously). The emulation loop is more than just executing instructions.
Misconception 5: “My emulator should run as fast as possible”
- Reality: Running too fast makes games unplayable. You must throttle to the original timing (typically 500-1000 instructions per second for CHIP-8).
3. Project Specification
3.1 What You Will Build
A complete CHIP-8 emulator that:
- Loads and executes CHIP-8 ROM files
- Implements all 35 standard CHIP-8 instructions
- Renders the 64x32 pixel display in a window
- Maps keyboard input to CHIP-8’s 16-key hex keypad
- Runs delay and sound timers at 60Hz
- Achieves correct timing so games are playable
3.2 Functional Requirements
- Load ROM files from 0x200 onwards
- Initialize font sprites at 0x000-0x04F
- Implement all 35 opcodes correctly
- Render 64x32 monochrome display
- Handle 16-key input
- Decrement delay/sound timers at 60Hz
- Play a beep when sound timer > 0
- Run at approximately 500-700 instructions per second (configurable)
3.3 Non-Functional Requirements
- Portable C code (Linux, macOS, Windows)
- Uses SDL2 for graphics, input, and audio
- Clean separation between emulator core and platform code
- Well-documented opcode implementations
- Debug mode showing CPU state
3.4 Example Usage / Output
# Basic usage
$ ./chip8 roms/PONG.ch8
# Opens window with Pong game running
# Debug mode
$ ./chip8 --debug roms/PONG.ch8
[0200] 6A02 LD VA, 02 VA=02
[0202] 6B0C LD VB, 0C VB=0C
[0204] 6C3F LD VC, 3F VC=3F
[0206] 6D0C LD VD, 0C VD=0C
[0208] A2EA LD I, 2EA I=02EA
...
Press SPACE to step, R to run
# Verbose timing
$ ./chip8 --verbose roms/TETRIS.ch8
[CHIP8] ROM loaded: 294 bytes
[CHIP8] Running at 540 IPS
[CHIP8] Frame time: 16.67ms (60 FPS)
[CHIP8] Timer decremented: DT=45, ST=0
3.5 Real World Outcome
When complete, your emulator will run classic games:
PONG.ch8 Running in Emulator:
==============================
┌────────────────────────────────────────────────────────────────────┐
│ │
│ │
│ ██ ██ │
│ ██ ██ │
│ ██ ● ██ │
│ ██ ██ │
│ ██ ██ │
│ │
│ │
│ │
│ Score: 3 Score: 2 │
│ │
└────────────────────────────────────────────────────────────────────┘
Controls:
Player 1: 1 (up), Q (down)
Player 2: 4 (up), R (down)
Your emulator is "virtualizing" a complete game system!
4. Solution Architecture
4.1 High-Level Design
CHIP-8 EMULATOR ARCHITECTURE:
==============================
┌──────────────────────────────────────────────────────────────────────┐
│ HOST SYSTEM (Your PC) │
├──────────────────────────────────────────────────────────────────────┤
│ │
│ ┌────────────────────────────────────────────────────────────────┐ │
│ │ PLATFORM LAYER (SDL2) │ │
│ │ │ │
│ │ ┌──────────────┐ ┌──────────────┐ ┌──────────────┐ │ │
│ │ │ DISPLAY │ │ INPUT │ │ AUDIO │ │ │
│ │ │ Renderer │ │ Handler │ │ Beeper │ │ │
│ │ │ (SDL2) │ │ (SDL2) │ │ (SDL2) │ │ │
│ │ └──────┬───────┘ └──────┬───────┘ └──────┬───────┘ │ │
│ │ │ │ │ │ │
│ └─────────┼─────────────────┼─────────────────┼───────────────────┘ │
│ │ │ │ │
│ ┌─────────┴─────────────────┴─────────────────┴───────────────────┐ │
│ │ EMULATOR CORE │ │
│ │ │ │
│ │ ┌──────────────────────────────────────────────────────────┐ │ │
│ │ │ MAIN EMULATION LOOP │ │ │
│ │ │ │ │ │
│ │ │ while (running) { │ │ │
│ │ │ 1. Handle input (update keypad state) │ │ │
│ │ │ 2. Execute N instructions (fetch-decode-execute) │ │ │
│ │ │ 3. Update timers (at 60Hz) │ │ │
│ │ │ 4. Render display (if draw flag set) │ │ │
│ │ │ 5. Synchronize timing (sleep if too fast) │ │ │
│ │ │ } │ │ │
│ │ │ │ │ │
│ │ └──────────────────────────────────────────────────────────┘ │ │
│ │ │ │ │
│ │ ▼ │ │
│ │ ┌──────────────────────────────────────────────────────────┐ │ │
│ │ │ CPU STATE │ │ │
│ │ │ │ │ │
│ │ │ ┌─────────┐ ┌─────────┐ ┌─────────┐ ┌─────────────────┐ │ │ │
│ │ │ │ V0-VF │ │ I, PC │ │ SP │ │ Stack[16] │ │ │ │
│ │ │ │ (8-bit) │ │ (16-bit)│ │ (8-bit) │ │ (16-bit each) │ │ │ │
│ │ │ └─────────┘ └─────────┘ └─────────┘ └─────────────────┘ │ │ │
│ │ │ │ │ │
│ │ │ ┌─────────────────────────────────────────────────────┐ │ │ │
│ │ │ │ Memory[4096] (4KB) │ │ │ │
│ │ │ └─────────────────────────────────────────────────────┘ │ │ │
│ │ │ │ │ │
│ │ │ ┌─────────────────────────────────────────────────────┐ │ │ │
│ │ │ │ Display[64][32] (2048 pixels) │ │ │ │
│ │ │ └─────────────────────────────────────────────────────┘ │ │ │
│ │ │ │ │ │
│ │ │ ┌─────────┐ ┌─────────┐ ┌─────────────────────────────┐ │ │ │
│ │ │ │ DT (8) │ │ ST (8) │ │ Keypad[16] (pressed state) │ │ │ │
│ │ │ └─────────┘ └─────────┘ └─────────────────────────────┘ │ │ │
│ │ │ │ │ │
│ │ └──────────────────────────────────────────────────────────┘ │ │
│ │ │ │
│ │ ┌──────────────────────────────────────────────────────────┐ │ │
│ │ │ INSTRUCTION DECODER │ │ │
│ │ │ │ │ │
│ │ │ switch (opcode & 0xF000) { │ │ │
│ │ │ case 0x0000: handle_0x0_instructions(); break; │ │ │
│ │ │ case 0x1000: jump_to_NNN(); break; │ │ │
│ │ │ case 0x2000: call_subroutine_NNN(); break; │ │ │
│ │ │ ... │ │ │
│ │ │ } │ │ │
│ │ │ │ │ │
│ │ └──────────────────────────────────────────────────────────┘ │ │
│ │ │ │
│ └──────────────────────────────────────────────────────────────────┘ │
│ │
└──────────────────────────────────────────────────────────────────────┘
4.2 Key Components
- CPU State Structure: Contains all registers, memory, display buffer, and timers
- ROM Loader: Reads binary file into memory starting at 0x200
- Font Loader: Initializes built-in hex digit sprites at 0x000-0x04F
- Instruction Decoder: Giant switch statement mapping opcodes to handlers
- Instruction Handlers: 35 functions implementing each opcode
- Timer System: Decrements DT and ST at exactly 60Hz
- Display Renderer: Converts 64x32 pixel buffer to SDL window
- Input Handler: Maps SDL keyboard events to 16-key keypad
- Audio System: Plays tone when sound timer > 0
4.3 Data Structures
/* Core CHIP-8 state - everything the "CPU" needs */
typedef struct {
/* Memory: 4KB total */
uint8_t memory[4096];
/* Display: 64x32 pixels, 1 bit per pixel */
uint8_t display[64 * 32];
/* General purpose registers: V0-VF (VF is flag register) */
uint8_t V[16];
/* Index register: points to memory locations */
uint16_t I;
/* Program counter: current instruction address */
uint16_t PC;
/* Stack: 16 levels of subroutine nesting */
uint16_t stack[16];
uint8_t SP; /* Stack pointer */
/* Timers: decrement at 60Hz */
uint8_t delay_timer;
uint8_t sound_timer;
/* Keypad: 16 keys, 1 = pressed */
uint8_t keypad[16];
/* Flags */
bool draw_flag; /* True when display needs update */
bool sound_flag; /* True when beep should play */
bool waiting_key; /* True during Fx0A (wait for key) */
uint8_t key_reg; /* Register to store key in during wait */
} Chip8;
/* Font sprites: 5 bytes per character, 16 characters (0-F) */
static const uint8_t FONT_SET[80] = {
0xF0, 0x90, 0x90, 0x90, 0xF0, /* 0 */
0x20, 0x60, 0x20, 0x20, 0x70, /* 1 */
0xF0, 0x10, 0xF0, 0x80, 0xF0, /* 2 */
0xF0, 0x10, 0xF0, 0x10, 0xF0, /* 3 */
0x90, 0x90, 0xF0, 0x10, 0x10, /* 4 */
0xF0, 0x80, 0xF0, 0x10, 0xF0, /* 5 */
0xF0, 0x80, 0xF0, 0x90, 0xF0, /* 6 */
0xF0, 0x10, 0x20, 0x40, 0x40, /* 7 */
0xF0, 0x90, 0xF0, 0x90, 0xF0, /* 8 */
0xF0, 0x90, 0xF0, 0x10, 0xF0, /* 9 */
0xF0, 0x90, 0xF0, 0x90, 0x90, /* A */
0xE0, 0x90, 0xE0, 0x90, 0xE0, /* B */
0xF0, 0x80, 0x80, 0x80, 0xF0, /* C */
0xE0, 0x90, 0x90, 0x90, 0xE0, /* D */
0xF0, 0x80, 0xF0, 0x80, 0xF0, /* E */
0xF0, 0x80, 0xF0, 0x80, 0x80 /* F */
};
4.4 Algorithm Overview
Main Emulation Loop:
ALGORITHM: Emulation Loop
1. Initialize SDL (window, renderer, audio)
2. Initialize CHIP-8 state (clear memory, load fonts, PC = 0x200)
3. Load ROM into memory at 0x200
4. WHILE running:
4.1. Handle SDL events (input, quit)
- Update keypad[] array based on key states
4.2. IF not waiting for key:
- Execute 10-20 instructions (configurable)
- For each instruction:
a. Fetch: opcode = (memory[PC] << 8) | memory[PC+1]
b. Decode: Extract X, Y, N, NN, NNN from opcode
c. Execute: Switch on opcode, perform operation
d. Increment PC (unless jump/call modified it)
4.3. Update timers (once per frame, 60Hz):
- IF delay_timer > 0: delay_timer--
- IF sound_timer > 0: sound_timer--, set sound_flag
4.4. IF draw_flag:
- Render display[] to SDL window
- draw_flag = false
4.5. IF sound_flag:
- Play beep sound
4.6. Delay to maintain ~60 FPS (16.67ms per frame)
5. Cleanup SDL and exit
Instruction Decode Logic:
ALGORITHM: Decode and Execute
INPUT: 16-bit opcode
1. Extract fields:
X = (opcode & 0x0F00) >> 8
Y = (opcode & 0x00F0) >> 4
N = opcode & 0x000F
NN = opcode & 0x00FF
NNN = opcode & 0x0FFF
2. Switch on (opcode & 0xF000):
CASE 0x0000:
IF opcode == 0x00E0: Clear display
IF opcode == 0x00EE: Return from subroutine
ELSE: (0x0NNN was call RCA 1802 program - usually ignored)
CASE 0x1000: Jump to NNN
CASE 0x2000: Call subroutine at NNN
CASE 0x3000: Skip if VX == NN
CASE 0x4000: Skip if VX != NN
CASE 0x5000: Skip if VX == VY
CASE 0x6000: VX = NN
CASE 0x7000: VX += NN (no carry flag)
CASE 0x8000: (Arithmetic operations based on N)
0x8XY0: VX = VY
0x8XY1: VX |= VY
0x8XY2: VX &= VY
0x8XY3: VX ^= VY
0x8XY4: VX += VY (VF = carry)
0x8XY5: VX -= VY (VF = NOT borrow)
0x8XY6: VX >>= 1 (VF = LSB before shift)
0x8XY7: VX = VY - VX (VF = NOT borrow)
0x8XYE: VX <<= 1 (VF = MSB before shift)
CASE 0x9000: Skip if VX != VY
CASE 0xA000: I = NNN
CASE 0xB000: Jump to NNN + V0
CASE 0xC000: VX = random() & NN
CASE 0xD000: Draw sprite at (VX, VY), height N
CASE 0xE000:
0xEX9E: Skip if key VX pressed
0xEXA1: Skip if key VX not pressed
CASE 0xF000: (Misc operations based on NN)
0xFX07: VX = delay_timer
0xFX0A: Wait for key, store in VX
0xFX15: delay_timer = VX
0xFX18: sound_timer = VX
0xFX1E: I += VX
0xFX29: I = font sprite address for VX
0xFX33: Store BCD of VX at I, I+1, I+2
0xFX55: Store V0-VX at I (I unchanged in modern)
0xFX65: Load V0-VX from I (I unchanged in modern)
5. Implementation Guide
5.1 Development Environment Setup
Required Tools:
# macOS
brew install sdl2
xcode-select --install # For clang
# Ubuntu/Debian
sudo apt update
sudo apt install build-essential libsdl2-dev
# Fedora/RHEL
sudo dnf install gcc SDL2-devel
# Windows (MSYS2/MinGW)
pacman -S mingw-w64-x86_64-gcc mingw-w64-x86_64-SDL2
# Verify installation
sdl2-config --version # Should show 2.x.x
gcc --version # Should show version info
Getting ROMs (Legal Test ROMs):
# Download test ROMs (public domain)
git clone https://github.com/corax89/chip8-test-rom.git
git clone https://github.com/Timendus/chip8-test-suite.git
# Classic games (many are public domain/abandonware)
# Search for: PONG.ch8, TETRIS.ch8, INVADERS.ch8
5.2 Project Structure
chip8-emulator/
├── src/
│ ├── main.c # Entry point, argument parsing
│ ├── chip8.c # Core emulator (CPU state, opcodes)
│ ├── chip8.h # CHIP-8 structure and function declarations
│ ├── platform.c # SDL2 wrapper (display, input, audio)
│ └── platform.h # Platform abstraction interface
├── roms/ # Test ROMs
│ ├── test_opcode.ch8 # Opcode test suite
│ └── PONG.ch8 # Test game
├── Makefile # Build automation
└── README.md # Documentation
Makefile:
CC = gcc
CFLAGS = -Wall -Wextra -std=c11 -O2
LDFLAGS = $(shell sdl2-config --libs) -lm
INCLUDES = $(shell sdl2-config --cflags)
SRCS = src/main.c src/chip8.c src/platform.c
OBJS = $(SRCS:.c=.o)
TARGET = chip8
all: $(TARGET)
$(TARGET): $(OBJS)
$(CC) $(OBJS) -o $@ $(LDFLAGS)
%.o: %.c
$(CC) $(CFLAGS) $(INCLUDES) -c $< -o $@
clean:
rm -f $(OBJS) $(TARGET)
debug: CFLAGS += -g -DDEBUG
debug: clean all
run: all
./$(TARGET) roms/test_opcode.ch8
.PHONY: all clean debug run
5.3 The Core Question You’re Answering
“How does software simulate hardware, and what does it mean to ‘emulate’ a CPU?”
This project answers the fundamental question of how one computer can pretend to be another. You’ll discover:
- How the fetch-decode-execute cycle works at the implementation level
- Why instruction encoding matters and how to parse binary formats
- How to maintain “virtual” state that mimics real hardware
- Why timing accuracy is crucial for any emulator or virtual machine
This is the same core loop that powers QEMU’s TCG when emulating ARM on x86, or Apple’s Rosetta translating x86 to Apple Silicon.
5.4 Concepts You Must Understand First
Before writing code, ensure you can answer these self-assessment questions:
Hexadecimal and Binary:
- Q: What is 0xA5 in binary?
- A: 1010 0101 (split into nibbles: A=1010, 5=0101)
Bitwise Operations:
- Q: How do you extract the upper nibble of a byte?
- A:
(byte & 0xF0) >> 4orbyte >> 4
Memory Layout:
- Q: If a 16-bit value 0x1234 is stored big-endian starting at address 0x200, what’s at 0x200 and 0x201?
- A: 0x200 = 0x12, 0x201 = 0x34
XOR Properties:
- Q: What happens when you XOR a value with itself? With 0?
- A: XOR with itself = 0, XOR with 0 = unchanged (this is how CHIP-8 sprite drawing works)
Stack Operations:
- Q: In a stack that grows “up”, what does push do?
- A:
stack[SP++] = value(store then increment)
5.5 Questions to Guide Your Design
Architecture:
- How will you separate the emulator core from platform-specific code?
- Should the emulator run at a fixed clock speed or variable?
- How many instructions should execute per frame (60Hz)?
Implementation:
- How will you handle the two 0x8XY_ instruction sub-types?
- How will you implement the BCD (binary-coded decimal) instruction?
- How will you handle the “wait for keypress” instruction?
Testing:
- How will you verify each opcode works correctly?
- How will you test timing accuracy?
5.6 Thinking Exercise
Before writing any code, trace through this simple CHIP-8 program by hand:
Address Opcode Meaning
------ ------ -------
0x200 6A05 LD VA, 0x05 ; VA = 5
0x202 6B03 LD VB, 0x03 ; VB = 3
0x204 8AB4 ADD VA, VB ; VA = VA + VB
0x206 00E0 CLS ; Clear display
0x208 1200 JP 0x200 ; Jump back to start
Trace Table: | PC | Opcode | Operation | VA | VB | VF | Notes | |——|——–|—————-|—–|—–|—–|——-| | 0x200| 0x6A05 | VA = 5 | 5 | ? | ? | Initialize VA | | 0x202| 0x6B03 | VB = 3 | 5 | 3 | ? | Initialize VB | | 0x204| 0x8AB4 | VA = VA + VB | 8 | 3 | 0 | 5+3=8, no carry | | 0x206| 0x00E0 | Clear display | 8 | 3 | 0 | Display cleared | | 0x208| 0x1200 | Jump to 0x200 | 8 | 3 | 0 | Loop back | | 0x200| 0x6A05 | VA = 5 | 5 | 3 | 0 | Reset VA | | … | … | (continues) | … | … | … | Infinite loop |
Question: What happens after 51 iterations of this loop when VA keeps adding 3?
5.7 Hints in Layers
Use these progressive hints only when stuck. Try each level before moving to the next.
Hint 1: Starting Point - Basic Structure
Start with the CHIP-8 state structure and initialization:
/* chip8.h */
#ifndef CHIP8_H
#define CHIP8_H
#include <stdint.h>
#include <stdbool.h>
typedef struct {
uint8_t memory[4096];
uint8_t display[64 * 32];
uint8_t V[16];
uint16_t I;
uint16_t PC;
uint16_t stack[16];
uint8_t SP;
uint8_t delay_timer;
uint8_t sound_timer;
uint8_t keypad[16];
bool draw_flag;
} Chip8;
void chip8_init(Chip8* chip8);
bool chip8_load_rom(Chip8* chip8, const char* filename);
void chip8_cycle(Chip8* chip8);
void chip8_update_timers(Chip8* chip8);
#endif
/* chip8.c - initialization */
void chip8_init(Chip8* chip8) {
memset(chip8, 0, sizeof(Chip8));
chip8->PC = 0x200; /* Programs start here */
/* Load font set into memory at 0x000 */
memcpy(chip8->memory, FONT_SET, sizeof(FONT_SET));
}
Hint 2: ROM Loading
bool chip8_load_rom(Chip8* chip8, const char* filename) {
FILE* file = fopen(filename, "rb");
if (!file) return false;
/* Get file size */
fseek(file, 0, SEEK_END);
long size = ftell(file);
fseek(file, 0, SEEK_SET);
/* Check if ROM fits in memory */
if (size > (4096 - 0x200)) {
fclose(file);
return false;
}
/* Read ROM into memory starting at 0x200 */
fread(&chip8->memory[0x200], 1, size, file);
fclose(file);
return true;
}
Hint 3: The Fetch-Decode-Execute Core
void chip8_cycle(Chip8* chip8) {
/* FETCH: Get 16-bit opcode (big-endian) */
uint16_t opcode = (chip8->memory[chip8->PC] << 8) |
chip8->memory[chip8->PC + 1];
/* Advance PC before execution (some ops modify it) */
chip8->PC += 2;
/* DECODE: Extract common fields */
uint8_t X = (opcode & 0x0F00) >> 8;
uint8_t Y = (opcode & 0x00F0) >> 4;
uint8_t N = opcode & 0x000F;
uint8_t NN = opcode & 0x00FF;
uint16_t NNN = opcode & 0x0FFF;
/* EXECUTE: Decode and run instruction */
switch (opcode & 0xF000) {
case 0x0000:
switch (opcode) {
case 0x00E0: /* CLS - Clear display */
memset(chip8->display, 0, sizeof(chip8->display));
chip8->draw_flag = true;
break;
case 0x00EE: /* RET - Return from subroutine */
chip8->SP--;
chip8->PC = chip8->stack[chip8->SP];
break;
}
break;
case 0x1000: /* JP NNN - Jump to address */
chip8->PC = NNN;
break;
case 0x2000: /* CALL NNN - Call subroutine */
chip8->stack[chip8->SP] = chip8->PC;
chip8->SP++;
chip8->PC = NNN;
break;
/* ... more cases ... */
default:
printf("Unknown opcode: 0x%04X\n", opcode);
break;
}
}
Hint 4: The 0x8XY_ Arithmetic Instructions
These are the trickiest opcodes - be careful with carry/borrow flags:
case 0x8000:
switch (N) {
case 0x0: /* LD VX, VY */
chip8->V[X] = chip8->V[Y];
break;
case 0x1: /* OR VX, VY */
chip8->V[X] |= chip8->V[Y];
break;
case 0x2: /* AND VX, VY */
chip8->V[X] &= chip8->V[Y];
break;
case 0x3: /* XOR VX, VY */
chip8->V[X] ^= chip8->V[Y];
break;
case 0x4: /* ADD VX, VY with carry */
{
uint16_t sum = chip8->V[X] + chip8->V[Y];
chip8->V[0xF] = (sum > 255) ? 1 : 0; /* Set BEFORE modifying VX */
chip8->V[X] = sum & 0xFF;
}
break;
case 0x5: /* SUB VX, VY with NOT borrow */
{
uint8_t vx = chip8->V[X];
uint8_t vy = chip8->V[Y];
chip8->V[0xF] = (vx >= vy) ? 1 : 0; /* NOT borrow */
chip8->V[X] = vx - vy;
}
break;
case 0x6: /* SHR VX (shift right) */
chip8->V[0xF] = chip8->V[X] & 0x1; /* Save LSB */
chip8->V[X] >>= 1;
break;
case 0x7: /* SUBN VX, VY (VX = VY - VX) */
{
uint8_t vx = chip8->V[X];
uint8_t vy = chip8->V[Y];
chip8->V[0xF] = (vy >= vx) ? 1 : 0;
chip8->V[X] = vy - vx;
}
break;
case 0xE: /* SHL VX (shift left) */
chip8->V[0xF] = (chip8->V[X] & 0x80) >> 7; /* Save MSB */
chip8->V[X] <<= 1;
break;
}
break;
Hint 5: The DXYN Draw Instruction
This is the most complex instruction - sprites are XOR’d onto display:
case 0xD000: /* DRW VX, VY, N - Draw sprite */
{
uint8_t x = chip8->V[X] % 64; /* Wrap around */
uint8_t y = chip8->V[Y] % 32;
chip8->V[0xF] = 0; /* Collision flag */
for (int row = 0; row < N; row++) {
uint8_t sprite_byte = chip8->memory[chip8->I + row];
for (int col = 0; col < 8; col++) {
/* Get sprite pixel (from MSB to LSB) */
uint8_t sprite_pixel = (sprite_byte >> (7 - col)) & 1;
/* Calculate screen position with wrapping */
int screen_x = (x + col) % 64;
int screen_y = (y + row) % 32;
int screen_idx = screen_y * 64 + screen_x;
/* XOR pixel onto display */
if (sprite_pixel) {
if (chip8->display[screen_idx]) {
chip8->V[0xF] = 1; /* Collision! */
}
chip8->display[screen_idx] ^= 1;
}
}
}
chip8->draw_flag = true;
}
break;
Hint 6: SDL2 Platform Layer
/* platform.c */
#include <SDL2/SDL.h>
#include "platform.h"
#include "chip8.h"
static SDL_Window* window = NULL;
static SDL_Renderer* renderer = NULL;
static SDL_Texture* texture = NULL;
static uint32_t pixels[64 * 32];
/* Key mapping: CHIP-8 keypad to keyboard */
static const SDL_Scancode keymap[16] = {
SDL_SCANCODE_X, /* 0 */
SDL_SCANCODE_1, /* 1 */
SDL_SCANCODE_2, /* 2 */
SDL_SCANCODE_3, /* 3 */
SDL_SCANCODE_Q, /* 4 */
SDL_SCANCODE_W, /* 5 */
SDL_SCANCODE_E, /* 6 */
SDL_SCANCODE_A, /* 7 */
SDL_SCANCODE_S, /* 8 */
SDL_SCANCODE_D, /* 9 */
SDL_SCANCODE_Z, /* A */
SDL_SCANCODE_C, /* B */
SDL_SCANCODE_4, /* C */
SDL_SCANCODE_R, /* D */
SDL_SCANCODE_F, /* E */
SDL_SCANCODE_V /* F */
};
bool platform_init(int scale) {
if (SDL_Init(SDL_INIT_VIDEO | SDL_INIT_AUDIO) < 0) {
return false;
}
window = SDL_CreateWindow("CHIP-8 Emulator",
SDL_WINDOWPOS_CENTERED, SDL_WINDOWPOS_CENTERED,
64 * scale, 32 * scale, SDL_WINDOW_SHOWN);
renderer = SDL_CreateRenderer(window, -1,
SDL_RENDERER_ACCELERATED);
texture = SDL_CreateTexture(renderer,
SDL_PIXELFORMAT_RGBA8888, SDL_TEXTUREACCESS_STREAMING,
64, 32);
return true;
}
void platform_render(Chip8* chip8) {
/* Convert 1-bit display to RGBA */
for (int i = 0; i < 64 * 32; i++) {
pixels[i] = chip8->display[i] ? 0xFFFFFFFF : 0x000000FF;
}
SDL_UpdateTexture(texture, NULL, pixels, 64 * sizeof(uint32_t));
SDL_RenderClear(renderer);
SDL_RenderCopy(renderer, texture, NULL, NULL);
SDL_RenderPresent(renderer);
}
bool platform_handle_input(Chip8* chip8) {
SDL_Event event;
while (SDL_PollEvent(&event)) {
if (event.type == SDL_QUIT) return false;
}
const uint8_t* state = SDL_GetKeyboardState(NULL);
for (int i = 0; i < 16; i++) {
chip8->keypad[i] = state[keymap[i]];
}
return true;
}
5.8 The Interview Questions They’ll Ask
Basic Understanding
- “What is the fetch-decode-execute cycle?”
- Good Answer: The fundamental CPU operation: (1) Fetch instruction bytes from memory at PC, (2) Decode the opcode to determine the operation, (3) Execute the operation and update state, (4) Update PC and repeat. Every CPU and emulator implements this.
- Red Flag: “It’s how programs run” (too vague)
- “Why is CHIP-8 a good first emulator project?”
- Good Answer: Small instruction set (35 opcodes), simple memory model (4KB), fixed-width instructions (16-bit), well-documented specification. It teaches all fundamental emulation concepts without overwhelming complexity.
- Follow-up: “What would be harder about a NES emulator?” (Multiple hardware chips, complex PPU timing, memory mappers)
- “How do you synchronize emulated time with real time?”
- Good Answer: CHIP-8 runs timers at 60Hz. Track elapsed real time, execute enough instructions per frame (500-1000), decrement timers once per frame. Use SDL_Delay or similar to throttle if running too fast.
- Red Flag: “Just run as fast as possible” (games would be unplayable)
Technical Details
- “Explain how the DXYN (draw sprite) instruction works”
- Good Answer: Reads N bytes from memory[I]. Each byte represents a row of 8 pixels. XORs each pixel onto the display at (VX, VY). Sets VF=1 if any pixel is erased (collision detection). Coordinates wrap around screen edges.
- Key Insight: XOR means drawing the same sprite twice erases it (useful for animation)
- “What’s the difference between 8XY5 (SUB) and 8XY7 (SUBN)?”
- Good Answer: 8XY5: VX = VX - VY, VF = NOT borrow (1 if VX >= VY). 8XY7: VX = VY - VX, VF = NOT borrow (1 if VY >= VX). The operand order is swapped.
- Red Flag: “One subtracts and one adds” (incorrect)
- “How does the FX0A instruction (wait for keypress) work?”
- Good Answer: Blocks execution until any key is pressed. When pressed, stores key number in VX and continues. Most implementations decrement PC by 2 to re-execute the instruction until a key is pressed.
- Alternative: Set a “waiting” flag and check in the main loop
Problem-Solving
- “Your emulator runs Pong but the paddles move too fast. How do you debug?”
- Good Answer:
- Check if timer decrementation is at 60Hz (not per instruction)
- Verify instructions-per-frame count (should be ~10-20 per frame)
- Add delay/sleep to main loop if not throttling
- Compare timing against known-good emulators
- Good Answer:
- “The test ROM shows arithmetic operations failing. Debugging approach?”
- Good Answer:
- Add debug output showing opcode, registers before/after
- Check VF flag handling order (must set before modifying VX in some cases)
- Verify shift operations use correct source register
- Test each opcode in isolation
- Good Answer:
- “How would you extend this to support SUPER-CHIP?”
- Good Answer: SUPER-CHIP adds 128x64 resolution, scrolling, larger sprites, and ~10 new instructions. Would need: double display buffer, new opcodes (scroll, high-res mode), larger font sprites.
5.9 Books That Will Help
| Topic | Book | Specific Chapter/Section | Why It Helps |
|---|---|---|---|
| Fetch-Decode-Execute | “Computer Systems: A Programmer’s Perspective” by Bryant & O’Hallaron | Chapter 4: Processor Architecture | Explains Y86-64 ISA in detail, same concepts apply |
| Instruction Encoding | “Write Great Code, Volume 2” by Randall Hyde | Chapter 3: Instruction Encoding | How CPUs encode operations in binary |
| State Machine Design | “Code: The Hidden Language” by Charles Petzold | Chapter 17: Automation | Builds up to CPU from first principles |
| Emulator Design | “Writing Game Emulators” by ZephRay | Entire book | Practical emulator construction |
| C Programming | “C Programming: A Modern Approach” by K.N. King | Chapters 16-20 | Bitwise operations, structures |
| SDL2 Programming | “SDL Game Development” by Shaun Mitchell | Chapters 1-4 | Graphics, input, audio basics |
Online Resources:
- Cowgod’s CHIP-8 Technical Reference - The definitive CHIP-8 specification
- CHIP-8 Test Suite - Verify opcode correctness
- r/EmuDev - Emulator development community
5.10 Implementation Phases
Phase 1: Foundation (Day 1-2)
- Set up project structure with Makefile
- Implement Chip8 structure and initialization
- Implement ROM loading
- Add basic SDL2 window creation
- Milestone: Window opens when running emulator
Phase 2: Core CPU (Day 2-4)
- Implement fetch-decode-execute loop
- Implement simple opcodes: 00E0, 1NNN, 6XNN, 7XNN, ANNN
- Add debug printing to verify execution
- Milestone: Simple program (set register, jump) runs
Phase 3: Display (Day 4-5)
- Implement DXYN draw instruction
- Implement display rendering to SDL
- Load and display font sprites
- Milestone: Can draw sprites on screen
Phase 4: Full Instruction Set (Day 5-6)
- Implement all remaining opcodes
- Pay special attention to 8XY_ arithmetic ops
- Implement FX__ instructions
- Test with opcode test ROM
- Milestone: Test ROM passes all opcodes
Phase 5: Input and Timing (Day 6-7)
- Implement keypad input via SDL
- Implement 60Hz timer decrementation
- Add speed throttling
- Test with Pong
- Milestone: Pong is playable at correct speed
Phase 6: Polish (Day 7+)
- Add sound (beep when ST > 0)
- Add debug mode
- Test multiple ROMs
- Clean up code, add comments
- Milestone: Multiple games work correctly
5.11 Key Implementation Decisions
-
Instructions per frame: 10-20 is typical. Too few = slow, too many = input lag. Make configurable.
-
Timer update point: Update timers once per frame (every ~16.67ms), NOT per instruction.
-
Shift instruction quirk: Original CHIP-8: VX = VY » 1. Modern: VX = VX » 1. Choose one and document it.
-
Load/Store quirk: Original: I incremented after FX55/FX65. Modern: I unchanged. Most ROMs expect modern behavior.
-
Wrap vs clip for drawing: Original wraps sprites around screen edges. Some modern interpreters clip. Wrapping is more compatible.
-
VF timing: For arithmetic ops, calculate result, set VF, THEN store result. Order matters when X=F.
6. Testing Strategy
6.1 Unit Testing
Since emulators are hard to unit test traditionally, use incremental verification:
/* test_opcodes.c - Manual opcode testing */
void test_ld_vx_nn(void) {
Chip8 chip8;
chip8_init(&chip8);
/* Setup: Place 6A42 at 0x200 (LD VA, 0x42) */
chip8.memory[0x200] = 0x6A;
chip8.memory[0x201] = 0x42;
/* Execute one cycle */
chip8_cycle(&chip8);
/* Verify */
assert(chip8.V[0xA] == 0x42);
assert(chip8.PC == 0x202);
printf("test_ld_vx_nn: PASS\n");
}
void test_add_vx_vy_with_carry(void) {
Chip8 chip8;
chip8_init(&chip8);
/* Setup: V0 = 250, V1 = 10, execute 8014 (ADD V0, V1) */
chip8.V[0] = 250;
chip8.V[1] = 10;
chip8.memory[0x200] = 0x80;
chip8.memory[0x201] = 0x14;
chip8_cycle(&chip8);
/* 250 + 10 = 260, wraps to 4, carry = 1 */
assert(chip8.V[0] == 4);
assert(chip8.V[0xF] == 1);
printf("test_add_vx_vy_with_carry: PASS\n");
}
6.2 Integration Testing
Use the CHIP-8 test suite ROM:
# Download test ROM
wget https://github.com/Timendus/chip8-test-suite/releases/download/v4.0/5-quirks.ch8
# Run and verify visually
./chip8 5-quirks.ch8
# Each test shows PASS or FAIL on screen
# Test with known games
./chip8 roms/PONG.ch8 # Should be playable
./chip8 roms/TETRIS.ch8 # Should be playable
./chip8 roms/INVADERS.ch8 # Should be playable
6.3 Debugging Techniques
Debug Mode Implementation:
/* Add to chip8_cycle() */
#ifdef DEBUG
printf("[%04X] %04X ", chip8->PC - 2, opcode);
/* Print disassembly */
switch (opcode & 0xF000) {
case 0x1000:
printf("JP %03X", NNN);
break;
case 0x6000:
printf("LD V%X, %02X", X, NN);
break;
/* ... */
}
printf(" V0=%02X V1=%02X ... VF=%02X I=%04X\n",
chip8->V[0], chip8->V[1], chip8->V[0xF], chip8->I);
#endif
Memory Dump:
void chip8_dump_memory(Chip8* chip8, uint16_t start, uint16_t len) {
for (uint16_t i = 0; i < len; i += 16) {
printf("%04X: ", start + i);
for (int j = 0; j < 16 && (start + i + j) < 4096; j++) {
printf("%02X ", chip8->memory[start + i + j]);
}
printf("\n");
}
}
7. Common Pitfalls & Debugging
Problem 1: Nothing appears on screen
- Root Cause: Draw flag not being set, or renderer not checking it
- Fix: Ensure
chip8->draw_flag = true;after DXYN, verifyplatform_render()is called when flag is set - Quick Test: Force draw a single pixel at (0,0) in init
Problem 2: Game runs way too fast
- Root Cause: No timing throttle in main loop
- Fix: Add
SDL_Delay(16)or similar for ~60 FPS cap - Quick Test: Print timestamp each frame, verify ~16ms apart
Problem 3: Arithmetic operations give wrong results
- Root Cause: VF set after operation instead of before, or wrong for X=F
- Fix: Calculate result to temp, set VF, THEN assign to VX
- Quick Test: Run opcode test ROM, check arithmetic section
Problem 4: Sprites look garbled
- Root Cause: Bit order wrong when extracting sprite pixels
- Fix: Sprite pixels go MSB to LSB:
(byte >> (7 - col)) & 1 - Quick Test: Draw font character ‘0’ (at 0x000), verify it looks correct
Problem 5: Keys don’t respond
- Root Cause: Key mapping wrong, or not calling SDL_PollEvent
- Fix: Verify keymap array matches expected layout, ensure event polling in main loop
- Quick Test: Print keypad state each frame, verify keys register
Problem 6: Stack overflow/underflow
- Root Cause: More CALL than RET, or RET without CALL
- Fix: Add bounds checking:
if (chip8->SP >= 16) { error(); } - Quick Test: Add stack depth printing in debug mode
Problem 7: Timers decrement too fast
- Root Cause: Decrementing per instruction instead of per frame (60Hz)
- Fix: Only call
chip8_update_timers()once per frame, not per instruction - Quick Test: Set delay_timer = 60, verify it takes ~1 second to reach 0
Problem 8: Random number instruction broken
- Root Cause: Forgetting to seed RNG, or not ANDing with NN
- Fix:
srand(time(NULL))at init,V[X] = rand() & NN - Quick Test: Print random values, verify they stay within expected range
8. Extensions & Challenges
Extension 1: SUPER-CHIP Support
- Add 128x64 high-resolution mode
- Implement scroll instructions (00CN, 00FB, 00FC)
- Add large 16x16 sprites
- Challenge: Make games auto-detect mode
Extension 2: Debugger Interface
- Add breakpoints at specific PC values
- Single-step execution (SPACE to step)
- Register inspection panel
- Memory viewer/editor
- Disassembly view
Extension 3: Save States
- Serialize entire Chip8 structure to file
- Load/restore state at any time
- Challenge: Implement rewind (circular buffer of states)
Extension 4: Configurable Quirks
- Toggle shift instruction behavior (VY vs VX)
- Toggle load/store I increment behavior
- Toggle clip vs wrap for sprites
- Make it pass all quirks tests
Extension 5: Sound Waveform
- Instead of simple beep, generate proper square wave
- Make frequency configurable
- Add volume control
Extension 6: ROM Disassembler
- Output assembly listing for any ROM
- Identify subroutines and loops
- Generate commented source
9. Real-World Connections
QEMU’s TCG (Tiny Code Generator): When QEMU emulates ARM on x86, it uses the same fundamental loop you built. The complexity scales (thousands of opcodes, multiple hardware devices), but the core idea is identical.
Java Virtual Machine: The JVM is a stack-based interpreter that does fetch-decode-execute on bytecode. Your CHIP-8 experience directly applies to understanding JIT compilation and runtime systems.
Game Console Emulators: NES, SNES, Game Boy emulators all start here. The patterns you learned (CPU emulation, PPU/display, input mapping) scale to more complex systems.
Apple’s Rosetta 2: When M1 Macs run x86 code, binary translation (an advanced form of what you did) converts instructions. Understanding CHIP-8 emulation helps understand this technology.
Cloud Computing: VMs in AWS/GCP/Azure use hardware virtualization (VT-x/AMD-V), but the software management layers (QEMU, libvirt) use many concepts you’ve now internalized.
Security Research: Emulators are used to analyze malware safely. Understanding emulation internals helps with reverse engineering and security analysis.
10. Resources
Primary References
- Cowgod’s CHIP-8 Technical Reference - The specification
- CHIP-8 Wikipedia - Historical context
- Tobias V. Langhoff’s Guide - Excellent tutorial
Test ROMs
- Timendus CHIP-8 Test Suite - Comprehensive tests
- corax89 Test ROM - Opcode verification
- Chip8 Games Archive - Classic games
Reference Implementations
- Rust CHIP-8 - Clean Rust implementation
- C CHIP-8 - Minimal C implementation
- Go CHIP-8 - Well-documented Go version
Tools
- SDL2 Documentation - Graphics library
- Online CHIP-8 Emulator - For comparison
11. Self-Assessment Checklist
Before moving to Project 2, verify:
Understanding:
- Can you explain the fetch-decode-execute cycle without notes?
- Can you decode a random CHIP-8 opcode by hand?
- Can you explain why XOR is used for sprite drawing?
- Can you explain the difference between delay timer and instruction timing?
- Can you explain why VF must be set before modifying VX in some cases?
Implementation:
- All 35 opcodes implemented and tested?
- Display renders correctly with font sprites visible?
- Keyboard input maps to all 16 keys?
- Timers decrement at exactly 60Hz?
- Sound plays when sound timer > 0?
Testing:
- Opcode test ROM passes all tests?
- Pong is playable at reasonable speed?
- At least 3 different games work?
- Timing is correct (games not too fast/slow)?
Growth:
- Can you modify the emulator without looking at hints?
- Can you explain how this relates to real hypervisors?
- Do you understand what you’d need to emulate a NES?
12. Submission / Completion Criteria
Your implementation is complete when:
Minimum Viable Completion:
- Loads and executes CHIP-8 ROMs
- All 35 standard opcodes implemented
- Display renders in SDL window
- Basic keyboard input works
- At least one game (Pong) is playable
Full Completion:
- All opcodes pass test ROM
- Correct 60Hz timer operation
- Sound plays when sound timer > 0
- Multiple games work correctly
- Code is clean and well-commented
- Debug mode available
Excellence (Going Above & Beyond):
- SUPER-CHIP support
- Built-in debugger with breakpoints
- Save state functionality
- Configurable quirks mode
- Disassembler output
Congratulations! By completing this project, you’ve built your first emulator and internalized the fetch-decode-execute cycle that powers every CPU and virtual machine. You understand why timing matters, how to decode binary instruction formats, and how to map virtual hardware to real devices.
This is the foundation for everything in the Hypervisor & Virtualization Deep Dive. Project 2 (RISC-V Emulator) will build on these concepts with a real-world ISA, and by Project 11 you’ll be building an actual VT-x hypervisor.
You’ve taken the first step from “user of VMs” to “builder of VMs.”
This guide was expanded from HYPERVISOR_VIRTUALIZATION_DEEP_DIVE_PROJECTS.md. For the complete learning path, see the README.