CPU ISA ARCHITECTURE PROJECTS
CPU, ISA & Computer Architecture - Learning Projects
Goal: Deeply understand how CPUs work, how instruction sets control execution, how memory is accessed, and how the processor orchestrates the entire system.
Core Concepts You’ll Master
| Concept | What It Means | Projects That Teach It |
|---|---|---|
| Binary & Hex | How data is represented in hardware | #1, #2 |
| Fetch-Decode-Execute | The CPU’s heartbeat cycle | #3, #4, #5, #6 |
| Registers | Ultra-fast CPU storage slots | #4, #5, #6, #7 |
| Memory Addressing | How CPU locates data | #5, #6, #8 |
| ALU Operations | Math and logic in silicon | #3, #4, #5 |
| Control Flow | Jumps, branches, conditionals | #4, #5, #6, #9 |
| The Stack | Function calls and local variables | #5, #6, #8, #10 |
| Interrupts & I/O | CPU talking to the world | #7, #11, #12 |
Project 1: Binary & Hex Visualization Tool
- File: CPU_ISA_ARCHITECTURE_PROJECTS.md
- Main Programming Language: C
- Alternative Programming Languages: Python, Rust, Go
- Coolness Level: Level 2: Practical but Forgettable
- Business Potential: 1. The “Resume Gold” (Educational/Personal Brand)
- Difficulty: Level 1: Beginner (The Tinkerer)
- Knowledge Area: Number Systems / Data Representation
- Software or Tool: CLI Converter Tool
- Main Book: “Code: The Hidden Language of Computer Hardware and Software” by Charles Petzold
What you’ll build: A command-line tool that converts between decimal, binary, hexadecimal, and shows the actual bit patterns with visual highlighting of sign bits, byte boundaries, and two’s complement representation.
Why it teaches CPU fundamentals: CPUs don’t understand decimal—everything is binary. Before you can understand instructions, you must be fluent in reading hex dumps and binary patterns. This forces you to internalize how numbers are actually stored.
Core challenges you’ll face:
- Implementing two’s complement for negative numbers → maps to how CPUs represent signed integers
- Handling different bit widths (8, 16, 32, 64-bit) → maps to register sizes and data types
- Displaying byte order (little-endian vs big-endian) → maps to memory layout differences between architectures
- Parsing and validating input in multiple bases → maps to how assemblers parse numeric literals
Key Concepts:
- Binary Number System: “Code” Chapter 7-9 - Charles Petzold
- Two’s Complement: “Computer Systems: A Programmer’s Perspective” Chapter 2.2 - Bryant & O’Hallaron
- Endianness: “Computer Systems: A Programmer’s Perspective” Chapter 2.1.3 - Bryant & O’Hallaron
- Bit Manipulation in C: “The C Programming Language” Chapter 2.9 - Kernighan & Ritchie
Difficulty: Beginner Time estimate: Weekend Prerequisites: Basic C programming, understanding of decimal numbers
Real world outcome:
$ ./bitview 255
Decimal: 255
Binary: 00000000 00000000 00000000 11111111
Hex: 0x000000FF
Signed: 255 (positive)
Bit width: 32-bit
$ ./bitview -1
Decimal: -1
Binary: 11111111 11111111 11111111 11111111
Hex: 0xFFFFFFFF
Signed: -1 (two's complement)
$ ./bitview 0xDEADBEEF
Decimal: 3735928559
Binary: 11011110 10101101 10111110 11101111
^^^^^^^^ ^^^^^^^^ ^^^^^^^^ ^^^^^^^^
DE AD BE EF
Learning milestones:
- Decimal-to-binary conversion works → You understand positional number systems
- Two’s complement displays correctly for negative numbers → You understand how CPUs handle signed arithmetic
- Endianness display toggle works → You understand memory layout fundamentals
Project 2: Logic Gate Simulator
- File: CPU_ISA_ARCHITECTURE_PROJECTS.md
- Main Programming Language: C
- Alternative Programming Languages: Rust, Python, JavaScript (for visual version)
- Coolness Level: Level 3: Genuinely Clever
- Business Potential: 1. The “Resume Gold” (Educational/Personal Brand)
- Difficulty: Level 1: Beginner (The Tinkerer)
- Knowledge Area: Digital Logic / Boolean Algebra
- Software or Tool: Logic Simulator
- Main Book: “Code: The Hidden Language of Computer Hardware and Software” by Charles Petzold
What you’ll build: A simulator where you can wire together AND, OR, NOT, XOR, NAND gates to build circuits. You’ll then build a 1-bit adder, then chain them into an 8-bit adder—the core of an ALU.
Why it teaches CPU fundamentals: CPUs are just billions of logic gates. By building an adder from gates, you see that “addition” isn’t magic—it’s just carefully arranged AND/OR/XOR gates. This demystifies the ALU completely.
Core challenges you’ll face:
- Implementing gate propagation (output depends on inputs) → maps to combinational logic
- Building a half-adder, then full-adder → maps to how ALUs perform arithmetic
- Chaining adders with carry propagation → maps to why addition takes time (carry delay)
- Detecting overflow → maps to CPU status flags
Key Concepts:
- Boolean Algebra: “Code” Chapter 10-11 - Charles Petzold
- Logic Gates: “Digital Design and Computer Architecture” Chapter 1 - Harris & Harris
- Building an Adder: “Code” Chapter 12 - Charles Petzold
- Carry Propagation: “Computer Organization and Design” Chapter 3.2 - Patterson & Hennessy
Difficulty: Beginner Time estimate: Weekend - 1 week Prerequisites: Basic programming, understanding of AND/OR/NOT from boolean logic
Real world outcome:
$ ./gatesim
> CREATE half_adder
> CONNECT input_a -> xor.in1, and.in1
> CONNECT input_b -> xor.in2, and.in2
> CONNECT xor.out -> sum
> CONNECT and.out -> carry
> SET input_a = 1
> SET input_b = 1
> SIMULATE
Result: sum=0, carry=1 (1+1 = 10 in binary!)
> BUILD 8bit_adder FROM full_adder[8]
> ADD 0b00001111 0b00000001
Result: 0b00010000 (15 + 1 = 16), carry=0, overflow=0
Learning milestones:
- Individual gates work correctly → You understand boolean logic
- Half-adder produces correct sum and carry → You see arithmetic emerging from logic
- 8-bit adder handles all cases including overflow → You understand how CPUs really add numbers
Project 3: Stack Machine Virtual Machine
- File: CPU_ISA_ARCHITECTURE_PROJECTS.md
- Main Programming Language: C
- Alternative Programming Languages: Rust, Go, Zig
- Coolness Level: Level 3: Genuinely Clever
- Business Potential: 1. The “Resume Gold” (Educational/Personal Brand)
- Difficulty: Level 2: Intermediate (The Developer)
- Knowledge Area: Virtual Machines / Instruction Execution
- Software or Tool: Stack-based VM
- Main Book: “Computer Systems: A Programmer’s Perspective” by Bryant & O’Hallaron
What you’ll build: A simple stack-based virtual machine (like the JVM or Python bytecode interpreter) that executes instructions like PUSH, POP, ADD, SUB, MUL, DIV, and can run simple programs.
Why it teaches CPU fundamentals: This is your first “CPU”! Stack machines are simpler than register machines but teach the same core concept: fetch an instruction, decode it, execute it, repeat. You’ll implement the fetch-decode-execute cycle yourself.
Core challenges you’ll face:
- Implementing the instruction fetch loop → maps to CPU fetch cycle
- Decoding opcodes into actions → maps to instruction decoding
- Managing the stack pointer → maps to SP register behavior
- Handling stack underflow/overflow → maps to hardware exceptions
Key Concepts:
- Fetch-Decode-Execute Cycle: “Computer Organization and Design” Chapter 4.1 - Patterson & Hennessy
- Stack-Based Computation: “Computer Systems: A Programmer’s Perspective” Chapter 3.7 - Bryant & O’Hallaron
- Bytecode Interpretation: “Crafting Interpreters” Chapter 14 - Robert Nystrom (free online)
- Opcode Design: “Language Implementation Patterns” Chapter 10 - Terence Parr
Difficulty: Intermediate Time estimate: 1 week Prerequisites: C programming, understanding of stacks as data structures
Real world outcome:
$ cat factorial.svm
; Calculate 5!
PUSH 5 ; n = 5
PUSH 1 ; result = 1
LOOP:
DUP ; duplicate n
JZ END ; if n == 0, jump to end
SWAP ; swap n and result
OVER ; copy n to top
MUL ; result = result * n
SWAP ; swap back
PUSH 1
SUB ; n = n - 1
JMP LOOP
END:
POP ; remove n (which is 0)
PRINT ; print result
$ ./stackvm factorial.svm
[FETCH] PC=0x00: PUSH 5
[EXEC] Stack: [5]
[FETCH] PC=0x02: PUSH 1
[EXEC] Stack: [5, 1]
...
[EXEC] Stack: [120]
[PRINT] 120
Result: 120 (5! = 120)
Learning milestones:
- PUSH/POP/ADD work correctly → You understand instruction execution
- Loops with JMP/JZ work → You understand control flow at the machine level
- Factorial program runs correctly → You’ve built a working CPU (in software)!
Project 4: CHIP-8 Emulator
- File: CPU_ISA_ARCHITECTURE_PROJECTS.md
- Main Programming Language: C
- Alternative Programming Languages: Rust, C++, Go
- Coolness Level: Level 4: Hardcore Tech Flex
- Business Potential: 1. The “Resume Gold” (Educational/Personal Brand)
- Difficulty: Level 2: Intermediate (The Developer)
- Knowledge Area: CPU Emulation / ISA Implementation
- Software or Tool: CHIP-8 Emulator
- Main Book: “Computer Organization and Design” by Patterson & Hennessy
What you’ll build: A complete emulator for the CHIP-8, a simple 1970s virtual machine with 35 instructions, 16 registers, a 64x32 display, and keyboard input. You’ll run actual games like Pong, Tetris, and Space Invaders.
Why it teaches CPU fundamentals: CHIP-8 is the “Hello World” of CPU emulation. It has a real instruction set with opcodes, registers (V0-VF), a program counter, stack, and memory—but it’s simple enough to implement in a weekend. You’ll implement every instruction by reading actual documentation.
Core challenges you’ll face:
- Parsing 2-byte opcodes and extracting operands → maps to instruction encoding and decoding
- Implementing 16 general-purpose registers → maps to register file design
- Managing PC (program counter) and subroutine stack → maps to control flow hardware
- Implementing timers that tick at 60Hz → maps to hardware timing and interrupts
- Drawing sprites with XOR → maps to memory-mapped I/O and graphics
Key Concepts:
- Opcode Decoding: “Computer Organization and Design” Chapter 4.3 - Patterson & Hennessy
- Register Files: “Computer Organization and Design” Chapter 4.2 - Patterson & Hennessy
- Program Counter & Branching: “Computer Systems: A Programmer’s Perspective” Chapter 3.6 - Bryant & O’Hallaron
- Memory-Mapped I/O: “Computer Organization and Design” Chapter 5.2 - Patterson & Hennessy
Resources for CHIP-8 specifics:
- “Cowgod’s CHIP-8 Technical Reference” - The definitive CHIP-8 spec (free online)
- “How to Write an Emulator (CHIP-8)” by Laurence Muller - Step-by-step guide
Difficulty: Intermediate Time estimate: 1-2 weeks Prerequisites: C programming, basic understanding of hex and binary
Real world outcome:
$ ./chip8 roms/PONG.ch8
┌────────────────────────────────────────────────────────────────┐
│ █ │
│ █ │
│ █ │
│ █ █ │
│ █ ● █ │
│ █ █ │
│ │
│ │
│ Score: 3 Score: 2 │
└────────────────────────────────────────────────────────────────┘
Controls: W/S = Left paddle, Up/Down = Right paddle, ESC = Quit
You’re playing actual 1970s games on a CPU you built!
Learning milestones:
- Opcodes decode correctly, registers update → You understand instruction encoding
- Jumps and subroutines work (CALL/RET) → You understand how the stack enables function calls
- Graphics display and games are playable → You’ve emulated a complete system!
Project 5: Simple RISC CPU Emulator (Custom ISA)
- File: CPU_ISA_ARCHITECTURE_PROJECTS.md
- Main Programming Language: C
- Alternative Programming Languages: Rust, C++, Zig
- Coolness Level: Level 4: Hardcore Tech Flex
- Business Potential: 1. The “Resume Gold” (Educational/Personal Brand)
- Difficulty: Level 3: Advanced (The Engineer)
- Knowledge Area: CPU Architecture / ISA Design
- Software or Tool: RISC CPU Emulator
- Main Book: “Computer Organization and Design RISC-V Edition” by Patterson & Hennessy
What you’ll build: Design and implement your own simple RISC instruction set with ~20 instructions, 8 registers, and a memory system. Write an assembler for it, then write programs in your own assembly language.
Why it teaches CPU fundamentals: Unlike CHIP-8 (where you follow a spec), here YOU design the ISA. You’ll make decisions like: How many bits per instruction? How many registers? What addressing modes? This forces you to understand why ISAs are designed the way they are.
Core challenges you’ll face:
- Designing fixed-width instruction encoding → maps to RISC design philosophy
- Implementing addressing modes (immediate, register, memory) → maps to operand fetch
- Building an assembler (text → binary) → maps to machine code generation
- Implementing load/store for memory access → maps to memory hierarchy
- Adding a status register (zero, negative, carry flags) → maps to condition codes
Key Concepts:
- RISC vs CISC Philosophy: “Computer Organization and Design” Chapter 2.18 - Patterson & Hennessy
- Instruction Encoding: “Computer Organization and Design” Chapter 2.5 - Patterson & Hennessy
- Addressing Modes: “Computer Systems: A Programmer’s Perspective” Chapter 3.4 - Bryant & O’Hallaron
- Assembler Design: “Language Implementation Patterns” Chapter 5 - Terence Parr
Difficulty: Advanced Time estimate: 2-3 weeks Prerequisites: CHIP-8 emulator completed, comfort with binary/hex
Real world outcome:
$ cat fibonacci.asm
; Your custom ISA assembly!
MOV R0, #0 ; fib(n-2) = 0
MOV R1, #1 ; fib(n-1) = 1
MOV R2, #10 ; count = 10
loop:
ADD R3, R0, R1 ; fib(n) = fib(n-2) + fib(n-1)
MOV R0, R1 ; shift values
MOV R1, R3
SUB R2, R2, #1 ; count--
BNZ loop ; if count != 0, continue
HALT
$ ./myasm fibonacci.asm -o fibonacci.bin
Assembled: 9 instructions, 36 bytes
$ ./mycpu fibonacci.bin --trace
[0x00] MOV R0, #0 | R0=0x00000000
[0x04] MOV R1, #1 | R1=0x00000001
[0x08] MOV R2, #10 | R2=0x0000000A
[0x0C] ADD R3, R0, R1 | R3=0x00000001
...
[HALT] R1 = 55 (10th Fibonacci number)
Learning milestones:
- Your ISA executes simple programs → You understand fetch-decode-execute deeply
- Your assembler produces working binaries → You understand the assembly→machine code pipeline
- Programs with loops and conditionals work → You understand control flow at the hardware level
Project 6: 6502 CPU Emulator (NES/C64 CPU)
- File: CPU_ISA_ARCHITECTURE_PROJECTS.md
- Main Programming Language: C
- Alternative Programming Languages: Rust, C++
- Coolness Level: Level 5: Pure Magic (Super Cool)
- Business Potential: 1. The “Resume Gold” (Educational/Personal Brand)
- Difficulty: Level 3: Advanced (The Engineer)
- Knowledge Area: CPU Emulation / 8-bit Architecture
- Software or Tool: 6502 Emulator
- Main Book: “Computer Organization and Design” by Patterson & Hennessy
What you’ll build: A cycle-accurate emulator of the MOS 6502 processor—the CPU inside the NES, Commodore 64, Apple II, and Atari 2600. You’ll implement all 56 instructions with their various addressing modes.
Why it teaches CPU fundamentals: The 6502 is a real CPU that powered a generation of computers. Unlike toy examples, it has complex addressing modes (zero-page, indexed, indirect), a real status register, and quirks that teach you how real silicon behaves. Running actual 6502 code (from the 1980s!) is incredibly satisfying.
Core challenges you’ll face:
- Implementing 13 different addressing modes → maps to how CPUs find operands
- Handling page-boundary crossing (extra cycles) → maps to memory access timing
- Implementing the status register (N, V, Z, C flags) → maps to condition flags
- Proper BCD (Binary Coded Decimal) arithmetic → maps to specialized ALU modes
- Cycle-accurate timing → maps to understanding CPU pipelines
Key Concepts:
- Addressing Modes: “Computer Systems: A Programmer’s Perspective” Chapter 3.4 - Bryant & O’Hallaron
- Status Flags and Branching: “Computer Organization and Design” Chapter 4.5 - Patterson & Hennessy
- Memory Timing: “Computer Organization and Design” Chapter 5.1 - Patterson & Hennessy
- 6502 Specifics: “Programming the 6502” by Rodnay Zaks (classic reference)
Difficulty: Advanced Time estimate: 3-4 weeks Prerequisites: Previous emulator experience (CHIP-8 or custom RISC)
Real world outcome:
$ ./emu6502 test_suite/6502_functional_test.bin
Running Klaus Dormann's 6502 Functional Test Suite...
[OK] LDA/STA/LDX/STX operations
[OK] Arithmetic: ADC, SBC
[OK] Logic: AND, ORA, EOR
[OK] Shifts: ASL, LSR, ROL, ROR
[OK] Branches: BEQ, BNE, BCC, BCS, BMI, BPL, BVC, BVS
[OK] Stack operations: PHA, PLA, PHP, PLP
[OK] Subroutines: JSR, RTS
[OK] Indexed addressing modes
[OK] Indirect addressing modes
ALL TESTS PASSED! Your 6502 is working correctly.
$ ./emu6502 roms/apple1_basic.bin
Apple I BASIC loaded. Type BASIC commands:
> PRINT 2 + 2
4
> 10 FOR I = 1 TO 5
> 20 PRINT I * I
> 30 NEXT I
> RUN
1
4
9
16
25
Learning milestones:
- Functional test suite passes → Your CPU is correct (verified against real hardware behavior)
- All addressing modes work → You deeply understand operand fetch
- Run Apple I BASIC or Commodore 64 programs → You’re running code from 1977 on your emulator!
Project 7: Memory Visualizer & Debugger
- File: CPU_ISA_ARCHITECTURE_PROJECTS.md
- Main Programming Language: C
- Alternative Programming Languages: Rust, C++, Python
- Coolness Level: Level 3: Genuinely Clever
- Business Potential: 2. The “Micro-SaaS / Pro Tool” (Solo-Preneur Potential)
- Difficulty: Level 3: Advanced (The Engineer)
- Knowledge Area: Memory Layout / Debugging
- Software or Tool: Memory Inspector
- Main Book: “Computer Systems: A Programmer’s Perspective” by Bryant & O’Hallaron
What you’ll build: A tool that attaches to a running process (using ptrace on Linux or mach APIs on macOS) and visualizes its memory layout—showing the stack, heap, code section, and how they change over time.
Why it teaches CPU fundamentals: Understanding memory is essential for understanding CPUs. This project shows you exactly where variables live, how the stack grows, how the heap fragments, and how the CPU’s view of memory relates to actual physical RAM.
Core challenges you’ll face:
- Attaching to a process and reading its memory → maps to virtual memory concepts
- Parsing /proc/[pid]/maps or equivalent → maps to memory segments (text, data, bss, heap, stack)
- Visualizing stack frames and local variables → maps to calling conventions and stack layout
- Tracking allocations over time → maps to heap management
Key Concepts:
- Virtual Memory: “Computer Systems: A Programmer’s Perspective” Chapter 9 - Bryant & O’Hallaron
- Process Memory Layout: “Computer Systems: A Programmer’s Perspective” Chapter 7.9 - Bryant & O’Hallaron
- The Stack: “Computer Systems: A Programmer’s Perspective” Chapter 3.7 - Bryant & O’Hallaron
- ptrace System Call: “The Linux Programming Interface” Chapter 26 - Michael Kerrisk
Difficulty: Advanced Time estimate: 2-3 weeks Prerequisites: C, basic understanding of processes, comfort with system calls
Real world outcome:
$ ./memviz ./my_program
Attaching to PID 12345...
Memory Map:
┌─────────────────────────────────────────────────────────────┐
│ 0x7fff00000000 ─────────────────────── STACK (grows down) │
│ │ main() frame: 64 bytes │
│ │ └─ int x = 42 at [RSP+0x08] │
│ │ └─ char buf[32] at [RSP+0x10] │
│ │ foo() frame: 32 bytes │
│ │ └─ return addr → main+0x1a │
│ ▼ │
│ [~3.8 GB unused] │
│ ▲ │
│ │ HEAP (grows up) │
│ │ [0x5555557a0000] malloc(1024) - active │
│ │ [0x5555557a0400] malloc(256) - freed │
├─────────────────────────────────────────────────────────────┤
│ 0x555555400000 ─── .text (code) ─── 4KB │
│ 0x555555600000 ─── .data (globals) ─── 1KB │
│ 0x555555601000 ─── .bss (zero-init) ─── 512B │
└─────────────────────────────────────────────────────────────┘
[Press 's' to step, 'c' to continue, 'q' to quit]
Learning milestones:
- Can read another process’s memory → You understand process isolation and debugging
- Stack visualization shows call frames → You see how functions work at the memory level
- Heap tracking works → You understand dynamic memory allocation
Project 8: x86-64 Disassembler
- File: CPU_ISA_ARCHITECTURE_PROJECTS.md
- Main Programming Language: C
- Alternative Programming Languages: Rust, C++
- Coolness Level: Level 4: Hardcore Tech Flex
- Business Potential: 1. The “Resume Gold” (Educational/Personal Brand)
- Difficulty: Level 4: Expert (The Systems Architect)
- Knowledge Area: ISA Encoding / Reverse Engineering
- Software or Tool: Disassembler
- Main Book: “Computer Systems: A Programmer’s Perspective” by Bryant & O’Hallaron
What you’ll build: A disassembler that reads x86-64 machine code bytes and produces human-readable assembly. Handle common instructions (MOV, ADD, SUB, CMP, JMP, CALL, RET, PUSH, POP).
Why it teaches CPU fundamentals: x86-64 is a CISC architecture with variable-length instructions (1-15 bytes!). Building a disassembler forces you to understand instruction encoding at the deepest level: prefixes, opcodes, ModR/M bytes, SIB bytes, and displacement/immediate fields.
Core challenges you’ll face:
- Decoding variable-length instructions → maps to CISC instruction encoding
- Parsing ModR/M and SIB bytes → maps to addressing mode encoding
- Handling prefixes (REX, operand-size, etc.) → maps to instruction modifiers
- Resolving relative addresses for jumps → maps to PC-relative addressing
Key Concepts:
- x86-64 Instruction Format: “Intel 64 and IA-32 Architectures Software Developer’s Manual” Volume 2 - Intel
- CISC vs RISC: “Computer Organization and Design” Chapter 2.18 - Patterson & Hennessy
- ModR/M Encoding: “Computer Systems: A Programmer’s Perspective” Web Aside ASM:IA32 - Bryant & O’Hallaron
- Disassembly Techniques: “Practical Binary Analysis” Chapter 6 - Dennis Andriesse
Difficulty: Expert Time estimate: 3-4 weeks Prerequisites: 6502 emulator experience, familiarity with x86-64 assembly
Real world outcome:
$ echo -e '\x55\x48\x89\xe5\x48\x83\xec\x10\xc7\x45\xfc\x00\x00\x00\x00\x5d\xc3' | ./disasm
0x0000: 55 push rbp
0x0001: 48 89 e5 mov rbp, rsp
0x0004: 48 83 ec 10 sub rsp, 0x10
0x0008: c7 45 fc 00 00 00 mov DWORD PTR [rbp-0x4], 0x0
0x000f: 5d pop rbp
0x0010: c3 ret
$ ./disasm /bin/ls | head -20
[Disassembly of /bin/ls entry point]
0x4010: 31 ed xor ebp, ebp
0x4012: 49 89 d1 mov r9, rdx
0x4015: 5e pop rsi
0x4016: 48 89 e2 mov rdx, rsp
...
Learning milestones:
- Simple instructions (push, pop, ret) decode → You understand basic opcode format
- ModR/M byte parsing works → You understand complex instruction encoding
- Can disassemble real binaries (/bin/ls) → Your disassembler is production-quality
Project 9: Write Programs in Assembly (x86-64 or ARM)
- File: CPU_ISA_ARCHITECTURE_PROJECTS.md
- Main Programming Language: Assembly (x86-64 or ARM64)
- Alternative Programming Languages: x86-64, ARM64, RISC-V
- Coolness Level: Level 4: Hardcore Tech Flex
- Business Potential: 1. The “Resume Gold” (Educational/Personal Brand)
- Difficulty: Level 3: Advanced (The Engineer)
- Knowledge Area: Assembly Programming / Low-level Coding
- Software or Tool: Assembler (nasm, gas)
- Main Book: “Computer Systems: A Programmer’s Perspective” by Bryant & O’Hallaron
What you’ll build: Write real assembly programs: a number printer, string reversal, bubble sort, and finally a simple shell or text editor. No C, no libraries—just raw syscalls.
Why it teaches CPU fundamentals: There’s no substitute for writing assembly. You’ll use registers directly, manage the stack manually, and see exactly what the CPU does. After this, you’ll read compiler output fluently and understand exactly what your C code becomes.
Core challenges you’ll face:
- Making system calls directly (no libc) → maps to user/kernel interface
- Managing registers (caller-saved vs callee-saved) → maps to calling conventions
- Implementing loops and conditionals → maps to branch instructions
- Stack-based local variables → maps to activation records
- String manipulation byte-by-byte → maps to memory access patterns
Key Concepts:
- x86-64 Assembly: “Computer Systems: A Programmer’s Perspective” Chapter 3 - Bryant & O’Hallaron
- System Calls: “The Linux Programming Interface” Chapter 3 - Michael Kerrisk
- Calling Conventions: “Computer Systems: A Programmer’s Perspective” Chapter 3.7 - Bryant & O’Hallaron
- Practical Assembly: “The Art of 64-Bit Assembly, Volume 1” - Randall Hyde
Difficulty: Advanced Time estimate: 2-4 weeks (for multiple programs) Prerequisites: Understanding of registers, stack, basic x86-64 or ARM knowledge
Real world outcome:
; hello.asm - x86-64 Linux
section .data
msg: db "Hello from raw assembly!", 10
len: equ $ - msg
section .text
global _start
_start:
mov rax, 1 ; sys_write
mov rdi, 1 ; stdout
mov rsi, msg ; buffer
mov rdx, len ; length
syscall
mov rax, 60 ; sys_exit
xor rdi, rdi ; status = 0
syscall
$ nasm -f elf64 hello.asm && ld hello.o -o hello && ./hello
Hello from raw assembly!
$ wc -c hello
352 hello # Only 352 bytes! No libc, no bloat.
Learning milestones:
- Hello World works with raw syscalls → You understand the syscall interface
- Bubble sort with loops and array access works → You can write real algorithms
- A simple shell (read command, fork, exec) → You’ve built something useful in pure assembly
Project 10: Game Boy Emulator (Z80-like CPU)
- File: CPU_ISA_ARCHITECTURE_PROJECTS.md
- Main Programming Language: C
- Alternative Programming Languages: Rust, C++
- Coolness Level: Level 5: Pure Magic (Super Cool)
- Business Potential: 1. The “Resume Gold” (Educational/Personal Brand)
- Difficulty: Level 4: Expert (The Systems Architect)
- Knowledge Area: CPU Emulation / Complete System
- Software or Tool: Game Boy Emulator
- Main Book: “Game Boy Coding Adventure” by Maximilien Dagois
What you’ll build: A complete Game Boy emulator: CPU (modified Z80), memory banking, PPU (graphics), timers, and input. Run actual commercial games like Tetris, Pokemon, and Zelda.
Why it teaches CPU fundamentals: The Game Boy is a complete system. The CPU doesn’t exist in isolation—it coordinates with the PPU through memory-mapped registers, handles interrupts, and must run at precise timing. This project shows how all the pieces fit together.
Core challenges you’ll face:
- Implementing the modified Z80 instruction set → maps to full ISA implementation
- Memory banking (switching ROM/RAM banks) → maps to memory management
- Cycle-accurate PPU synchronization → maps to timing and synchronization
- Handling interrupts (VBlank, timer, etc.) → maps to interrupt handling
- Implementing LCD drawing (scanlines, sprites, background) → maps to hardware rendering
Key Concepts:
- Complete CPU Implementation: “Game Boy Coding Adventure” Chapters 3-5 - Maximilien Dagois
- Memory Mapping: “Computer Organization and Design” Chapter 5.2 - Patterson & Hennessy
- Interrupts: “Computer Organization and Design” Chapter 4.9 - Patterson & Hennessy
- Game Boy Specifics: “Pan Docs” (gbdev.io) - The definitive Game Boy technical reference
Difficulty: Expert Time estimate: 1-2 months Prerequisites: 6502 emulator experience, graphics programming basics
Real world outcome:
$ ./gameboy roms/tetris.gb
┌────────────────────────────────────────┐
│ ████████████████ │
│ █ TETRIS █ │
│ ████████████████ │
│ │
│ ██ │
│ ██ │
│ ████ │
│ │
│ ████ │
│ ██ │
│ ██ │
│ ──────────────────── │
│ Score: 1,234 │
└────────────────────────────────────────┘
Controls: Arrow keys to move, Z to rotate, X to drop
You’re playing the actual Tetris ROM from 1989 on an emulator you built!
Learning milestones:
- CPU passes Blargg’s test ROMs → Your CPU is accurate
- Simple games boot and are playable → CPU + PPU + input work together
- Pokemon runs → You’ve handled edge cases and complex memory banking
Project 11: Bare-Metal Programming (Raspberry Pi or Arduino)
- File: CPU_ISA_ARCHITECTURE_PROJECTS.md
- Main Programming Language: C with inline Assembly
- Alternative Programming Languages: Rust, Assembly-only
- Coolness Level: Level 5: Pure Magic (Super Cool)
- Business Potential: 1. The “Resume Gold” (Educational/Personal Brand)
- Difficulty: Level 4: Expert (The Systems Architect)
- Knowledge Area: Bare-Metal / Operating Systems
- Software or Tool: Bare-metal Raspberry Pi / Arduino
- Main Book: “Computer Systems: A Programmer’s Perspective” by Bryant & O’Hallaron
What you’ll build: A program that runs on a Raspberry Pi with NO operating system. You’ll boot directly into your code, blink an LED, read a button, and output text over UART—all by directly accessing hardware registers.
Why it teaches CPU fundamentals: With no OS, there’s nothing between you and the CPU. You’ll configure the hardware by writing to memory-mapped registers, handle interrupts yourself, and see exactly how the CPU boots. This is the ultimate understanding of how computers really work.
Core challenges you’ll face:
- Writing a bootloader/linker script → maps to how programs are loaded into memory
- Configuring GPIO by writing to registers → maps to memory-mapped I/O
- Implementing busy-wait and timer-based delays → maps to hardware timers
- Setting up UART for serial output → maps to peripheral communication
- Handling interrupts without an OS → maps to interrupt vectors and handlers
Key Concepts:
- Bootloaders and Linker Scripts: “Making Embedded Systems” Chapter 3 - Elecia White
- Memory-Mapped I/O: “Computer Organization and Design” Chapter 5.2 - Patterson & Hennessy
- Bare-Metal Setup: “Bare Metal C” Chapters 1-5 - Steve Oualline
- ARM Specifics: “The Art of ARM Assembly, Volume 1” - Randall Hyde
Difficulty: Expert Time estimate: 2-4 weeks Prerequisites: C programming, basic electronics, previous emulator experience
Real world outcome:
// main.c - No OS, no libraries, just you and the hardware
#include "gpio.h"
#include "uart.h"
void kernel_main() {
uart_init();
uart_puts("Hello from bare metal!\r\n");
gpio_set_function(47, GPIO_FUNC_OUTPUT); // LED pin
while (1) {
gpio_set(47); // LED on
delay_ms(500);
gpio_clear(47); // LED off
delay_ms(500);
uart_puts("Blink!\r\n");
}
}
$ arm-none-eabi-gcc -nostdlib -ffreestanding kernel.c -o kernel.elf
$ arm-none-eabi-objcopy kernel.elf -O binary kernel8.img
$ # Copy to SD card, insert in Raspberry Pi, power on
[UART output via serial cable]:
Hello from bare metal!
Blink!
Blink!
Blink!
The LED blinks. No Linux. No OS. Just your code running directly on the CPU.
Learning milestones:
- LED blinks → You can control hardware directly
- UART output works → You’ve configured a peripheral from scratch
- Button input with interrupts → You understand interrupt handling without an OS
Project 12: RISC-V CPU on FPGA
- File: CPU_ISA_ARCHITECTURE_PROJECTS.md
- Main Programming Language: Verilog or VHDL
- Alternative Programming Languages: Verilog, VHDL, Chisel
- Coolness Level: Level 5: Pure Magic (Super Cool)
- Business Potential: 1. The “Resume Gold” (Educational/Personal Brand)
- Difficulty: Level 5: Master (The First-Principles Wizard)
- Knowledge Area: CPU Design / Digital Logic
- Software or Tool: FPGA (Lattice iCE40, Xilinx Artix)
- Main Book: “Digital Design and Computer Architecture” by Harris & Harris
What you’ll build: A real CPU implemented in hardware (Verilog/VHDL) running on an FPGA. Implement the RISC-V RV32I instruction set, write programs in RISC-V assembly, and watch them execute on your CPU.
Why it teaches CPU fundamentals: This is the ultimate project. You’re not emulating a CPU in software—you’re building one in actual hardware. You’ll implement registers as flip-flops, the ALU as combinational logic, and the control unit as a state machine. After this, you truly understand CPUs at the transistor level.
Core challenges you’ll face:
- Implementing a register file in Verilog → maps to storage elements
- Building an ALU with carry chains → maps to arithmetic circuits
- Designing the control unit (FSM or hardwired) → maps to control signals
- Memory interface (BRAM, load/store) → maps to memory hierarchy
- Pipelining (optional, advanced) → maps to CPU performance optimization
Key Concepts:
- Digital Logic in Verilog: “Digital Design and Computer Architecture” Chapters 1-4 - Harris & Harris
- CPU Datapath: “Computer Organization and Design RISC-V Edition” Chapter 4.3 - Patterson & Hennessy
- Control Unit Design: “Computer Organization and Design RISC-V Edition” Chapter 4.4 - Patterson & Hennessy
- FPGA Workflow: “Getting Started with FPGAs” - Russell Merrick
Difficulty: Master Time estimate: 2-3 months Prerequisites: Digital logic, previous emulator experience, basic Verilog/VHDL
Real world outcome:
// A snippet of your CPU in Verilog
module cpu (
input wire clk,
input wire reset,
output wire [31:0] pc_out,
output wire [31:0] instr_out
);
reg [31:0] pc;
reg [31:0] regs [0:31];
wire [31:0] instruction;
// Fetch
assign instruction = imem[pc[9:2]];
// Decode
wire [6:0] opcode = instruction[6:0];
wire [4:0] rd = instruction[11:7];
wire [4:0] rs1 = instruction[19:15];
// Execute (ALU)
// ... your implementation here
endmodule
$ yosys -p "synth_ice40" cpu.v # Synthesize
$ nextpnr-ice40 --hx8k cpu.json # Place & Route
$ icepack cpu.asc cpu.bin # Generate bitstream
$ iceprog cpu.bin # Program FPGA
[LEDs on FPGA board show program counter advancing]
[UART output]: Running Fibonacci on YOUR CPU!
F(10) = 55
You’ve built a real CPU. In hardware. That runs programs.
Learning milestones:
- Basic instructions execute in simulation → Your Verilog describes a working CPU
- Runs on actual FPGA hardware → Your design maps to real logic gates
- Runs programs you wrote in RISC-V assembly → You have a complete, working CPU you designed
Project 13: CPU Pipeline Simulator
- File: CPU_ISA_ARCHITECTURE_PROJECTS.md
- Main Programming Language: C
- Alternative Programming Languages: Rust, Python, C++
- Coolness Level: Level 3: Genuinely Clever
- Business Potential: 1. The “Resume Gold” (Educational/Personal Brand)
- Difficulty: Level 3: Advanced (The Engineer)
- Knowledge Area: CPU Performance / Pipelining
- Software or Tool: Pipeline Visualizer
- Main Book: “Computer Organization and Design” by Patterson & Hennessy
What you’ll build: A simulator that shows how a pipelined CPU executes instructions. Visualize the 5-stage pipeline (IF, ID, EX, MEM, WB), show hazards, forwarding, stalls, and branch prediction.
Why it teaches CPU fundamentals: Modern CPUs don’t execute one instruction at a time—they overlap many instructions in a pipeline. Understanding hazards (data, control, structural) and how CPUs solve them (forwarding, stalling, speculation) is essential for writing fast code.
Core challenges you’ll face:
- Implementing 5 pipeline stages → maps to pipelining basics
- Detecting data hazards (RAW, WAR, WAW) → maps to dependency analysis
- Implementing forwarding/bypassing → maps to hazard mitigation
- Branch prediction (static, dynamic) → maps to control hazards
- Calculating CPI (Cycles Per Instruction) → maps to performance metrics
Key Concepts:
- Pipelining: “Computer Organization and Design” Chapter 4.5-4.8 - Patterson & Hennessy
- Hazards: “Computer Organization and Design” Chapter 4.7 - Patterson & Hennessy
- Branch Prediction: “Computer Architecture” Chapter 3.3 - Hennessy & Patterson
- Performance: “Computer Organization and Design” Chapter 1.6 - Patterson & Hennessy
Difficulty: Advanced Time estimate: 2-3 weeks Prerequisites: Understanding of basic CPU operation, emulator experience
Real world outcome:
$ ./pipelinesim program.asm
Clock Cycle: 10
┌───────┬───────┬───────┬───────┬───────┐
│ IF │ ID │ EX │ MEM │ WB │
├───────┼───────┼───────┼───────┼───────┤
│ ADD │ LW │ SUB │ MUL │ ADD │
│ R5,R6 │ R3,0( │ R4,R1 │ R2,R1 │ R1,R2 │
│ ,R7 │ R2) │ ,R3 │ ,R3 │ ,R3 │
└───────┴───────┴───────┴───────┴───────┘
Hazard Detected: RAW on R3 between LW and SUB
Action: Forwarding from MEM/WB to EX stage
Stats:
- Instructions: 100
- Cycles: 134
- CPI: 1.34
- Stalls due to load-use: 12
- Branch mispredictions: 5
Learning milestones:
- Instructions flow through 5 stages → You understand basic pipelining
- Hazards are detected and displayed → You understand dependencies
- Performance metrics match expected values → You can analyze CPU efficiency
Project 14: Cache Simulator
- File: CPU_ISA_ARCHITECTURE_PROJECTS.md
- Main Programming Language: C
- Alternative Programming Languages: Rust, C++, Python
- Coolness Level: Level 3: Genuinely Clever
- Business Potential: 1. The “Resume Gold” (Educational/Personal Brand)
- Difficulty: Level 3: Advanced (The Engineer)
- Knowledge Area: Memory Hierarchy / Caching
- Software or Tool: Cache Simulator
- Main Book: “Computer Systems: A Programmer’s Perspective” by Bryant & O’Hallaron
What you’ll build: A cache simulator that models L1, L2, and L3 caches. Feed it memory access traces and see hit rates, eviction patterns, and how different configurations affect performance.
Why it teaches CPU fundamentals: Memory is the bottleneck. CPUs spend more time waiting for memory than computing. Understanding caches—how they work, why they help, and how to write cache-friendly code—is essential for performance.
Core challenges you’ll face:
- Implementing direct-mapped, set-associative, fully-associative caches → maps to cache organization
- Address parsing (tag, index, offset) → maps to how addresses map to cache lines
- Replacement policies (LRU, FIFO, random) → maps to eviction strategies
- Multi-level cache hierarchy → maps to modern memory systems
- Write policies (write-through, write-back) → maps to cache coherence basics
Key Concepts:
- Cache Organization: “Computer Systems: A Programmer’s Perspective” Chapter 6.3-6.4 - Bryant & O’Hallaron
- Locality: “Computer Systems: A Programmer’s Perspective” Chapter 6.2 - Bryant & O’Hallaron
- Multi-level Caches: “Computer Organization and Design” Chapter 5.8 - Patterson & Hennessy
- Cache Performance: “Computer Systems: A Programmer’s Perspective” Chapter 6.6 - Bryant & O’Hallaron
Difficulty: Advanced Time estimate: 2 weeks Prerequisites: Understanding of memory addressing, basic C
Real world outcome:
$ ./cachesim --l1-size=32K --l1-assoc=8 --line=64 trace.txt
Simulating cache access for 1,000,000 memory references...
Cache Configuration:
L1: 32KB, 8-way set-associative, 64-byte lines
L2: 256KB, 16-way set-associative, 64-byte lines
Results:
┌─────────┬───────────┬───────────┬───────────┐
│ Cache │ Hits │ Misses │ Hit Rate │
├─────────┼───────────┼───────────┼───────────┤
│ L1 │ 920,145 │ 79,855 │ 92.0% │
│ L2 │ 71,234 │ 8,621 │ 89.2% │
│ Memory │ 8,621 │ - │ - │
└─────────┴───────────┴───────────┴───────────┘
Average Memory Access Time: 4.3 cycles
(vs. ~100 cycles if no cache!)
Hottest cache lines:
0x7fff5a00: 15,234 hits (loop counter)
0x5555b040: 8,456 hits (frequently accessed array)
Learning milestones:
- Direct-mapped cache works → You understand basic cache mechanics
- Set-associative with LRU works → You understand realistic cache design
- Multi-level hierarchy shows realistic behavior → You understand modern memory systems
Project 15: Branch Predictor Simulator
- File: CPU_ISA_ARCHITECTURE_PROJECTS.md
- Main Programming Language: C
- Alternative Programming Languages: Rust, C++, Python
- Coolness Level: Level 3: Genuinely Clever
- Business Potential: 1. The “Resume Gold” (Educational/Personal Brand)
- Difficulty: Level 3: Advanced (The Engineer)
- Knowledge Area: CPU Performance / Branch Prediction
- Software or Tool: Branch Predictor Simulator
- Main Book: “Computer Architecture” by Hennessy & Patterson
What you’ll build: A simulator that implements various branch prediction algorithms: static (always taken/not-taken), 1-bit, 2-bit saturating counters, and more advanced schemes like gshare and tournament predictors.
Why it teaches CPU fundamentals: Branch mispredictions are expensive—they flush the pipeline and waste cycles. Modern CPUs have sophisticated predictors with 95%+ accuracy. Understanding prediction helps you write code that CPUs can predict well.
Core challenges you’ll face:
- Implementing 2-bit saturating counters → maps to local branch history
- Implementing gshare (global history XOR) → maps to global branch prediction
- Tournament predictor (choosing between predictors) → maps to hybrid schemes
- Analyzing misprediction patterns → maps to understanding CPU performance
Key Concepts:
- Branch Prediction Basics: “Computer Organization and Design” Chapter 4.8 - Patterson & Hennessy
- Dynamic Prediction: “Computer Architecture” Chapter 3.3 - Hennessy & Patterson
- Tournament Predictors: “Computer Architecture” Chapter 3.3 - Hennessy & Patterson
- Real-World Predictors: Academic papers on TAGE, neural branch prediction
Difficulty: Advanced Time estimate: 1-2 weeks Prerequisites: Understanding of pipelining, control hazards
Real world outcome:
$ ./branchsim --predictor=gshare --history=12 trace.txt
Branch Predictor Comparison (1M branches):
┌──────────────────────┬───────────────┬──────────────┐
│ Predictor │ Accuracy (%) │ Mispredicts │
├──────────────────────┼───────────────┼──────────────┤
│ Always Taken │ 62.3% │ 377,000 │
│ 1-bit Local │ 85.1% │ 149,000 │
│ 2-bit Saturating │ 91.2% │ 88,000 │
│ Gshare (12-bit) │ 95.4% │ 46,000 │
│ Tournament │ 96.1% │ 39,000 │
└──────────────────────┴───────────────┴──────────────┘
If pipeline is 15 stages deep, misprediction cost = ~15 cycles
Cycles saved by 2-bit vs Always-Taken: 4,335,000 cycles!
Branch patterns detected:
- Loop branches: 98.2% predictable (taken N-1 times, then not taken)
- Data-dependent branches: 73.4% predictable (biased toward taken)
Learning milestones:
- 2-bit predictor beats 1-bit → You understand why state machines help
- Gshare beats local predictors → You understand global correlation
- You can analyze code for predictability → You write CPU-friendly code
Project Comparison Table
| # | Project | Difficulty | Time | Depth of Understanding | Coolness |
|---|---|---|---|---|---|
| 1 | Binary/Hex Visualizer | Beginner | Weekend | ⭐⭐ | ⭐⭐ |
| 2 | Logic Gate Simulator | Beginner | 1 week | ⭐⭐⭐ | ⭐⭐⭐ |
| 3 | Stack Machine VM | Intermediate | 1 week | ⭐⭐⭐ | ⭐⭐⭐ |
| 4 | CHIP-8 Emulator | Intermediate | 1-2 weeks | ⭐⭐⭐⭐ | ⭐⭐⭐⭐ |
| 5 | Custom RISC CPU | Advanced | 2-3 weeks | ⭐⭐⭐⭐⭐ | ⭐⭐⭐⭐ |
| 6 | 6502 Emulator | Advanced | 3-4 weeks | ⭐⭐⭐⭐⭐ | ⭐⭐⭐⭐⭐ |
| 7 | Memory Visualizer | Advanced | 2-3 weeks | ⭐⭐⭐⭐ | ⭐⭐⭐ |
| 8 | x86-64 Disassembler | Expert | 3-4 weeks | ⭐⭐⭐⭐⭐ | ⭐⭐⭐⭐ |
| 9 | Assembly Programming | Advanced | 2-4 weeks | ⭐⭐⭐⭐⭐ | ⭐⭐⭐⭐ |
| 10 | Game Boy Emulator | Expert | 1-2 months | ⭐⭐⭐⭐⭐ | ⭐⭐⭐⭐⭐ |
| 11 | Bare-Metal RPi | Expert | 2-4 weeks | ⭐⭐⭐⭐⭐ | ⭐⭐⭐⭐⭐ |
| 12 | RISC-V on FPGA | Master | 2-3 months | ⭐⭐⭐⭐⭐ | ⭐⭐⭐⭐⭐ |
| 13 | Pipeline Simulator | Advanced | 2-3 weeks | ⭐⭐⭐⭐ | ⭐⭐⭐ |
| 14 | Cache Simulator | Advanced | 2 weeks | ⭐⭐⭐⭐ | ⭐⭐⭐ |
| 15 | Branch Predictor | Advanced | 1-2 weeks | ⭐⭐⭐⭐ | ⭐⭐⭐ |
Recommended Learning Path
Since you’re starting from zero, here’s the optimal progression:
Phase 1: Foundations (2-3 weeks)
- Binary/Hex Visualizer - Get fluent with number systems
- Logic Gate Simulator - See how hardware computes
Phase 2: Your First CPU (3-4 weeks)
- Stack Machine VM - Implement fetch-decode-execute
- CHIP-8 Emulator - Emulate a real (simple) instruction set
Phase 3: Deep Dive into ISA (4-6 weeks)
- Custom RISC CPU - Design your own instruction set
- 6502 Emulator - Handle real-world complexity
Phase 4: Real Hardware & Performance (4-6 weeks)
- Assembly Programming - Write code at the CPU level
- Pipeline/Cache/Branch Simulators - Understand performance
Phase 5: The Ultimate Projects (2-3 months)
- Game Boy Emulator - Complete system integration
- Bare-Metal Raspberry Pi - No OS, just you and the CPU
- RISC-V on FPGA - Build a CPU in actual hardware
Final Capstone Project: Full System-on-Chip
After completing the learning path, tackle this ultimate project:
Project: RISC-V System-on-Chip with Peripherals
- File: CPU_ISA_ARCHITECTURE_PROJECTS.md
- Main Programming Language: Verilog + C (for firmware)
- Alternative Programming Languages: VHDL + Rust, Chisel + C
- Coolness Level: Level 5: Pure Magic (Super Cool)
- Business Potential: 1. The “Resume Gold” (Educational/Personal Brand)
- Difficulty: Level 5: Master (The First-Principles Wizard)
- Knowledge Area: CPU Design / System Integration
- Software or Tool: FPGA + Peripherals
- Main Book: “Computer Organization and Design RISC-V Edition” by Patterson & Hennessy
What you’ll build: A complete RISC-V CPU on an FPGA with: UART for serial communication, GPIO for LED/button control, a timer peripheral, interrupt controller, and SRAM interface. Write a bootloader and simple OS kernel that runs on YOUR CPU.
Why this is the capstone: This combines everything: you design the CPU, the memory interface, the peripherals, the interrupt handling, AND the software. When you boot Linux on a CPU you designed, you’ve reached the summit.
Core challenges you’ll face:
- Multi-stage pipeline with forwarding → maps to everything from previous projects
- Memory-mapped peripheral bus (Wishbone or AXI-Lite) → maps to bus protocols
- Interrupt controller (PLIC) → maps to interrupt prioritization
- Writing a bootloader in assembly → maps to system initialization
- Device drivers for your peripherals → maps to hardware/software interface
Key Concepts:
- SoC Design: “Digital Design and Computer Architecture” Chapter 8 - Harris & Harris
- Bus Protocols: “Computer Organization and Design” Chapter 5 - Patterson & Hennessy
- Interrupt Handling: “Computer Organization and Design” Chapter 4.9 - Patterson & Hennessy
- RISC-V Privileged Spec: Official RISC-V Privileged Architecture specification
Difficulty: Master Time estimate: 3-4 months Prerequisites: All previous projects, especially RISC-V on FPGA and bare-metal programming
Real world outcome:
[FPGA UART output]
=====================================
MY RISC-V SoC v1.0
CPU: RV32I @ 50MHz
RAM: 64KB SRAM
Peripherals: UART, GPIO, Timer
=====================================
Bootloader loaded. Starting kernel...
[MyOS]> gpio led on
LED is now ON
[MyOS]> timer 1000
Setting timer for 1000ms...
[IRQ] Timer interrupt! 1000ms elapsed.
[MyOS]> cat hello.txt
Hello from my CPU!
[MyOS]> run fibonacci
F(20) = 6765 (computed in 234 cycles)
You’ve built a complete computer. From logic gates to running programs. This is what understanding CPUs looks like.
Essential Resources
Primary Books (in order of importance)
- “Code: The Hidden Language” by Charles Petzold - Start here. No prerequisites.
- “Computer Systems: A Programmer’s Perspective” by Bryant & O’Hallaron - The programmer’s view.
- “Computer Organization and Design RISC-V Edition” by Patterson & Hennessy - The hardware view.
- “Digital Design and Computer Architecture” by Harris & Harris - For FPGA work.
Online References
- Nand2Tetris (nand2tetris.org) - Build a computer from logic gates to Tetris
- Ben Eater’s YouTube - 8-bit breadboard computer, visual explanations
- RISC-V Specifications (riscv.org) - Official ISA documentation
- Pan Docs (gbdev.io) - Game Boy technical reference
Tools You’ll Need
- C Compiler: GCC or Clang
- Assembler: NASM (x86-64), GNU as, or RISC-V toolchain
- Emulator Testing: Existing emulators to compare against
- FPGA: iCE40 boards (~$50) for hardware projects
- Raspberry Pi: For bare-metal projects
Start with Project 1. The journey of understanding CPUs begins with understanding that everything is just bits.