Understanding Bitcoin, Blockchain & Ethereum Through Building

Goal: Deeply understand how cryptocurrency systems work at every level—from the cryptographic primitives that secure transactions, to the consensus mechanisms that keep networks honest, to the virtual machines that execute smart contracts. By building these systems yourself, you’ll gain the knowledge to read any blockchain’s source code, audit smart contracts, and architect new decentralized systems.

Why Blockchain Technology Matters

In 2008, an anonymous developer named Satoshi Nakamoto published a 9-page whitepaper that solved a problem cryptographers had struggled with for decades: how to create digital money that can’t be double-spent, without trusting a central authority.

That solution—Bitcoin—launched an entirely new field of computer science. Today:

Bitcoin processes ~427,000+ transactions per day as of December 2024, with cumulative 1.28+ billion transactions since inception, secured by more computing power than the world’s top 500 supercomputers combined
Ethereum hosts over $166 billion in DeFi total value locked (TVL) plus $45 billion in Layer 2 TVL, running unstoppable applications with no central server
Layer 2 rollups (Arbitrum, Optimism, Base, zkSync) process thousands of transactions per second while inheriting Ethereum’s security
Every major bank and tech company now has blockchain research teams, with projections suggesting Ethereum’s TVL could grow 10x by 2026

Understanding blockchain isn’t just about cryptocurrency—it’s about understanding a new paradigm for building trustless, decentralized systems. The concepts you’ll learn (cryptographic commitments, distributed consensus, state machines, game-theoretic security) apply far beyond crypto.

The Mental Model: What Makes Blockchain Work

Before diving into projects, you need to understand why blockchains work. Here’s the core insight:

Traditional Database                    Blockchain
┌─────────────────────┐                ┌─────────────────────────────────────┐
│                     │                │  ┌──────┐  ┌──────┐  ┌──────┐       │
│   Central Server    │                │  │Node 1│  │Node 2│  │Node 3│ ...   │
│   (Single source    │                │  └──┬───┘  └──┬───┘  └──┬───┘       │
│    of truth)        │                │     │         │         │           │
│                     │                │     └─────────┴─────────┘           │
└─────────┬───────────┘                │           Consensus                 │
          │                            │     (All nodes agree on             │
    "Trust me"                         │      the same state)                │
          │                            └─────────────────────────────────────┘
          ▼                                         │
  Users must trust                                  ▼
  the operator                            "Trust the math"
                                         (Cryptographic proofs make
                                          cheating impossible/expensive)

The key innovation is replacing trust in institutions with trust in mathematics:

Cryptographic hashes make tampering detectable (change one bit → completely different hash)
Digital signatures prove ownership without revealing secrets
Proof of Work/Stake makes attacks economically irrational
Merkle trees enable efficient verification without downloading everything
Consensus protocols ensure all honest nodes see the same history

Cryptographic Hash Functions: The Foundation of Blockchain

Every blockchain relies on cryptographic hash functions. A hash function takes any input and produces a fixed-size output (the “hash” or “digest”):

Input: "Hello, World!"
        ↓ SHA-256
Output: dffd6021bb2bd5b0af676290809ec3a53191dd81c7f70a4b28688a362182986f

Input: "Hello, World." (just one character different!)
        ↓ SHA-256
Output: f8c3bf62a9aa3e6fc1619c250e48ade01a8e0a892e2e69e9a5e3f8a2f5e21c8a
                        (completely different!)

Properties That Make Hash Functions Useful

┌─────────────────────────────────────────────────────────────────────────────┐
│                    HASH FUNCTION PROPERTIES                                 │
├─────────────────────────────────────────────────────────────────────────────┤
│                                                                             │
│  1. DETERMINISTIC                                                           │
│     Same input → Always same output                                         │
│     "hello" → abc123... (every single time)                                 │
│                                                                             │
│  2. ONE-WAY (Preimage Resistance)                                           │
│     Input → Hash  ✓ EASY (microseconds)                                     │
│     Hash → Input  ✗ IMPOSSIBLE (longer than the universe)                   │
│                                                                             │
│  3. COLLISION RESISTANT                                                     │
│     Finding two different inputs with same hash is infeasible               │
│     P(collision) ≈ 1 in 2^128 for SHA-256                                   │
│                                                                             │
│  4. AVALANCHE EFFECT                                                        │
│     Tiny change in input → Completely different output                      │
│     "hello" → abc123...                                                     │
│     "hellp" → xyz789... (no similarity!)                                    │
│                                                                             │
└─────────────────────────────────────────────────────────────────────────────┘

How Blocks Chain Together via Hashes

                         THE BLOCKCHAIN DATA STRUCTURE

Block 0 (Genesis)           Block 1                      Block 2
┌──────────────────┐        ┌──────────────────┐        ┌──────────────────┐
│ Prev Hash: 0000  │   ┌───▶│ Prev Hash: a7f3 │   ┌───▶│ Prev Hash: 8d2e │
├──────────────────┤   │    ├──────────────────┤   │    ├──────────────────┤
│ Timestamp        │   │    │ Timestamp        │   │    │ Timestamp        │
│ Nonce            │   │    │ Nonce            │   │    │ Nonce            │
│ Merkle Root      │   │    │ Merkle Root      │   │    │ Merkle Root      │
├──────────────────┤   │    ├──────────────────┤   │    ├──────────────────┤
│ Transactions     │   │    │ Transactions     │   │    │ Transactions     │
│ - Tx0            │   │    │ - Tx0            │   │    │ - Tx0            │
│ - Tx1            │   │    │ - Tx1            │   │    │ - Tx1            │
│ - ...            │   │    │ - ...            │   │    │ - ...            │
└────────┬─────────┘   │    └────────┬─────────┘   │    └──────────────────┘
         │             │             │             │
         └─ Hash ──────┘             └─ Hash ──────┘
          = a7f3...                   = 8d2e...

WHY THIS IS TAMPER-PROOF:
─────────────────────────
If you try to change a transaction in Block 1:
  1. Block 1's hash changes (avalanche effect)
  2. Block 2's "Prev Hash" no longer matches
  3. Block 2's hash changes
  4. ... all subsequent blocks become invalid

To tamper, you'd need to recalculate ALL subsequent blocks
faster than the honest network adds new ones.

Merkle Trees: Efficient Transaction Verification

A Merkle tree allows you to prove a transaction is in a block without downloading the entire block:

                            MERKLE TREE STRUCTURE

                            ┌─────────────────┐
                            │   Merkle Root   │ ← Stored in block header
                            │   H(AB + CD)    │   (just 32 bytes!)
                            └────────┬────────┘
                                     │
                   ┌─────────────────┴─────────────────┐
                   │                                   │
             ┌─────┴─────┐                       ┌─────┴─────┐
             │  H(A+B)   │                       │  H(C+D)   │
             │ Internal  │                       │ Internal  │
             └─────┬─────┘                       └─────┬─────┘
                   │                                   │
           ┌───────┴───────┐                   ┌───────┴───────┐
           │               │                   │               │
      ┌────┴────┐     ┌────┴────┐         ┌────┴────┐     ┌────┴────┐
      │  H(A)   │     │  H(B)   │         │  H(C)   │     │  H(D)   │
      │ Leaf A  │     │ Leaf B  │         │ Leaf C  │     │ Leaf D  │
      └────┬────┘     └────┬────┘         └────┬────┘     └────┬────┘
           │               │                   │               │
      ┌────┴────┐     ┌────┴────┐         ┌────┴────┐     ┌────┴────┐
      │  Tx A   │     │  Tx B   │         │  Tx C   │     │  Tx D   │
      └─────────┘     └─────────┘         └─────────┘     └─────────┘


MERKLE PROOF EXAMPLE: Prove Tx B is in the tree
─────────────────────────────────────────────────

You need only 2 hashes (marked with ★):
  1. H(A) ★  (to compute H(A+B))
  2. H(C+D) ★ (to compute the root)

       ┌─────────────────┐
       │   Merkle Root   │ ← Compute and compare to block header
       │   H(AB + CD)    │
       └────────┬────────┘
                │
       ┌────────┴────────┐
       │                 │
   ┌───┴───┐         ┌───┴───┐
   │H(A+B) │         │H(C+D) │ ★ Given
   └───┬───┘         └───────┘
       │
   ┌───┴───┐
   │       │
   ★       ● ← Your transaction (Tx B)
  H(A)    H(B)  You compute H(B) yourself

Verification: O(log n) hashes instead of O(n)
  - 1 million transactions: only ~20 hashes needed!
  - Light clients can verify without full blockchain

Elliptic Curve Cryptography: How Digital Signatures Work

Bitcoin and Ethereum use the secp256k1 elliptic curve for digital signatures. Here’s why this matters:

                    THE SECP256K1 CURVE: y² = x³ + 7

                        │
                        │           ....
                        │       ...      ...
                        │     ..            ..
                        │    .                .
                        │   .       Point      .
                ────────┼───●──────────────────────────
                        │   .   G (Generator)  .
                        │    .                .
                        │     ..            ..
                        │       ...      ...
                        │           ....
                        │

  HOW KEY GENERATION WORKS:
  ─────────────────────────

  1. Pick a random 256-bit number: your PRIVATE KEY (k)
     k = 0x1234567890abcdef... (keep this SECRET!)

  2. Multiply Generator Point G by your private key:
     PUBLIC KEY = k × G = P (a point on the curve)

  3. The magic: Computing k × G is easy (milliseconds)
                Reversing P → k is IMPOSSIBLE (billions of years)

     This is the "Elliptic Curve Discrete Logarithm Problem" (ECDLP)
     Security: 2^128 operations to break = heat death of universe


  DIGITAL SIGNATURE (ECDSA):
  ──────────────────────────

  Signing a Transaction:
  ┌─────────────────────────────────────────────────────────┐
  │                                                         │
  │  1. Hash the message: z = SHA256(transaction_data)      │
  │                                                         │
  │  2. Pick random nonce: k (MUST be unique per signature!)│
  │                                                         │
  │  3. Calculate R = k × G (a curve point)                 │
  │     r = R.x mod n (x-coordinate of R)                   │
  │                                                         │
  │  4. Calculate s = k⁻¹(z + r × private_key) mod n        │
  │                                                         │
  │  5. Signature = (r, s)                                  │
  │                                                         │
  └─────────────────────────────────────────────────────────┘

  Verifying a Signature (anyone can do this!):
  ┌─────────────────────────────────────────────────────────┐
  │                                                         │
  │  Given: message, signature (r, s), public key P         │
  │                                                         │
  │  1. Hash the message: z = SHA256(message)               │
  │                                                         │
  │  2. Calculate: u₁ = z × s⁻¹ mod n                       │
  │                u₂ = r × s⁻¹ mod n                       │
  │                                                         │
  │  3. Calculate point: R' = u₁ × G + u₂ × P               │
  │                                                         │
  │  4. Signature valid if: R'.x mod n == r                 │
  │                                                         │
  └─────────────────────────────────────────────────────────┘

  WHY THIS IS SECURE:
  ───────────────────
  - Only the private key holder can create valid signatures
  - Anyone can verify with just the public key
  - The signature is unique to that exact message
  - Change one bit of the message → signature becomes invalid

Bitcoin’s UTXO Model vs Ethereum’s Account Model

These are the two fundamental ways blockchains track “who owns what”:

                    BITCOIN: UTXO MODEL
                    ════════════════════

Think of it like CASH - physical bills you receive and spend:

You have these UTXOs (Unspent Transaction Outputs):
┌─────────────────┐  ┌─────────────────┐  ┌─────────────────┐
│ UTXO #1         │  │ UTXO #2         │  │ UTXO #3         │
│ 0.5 BTC         │  │ 0.3 BTC         │  │ 0.2 BTC         │
│ From: tx_abc    │  │ From: tx_def    │  │ From: tx_ghi    │
└─────────────────┘  └─────────────────┘  └─────────────────┘
Total: 1.0 BTC (but stored as 3 separate "bills")


To send 0.6 BTC to Alice:
─────────────────────────
    INPUTS (consumed)              OUTPUTS (created)
┌─────────────────────────┐    ┌─────────────────────────┐
│ UTXO #1: 0.5 BTC ───────┼───▶│ Alice:   0.6 BTC (new) │
│ UTXO #2: 0.3 BTC ───────┼───▶│ Change:  0.199 BTC (new)│
└─────────────────────────┘    │ Fee:     0.001 BTC     │
   0.8 BTC consumed            └─────────────────────────┘
                                  0.8 BTC distributed

After: You own 1 UTXO (the 0.199 BTC change + UTXO #3)


═══════════════════════════════════════════════════════════════


                    ETHEREUM: ACCOUNT MODEL
                    ════════════════════════

Think of it like a BANK ACCOUNT - one balance that gets updated:

Global State (simplified):
┌───────────────────────────────────────────────────────────┐
│  Account                    │ Balance  │ Nonce │ Storage │
├─────────────────────────────┼──────────┼───────┼─────────┤
│ 0xAlice...                  │ 5.0 ETH  │  12   │   -     │
│ 0xBob...                    │ 3.2 ETH  │  45   │   -     │
│ 0xUniswap... (contract)     │ 1000 ETH │   1   │  {...}  │
└───────────────────────────────────────────────────────────┘


To send 2.0 ETH from Alice to Bob:
──────────────────────────────────
Before:  Alice = 5.0 ETH,  Bob = 3.2 ETH,  Alice.nonce = 12
After:   Alice = 2.9 ETH,  Bob = 5.2 ETH,  Alice.nonce = 13
                  ▲                                      ▲
                  │                                      │
           (2.0 sent + 0.1 fee)              (prevents replay attacks)


═══════════════════════════════════════════════════════════════


                        COMPARISON
                        ══════════

┌─────────────────────┬──────────────────────┬────────────────────────┐
│ Property            │ UTXO (Bitcoin)       │ Account (Ethereum)     │
├─────────────────────┼──────────────────────┼────────────────────────┤
│ Privacy             │ Better (new address  │ Worse (single address  │
│                     │  for each UTXO)      │  easy to track)        │
├─────────────────────┼──────────────────────┼────────────────────────┤
│ Parallelism         │ Excellent (UTXOs are │ Limited (account state │
│                     │  independent)        │  is sequential)        │
├─────────────────────┼──────────────────────┼────────────────────────┤
│ Smart Contracts     │ Limited (Script is   │ Excellent (Turing-     │
│                     │  not Turing-complete)│  complete EVM)         │
├─────────────────────┼──────────────────────┼────────────────────────┤
│ Complexity          │ Higher (manage many  │ Lower (simple balance) │
│                     │  UTXOs)              │                        │
├─────────────────────┼──────────────────────┼────────────────────────┤
│ Double-Spend Check  │ Check if UTXO exists │ Check nonce sequence   │
│                     │  and unspent         │                        │
└─────────────────────┴──────────────────────┴────────────────────────┘

Proof of Work: Making Cheating Expensive

Proof of Work is the original consensus mechanism that made Bitcoin possible:

                    HOW PROOF OF WORK OPERATES

The Mining Puzzle:
──────────────────
Find a nonce such that:
  SHA256(block_header + nonce) < TARGET

Where TARGET is adjusted so this takes ~10 minutes on average
for the entire network combined.


Example:
────────
Block header data: "prev_hash=abc123, merkle_root=def456, timestamp=..."

Miner tries:
  nonce=0:  SHA256(...) = f8a3b2c1...  ❌ Too high!
  nonce=1:  SHA256(...) = e9d4c5a6...  ❌ Too high!
  nonce=2:  SHA256(...) = d7e8f901...  ❌ Too high!
  ... millions of attempts ...
  nonce=8372631: SHA256(...) = 0000000000abc...  ✓ Below target!

This took BILLIONS of hash operations (expensive!)
But verifying is just ONE hash (cheap!)


The Difficulty Target:
──────────────────────
┌─────────────────────────────────────────────────────────────────┐
│                                                                 │
│  TARGET (determines how many leading zeros required)            │
│                                                                 │
│  Difficulty 1:   00000000ffffffff...  (easiest)                │
│  Difficulty 10:  000000000fffffff...                           │
│  Difficulty 100: 0000000000ffffff...                           │
│  Bitcoin 2024:   00000000000000000000...  (very hard!)         │
│                                                                 │
│  Adjustment: Every 2016 blocks (~2 weeks)                      │
│    - If blocks came too fast → increase difficulty             │
│    - If blocks came too slow → decrease difficulty             │
│    - Goal: maintain ~10 minute average block time              │
│                                                                 │
└─────────────────────────────────────────────────────────────────┘


Why This Creates Consensus:
───────────────────────────
┌─────────────────────────────────────────────────────────────────┐
│                                                                 │
│  1. Miners compete to find valid blocks                         │
│                                                                 │
│  2. First valid block gets broadcast to network                 │
│                                                                 │
│  3. Other miners verify (cheap!) and accept                     │
│                                                                 │
│  4. Miners start building on the new longest chain              │
│                                                                 │
│  5. Block reward (currently 3.125 BTC) incentivizes honesty    │
│                                                                 │
│                                                                 │
│  Fork Resolution: LONGEST CHAIN WINS                           │
│                                                                 │
│       ┌────┐     ┌────┐     ┌────┐                             │
│       │ B1 │────▶│ B2 │────▶│ B3 │──┬──▶ ✓ This chain wins    │
│       └────┘     └────┘     └────┘  │    (more work)           │
│                               │     │                          │
│                               ▼     │                          │
│                            ┌────┐   │                          │
│                            │ B3'│───┘    ✗ Orphaned            │
│                            └────┘        (less work)           │
│                                                                 │
└─────────────────────────────────────────────────────────────────┘


51% Attack Economics:
─────────────────────
To rewrite history, an attacker needs >50% of network hash rate:
  - Current Bitcoin network: ~500 EH/s (500 quintillion hashes/sec)
  - Cost of equipment: ~$10 billion
  - Electricity: ~$20 million per day
  - And you'd destroy the value of what you're trying to steal!

The attack is possible but economically irrational.

The Ethereum Virtual Machine: A World Computer

Ethereum extends Bitcoin’s vision by adding a Turing-complete virtual machine:

                    EVM ARCHITECTURE

┌─────────────────────────────────────────────────────────────────┐
│                    ETHEREUM VIRTUAL MACHINE                     │
├─────────────────────────────────────────────────────────────────┤
│                                                                 │
│  ┌─────────────────────────────────────────────────────────┐   │
│  │                        STACK                             │   │
│  │  ┌────┬────┬────┬────┬────┬────┬─────────────────────┐  │   │
│  │  │ 32 │ 31 │ 30 │ 29 │ 28 │ .. │         0           │  │   │
│  │  │byte│byte│byte│byte│byte│    │      (top)          │  │   │
│  │  └────┴────┴────┴────┴────┴────┴─────────────────────┘  │   │
│  │  Max depth: 1024 items, each 256-bits (32 bytes)        │   │
│  └─────────────────────────────────────────────────────────┘   │
│                                                                 │
│  ┌─────────────────────────────────────────────────────────┐   │
│  │                       MEMORY                             │   │
│  │  ┌────┬────┬────┬────┬────┬────┬────┬────┬────────────┐ │   │
│  │  │ 0  │ 1  │ 2  │ 3  │ 4  │ 5  │ .. │ n  │   ∞        │ │   │
│  │  └────┴────┴────┴────┴────┴────┴────┴────┴────────────┘ │   │
│  │  Linear byte array, volatile (cleared after execution)  │   │
│  │  Cost: grows quadratically (more memory = more gas)     │   │
│  └─────────────────────────────────────────────────────────┘   │
│                                                                 │
│  ┌─────────────────────────────────────────────────────────┐   │
│  │                       STORAGE                            │   │
│  │  ┌──────────────────────┬───────────────────────────┐   │   │
│  │  │      Key (256-bit)   │     Value (256-bit)       │   │   │
│  │  ├──────────────────────┼───────────────────────────┤   │   │
│  │  │  0x0000...0000       │  contract_owner_address   │   │   │
│  │  │  0x0000...0001       │  total_supply             │   │   │
│  │  │  keccak(user, slot)  │  user_balance             │   │   │
│  │  └──────────────────────┴───────────────────────────┘   │   │
│  │  Key-value store, PERSISTENT across transactions        │   │
│  │  Most expensive operation! (20,000 gas to write)        │   │
│  └─────────────────────────────────────────────────────────┘   │
│                                                                 │
└─────────────────────────────────────────────────────────────────┘


EVM EXECUTION MODEL:
────────────────────

Program Counter ──▶ ┌──────────────────────────────────────────┐
                    │ 0x60  0x80  0x60  0x40  0x52  0x34  ...  │
                    │ PUSH1 0x80  PUSH1 0x40  MSTORE CALLVALUE │
                    └──────────────────────────────────────────┘
                         ▲
                         │
                    Each byte is an OPCODE or data


COMMON OPCODES:
───────────────
┌───────────┬────────┬────────────────────────────────────────────┐
│ Opcode    │  Gas   │  Description                               │
├───────────┼────────┼────────────────────────────────────────────┤
│ ADD       │   3    │  Pop 2 values, push their sum              │
│ MUL       │   5    │  Pop 2 values, push their product          │
│ SUB       │   3    │  Pop 2 values, push their difference       │
│ DIV       │   5    │  Pop 2 values, push their quotient         │
├───────────┼────────┼────────────────────────────────────────────┤
│ PUSH1     │   3    │  Push 1 byte onto stack                    │
│ PUSH32    │   3    │  Push 32 bytes onto stack                  │
│ POP       │   2    │  Remove top stack item                     │
│ DUP1      │   3    │  Duplicate top stack item                  │
│ SWAP1     │   3    │  Swap top 2 stack items                    │
├───────────┼────────┼────────────────────────────────────────────┤
│ MLOAD     │   3    │  Load word from memory                     │
│ MSTORE    │   3    │  Store word in memory                      │
│ SLOAD     │  100   │  Load word from storage (cold access 2100) │
│ SSTORE    │ 20000  │  Store word in storage (most expensive!)   │
├───────────┼────────┼────────────────────────────────────────────┤
│ JUMP      │   8    │  Jump to code location                     │
│ JUMPI     │  10    │  Conditional jump                          │
│ CALL      │  100+  │  Call another contract                     │
│ RETURN    │   0    │  End execution, return data                │
└───────────┴────────┴────────────────────────────────────────────┘


EXAMPLE: Simple Addition in Bytecode
────────────────────────────────────

Solidity:  function add(uint a, uint b) returns (uint) { return a + b; }

Compiles to something like:
  PUSH1 0x00     ; Push 0 (result location)
  CALLDATALOAD   ; Load 'a' from calldata
  PUSH1 0x20     ; Push 32 (offset for 'b')
  CALLDATALOAD   ; Load 'b' from calldata
  ADD            ; a + b
  PUSH1 0x00     ; Push return offset
  MSTORE         ; Store result in memory
  PUSH1 0x20     ; Push return size (32 bytes)
  PUSH1 0x00     ; Push return offset
  RETURN         ; Return the result

Stack trace:
  []                          ; Start empty
  [0x00]                      ; PUSH1 0x00
  [a]                         ; CALLDATALOAD (replaces 0x00 with value at offset 0)
  [a, 0x20]                   ; PUSH1 0x20
  [a, b]                      ; CALLDATALOAD (loads value at offset 32)
  [a+b]                       ; ADD
  ...

Concept Summary Table

Concept Cluster	What You Need to Internalize
Hash functions	SHA-256 is deterministic, one-way, collision-resistant. Blocks chain via hashes. Changing one bit invalidates everything after.
Merkle trees	Binary tree of hashes. Proves inclusion in O(log n). Enables light clients and efficient verification.
Elliptic curves (secp256k1)	Private key × Generator Point = Public key. Easy to compute, impossible to reverse. Foundation of digital signatures.
ECDSA signatures	Prove ownership without revealing private key. Unique per message. Anyone can verify.
UTXO vs Account model	Bitcoin tracks unspent outputs (like cash). Ethereum tracks balances (like bank accounts). Trade-offs in privacy, parallelism, and expressiveness.
Proof of Work	Find nonce where hash < target. Expensive to create, cheap to verify. Longest chain wins. 51% attack is economically irrational.
The EVM	Stack-based VM with 256-bit words. Stack (1024 deep), Memory (volatile), Storage (persistent). Gas measures computation. Opcodes are single bytes.
Consensus	Agreement without central authority. PoW uses energy, PoS uses stake. Both make attacks expensive.

Deep Dive Reading by Concept

This section maps each concept to specific book chapters for deeper understanding. Read these before or alongside the projects to build strong mental models.

Cryptographic Foundations

Concept	Book & Chapter
Hash function internals	Serious Cryptography, 2nd Edition by Jean-Philippe Aumasson — Ch. 6: “Hash Functions”
Merkle trees and proofs	Mastering Bitcoin, 3rd Edition by Andreas Antonopoulos — Ch. 11: “The Blockchain”
Elliptic curve math	Programming Bitcoin by Jimmy Song — Ch. 2-3: “Elliptic Curves” and “Elliptic Curve Cryptography”
ECDSA signatures	Programming Bitcoin by Jimmy Song — Ch. 4: “Serialization” and Ch. 5: “Transactions”
Cryptographic primitives overview	Practical Cryptography for Developers (online) by Svetlin Nakov

Bitcoin Internals

Concept	Book & Chapter
Transaction structure	Mastering Bitcoin, 3rd Edition — Ch. 6: “Transactions”
UTXO model deep dive	Programming Bitcoin by Jimmy Song — Ch. 5-7: Transactions and Script
Bitcoin Script opcodes	Programming Bitcoin by Jimmy Song — Ch. 6: “Script”
Block structure	Mastering Bitcoin, 3rd Edition — Ch. 11: “The Blockchain”
Proof of Work mining	Programming Bitcoin by Jimmy Song — Ch. 9: “Blocks”

Ethereum & Smart Contracts

Concept	Book & Chapter
EVM architecture	Mastering Ethereum by Antonopoulos & Wood — Ch. 13: “The Ethereum Virtual Machine”
Smart contract basics	Mastering Ethereum — Ch. 7: “Smart Contracts and Solidity”
Gas and execution model	Mastering Ethereum — Ch. 13 (Gas section)
Account model	Mastering Ethereum — Ch. 4: “Cryptography” and Ch. 5: “Wallets”

Distributed Systems & Consensus

Concept	Book & Chapter
Byzantine Fault Tolerance	Designing Data-Intensive Applications by Martin Kleppmann — Ch. 8-9: “Distributed Systems Trouble” and “Consistency and Consensus”
P2P networking	Computer Networks by Tanenbaum & Wetherall — Ch. 5: “Network Layer”
Consensus algorithms	Designing Data-Intensive Applications — Ch. 9: “Consistency and Consensus”
Proof of Stake	Ethereum Casper papers and Vitalik’s blog posts

Building Virtual Machines

Concept	Book & Chapter
Stack machine architecture	Crafting Interpreters by Robert Nystrom — Part III: “A Bytecode Virtual Machine”
Bytecode design	Crafting Interpreters — Ch. 14-15: “Chunks of Bytecode” and “A Virtual Machine”
Compiler construction	Writing a C Compiler by Nora Sandler — Full book

Essential Reading Order

For maximum comprehension, read in this order:

Foundation (Week 1):
- Programming Bitcoin Ch. 1-4 (field math, curves, serialization)
- Mastering Bitcoin Ch. 6 (transactions)
Bitcoin Deep Dive (Week 2):
- Programming Bitcoin Ch. 5-9 (transactions, script, blocks)
- Mastering Bitcoin Ch. 11 (blockchain)
Ethereum (Week 3):
- Mastering Ethereum Ch. 4-7 (crypto, wallets, contracts)
- Mastering Ethereum Ch. 13 (EVM)
Distributed Systems (Week 4):
- Designing Data-Intensive Applications Ch. 8-9

Prerequisites & Background Knowledge

Essential Prerequisites (Must Have)

Before starting these projects, you should have:

Programming Skills:

Proficiency in at least one programming language (Python, C, JavaScript, or Rust preferred)
Understanding of data structures (arrays, hash tables, trees)
Basic algorithm analysis (Big O notation)
Experience with command-line tools and Git

Cryptography Fundamentals:

What is a cryptographic hash function? (Can you explain SHA-256?)
What makes a hash function “cryptographically secure”?
What is public-key cryptography? (Asymmetric encryption)
What is a digital signature and why can’t it be forged?

Computer Science Fundamentals:

How does TCP/IP networking work? (Client-server model, sockets)
What is a state machine?
What is serialization/deserialization?
How do distributed systems differ from single-machine programs?

Helpful But Not Required

You’ll learn these during the projects, but having them helps:

Number theory (modular arithmetic, prime fields) - Project 1 teaches this
Distributed systems theory (CAP theorem, Byzantine Generals) - Projects 2 & 4 cover this
Compiler theory (parsing, ASTs, bytecode) - Projects 3 & 5 teach this
Game theory (incentives, Nash equilibrium) - Emerges naturally in consensus projects

Self-Assessment Questions

Can you answer these? If yes, you’re ready:

Cryptography:
- Why can’t you reverse a SHA-256 hash?
- How does signing with a private key let others verify with the public key?
- What happens if two different inputs produce the same hash?
Programming:
- How would you represent a graph in memory?
- What’s the difference between passing by value vs. by reference?
- How do you debug a segfault or null pointer exception?
Networking:
- What’s the difference between TCP and UDP?
- How do two programs on different computers communicate?
- What does “peer-to-peer” mean?

If you struggled with any of these, spend a day reviewing:

Cryptography basics: “Serious Cryptography, 2nd Edition” Ch. 1-3
Networking: “Computer Networks” Ch. 1
Data structures: “Algorithms, Fourth Edition” Ch. 1-3

Development Environment Setup

Required Tools:

Python 3.9+ (for Projects 1, 2, 5) - or your preferred language
C compiler (gcc or clang for Project 2 if using C)
Git for version control
Text editor or IDE (VS Code, Vim, PyCharm - whatever you’re comfortable with)

Recommended Tools:

Bitcoin Core (for testing against real Bitcoin network)

# Install Bitcoin Core in regtest mode for safe testing
bitcoin-cli -regtest -daemon

Geth (Ethereum client for testing EVM projects)

# Run a local Ethereum testnet
geth --dev --http

Block explorer access (blockchain.com, etherscan.io) to inspect real transactions
Wireshark or tcpdump for inspecting network packets (Project 2)
Hexdump tools (xxd, hexyl) for debugging binary formats

Optional but Useful:

Docker to run multiple blockchain nodes easily
Postman or curl for testing APIs
Jupyter notebooks for experimenting with crypto math (Project 1)

Time Investment Expectations

Realistic time estimates per project:

Project	Beginner	Intermediate	Advanced
Project 1: Bitcoin from Scratch	6-8 weeks	3-4 weeks	2-3 weeks
Project 2: Minimal Blockchain	2 weeks	1 week	2-3 days
Project 3: EVM from Scratch	4-6 weeks	2-3 weeks	1-2 weeks
Project 4: Proof-of-Stake	3-4 weeks	2 weeks	1 week
Project 5: Smart Contract Compiler	3-4 weeks	2 weeks	1 week
Project 6: Layer-2 Rollup	6-8 weeks	4 weeks	2-3 weeks

Total learning journey: 6-12 months if doing all projects (working 10-15 hours/week)

Important: These are NOT tutorial projects. You will get stuck. You will need to read documentation, whitePapers, and books. You will debug for hours. This is how deep learning happens.

Important Reality Check

What These Projects Are NOT:

❌ Copy-paste tutorials with step-by-step instructions
❌ “Build a blockchain in 100 lines” toy examples
❌ Get-rich-quick crypto trading guides
❌ Production-ready code you should deploy with real money

What These Projects ARE:

✅ Deep dives into how blockchain systems actually work
✅ Implementations from first principles with working code
✅ Educational exercises that force you to confront hard problems
✅ Skills that transfer to auditing smart contracts, building dApps, or understanding any blockchain

Warning: Do NOT use code from these projects with real cryptocurrency. These are educational implementations. Production systems require security audits, extensive testing, and expert review.

Quick Start: Your First 48 Hours

Feeling overwhelmed? Start here.

If you’re new to blockchain and don’t know where to begin, follow this 48-hour crash course:

Hour 0-4: Understand the Core Insight

Read these (in order):

Bitcoin Whitepaper - 9 pages, focus on sections 1-5
“Why Blockchain Technology Matters” section above (reread it slowly)
Visualize: Draw the “Traditional Database vs Blockchain” diagram on paper

Experiment:

# Install Python and try this:
import hashlib

# See how hash functions work
data = "Hello, Bitcoin!"
hash1 = hashlib.sha256(data.encode()).hexdigest()
print(f"SHA-256({data}) = {hash1}")

# Change ONE character
data2 = "Hello, Bitcoin?"
hash2 = hashlib.sha256(data2.encode()).hexdigest()
print(f"SHA-256({data2}) = {hash2}")

# Notice: Hashes are completely different!

Goal: Understand why cryptographic hashes make blockchains tamper-proof.

Hour 5-12: Build Your First Tiny Blockchain

Do this mini-project:

Create a simple Python script with 3 blocks that chain together:

# mini_blockchain.py
import hashlib
import json
import time

class Block:
    def __init__(self, index, data, previous_hash):
        self.index = index
        self.timestamp = time.time()
        self.data = data
        self.previous_hash = previous_hash
        self.hash = self.calculate_hash()

    def calculate_hash(self):
        block_string = json.dumps({
            "index": self.index,
            "timestamp": self.timestamp,
            "data": self.data,
            "previous_hash": self.previous_hash
        }, sort_keys=True)
        return hashlib.sha256(block_string.encode()).hexdigest()

# Create genesis block
genesis = Block(0, "Genesis Block", "0")
block1 = Block(1, "Alice sends Bob 10 BTC", genesis.hash)
block2 = Block(2, "Bob sends Charlie 5 BTC", block1.hash)

print(f"Block 0: {genesis.hash}")
print(f"Block 1: {block1.hash} (prev: {block1.previous_hash})")
print(f"Block 2: {block2.hash} (prev: {block2.previous_hash})")

# Try tampering!
block1.data = "Alice sends Bob 1000 BTC"  # Fraud attempt!
print(f"\n❌ After tampering Block 1:")
print(f"Block 1 hash: {block1.hash}")
print(f"Block 2 expects previous: {block2.previous_hash}")
print(f"Does Block 2 still validate? {block1.hash == block2.previous_hash}")

Run it and experiment:

What happens when you change block 1’s data?
Why does block 2 break when you tamper with block 1?
This is the “chain” in blockchain!

Hour 13-24: Understand Proof of Work

Add mining to your mini blockchain:

Modify your calculate_hash method to require the hash to start with 0000:

def mine_block(self, difficulty=4):
    target = "0" * difficulty
    self.nonce = 0
    while self.hash[:difficulty] != target:
        self.nonce += 1
        self.hash = self.calculate_hash()
    print(f"Block mined! Nonce: {self.nonce}, Hash: {self.hash}")

Run it and observe:

How long does it take to mine a block?
What happens if you increase difficulty to 5 zeros? 6 zeros?
This is proof-of-work! It makes tampering expensive.

Hour 25-36: Learn Cryptographic Signatures

Read and implement:

Read “Programming Bitcoin” Chapter 3 (Elliptic Curve Cryptography) - first 20 pages
Use Python’s ecdsa library to sign and verify messages:

from ecdsa import SigningKey, SECP256k1

# Alice generates a keypair
private_key = SigningKey.generate(curve=SECP256k1)
public_key = private_key.verifying_key

# Alice signs a message
message = b"I, Alice, send Bob 10 BTC"
signature = private_key.sign(message)

# Anyone can verify with Alice's public key
try:
    public_key.verify(signature, message)
    print("✓ Signature valid!")
except:
    print("❌ Invalid signature")

# Try to forge (will fail!)
fake_message = b"I, Alice, send Bob 1000 BTC"
try:
    public_key.verify(signature, fake_message)
except:
    print("❌ Cannot forge signature!")

Goal: Understand how digital signatures prove ownership.

Hour 37-48: Read Real Bitcoin Code

Don’t implement, just READ:

Go to Bitcoin Core source code
Read these files (just read, don’t understand everything):
- src/primitives/block.h - See how blocks are structured
- src/primitives/transaction.h - See transaction format
- src/hash.h - See how hashing is implemented

Then look at a real transaction:

Visit blockchain.com
Click any recent block
Click a transaction
Look at the “raw transaction” hex

Ask yourself:

Can I see the inputs and outputs?
Where are the signatures?
How much data is actually in a transaction?

After 48 Hours

You should now understand:

✅ Why blockchains are tamper-proof (hash chains)
✅ Why mining makes attacks expensive (proof-of-work)
✅ How ownership is proven (digital signatures)
✅ What a real blockchain looks like (Bitcoin exploration)

You’re ready to start Project 2: “Build a Minimal Blockchain in a Weekend”

Recommended Learning Paths

Different backgrounds benefit from different project orders. Choose your path:

Path 1: “The Academic” (Theory-First Approach)

Best for: Computer science students, math background, theory lovers

Order:

Start: Read all concept sections above thoroughly
Project 2: Build a Minimal Blockchain (get the mental model)
Project 4: Implement Proof-of-Stake (understand consensus theory)
Project 1: Build Bitcoin from Scratch (apply all the theory)
Project 3: Build the EVM (understand state machines)
Project 5: Smart Contract Compiler (compilers & languages)
Project 6: Layer-2 Rollup (advanced cryptography)

Why this order: You build theoretical understanding before diving into Bitcoin’s complexity.

Path 2: “The Practitioner” (Build-First Approach)

Best for: Professional developers, learn-by-doing types

Order:

Start: Quick Start guide (48 hours)
Project 2: Build a Minimal Blockchain (see it work immediately)
Project 1: Build Bitcoin from Scratch (the real deal)
Project 3: Build the EVM (different paradigm)
Project 5: Smart Contract Compiler (compiler theory)
Project 4: Proof-of-Stake (modern consensus)
Project 6: Layer-2 Rollup (cutting edge)

Why this order: You get working code fast, then deepen understanding.

Path 3: “The Bitcoin Maximalist”

Best for: Those specifically interested in Bitcoin, security-focused

Order:

Start: Read Bitcoin Whitepaper 3 times
Project 1: Build Bitcoin from Scratch (the only true blockchain!)
Project 2: Build a Minimal Blockchain (understand Bitcoin’s innovations)
Project 4: Proof-of-Stake (to understand why Bitcoin doesn’t use it)
Project 6: Layer-2 Rollup (Bitcoin’s Lightning Network uses similar ideas)
Skip Projects 3 & 5 (or do them to understand “the competition”)

Why this order: Deep Bitcoin focus, with understanding of alternatives.

Path 4: “The Ethereum Developer”

Best for: dApp developers, smart contract auditors

Order:

Start: Quick Start guide + read Ethereum sections above
Project 3: Build the EVM (understand what your Solidity code runs on)
Project 5: Smart Contract Compiler (understand how code becomes bytecode)
Project 1: Build Bitcoin (understand why Ethereum differs)
Project 4: Proof-of-Stake (Ethereum 2.0’s consensus)
Project 6: Layer-2 Rollup (Arbitrum, Optimism, zkSync)
Optional: Project 2 (minimal blockchain for comparison)

Why this order: Focused on understanding Ethereum’s entire stack.

Path 5: “The Researcher” (Consensus & Distributed Systems Focus)

Best for: Those interested in distributed systems, consensus algorithms

Order:

Start: Read “Designing Data-Intensive Applications” Ch. 8-9 first
Project 4: Proof-of-Stake (Byzantine Fault Tolerance)
Project 2: Minimal Blockchain (distributed state machines)
Project 1: Bitcoin (Nakamoto Consensus)
Project 6: Layer-2 Rollup (optimistic vs zero-knowledge proofs)
Projects 3 & 5: If interested in execution environments

Why this order: Focuses on the distributed systems theory that makes blockchains work.

Path 6: “The Time-Constrained” (Fastest Path to Understanding)

Best for: Busy professionals, want core insights fast

Order:

Week 1: Quick Start (48 hours) + Project 2 (weekend)
Week 2-4: Project 1 (Bitcoin, focus on Ch. 1-5 of “Programming Bitcoin”)
Week 5-6: Project 3 (EVM, just get it working, don’t optimize)
Stop here - you understand 80% of blockchain concepts

Why this order: Maximum learning per hour invested.

Project 1: Build Bitcoin From Scratch

File: BLOCKCHAIN_BITCOIN_ETHEREUM_LEARNING_PROJECTS.md
Programming Language: Python
Coolness Level: Level 5: Pure Magic (Super Cool)
Business Potential: 1. The “Resume Gold”
Difficulty: Level 5: Master
Knowledge Area: Blockchain / Cryptography
Software or Tool: Bitcoin
Main Book: “Programming Bitcoin” by Jimmy Song

What you’ll build: A complete Bitcoin library implementing elliptic curve cryptography, transactions, blocks, script parsing, and network communication—all from first principles in Python.

Why it teaches blockchain: This forces you to understand why Bitcoin works, not just that it works. You can’t fake your way through implementing ECDSA signatures or parsing transaction scripts. Every line of code confronts you with a design decision Satoshi made.

Core challenges you’ll face:

Finite field arithmetic → Maps to understanding why Bitcoin uses secp256k1 curve
Elliptic curve point operations → Maps to how public keys derive from private keys
Transaction serialization → Maps to how data is encoded on-chain
Script interpreter → Maps to Bitcoin’s programmability model
Merkle tree construction → Maps to how blocks efficiently prove transaction inclusion
Block header hashing → Maps to proof-of-work mining

Resources for key challenges:

“Programming Bitcoin” by Jimmy Song - THE definitive hands-on guide; each chapter builds the library piece by piece with exercises
Bitcoin Whitepaper - Read alongside implementation to see theory meet practice

Key Concepts:

Elliptic Curve Cryptography: “Programming Bitcoin” Ch. 2-3 - Jimmy Song
Transaction Structure: “Mastering Bitcoin” Ch. 6 - Andreas Antonopoulos
Script Opcodes: “Programming Bitcoin” Ch. 6 - Jimmy Song
Merkle Trees: “Mastering Bitcoin” Ch. 11 - Andreas Antonopoulos
Proof of Work: “Programming Bitcoin” Ch. 9 - Jimmy Song

Difficulty: Advanced Time estimate: 1 month+ Prerequisites: Python proficiency, basic number theory helps

Learning milestones:

Generate valid Bitcoin addresses - You understand public key cryptography
Parse and create transactions - You understand the UTXO model
Validate a real block - You understand proof-of-work and Merkle proofs
Interpret Script opcodes - You understand Bitcoin’s programmability

Real World Outcome

When you complete this project, you’ll have a fully functional Bitcoin library that you wrote from scratch. Here’s exactly what you’ll be able to do:

1. Generate Real Bitcoin Addresses

$ python bitcoin_cli.py generate-wallet

Private Key (WIF): 5HueCGU8rMjxEXxiPuD5BDku4MkFqeZyd4dZ1jvhTVqvbTLvyTJ
Public Key (compressed): 02d0de0aaeaefad02b8bdc8a01a1b8b11c696bd3d66a2c5f10780d95b7df42645c
Bitcoin Address (P2PKH): 1GAehh7TsJAHuUAeKZcXf5CnwuGuGgyX2S

This is a REAL Bitcoin address on mainnet!
Send 0.00001 BTC to it to verify (you can recover with the private key above)

2. Parse and Decode Real Transactions from the Blockchain

$ python bitcoin_cli.py decode-tx 0100000001c997a5e56e104102fa209c6a852dd90660a20b2d9c352423edce25857fcd3704000000...

TRANSACTION DECODED:
════════════════════════════════════════════════════════
Version: 1
Locktime: 0

INPUTS (1):
  [0] Previous TX: 0437cd7f8525cede2e4a0b40b1f6e3c7...
      Output Index: 0
      ScriptSig: 47304402204e45e16932b8af514961...
      Sequence: 0xffffffff

OUTPUTS (2):
  [0] Value: 0.10000000 BTC (10,000,000 satoshis)
      ScriptPubKey: OP_DUP OP_HASH160 <pubkeyhash> OP_EQUALVERIFY OP_CHECKSIG
      Type: P2PKH (Pay to Public Key Hash)
      Address: 1runeksijzfVxyrpiyCY2LCBvYsSi1Ai6

  [1] Value: 0.08950000 BTC (Change)
      ScriptPubKey: OP_DUP OP_HASH160 <pubkeyhash> OP_EQUALVERIFY OP_CHECKSIG
      Address: 1QJtPTVJjkqVALLzCLF9kCLJYN4C4GzK2c

Transaction ID: e4c226432e...
Size: 225 bytes
════════════════════════════════════════════════════════

3. Create and Sign Your Own Transactions

$ python bitcoin_cli.py create-tx \
    --from-utxo "tx_id:0" \
    --to "1BvBMSEYstWetqTFn5Au4m4GFg7xJaNVN2:0.001" \
    --change "1MyAddress..." \
    --private-key "your_wif_key"

SIGNED TRANSACTION:
Raw Hex: 0100000001eccf7e3034189b851985d871f91384b8ee357cd47c3024736e...

Breakdown:
  - Input signed with ECDSA (your implementation!)
  - Signature verified: ✓ VALID
  - Ready to broadcast to network

You can paste this hex into any block explorer's "broadcast" feature

4. Validate Real Blocks from the Bitcoin Network

$ python bitcoin_cli.py validate-block 000000000019d6689c085ae165831e934ff763ae46a2a6c172b3f1b60a8ce26f

BLOCK 0 (Genesis Block) VALIDATION:
════════════════════════════════════════════════════════
Header Hash: 000000000019d6689c085ae165831e934ff763ae46a2a6c172b3f1b60a8ce26f
             ^^^^^^^^ (Leading zeros = Proof of Work!)

Version: 1
Previous Block: 0000000000000000000000000000000000000000000000000000000000000000
Merkle Root: 4a5e1e4baab89f3a32518a88c31bc87f618f76673e2cc77ab2127b7afdeda33b
Timestamp: 2009-01-03 18:15:05 UTC
Difficulty Bits: 0x1d00ffff
Nonce: 2083236893

VALIDATION RESULTS:
  ✓ Block hash below target (PoW valid)
  ✓ Merkle root matches transactions
  ✓ Coinbase transaction present
  ✓ Block structure valid

"The Times 03/Jan/2009 Chancellor on brink of second bailout for banks"
  - Satoshi's message embedded in the coinbase!
════════════════════════════════════════════════════════

5. Execute Bitcoin Script Programs

$ python bitcoin_cli.py run-script "OP_2 OP_3 OP_ADD OP_5 OP_EQUAL"

SCRIPT EXECUTION TRACE:
════════════════════════════════════════════════════════
Step 1: OP_2      Stack: [2]
Step 2: OP_3      Stack: [2, 3]
Step 3: OP_ADD    Stack: [5]           (popped 2 and 3, pushed 5)
Step 4: OP_5      Stack: [5, 5]
Step 5: OP_EQUAL  Stack: [1]           (1 = TRUE)

RESULT: ✓ Script executed successfully (stack top is truthy)
════════════════════════════════════════════════════════

The Core Question You’re Answering

“How does Bitcoin actually work at the byte level? How does a private key become an address? How does a transaction prove ownership without revealing the private key?”

Before you write any code, sit with these questions. Most developers know “Bitcoin uses cryptography” but can’t explain how a 256-bit random number becomes a valid Bitcoin address, or why you can prove you own coins without ever revealing your private key.

The answer involves:

Finite field arithmetic (modular math in a prime field)
Elliptic curve point multiplication (how a number becomes a curve point)
Cryptographic hash chains (how addresses are derived)
Digital signature math (how ECDSA proves knowledge without revelation)

Concepts You Must Understand First

Stop and research these before coding:

Finite Fields (Modular Arithmetic)
- What does “mod p” mean and why is it useful for cryptography?
- What is a “field” in abstract algebra? Why does Bitcoin use prime fields?
- How do you compute modular inverses? (Extended Euclidean Algorithm)
- Why does Fermat’s Little Theorem give us a^(p-1) ≡ 1 (mod p)?
- Book Reference: “Programming Bitcoin” Ch. 1 - Jimmy Song
Elliptic Curves over Finite Fields
- What is an elliptic curve equation? (y² = x³ + ax + b)
- What does “point addition” mean geometrically?
- What is the “point at infinity” and why do we need it?
- What makes secp256k1 special? (y² = x³ + 7)
- Book Reference: “Programming Bitcoin” Ch. 2-3 - Jimmy Song
Digital Signatures (ECDSA)
- Why can signing prove ownership without revealing the private key?
- What is a “nonce” and why MUST it never be reused? (Sony PlayStation 3 hack!)
- What do r and s in a signature represent?
- How does verification work using only the public key?
- Book Reference: “Understanding and Using C Pointers” Ch. 1-2 for pointer intuition; “Programming Bitcoin” Ch. 4 - Jimmy Song for signatures
Transaction Structure (UTXOs)
- What is a UTXO (Unspent Transaction Output)?
- Why does Bitcoin use inputs/outputs rather than account balances?
- What is a “locking script” (scriptPubKey) vs “unlocking script” (scriptSig)?
- How does OP_CHECKSIG actually verify a signature?
- Book Reference: “Mastering Bitcoin” Ch. 6 - Andreas Antonopoulos
Bitcoin Script
- What is a stack-based language?
- Why did Satoshi make Script intentionally NOT Turing-complete?
- What are the most important opcodes: OP_DUP, OP_HASH160, OP_EQUALVERIFY, OP_CHECKSIG?
- How does P2PKH (Pay-to-Public-Key-Hash) work?
- Book Reference: “Programming Bitcoin” Ch. 6 - Jimmy Song
Block Structure and Merkle Trees
- What is in a block header? (version, prev_hash, merkle_root, timestamp, bits, nonce)
- How does the Merkle root commit to all transactions?
- What is the “difficulty target” and how is it encoded in 4 bytes?
- Why does finding a valid nonce take billions of attempts?
- Book Reference: “Programming Bitcoin” Ch. 9 - Jimmy Song; “Mastering Bitcoin” Ch. 11 - Antonopoulos

Questions to Guide Your Design

Before implementing, think through these:

Finite Field Class
- How will you represent a finite field element?
- How do you ensure all operations stay within the field (mod p)?
- What’s the most efficient way to compute modular exponentiation?
- How will you handle division (modular inverse)?
Elliptic Curve Point Class
- How do you represent the “point at infinity”?
- How do you handle the special case where two points have the same x-coordinate?
- How do you efficiently compute k × G for large k? (Hint: double-and-add)
- How will you distinguish between compressed and uncompressed public keys?
Transaction Serialization
- Why does Bitcoin use little-endian for some fields and big-endian for others?
- How do you parse variable-length integers (varints)?
- What is the “signature hash” and why is it different from the transaction hash?
- How do you handle witness data (SegWit)?
Script Interpreter
- How will you represent opcodes?
- How do you handle conditional operators (OP_IF, OP_ELSE)?
- What should happen when a script fails?
- How do you handle OP_CHECKMULTISIG’s famous off-by-one bug?
Block Validation
- How do you check if a block hash meets the difficulty target?
- How do you verify the Merkle root matches the transactions?
- What order should you validate things in for efficiency?

Thinking Exercise

Before coding, work through this on paper:

Exercise 1: Trace Key Derivation

Private Key (random 256-bit number):
  k = 0x1 (simplest example)

Generator Point G on secp256k1:
  G = (0x79BE667EF9DCBBAC55A06295CE870B07029BFCDB2DCE28D959F2815B16F81798,
       0x483ADA7726A3C4655DA4FBFC0E1108A8FD17B448A68554199C47D08FFB10D4B8)

Public Key P = k × G = 1 × G = G (when k=1, P equals G)

Now trace what happens:
1. Serialize P as compressed (33 bytes) or uncompressed (65 bytes)
2. SHA256(serialized_P) → 32 bytes
3. RIPEMD160(sha256_result) → 20 bytes (the "hash160")
4. Prepend version byte (0x00 for mainnet)
5. SHA256(SHA256(version + hash160)) → take first 4 bytes as checksum
6. Base58Check encode (version + hash160 + checksum)
7. Result: Bitcoin address!

Exercise 2: Trace a Transaction

Given these UTXOs you control:
  - UTXO 1: 0.5 BTC from tx_aaa, output 0
  - UTXO 2: 0.3 BTC from tx_bbb, output 1

You want to send 0.6 BTC to address 1Bob...
Fee: 0.001 BTC

Work out:
1. Which UTXOs do you need to spend?
2. What is the change amount?
3. What does the raw transaction look like (draw the structure)?
4. What exactly gets signed? (The signature hash)
5. Where does the signature go in the final transaction?

The Interview Questions They’ll Ask

Prepare to answer these:

“Walk me through what happens when you send a Bitcoin transaction.”
“How does ECDSA prove you own a private key without revealing it?”
“What is a UTXO and why did Satoshi choose this model over account balances?”
“What makes Bitcoin Script different from a Turing-complete language?”
“How does proof-of-work prevent double-spending?”
“What would happen if someone reused a nonce in two different signatures?”
“How does a Merkle tree allow light clients to verify transactions?”
“What’s the difference between a transaction ID and the data that gets signed?”
“Why does Bitcoin have both P2PKH and P2SH? What problem does P2SH solve?”
“How is the difficulty target encoded in 4 bytes?”

Hints in Layers

Hint 1: Start with Finite Fields Build and test your finite field arithmetic first:

class FieldElement:
    def __init__(self, num, prime):
        self.num = num % prime
        self.prime = prime

    def __add__(self, other):
        return FieldElement((self.num + other.num) % self.prime, self.prime)

    def __pow__(self, exponent):
        # Fermat's Little Theorem for negative exponents
        n = exponent % (self.prime - 1)
        return FieldElement(pow(self.num, n, self.prime), self.prime)

Test: Verify that a * a^(-1) = 1 for various values.

Hint 2: Point Addition Has Edge Cases The formula for adding two points depends on whether:

Points are the same (point doubling)
Points have the same x-coordinate (result is infinity)
One point is the point at infinity

Draw the geometric picture before coding!

Hint 3: Signature Hash Is Not Transaction Hash When signing, you don’t sign the transaction—you sign a modified version where the input’s scriptSig is replaced with the previous output’s scriptPubKey. This tripped up many early implementers.

Hint 4: Test Against Real Data The book “Programming Bitcoin” includes test vectors from the real Bitcoin network. Use them! If your transaction parser can’t decode real transactions, something is wrong.

Hint 5: Use the Debug Flag Add verbose logging to your Script interpreter:

OP_DUP    Stack: [pubkey] → [pubkey, pubkey]
OP_HASH160  Stack: [pubkey, pubkey] → [pubkey, hash160(pubkey)]

This makes debugging script execution much easier.

Books That Will Help

Topic	Book	Chapter
Complete Bitcoin implementation	Programming Bitcoin by Jimmy Song	Full book (follow along!)
Bitcoin transaction concepts	Mastering Bitcoin, 3rd Edition by Andreas Antonopoulos	Ch. 6: “Transactions”
Block structure and mining	Mastering Bitcoin, 3rd Edition	Ch. 10-11: “Mining” and “The Blockchain”
Elliptic curve cryptography math	Serious Cryptography, 2nd Edition by Jean-Philippe Aumasson	Ch. 11-12: “Public-Key Cryptography”
Hash function properties	Serious Cryptography, 2nd Edition	Ch. 6: “Hash Functions”
Bitcoin whitepaper context	The Book of Satoshi by Phil Champagne	Historical context and design decisions
Number theory foundations	An Introduction to Mathematical Cryptography by Hoffstein, Pipher, Silverman	Ch. 1-2: Modular arithmetic

Common Pitfalls & Debugging

Building Bitcoin from scratch is complex. Here are the most common issues and how to solve them:

Problem 1: “My elliptic curve point addition doesn’t match reference implementations”

Why: You’re likely not handling the “point at infinity” correctly, or missing modular inverse calculations
Fix: The point at infinity (O) is the identity element. A + O = A. Also ensure you’re using Fermat’s Little Theorem for modular inverses: a^(-1) ≡ a^(p-2) (mod p)

Quick test:

# Point addition should be commutative
assert P + Q == Q + P
# Identity element test
assert P + O == P

Problem 2: “My signatures verify correctly but Bitcoin Core rejects them”

Why: Bitcoin uses DER encoding for signatures, and has strict low-S requirement (BIP 62)
Fix: Ensure s-value satisfies s < curve_order / 2. If not, replace with curve_order - s
Quick test: Verify against test vectors from Bitcoin test suite

Problem 3: “Transaction hashing gives wrong hash”

Why: Bitcoin uses double SHA-256 and specific serialization order
Fix: Hash = SHA256(SHA256(data)). Ensure fields are serialized in exact order: version, inputs, outputs, locktime

Debug command:

# Compare your serialization with real tx
bitcoin-cli getrawtransaction <txid> | xxd -r -p | xxd

Problem 4: “Script execution fails on OP_CHECKSIG”

Why: The signature hash (sighash) calculation is intricate—you must remove scriptSig, replace with scriptPubKey, and append sighash type
Fix: Follow the exact sighash algorithm from Bitcoin Wiki. The devil is in serialization details.

Quick test:

# Test with a known good transaction
# from block 170 (first Bitcoin transaction)
test_tx_hash = "f4184fc596403b9d638783cf57adfe4c75c605f6356fbc91338530e9831e9e16"

Problem 5: “Merkle root doesn’t match block’s merkle_root field”

Why: Merkle tree construction requires double SHA-256 at each level, and if odd number of nodes, duplicate the last one

Fix: Implementation:

while len(hashes) > 1:
    if len(hashes) % 2 == 1:
        hashes.append(hashes[-1])  # Duplicate last hash
    hashes = [hash256(h1 + h2) for h1, h2 in zip(hashes[::2], hashes[1::2])]

Quick test: Validate against genesis block (known merkle root)

Problem 6: “Block validation passes but hash doesn’t have enough leading zeros”

Why: Confusing bits/target with difficulty. Bitcoin uses compact “bits” representation

Fix: Convert bits to target:

def bits_to_target(bits):
    exponent = bits >> 24
    coefficient = bits & 0xffffff
    return coefficient * (256 ** (exponent - 3))

Quick test: Genesis block has bits=0x1d00ffff, which should equal target with 8 leading zero bytes

Problem 7: “Getting ‘invalid signature’ errors but math seems correct”

Why: Nonce (k value) in ECDSA must be cryptographically random. Never reuse!
Fix: Use RFC 6979 deterministic k-generation (based on message hash + private key)
Security warning: Reusing k reveals your private key! (This is how Sony PS3 was hacked)

Problem 8: “Python is too slow for mining”

Why: Mining requires millions of hash operations. Pure Python is ~100x slower than C
Fix: Either (1) lower difficulty for testing, or (2) Use hashlib (C implementation), or (3) Rewrite mining in C/Rust
Alternative: Focus on validation, not mining. Use testnet blocks for validation tests.

Debugging Strategy:

Start with test vectors: Bitcoin has extensive test data. Validate each component against known inputs/outputs
Compare byte-by-byte: When your output differs, hexdump both and find first differing byte
Use Bitcoin Core as oracle: You can use bitcoin-cli to verify your results
Build incrementally: Don’t try to validate a full block on day one. Start with: hash → signature → transaction → block

Recommended debugging tools:

bitcoin-cli - Query real blockchain data
xxd or hexyl - Hex dump to compare binary data
python -m pdb - Step through your code
bitcoin.stackexchange.com - Ask specific technical questions

Project 2: Build a Minimal Blockchain in a Weekend

File: BLOCKCHAIN_BITCOIN_ETHEREUM_LEARNING_PROJECTS.md
Programming Language: C
Coolness Level: Level 3: Genuinely Clever
Business Potential: 1. The “Resume Gold”
Difficulty: Level 2: Intermediate
Knowledge Area: Blockchain / Consensus
Software or Tool: Blockchain concepts
Main Book: “Mastering Bitcoin” by Andreas Antonopoulos

What you’ll build: A simple blockchain with proof-of-work consensus, transaction pool, and P2P gossip—the essential skeleton that makes all blockchains tick.

Why it teaches blockchain: Before diving into Bitcoin/Ethereum complexity, you need the core mental model: blocks chain together via hashes, nodes gossip transactions, and PoW makes forgery expensive. This project isolates those fundamentals.

Core challenges you’ll face:

Chain integrity via hashes → Maps to why tampering is detectable
Difficulty adjustment → Maps to why block times stay consistent
Fork resolution → Maps to why “longest chain wins”
Transaction validation → Maps to preventing double-spends
Gossip protocol → Maps to how decentralization works

Key Concepts:

Hash Functions: “Serious Cryptography, 2nd Edition” Ch. 6 - Jean-Philippe Aumasson
Distributed Consensus: “Designing Data-Intensive Applications” Ch. 8-9 - Martin Kleppmann
P2P Networking: “Computer Networks” Ch. 5 - Tanenbaum & Wetherall

Difficulty: Intermediate Time estimate: Weekend to 1 week Prerequisites: Any programming language, basic networking concepts

Learning milestones:

Single node mines blocks - You understand PoW mechanics
Two nodes sync chains - You understand gossip and fork resolution
Transactions propagate and confirm - You understand the mempool and block inclusion

Real World Outcome

When you complete this project, you’ll have a working distributed blockchain running across multiple terminals (or machines). Here’s exactly what you’ll see:

1. Start Your First Node (Terminal 1)

$ ./blockchain-node --port 3000 --mine

╔═══════════════════════════════════════════════════════════════════╗
║               MINIMAL BLOCKCHAIN NODE v1.0                        ║
║               Listening on port 3000                              ║
╚═══════════════════════════════════════════════════════════════════╝

[2024-12-22 14:30:01] Genesis block created
                      Hash: 0000a1b2c3d4e5f6...
                      Difficulty: 4 (4 leading zeros required)

[2024-12-22 14:30:01] Starting mining thread...
[2024-12-22 14:30:01] Mining block 1...
[2024-12-22 14:30:03] Nonce attempt: 1000000
[2024-12-22 14:30:05] Nonce attempt: 2000000
[2024-12-22 14:30:08] ✓ BLOCK MINED!
                      Block #1
                      Hash: 0000f8e7d6c5b4a3...
                      Nonce: 2847291
                      Transactions: 1 (coinbase only)
                      Mining took: 7.2 seconds

[2024-12-22 14:30:08] Mining block 2...

2. Start a Second Node and Watch Synchronization (Terminal 2)

$ ./blockchain-node --port 3001 --peer localhost:3000

╔═══════════════════════════════════════════════════════════════════╗
║               MINIMAL BLOCKCHAIN NODE v1.0                        ║
║               Listening on port 3001                              ║
╚═══════════════════════════════════════════════════════════════════╝

[2024-12-22 14:31:00] Connecting to peer: localhost:3000
[2024-12-22 14:31:00] ← Received: CHAIN_REQUEST
[2024-12-22 14:31:00] → Sending: CHAIN_RESPONSE (3 blocks)
[2024-12-22 14:31:00] ✓ Synchronized with peer
                      Local chain: 3 blocks
                      Peer chain:  3 blocks

[2024-12-22 14:31:00] CURRENT CHAIN STATE:
┌─────────────────────────────────────────────────────────────────┐
│ Block 0 (Genesis)                                               │
│   Hash: 0000a1b2c3d4e5f6789...                                  │
│   Prev: 0000000000000000000...                                  │
│   Txns: 0                                                       │
├─────────────────────────────────────────────────────────────────┤
│ Block 1                                                         │
│   Hash: 0000f8e7d6c5b4a3210...                                  │
│   Prev: 0000a1b2c3d4e5f6789...                                  │
│   Txns: 1                                                       │
├─────────────────────────────────────────────────────────────────┤
│ Block 2                                                         │
│   Hash: 00003c4d5e6f7a8b9c0...                                  │
│   Prev: 0000f8e7d6c5b4a3210...                                  │
│   Txns: 1                                                       │
└─────────────────────────────────────────────────────────────────┘

3. Submit a Transaction and Watch It Propagate

$ ./blockchain-cli --node localhost:3001 send --from alice --to bob --amount 50

TRANSACTION SUBMITTED:
════════════════════════════════════════════════════════════════════
TX ID: tx_7f8a9b0c1d2e3f4a5b6c...
From:  alice
To:    bob
Amount: 50 coins

[2024-12-22 14:32:00] → Broadcasting to 2 connected peers...
[2024-12-22 14:32:00] ✓ Peer localhost:3000 acknowledged
[2024-12-22 14:32:00] ✓ Peer localhost:3002 acknowledged

Status: IN MEMPOOL (waiting for block inclusion)
════════════════════════════════════════════════════════════════════

# On Node 1 (Terminal 1), you'll see:
[2024-12-22 14:32:00] ← Received TX: tx_7f8a9b0c... (alice → bob: 50)
[2024-12-22 14:32:00] ✓ TX validated and added to mempool
[2024-12-22 14:32:00] Mempool size: 1 transaction

# When the block is mined:
[2024-12-22 14:32:15] ✓ BLOCK MINED!
                      Block #4
                      Hash: 0000abc123def456...
                      Transactions: 2 (1 coinbase + 1 user tx)
                      Including: tx_7f8a9b0c... (alice → bob: 50)

# On all nodes:
[2024-12-22 14:32:15] ← Received BLOCK: #4 from peer
[2024-12-22 14:32:15] ✓ Block validated and added to chain
[2024-12-22 14:32:15] TX tx_7f8a9b0c... now has 1 confirmation

4. Watch a Fork Happen and Resolve

# Start two miners simultaneously, disconnect them, let them each mine 2 blocks,
# then reconnect and watch the shorter chain get abandoned:

[2024-12-22 14:35:00] ⚠ FORK DETECTED!
                      Local chain:  blocks 4a → 5a (total work: 12847291)
                      Remote chain: blocks 4b → 5b → 6b (total work: 19283746)

[2024-12-22 14:35:00] Remote chain has more work. REORGANIZING...

[2024-12-22 14:35:00] ✗ Orphaning block 5a (hash: 0000xyz...)
[2024-12-22 14:35:00] ✗ Orphaning block 4a (hash: 0000uvw...)
[2024-12-22 14:35:00] ✓ Adopting block 4b (hash: 0000rst...)
[2024-12-22 14:35:00] ✓ Adopting block 5b (hash: 0000opq...)
[2024-12-22 14:35:00] ✓ Adopting block 6b (hash: 0000lmn...)

[2024-12-22 14:35:00] Reorganization complete. Chain tip is now block 6b.
[2024-12-22 14:35:00] Returning 3 transactions from orphaned blocks to mempool.

5. Query Blockchain State

$ ./blockchain-cli --node localhost:3000 status

BLOCKCHAIN STATUS
════════════════════════════════════════════════════════════════════
Chain height:     12 blocks
Total difficulty: 48 (4 zeros × 12 blocks)
Chain work:       142,847,291 total hash attempts

Connected peers: 3
  - localhost:3001 (height: 12)
  - localhost:3002 (height: 12)
  - 192.168.1.50:3000 (height: 12)

Mempool: 2 pending transactions
  - tx_abc123... (alice → carol: 25)
  - tx_def456... (bob → dave: 10)

ACCOUNT BALANCES:
  alice:  425 coins (from mining + transfers)
  bob:    50 coins
  carol:  0 coins (25 pending)
  miner1: 600 coins (block rewards)
════════════════════════════════════════════════════════════════════

The Core Question You’re Answering

“How do distributed nodes agree on a single version of truth without a central authority? Why can’t someone just create a fake blockchain?”

This is the fundamental question that Bitcoin solved. Before you write any code, understand:

Why does hashing create “chains”? (Each block commits to all previous blocks)
Why does Proof of Work create “irreversibility”? (Rewriting requires redoing all work)
Why does “longest chain wins” create consensus? (Honest majority outpaces attackers)
Why does gossip create “decentralization”? (No single point of failure)

Concepts You Must Understand First

Stop and research these before coding:

Cryptographic Hash Functions
- What makes SHA-256 “secure”? (Preimage resistance, collision resistance)
- Why does changing one bit change the entire hash? (Avalanche effect)
- How do you verify data integrity with a hash?
- Book Reference: “Serious Cryptography, 2nd Edition” Ch. 6 - Jean-Philippe Aumasson
Linked Data Structures via Hashes
- How does including the previous hash in each block create a “chain”?
- Why can’t you modify a block in the middle without invalidating everything after?
- What is a “hash pointer” and how is it different from a regular pointer?
- Book Reference: “Mastering Bitcoin” Ch. 11 - Andreas Antonopoulos
Proof of Work
- What is a “target” and what does it mean for a hash to be “below” it?
- Why is finding a valid nonce hard but verifying it easy?
- How does difficulty adjustment maintain consistent block times?
- What is a “nonce” and why does incrementing it change the hash?
- Book Reference: “Programming Bitcoin” Ch. 9 - Jimmy Song
Consensus and Fork Resolution
- What happens when two miners find valid blocks at the same time?
- Why does “longest chain” (or “most work”) win?
- What is “reorganization” and when does it happen?
- What is the difference between a soft fork and a hard fork?
- Book Reference: “Designing Data-Intensive Applications” Ch. 9 - Martin Kleppmann
P2P Networking and Gossip
- How do nodes discover each other without a central server?
- What is a “gossip protocol” and why is it resilient?
- How do you prevent message loops in a mesh network?
- What is the difference between push and pull gossip?
- Book Reference: “Computer Networks” Ch. 5 - Tanenbaum & Wetherall
Transaction Pools (Mempools)
- What is a mempool and why is it needed?
- How do you validate a transaction before including it?
- How do miners choose which transactions to include?
- What happens to transactions in orphaned blocks?
- Book Reference: “Mastering Bitcoin” Ch. 6 and 10 - Antonopoulos

Questions to Guide Your Design

Before implementing, think through these:

Block Structure
- What fields must a block have? (index, timestamp, transactions, prev_hash, nonce, hash)
- How will you serialize a block for hashing?
- How will you store blocks? (In-memory array? File? Database?)
- How will you handle the genesis block (no previous hash)?
Proof of Work Mining
- How will you represent the difficulty target?
- How will you increment the nonce?
- Should mining run in a separate thread?
- How do you stop mining when you receive a valid block from a peer?
Networking
- What message types do you need? (NEW_BLOCK, NEW_TX, GET_CHAIN, CHAIN_RESPONSE, etc.)
- How will you serialize messages? (JSON? Binary? Custom?)
- How will you handle partial reads from sockets?
- How will you prevent infinite message forwarding?
Transaction Validation
- How will you track account balances? (UTXO vs account model)
- How will you prevent double-spending?
- What happens if a transaction references a balance from an unconfirmed transaction?
- How will you handle coinbase (block reward) transactions?
Fork Resolution
- How will you compare two chains to decide which is “better”?
- What do you do when you receive a block that doesn’t extend your chain?
- How will you request missing blocks from peers?
- How will you return orphaned transactions to the mempool?

Thinking Exercise

Before coding, trace this scenario on paper:

Scenario: Three nodes, one malicious

Time T0: All nodes have chain: [Block 0] → [Block 1] → [Block 2]
         Block 2 contains: alice → bob: 50 coins

Time T1: Node A (malicious) disconnects from network
         Node A starts mining an alternate Block 2':
           Block 2': alice → mallory: 50 coins (double-spend attempt!)

Time T2: Node A mines Block 2' and Block 3' (2 blocks deep)
         Meanwhile, honest network mines only Block 3

Time T3: Node A reconnects with chain: [B0] → [B1] → [B2'] → [B3']
         Honest nodes have: [B0] → [B1] → [B2] → [B3]

Questions:
1. Which chain "wins"? Why?
2. What happens to the transaction "alice → bob: 50"?
3. What would Mallory need to accomplish this attack?
4. How does this relate to "6 confirmations" recommendation?

Draw the fork diagram and trace the resolution!

The Interview Questions They’ll Ask

Prepare to answer these:

“What prevents someone from rewriting blockchain history?”
“Why does Proof of Work use so much energy? Is there an alternative?”
“What happens when two miners find a block at the same time?”
“How does a new node synchronize with the network?”
“What is a 51% attack and is it actually feasible?”
“Why do Bitcoin transactions need multiple confirmations?”
“How does gossip protocol handle network partitions?”
“What’s the difference between your minimal blockchain and Bitcoin?”
“How would you add transaction fees to your implementation?”
“What are the trade-offs between short and long block times?”

Hints in Layers

Hint 1: Start with a Single-Node Chain Get this working first before any networking:

typedef struct Block {
    uint32_t index;
    uint32_t timestamp;
    char prev_hash[65];  // SHA-256 hex string
    char hash[65];
    uint32_t nonce;
    char data[1024];     // Simplified: just a string
} Block;

char* calculate_hash(Block* block) {
    char buffer[2048];
    sprintf(buffer, "%d%d%s%d%s",
            block->index, block->timestamp,
            block->prev_hash, block->nonce, block->data);
    return sha256(buffer);  // You'll need a SHA-256 library
}

bool is_valid_hash(char* hash, int difficulty) {
    for (int i = 0; i < difficulty; i++) {
        if (hash[i] != '0') return false;
    }
    return true;
}

Hint 2: Mining Loop Pattern

void mine_block(Block* block, int difficulty) {
    block->nonce = 0;
    while (true) {
        char* hash = calculate_hash(block);
        if (is_valid_hash(hash, difficulty)) {
            strcpy(block->hash, hash);
            return;
        }
        block->nonce++;
        if (block->nonce % 100000 == 0) {
            printf("Nonce: %d, Hash: %.16s...\n", block->nonce, hash);
        }
    }
}

Hint 3: Simple TCP Message Protocol

Message format:
[4 bytes: message type] [4 bytes: payload length] [N bytes: payload]

Message types:
0x01: NEW_BLOCK
0x02: NEW_TRANSACTION
0x03: REQUEST_CHAIN
0x04: CHAIN_RESPONSE
0x05: PEER_ANNOUNCE

Hint 4: Chain Comparison

int compare_chains(Block* chain_a, int len_a, Block* chain_b, int len_b) {
    // Sum of all nonces represents total "work"
    uint64_t work_a = 0, work_b = 0;
    for (int i = 0; i < len_a; i++) work_a += chain_a[i].nonce;
    for (int i = 0; i < len_b; i++) work_b += chain_b[i].nonce;
    return (work_a > work_b) ? 1 : (work_a < work_b) ? -1 : 0;
}

Hint 5: Use select() for Multi-Connection Handling

fd_set read_fds;
FD_ZERO(&read_fds);
FD_SET(server_socket, &read_fds);
for (int i = 0; i < num_peers; i++) {
    FD_SET(peer_sockets[i], &read_fds);
}
select(max_fd + 1, &read_fds, NULL, NULL, &timeout);

Books That Will Help

Topic	Book	Chapter
Hash function fundamentals	Serious Cryptography, 2nd Edition by Jean-Philippe Aumasson	Ch. 6: “Hash Functions”
Blockchain structure	Mastering Bitcoin, 3rd Edition by Andreas Antonopoulos	Ch. 11: “The Blockchain”
Proof of Work mining	Programming Bitcoin by Jimmy Song	Ch. 9: “Blocks”
Distributed consensus theory	Designing Data-Intensive Applications by Martin Kleppmann	Ch. 8-9: “Trouble with Distributed Systems”
Network programming in C	The Linux Programming Interface by Michael Kerrisk	Ch. 56-61: “Sockets”
P2P protocols	Computer Networks by Tanenbaum & Wetherall	Ch. 5: “Network Layer”
Bitcoin P2P specifics	Mastering Bitcoin, 3rd Edition	Ch. 8: “The Bitcoin Network”

Common Pitfalls & Debugging

Problem 1: “Nodes connect but chains don’t sync”

Why: You’re likely sending chain data but not validating before accepting
Fix: Before adding received blocks: (1) Verify hash matches target, (2) Check previous_hash exists in your chain, (3) Validate all transactions

Quick test:

// Validation checklist
bool validate_block(Block *block) {
    if (!hash_meets_target(block->hash, current_difficulty)) return false;
    if (!find_block_by_hash(block->prev_hash)) return false;
    return true;
}

Problem 2: “Fork resolution chooses wrong chain”

Why: “Longest chain wins” means chain with most cumulative work, not most blocks
Fix: Track cumulative difficulty, not just block count

Quick test:

// Wrong: return longest_chain_by_count();
// Right: return chain_with_most_work();  // Sum of all difficulties

Problem 3: “Mining never finds a valid hash”

Why: Off-by-one error in difficulty check, or endianness issues
Fix: Target is a large number. Block hash must be numerically less than target

Debug:

printf("Block hash: %s\n", hash_to_hex(block.hash));
printf("Target:     %s\n", target_to_hex(current_target));
// Hash should be smaller (more leading zeros)

Problem 4: “Nodes see different transaction states”

Why: Race condition—transaction removed from mempool before all nodes see it
Fix: Keep transactions in mempool until N confirmations (usually 6)
Quick test: Send same transaction from 2 nodes simultaneously, ensure both accept it

Problem 5: “P2P gossip creates message storms”

Why: Forwarding messages to sender, or no duplicate detection

Fix:

// Track seen messages
HashSet *seen_messages = hashset_new();

void on_message(Message *msg, Peer *from_peer) {
    if (hashset_contains(seen_messages, msg->id)) return; // Already seen
    hashset_add(seen_messages, msg->id);
    broadcast_to_peers_except(msg, from_peer); // Don't send back to sender
}

Problem 6: “Memory leak when nodes disconnect/reconnect”

Why: Not freeing peer connection structures

Fix: Use valgrind to find leaks:

valgrind --leak-check=full ./blockchain-node
# Fix all "definitely lost" blocks before continuing

Problem 7: “select() or epoll() returns but no data to read”

Why: Peer disconnected (EOF), or spurious wakeup

Fix:

int bytes = recv(socket, buffer, sizeof(buffer), 0);
if (bytes == 0) {
    // Connection closed gracefully
    remove_peer(socket);
} else if (bytes < 0) {
    // Error occurred
    perror("recv");
    remove_peer(socket);
}

Problem 8: “Difficulty adjustment causes sudden jumps”

Why: Adjusting too frequently or using wrong formula
Fix: Bitcoin adjusts every 2016 blocks. New target = old target × (actual time / expected time)
Quick test: If blocks came too fast, difficulty should increase (target decreases)

Debugging Strategy:

Single node first: Get mining working before adding P2P
Two nodes: Test sync, then fork resolution
Network partition test: Disconnect nodes, mine on both, reconnect, verify longest chain wins
Print everything: In development, log every message sent/received
Wireshark: Capture and inspect actual P2P traffic

Common gotchas:

Endianness: Network byte order (big-endian) vs. host order
Buffer overflows: Always check message size before parsing
Race conditions: Use locks when multiple threads access shared state

Project 3: Build the Ethereum Virtual Machine (EVM) From Scratch

File: BLOCKCHAIN_BITCOIN_ETHEREUM_LEARNING_PROJECTS.md
Main Programming Language: Rust
Alternative Programming Languages: Go, TypeScript, Python
Coolness Level: Level 5: Pure Magic (Super Cool)
Business Potential: Level 1: The “Resume Gold”
Difficulty: Level 4: Expert (The Systems Architect)
Knowledge Area: Virtual Machines, Blockchain
Software or Tool: EVM, Ethereum, Bytecode
Main Book: “Crafting Interpreters” by Robert Nystrom

What you’ll build: A stack-based virtual machine that executes Ethereum bytecode—implementing opcodes like PUSH, ADD, SSTORE, CALL, and the gas metering system.

Why it teaches Ethereum: The EVM is Ethereum’s brain. Smart contracts compile to bytecode that the EVM executes. Building it yourself reveals why gas exists, how storage works, why reentrancy attacks happen, and what makes smart contracts “smart.”

Core challenges you’ll face:

Stack machine architecture → Maps to how computation is expressed
256-bit word operations → Maps to why Solidity uses uint256
Gas accounting → Maps to preventing infinite loops and spam
Storage (SLOAD/SSTORE) → Maps to contract state persistence
CALL semantics → Maps to contract-to-contract interaction
Memory vs storage vs stack → Maps to Solidity optimization

Resources for key challenges:

EVM From Scratch Course by W1nt3r.eth - 116 progressive tests to pass
EVM From Scratch Book - Jupyter notebooks building step-by-step
“Mastering Ethereum” Ch. 13 - Authoritative EVM reference

Key Concepts:

Stack Machines: “Computer Systems: A Programmer’s Perspective” Ch. 3 - Bryant & O’Hallaron
Virtual Machine Design: “Crafting Interpreters” - Robert Nystrom (free online)
EVM Opcodes: Ethereum Yellow Paper (formal specification)

Difficulty: Advanced Time estimate: 2-4 weeks Prerequisites: Understanding of stack-based computation, any systems language

Learning milestones:

Arithmetic opcodes work - You understand the stack model
Control flow (JUMP/JUMPI) works - You understand how loops and conditionals compile
Storage operations work - You understand contract state
CALL works - You understand contract interaction and reentrancy

Real World Outcome

When you complete this project, you’ll have a working EVM that can execute real Solidity bytecode. Here’s exactly what you’ll be able to do:

1. Execute and Trace Simple Bytecode

$ ./evm-cli run --bytecode "6005600401" --trace

EVM EXECUTION TRACE
════════════════════════════════════════════════════════════════════
Bytecode: 60 05 60 04 01
          │  │  │  │  │
          │  │  │  │  └─ ADD (0x01)
          │  │  │  └──── 0x04 (data for PUSH1)
          │  │  └─────── PUSH1 (0x60)
          │  └────────── 0x05 (data for PUSH1)
          └───────────── PUSH1 (0x60)

Step 1: PUSH1 0x05
        PC: 0 → 2
        Gas: 3 consumed (2999997 remaining)
        Stack: [] → [5]

Step 2: PUSH1 0x04
        PC: 2 → 4
        Gas: 3 consumed (2999994 remaining)
        Stack: [5] → [5, 4]

Step 3: ADD
        PC: 4 → 5
        Gas: 3 consumed (2999991 remaining)
        Stack: [5, 4] → [9]

EXECUTION COMPLETE
════════════════════════════════════════════════════════════════════
Result:  0x09 (9 in decimal)
Gas used: 9
Status:  SUCCESS
════════════════════════════════════════════════════════════════════

2. Execute Real Compiled Solidity Contracts

# First, compile a simple Solidity contract:
# contract Counter {
#     uint256 count;
#     function increment() public { count += 1; }
#     function get() public view returns (uint256) { return count; }
# }

$ ./evm-cli deploy --bytecode "608060405234801561001057600080fd5b50610..."

CONTRACT DEPLOYED
════════════════════════════════════════════════════════════════════
Contract Address: 0x5B38Da6a701c568545dCfcB03FcB875f56beddC4
Bytecode Size:    245 bytes
Gas Used:         127,543
════════════════════════════════════════════════════════════════════

$ ./evm-cli call --to 0x5B38Da6a701c568545dCfcB03FcB875f56beddC4 \
                 --data "0xd09de08a" \  # increment() function selector
                 --trace

FUNCTION CALL: increment()
════════════════════════════════════════════════════════════════════
Step 1:  PUSH1 0x80           Stack: [128]
Step 2:  PUSH1 0x40           Stack: [128, 64]
Step 3:  MSTORE               Stack: []                Memory[64] = 128
...
Step 45: SLOAD                Stack: [0]               (load count from slot 0)
Step 46: PUSH1 0x01           Stack: [0, 1]
Step 47: ADD                  Stack: [1]
Step 48: PUSH1 0x00           Stack: [1, 0]
Step 49: SSTORE               Stack: []               Storage[0] = 1
...
Step 62: RETURN

EXECUTION COMPLETE
════════════════════════════════════════════════════════════════════
Storage Changes:
  Slot 0x00: 0x00 → 0x01  (count incremented!)
Gas Used: 43,291
Status: SUCCESS
════════════════════════════════════════════════════════════════════

$ ./evm-cli call --to 0x5B38Da6a701c568545dCfcB03FcB875f56beddC4 \
                 --data "0x6d4ce63c"  # get() function selector

Return Value: 0x0000000000000000000000000000000000000000000000000000000000000001
Decoded:      uint256 = 1

3. Debug Reentrancy Vulnerabilities

# Deploy vulnerable contract and attacker contract, then trace the attack:

$ ./evm-cli call --to 0xVulnerable... --data "0x..." --trace

REENTRANCY ATTACK DETECTED
════════════════════════════════════════════════════════════════════
Call Depth: 0 → Vulnerable.withdraw()
  Step 15: SLOAD      balance[attacker] = 1 ETH
  Step 20: CALL       → Attacker.fallback() with 1 ETH

  Call Depth: 1 → Attacker.fallback()
    Step 5: CALL      → Vulnerable.withdraw() [REENTRANT!]

    Call Depth: 2 → Vulnerable.withdraw()
      Step 15: SLOAD   balance[attacker] = 1 ETH (NOT YET UPDATED!)
      Step 20: CALL    → Attacker.fallback() with 1 ETH

      Call Depth: 3 → Attacker.fallback()
        ... (continues until gas exhausted or balance drained)

⚠ VULNERABILITY: State update (SSTORE) happens AFTER external call (CALL)
⚠ FIX: Use checks-effects-interactions pattern
════════════════════════════════════════════════════════════════════

4. Inspect Memory and Storage State

$ ./evm-cli debug --to 0x... --data "0x..."

EVM DEBUGGER
════════════════════════════════════════════════════════════════════
(evm) step
PC: 0x0A | Opcode: MSTORE | Gas: 2999991

(evm) stack
Stack (3 items):
  [0] 0x0000...0080 (128)
  [1] 0x0000...0040 (64)
  [2] 0x0000...0001 (1)

(evm) memory 0 128
Memory (128 bytes):
0x00: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
0x10: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
0x20: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
0x30: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
0x40: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 80
                                                    ^^
                                      Free memory pointer

(evm) storage
Storage (2 slots):
  Slot 0x00: 0x0000...0005 (count = 5)
  Slot 0x01: 0x0000...owner_address... (owner)

(evm) continue
════════════════════════════════════════════════════════════════════

5. Run the EVM From Scratch Test Suite

$ cargo test

running 116 tests
test evm::tests::test_stop ... ok
test evm::tests::test_add ... ok
test evm::tests::test_mul ... ok
test evm::tests::test_sub ... ok
test evm::tests::test_div ... ok
test evm::tests::test_sdiv ... ok
test evm::tests::test_mod ... ok
...
test evm::tests::test_push1 ... ok
test evm::tests::test_push32 ... ok
test evm::tests::test_dup1 ... ok
...
test evm::tests::test_mstore ... ok
test evm::tests::test_mload ... ok
test evm::tests::test_sstore ... ok
test evm::tests::test_sload ... ok
...
test evm::tests::test_jump ... ok
test evm::tests::test_jumpi ... ok
test evm::tests::test_call ... ok
test evm::tests::test_delegatecall ... ok
test evm::tests::test_create ... ok
test evm::tests::test_selfdestruct ... ok

test result: ok. 116 passed; 0 failed

The Core Question You’re Answering

“How does a blockchain execute code? What actually happens when you call a smart contract function, and why do some operations cost more gas than others?”

Before writing code, understand that the EVM is fundamentally just a stack machine with three areas of data access:

Stack: Fast, temporary, cheap (1024 elements max, 256-bit words)
Memory: Volatile byte array, grows during execution, quadratic cost
Storage: Persistent key-value store, survives transactions, very expensive

Every smart contract vulnerability (reentrancy, integer overflow, access control) has its roots in how the EVM executes bytecode. Understanding the EVM means understanding why contracts behave (and misbehave) the way they do.

Concepts You Must Understand First

Stop and research these before coding:

Stack Machine Architecture
- What is a stack and why is LIFO important?
- How do you express a + b * c using only stack operations?
- What is “reverse Polish notation”?
- How is a stack machine different from a register machine?
- Book Reference: “Crafting Interpreters” Ch. 15 - Robert Nystrom
256-bit Integer Arithmetic
- Why did Ethereum choose 256-bit words? (Matches cryptographic primitives)
- How do you represent negative numbers? (Two’s complement)
- What is “signed” vs “unsigned” division in the EVM?
- How do you handle overflow? (EVM wraps around!)
- Book Reference: “Computer Systems: A Programmer’s Perspective” Ch. 2 - Bryant & O’Hallaron
Bytecode and Opcodes
- What is an “opcode” and why is it one byte?
- How does the program counter (PC) advance?
- What is the difference between PUSH1 and PUSH32?
- Why are some opcodes followed by immediate data?
- Reference: evm.codes - Interactive opcode reference
EVM Memory Model
- What is the difference between stack, memory, and storage?
- Why does memory cost grow quadratically?
- What is the “free memory pointer” at address 0x40?
- How are dynamic arrays stored in memory vs storage?
- Book Reference: “Mastering Ethereum” Ch. 13 - Antonopoulos & Wood
Gas and Execution Limits
- Why does each opcode have a gas cost?
- What is “gas limit” vs “gas price”?
- How does gas prevent infinite loops?
- Why does SSTORE cost so much more than ADD?
- Reference: Ethereum Yellow Paper Appendix G
Call Semantics (CALL, DELEGATECALL, STATICCALL)
- What is a “message call” and how does it create a new execution context?
- How does DELEGATECALL preserve msg.sender and storage context?
- What happens to gas during a call?
- What is the “call depth limit” and why does it exist?
- Book Reference: “Mastering Ethereum” Ch. 13 - Antonopoulos & Wood

Questions to Guide Your Design

Before implementing, think through these:

Data Representation
- How will you represent 256-bit integers? (BigInt library? Fixed array?)
- How will you handle the stack? (Vec/Array with push/pop?)
- How will you store memory? (Byte array that grows as needed?)
- How will you store storage? (HashMap of slot → value?)
Opcode Dispatch
- How will you map opcode bytes to handler functions?
- How will you handle PUSHn opcodes (which read n bytes of immediate data)?
- How will you implement DUPn and SWAPn (parameterized by n)?
- Should you use a switch statement, function table, or trait objects?
Execution Context
- What state do you need to track? (PC, stack, memory, storage, gas, etc.)
- How will you handle nested calls? (New context per call?)
- How will you pass msg.sender, msg.value, calldata?
- How will you handle return data from calls?
Gas Accounting
- Where in your code will you deduct gas?
- How will you handle “out of gas” mid-execution?
- How will you calculate dynamic gas costs (memory expansion)?
- How will you handle gas refunds (SSTORE clearing)?
Control Flow
- How will you validate JUMP destinations (must be JUMPDEST)?
- How will you handle STOP, RETURN, REVERT, INVALID?
- How will you implement conditionals (JUMPI)?
- How will you detect infinite loops (gas exhaustion)?

Thinking Exercise

Before coding, trace this bytecode by hand:

Bytecode: 60 03 60 05 01 60 02 02 60 00 52 60 20 60 00 f3

Disassembly:
00: PUSH1 0x03      Push 3 onto stack
02: PUSH1 0x05      Push 5 onto stack
04: ADD             Pop 2, push sum (3+5=8)
05: PUSH1 0x02      Push 2 onto stack
07: MUL             Pop 2, push product (8*2=16)
08: PUSH1 0x00      Push 0 onto stack
0A: MSTORE          Store 16 at memory[0:32]
0B: PUSH1 0x20      Push 32 onto stack
0D: PUSH1 0x00      Push 0 onto stack
0F: RETURN          Return memory[0:32]

Trace:
Step 1: PUSH1 0x03    Stack: [3]                Memory: empty
Step 2: PUSH1 0x05    Stack: [3, 5]             Memory: empty
Step 3: ADD           Stack: [8]                Memory: empty
Step 4: PUSH1 0x02    Stack: [8, 2]             Memory: empty
Step 5: MUL           Stack: [16]               Memory: empty
Step 6: PUSH1 0x00    Stack: [16, 0]            Memory: empty
Step 7: MSTORE        Stack: []                 Memory[0:32] = 0x...0010 (16)
Step 8: PUSH1 0x20    Stack: [32]               Memory[0:32] = 0x...0010
Step 9: PUSH1 0x00    Stack: [32, 0]            Memory[0:32] = 0x...0010
Step 10: RETURN       Return 32 bytes from memory offset 0

Result: 0x0000000000000000000000000000000000000000000000000000000000000010
        = 16 in decimal

Question: What is the total gas cost of this execution?

PUSH1: 3 gas × 5 = 15 gas
ADD: 3 gas
MUL: 5 gas
MSTORE: 3 gas + (memory expansion cost)
RETURN: 0 gas

The Interview Questions They’ll Ask

Prepare to answer these:

“What is the EVM and why is it stack-based?”
“Explain the difference between memory, storage, and the stack.”
“Why does SSTORE cost 20,000 gas while ADD costs only 3?”
“What is a reentrancy attack and how does it exploit CALL semantics?”
“How does DELEGATECALL differ from CALL? When would you use each?”
“What happens when a contract runs out of gas mid-execution?”
“How does the EVM prevent infinite loops?”
“What is a JUMPDEST and why is it required?”
“How are function selectors (4-byte signatures) used in the EVM?”
“What is the ‘free memory pointer’ and where is it stored?”
“Why do we need STATICCALL? What security guarantee does it provide?”
“How would you implement your own ERC-20 token at the bytecode level?”

Hints in Layers

Hint 1: Start with Stack Operations Get PUSH, POP, DUP, SWAP working first:

struct EVM {
    stack: Vec<U256>,  // Use a 256-bit integer library
    pc: usize,
    code: Vec<u8>,
    gas: u64,
}

impl EVM {
    fn execute(&mut self) -> Result<Vec<u8>, &'static str> {
        while self.pc < self.code.len() {
            let opcode = self.code[self.pc];
            match opcode {
                0x00 => break,  // STOP
                0x01 => self.op_add()?,
                0x60 => self.op_push(1)?,  // PUSH1
                0x61 => self.op_push(2)?,  // PUSH2
                // ...
                _ => return Err("Invalid opcode"),
            }
        }
        Ok(vec![])
    }

    fn op_add(&mut self) -> Result<(), &'static str> {
        let a = self.stack.pop().ok_or("Stack underflow")?;
        let b = self.stack.pop().ok_or("Stack underflow")?;
        self.stack.push(a.wrapping_add(b));  // EVM wraps on overflow!
        self.gas -= 3;
        self.pc += 1;
        Ok(())
    }

    fn op_push(&mut self, n: usize) -> Result<(), &'static str> {
        let value = &self.code[self.pc + 1..self.pc + 1 + n];
        self.stack.push(U256::from_big_endian(value));
        self.gas -= 3;
        self.pc += 1 + n;
        Ok(())
    }
}

Hint 2: Memory is a Growing Byte Array

struct Memory {
    data: Vec<u8>,
}

impl Memory {
    fn expand_to(&mut self, offset: usize, size: usize) -> u64 {
        let needed = offset + size;
        if needed > self.data.len() {
            let old_words = (self.data.len() + 31) / 32;
            self.data.resize(needed, 0);
            let new_words = (self.data.len() + 31) / 32;
            // Memory expansion cost is quadratic!
            let old_cost = old_words * 3 + (old_words * old_words) / 512;
            let new_cost = new_words * 3 + (new_words * new_words) / 512;
            return (new_cost - old_cost) as u64;
        }
        0
    }
}

Hint 3: Storage is Just a HashMap

use std::collections::HashMap;

struct Storage {
    slots: HashMap<U256, U256>,
}

impl Storage {
    fn sload(&self, slot: &U256) -> U256 {
        self.slots.get(slot).cloned().unwrap_or(U256::zero())
    }

    fn sstore(&mut self, slot: U256, value: U256) -> u64 {
        let old = self.sload(&slot);
        let gas = if old.is_zero() && !value.is_zero() {
            20000  // Setting non-zero from zero
        } else if !old.is_zero() && value.is_zero() {
            5000   // Clearing (plus refund)
        } else {
            5000   // Modifying
        };
        self.slots.insert(slot, value);
        gas
    }
}

Hint 4: Use the evm-from-scratch Test Suite Clone https://github.com/w1nt3r-eth/evm-from-scratch and run tests as you go. Each test is one opcode—perfect for incremental development.

Hint 5: Handle CALL Last CALL is the most complex opcode because it creates a new execution context. Get everything else working first, then tackle CALL, DELEGATECALL, and STATICCALL.

Books That Will Help

Topic	Book	Chapter
Stack machine fundamentals	Crafting Interpreters by Robert Nystrom	Ch. 14-15: “Chunks of Bytecode” and “A Virtual Machine”
EVM specification	Mastering Ethereum by Antonopoulos & Wood	Ch. 13: “The Ethereum Virtual Machine”
Binary representation	Computer Systems: A Programmer’s Perspective by Bryant & O’Hallaron	Ch. 2: “Representing Information”
VM dispatch techniques	Virtual Machines: Versatile Platforms by Iain D. Craig	Ch. 2-3: “Interpreters”
Smart contract security	Mastering Ethereum	Ch. 9: “Smart Contract Security”
Solidity internals	Ethereum Smart Contract Development by Mayukh Mukhopadhyay	Ch. 6-7: “Smart Contract Internals”
Formal EVM specification	Ethereum Yellow Paper	Appendix H: “Virtual Machine Specification”

Common Pitfalls & Debugging

Problem 1: “Stack underflow on seemingly valid bytecode”

Why: You’re not handling stack depth requirements correctly. Some opcodes pop more items than they push
Fix: Before each opcode, validate: stack.len() >= required_depth

Quick test:

// SWAP2 requires at least 3 items
let opcode = 0x91; // SWAP2
if stack.len() < 3 { return Err("Stack underflow"); }

Problem 2: “Gas calculation doesn’t match real EVM”

Why: Some opcodes have dynamic gas costs (memory expansion, SSTORE refunds)
Fix: Memory expansion gas: new_mem_cost = (new_size^2 / 512) + (3 * new_size)
Quick test: Run against evm.codes test vectors and compare gas used

Problem 3: “SSTORE/SLOAD work but values don’t persist between calls”

Why: Storage must be external to the EVM execution context

Fix:

struct Account {
    storage: HashMap<U256, U256>,  // Persistent key-value store
    balance: U256,
    nonce: u64,
}

// EVM only holds a *reference* to storage
struct EVM<'a> {
    storage: &'a mut HashMap<U256, U256>,  // Reference to account storage
    // ...
}

Problem 4: “256-bit arithmetic overflows or wraps incorrectly”

Why: EVM uses wrapping arithmetic (modulo 2^256), not overflow panics
Fix: Use .wrapping_add(), .wrapping_mul(), etc.

Quick test:

// MAX_U256 + 1 should wrap to 0
assert_eq!(U256::MAX.wrapping_add(U256::from(1)), U256::zero());

Problem 5: “CALL/DELEGATECALL creates infinite recursion”

Why: No call depth limit or not passing gas correctly
Fix: EVM limits call depth to 1024. Also: callee gets at most 63/64 of remaining gas

Quick test:

const MAX_CALL_DEPTH: usize = 1024;

fn call(&mut self, call_depth: usize) -> Result<()> {
    if call_depth >= MAX_CALL_DEPTH {
        return Err("Call depth exceeded");
    }
    let callee_gas = (self.gas * 63) / 64;
    // Execute with reduced gas...
}

Problem 6: “CREATE opcode fails with ‘out of gas’ but plenty remains”

Why: CREATE charges extra gas for code deployment (200 gas per byte)
Fix: Total cost = init_code_gas + (deployed_code.len() * 200)
Quick test: Deploy a 100-byte contract, verify (32000 + 20000) gas charged

Problem 7: “JUMPI doesn’t jump even when condition is true”

Why: EVM treats ANY non-zero value as true, but destination must be a JUMPDEST (0x5B)

Fix:

fn op_jumpi(&mut self) -> Result<()> {
    let dest = self.stack.pop()? as usize;
    let condition = self.stack.pop()?;

    if condition != U256::zero() {
        // Validate destination is JUMPDEST
        if self.code[dest] != 0x5B {
            return Err("Invalid jump destination");
        }
        self.pc = dest;
    } else {
        self.pc += 1;
    }
    Ok(())
}

Problem 8: “Memory expansion costs explode unexpectedly”

Why: Memory cost is quadratic, not linear

Fix: Cost for expanding from old_size to new_size is:

fn memory_cost(size_in_words: u64) -> u64 {
    (size_in_words * size_in_words) / 512 + (3 * size_in_words)
}

let expansion_cost = memory_cost(new_size) - memory_cost(old_size);

Why quadratic? Prevents spam attacks. Accessing 1MB of memory should be expensive!

Debugging Strategy:

Use evm.codes as reference: Every opcode has gas cost, stack changes, and examples
Test one opcode at a time: The evm-from-scratch repo has isolated tests per opcode
Compare traces: Run your EVM and geth in debug mode, compare execution traces
Fuzz testing: Generate random valid bytecode and compare results with reference EVM

Essential debugging tools:

evm.codes - Interactive opcode playground
Remix IDE - Compile Solidity and inspect bytecode
etherscan.io - View real smart contract bytecode
Foundry’s forge debug - Step through transactions

Test suite progression:

Stack operations (PUSH, POP, DUP, SWAP)
Arithmetic (ADD, MUL, DIV, MOD, SDIV, SMOD, ADDMOD, MULMOD, EXP)
Comparison & bitwise (LT, GT, EQ, ISZERO, AND, OR, XOR, NOT, SHL, SHR)
Memory (MLOAD, MSTORE, MSTORE8, MSIZE)
Storage (SLOAD, SSTORE)
Flow control (JUMP, JUMPI, PC, GAS)
Block info (BLOCKHASH, COINBASE, TIMESTAMP, NUMBER, DIFFICULTY, GASLIMIT)
Account (BALANCE, CALLER, CALLVALUE)
Call operations (CALL, DELEGATECALL, STATICCALL, CREATE, CREATE2)
Logging (LOG0-LOG4)

Project 4: Implement a Proof-of-Stake Consensus

File: BLOCKCHAIN_BITCOIN_ETHEREUM_LEARNING_PROJECTS.md
Programming Language: C
Coolness Level: Level 5: Pure Magic (Super Cool)
Business Potential: 5. The “Industry Disruptor”
Difficulty: Level 5: Master
Knowledge Area: Distributed Consensus / Game Theory
Software or Tool: Consensus Algorithms
Main Book: “Designing Data-Intensive Applications” by Martin Kleppmann

What you’ll build: A consensus mechanism where validators stake tokens and are selected to propose blocks based on stake weight, with slashing for misbehavior.

Why it teaches consensus: Proof-of-Stake is how modern chains (Ethereum 2.0, Solana, Cardano) work. Building it teaches you the game theory: why validators behave honestly, what happens during forks, and how finality differs from PoW.

Core challenges you’ll face:

Validator selection → Maps to randomness and stake weighting
Block proposal and attestation → Maps to committee-based consensus
Slashing conditions → Maps to punishing equivocation
Finality gadgets → Maps to when transactions become irreversible
Nothing-at-stake problem → Maps to PoS vs PoW tradeoffs

Key Concepts:

Byzantine Fault Tolerance: “Designing Data-Intensive Applications” Ch. 8 - Martin Kleppmann
Consensus Algorithms: Ethereum’s Casper FFG paper
Game Theory: “Mastering Ethereum” Ch. 14 - Antonopoulos & Wood

Difficulty: Advanced Time estimate: 2-4 weeks Prerequisites: Understanding of distributed systems, cryptography basics

Learning milestones:

Validators stake and get selected - You understand stake-weighted randomness
Blocks reach finality - You understand supermajority attestation
Slashing works - You understand the economic security model

Real World Outcome

When you complete this project, you’ll have a functioning Proof-of-Stake consensus network that you can run locally. Here’s exactly what you’ll see:

1. Validators Stake and Join the Network

$ ./pos_node validator --stake 32000

VALIDATOR NODE STARTING...
════════════════════════════════════════════════════════════
Validator Address: 0x742d35Cc6634C0532925a3b844Bc9e7595f0bEb7
Staked Amount: 32,000 tokens
Validator Index: 5
Status: ACTIVE
Effective Balance: 32,000 tokens (max)

Waiting for block proposal assignment...
════════════════════════════════════════════════════════════

2. Block Proposal and Attestation

$ ./pos_node view-epoch

EPOCH 42 STATUS:
════════════════════════════════════════════════════════════
Total Validators: 128
Total Stake: 4,096,000 tokens
Participation Rate: 97.3%

SLOT 340 (Current):
  Proposer: Validator #23 (0x8ab...)
  Block Hash: 0x9f2e3d...
  Attestations: 124/128 (96.9%)
  Status: ✓ FINALIZED (>66.67% attestations)

SLOT 341 (Next):
  Assigned Proposer: Validator #67 (YOU!)
  Expected time: 8 seconds
════════════════════════════════════════════════════════════

[12:34:56] YOUR TURN! Proposing block for slot 341...
[12:34:56] Including 47 pending transactions
[12:34:56] Block 0x7a3c... proposed successfully
[12:34:57] Attestation from Validator #5: ✓
[12:34:57] Attestation from Validator #12: ✓
[12:34:58] Attestation from Validator #89: ✓
[12:35:02] Attestations received: 122/128 (95.3%)
[12:35:02] Block FINALIZED ✓
[12:35:02] Reward earned: +0.025 tokens

3. Slashing Detection and Execution

$ ./pos_node simulate-attack double-vote --validator 42

ATTACK SIMULATION: Double Vote by Validator #42
════════════════════════════════════════════════════════════
[Network] Validator #42 broadcasting CONFLICTING votes:
  Vote 1: Slot 450 → Block 0xabc123...
  Vote 2: Slot 450 → Block 0xdef456... (DIFFERENT!)

[Detector] Slashing condition detected!
  Offense: DOUBLE_VOTE
  Evidence:
    - Both votes from same validator (pubkey match)
    - Same slot number (450)
    - Different block hashes
    - Both signatures valid

[Consensus] Slashing proposal submitted by Validator #17
[Consensus] 87/128 validators confirmed evidence
[Execution] SLASHING VALIDATOR #42
  - Stake burned: 1,000 tokens (3.125% of total)
  - Remaining stake: 31,000 tokens
  - Status: EJECTED from validator set
  - Whistleblower reward (Validator #17): +10 tokens

[Network] Validator #42 removed from active set
════════════════════════════════════════════════════════════

4. Fork Choice and Reorganization

$ ./pos_node view-fork-choice

FORK CHOICE RULE (LMD GHOST):
════════════════════════════════════════════════════════════
                    ┌─ Block 453a (20 votes)
       Block 452 ──┤
                    └─ Block 453b (108 votes) ← CANONICAL
                           │
                           └─ Block 454 (122 votes)

HEAD: Block 454 (0x8f3e...)
Justification: Epoch 92 checkpoint (>66.67% attested)
Finalization: Epoch 91 checkpoint (IRREVERSIBLE)

Orphaned blocks: 1 (Block 453a - insufficient attestations)
════════════════════════════════════════════════════════════

5. Economic Security Metrics

$ ./pos_node security-analysis

NETWORK SECURITY ANALYSIS:
════════════════════════════════════════════════════════════
Total Staked: 4,096,000 tokens
Network Value: $204,800,000 (at $50/token)

ATTACK COST ANALYSIS:
─────────────────────
To attack (33.4% stake needed): 1,368,064 tokens
  Cost to acquire: ~$68,403,200
  Slashing penalty if caught: -$68,403,200
  Expected outcome: ECONOMIC LOSS (protocol defends)

To finalize invalid block (66.7% needed): 2,732,032 tokens
  Cost: ~$136,601,600
  Slashing penalty: -$136,601,600
  Conclusion: Attack is economically irrational

Current security level: ✓ STRONG
Validator decentralization: 128 unique validators
Largest validator: 3.1% of stake (low centralization risk)
════════════════════════════════════════════════════════════

The Core Question You’re Answering

“Why should validators behave honestly in Proof-of-Stake? What prevents them from validating multiple conflicting chains (the ‘nothing-at-stake’ problem)?”

Before you write any code, sit with this question. In Proof-of-Work, miners can’t work on multiple chains simultaneously (they must choose where to spend their hash power). But in Proof-of-Stake, validators can trivially sign multiple conflicting blocks at zero cost.

The answer involves:

Economic security (slashing makes misbehavior expensive)
Game-theoretic incentives (rewards for honesty > costs of attacking)
Verifiable evidence (cryptographic proofs of equivocation)
Social consensus (weak subjectivity checkpoints)

Concepts You Must Understand First

Stop and research these before coding:

Byzantine Fault Tolerance (BFT)
- What does “Byzantine” mean in distributed systems?
- Why is the 2/3 threshold (66.67%) important in BFT consensus?
- What’s the difference between safety (never finalize conflicting blocks) and liveness (eventually finalize blocks)?
- How does BFT differ from Nakamoto consensus (longest chain)?
- Book Reference: “Designing Data-Intensive Applications” Ch. 8-9 - Martin Kleppmann
Stake-Weighted Selection
- How do you fairly select a validator when they have different stake amounts?
- What is a Verifiable Random Function (VRF)?
- Why can’t you use simple rand() % num_validators?
- How does Ethereum 2.0’s RANDAO provide randomness?
- Book Reference: Ethereum 2.0 Spec (Beacon Chain documentation)
Slashing Conditions
- What constitutes “equivocation” (provably malicious behavior)?
- Why are double votes and surround votes slashable?
- How much should validators be slashed? (Too little = ineffective, too much = discourages participation)
- How do you prove a validator misbehaved without trusting a single reporter?
- Book Reference: Ethereum’s Casper FFG paper
Finality Gadgets
- What does “finalized” mean? How is it different from “confirmed”?
- What is a “checkpoint” and why do we finalize in epochs, not per block?
- How does Casper FFG achieve finality on top of the LMD GHOST fork choice?
- What happens during a finality reversion (catastrophic but possible)?
- Book Reference: “Mastering Ethereum” Ch. 14 - Antonopoulos & Wood
Nothing-at-Stake Problem
- Why is it “free” to vote on multiple chains in naive PoS?
- How does slashing solve this?
- What are “weak subjectivity” checkpoints?
- Why can’t you trustlessly sync from genesis in PoS (unlike PoW)?
- Book Reference: Vitalik Buterin’s “A Proof of Stake Design Philosophy”
Long-Range Attacks
- What is a long-range attack (rewriting ancient history)?
- Why can’t this happen in PoW (too much computational cost)?
- How do checkpoints prevent long-range attacks?
- What is “social consensus” and why is it needed for very old reorgs?
- Book Reference: “Designing Data-Intensive Applications” Ch. 9 - Martin Kleppmann

Questions to Guide Your Design

Before implementing, think through these:

Validator Registration
- How do validators join? Do they lock tokens in a smart contract?
- What’s the minimum stake required? (Ethereum uses 32 ETH)
- How long does it take to activate? (Prevents griefing by rapid join/leave)
- How do validators exit? (Immediate exit enables long-range attacks!)
Block Proposal Selection
- How often should each validator propose? (Every N slots based on stake?)
- Should selection be deterministic or random?
- How far in advance do validators know they’re assigned?
- What if the selected validator is offline?
Attestation Aggregation
- Do you collect attestations one-by-one or batch them?
- How do you efficiently verify 100+ BLS signatures?
- What’s the deadline for attestations? (Too short = missed votes, too long = slow finality)
Fork Choice Rule
- When there are competing chains, which is canonical?
- LMD GHOST: Follow the fork with the most recent attestation weight
- How do you handle ties?
Slashing Implementation
- Who detects slashable offenses? (Any node can!)
- How do you reward the whistleblower?
- Should slashing be gradual (correlated failures) or fixed?
- Can you slash multiple times for the same offense?
Epoch Boundaries
- How long is an epoch? (Ethereum: 32 slots = 6.4 minutes)
- Why checkpoint finality per epoch instead of per block?
- What happens to validator set changes during an epoch?

Thinking Exercise

Before coding, work through this scenario on paper:

Exercise 1: Trace Validator Selection

Network state:
  Validator A: 32,000 tokens staked (31.25% of total)
  Validator B: 64,000 tokens staked (62.5%)
  Validator C: 6,400 tokens staked (6.25%)
  Total: 102,400 tokens

Epoch 10 random seed: 0x8f3a... (from RANDAO)

For slot 320:
1. How do you select the proposer fairly?
   (Hint: Hash(seed + slot_number) mod total_stake, then find which validator's range it falls in)

2. What is the probability each validator is selected?
   (Should match their stake weight!)

3. If Validator B is selected, what attestations must they collect?
   (All other validators attest to their proposed block)

4. What percentage of stake must attest for finality?
   (>66.67%, so at least 68,267 tokens worth of attestations)

Exercise 2: Detect a Slashable Offense

Validator D broadcasts two attestations:

Attestation 1:
  Source checkpoint: Epoch 15
  Target checkpoint: Epoch 20
  Block hash: 0xabc123...
  Signature: valid

Attestation 2:
  Source checkpoint: Epoch 15
  Target checkpoint: Epoch 20
  Block hash: 0xdef456... (DIFFERENT!)
  Signature: valid

Questions:
1. Is this slashable? Why?
2. What evidence do you need to prove it?
3. How much should Validator D be slashed?
4. Can Validator D claim "my node was hacked"? (Doesn't matter - strict liability!)

Exercise 3: Trace Finality

Epoch 25 ends with these attestations:
  Block 800: 65,000 tokens attested (63.5%) ← Not finalized
  Block 801: 70,000 tokens attested (68.4%) ← Finalized!
  Block 802: 45,000 tokens attested (44%)

Epoch 26:
  Block 803 builds on block 801
  Block 804 builds on block 801
  Block 805 builds on block 804

What is the finalized chain at the end of Epoch 26?
Can block 800 ever become part of the canonical chain? (No - block 801 is finalized)

The Interview Questions They’ll Ask

Prepare to answer these:

“Explain the nothing-at-stake problem in Proof-of-Stake. How does Ethereum solve it?”
“What’s the difference between justification and finalization in Casper FFG?”
“Why does Proof-of-Stake use a 2/3 threshold instead of simple majority (50%+1)?”
“Walk me through what happens when a validator double-votes.”
“How does weak subjectivity differ from objective finality in Proof-of-Work?”
“What is a long-range attack and why can’t it happen in Proof-of-Work?”
“How does validator selection work without being predictable or gameable?”
“What happens during a finality reversion (both justified checkpoints conflict)?”
“Why is Proof-of-Stake considered more energy-efficient than Proof-of-Work?”
“How does slashing rate scale with the number of validators misbehaving simultaneously?”

Hints in Layers

Hint 1: Start with a Simple Stake Registry Before consensus, build the staking mechanism:

typedef struct {
    uint8_t pubkey[48];        // BLS public key
    uint64_t stake;            // Amount staked (in tokens)
    uint64_t activation_epoch; // When validator activates
    bool slashed;              // Has been slashed?
} Validator;

Validator validators[MAX_VALIDATORS];
int validator_count = 0;
uint64_t total_stake = 0;

void register_validator(uint8_t *pubkey, uint64_t stake) {
    validators[validator_count++] = (Validator){
        .stake = stake,
        .activation_epoch = current_epoch + 2,  // 2-epoch delay
        .slashed = false
    };
    memcpy(validators[validator_count - 1].pubkey, pubkey, 48);
    total_stake += stake;
}

Test: Ensure stake accounting is always correct (sum of individual stakes == total).

Hint 2: Implement Stake-Weighted Random Selection Use the “weighted sampling” technique:

int select_proposer(uint64_t slot, uint8_t *random_seed) {
    // Deterministic but unpredictable selection
    uint8_t hash_input[40];
    memcpy(hash_input, random_seed, 32);
    memcpy(hash_input + 32, &slot, 8);

    uint8_t hash[32];
    sha256(hash_input, 40, hash);

    uint64_t random_value = *(uint64_t*)hash % total_stake;

    // Find which validator's range this falls into
    uint64_t cumulative = 0;
    for (int i = 0; i < validator_count; i++) {
        cumulative += validators[i].stake;
        if (random_value < cumulative) {
            return i;  // Validator i is selected!
        }
    }
}

Hint 3: Slashing Detection Uses Signature Comparison Two votes conflict if they’re from the same validator:

bool is_slashable_double_vote(Attestation *att1, Attestation *att2) {
    // Same validator?
    if (memcmp(att1->pubkey, att2->pubkey, 48) != 0) return false;

    // Same target height but different blocks?
    if (att1->target_epoch == att2->target_epoch &&
        memcmp(att1->block_hash, att2->block_hash, 32) != 0) {
        return true;  // SLASHABLE!
    }

    return false;
}

Hint 4: Finality Requires Checkpointing Don’t try to finalize every block. Use epoch boundaries:

typedef struct {
    uint64_t epoch;
    uint8_t block_hash[32];
    uint64_t total_attesting_stake;
    bool justified;   // >2/3 voted
    bool finalized;   // Previous checkpoint justified, this one justified
} Checkpoint;

void check_finality(Checkpoint *checkpoint) {
    if (checkpoint->total_attesting_stake * 3 > total_stake * 2) {
        checkpoint->justified = true;

        // If previous checkpoint was justified, this one finalizes it
        if (previous_checkpoint.justified) {
            previous_checkpoint.finalized = true;
        }
    }
}

Hint 5: Test Byzantine Scenarios Your protocol must handle malicious validators:

// Simulate a Byzantine validator
void simulate_byzantine_validator(int validator_id) {
    // Randomly vote on wrong blocks
    if (rand() % 2 == 0) {
        Attestation att = create_fake_attestation(validator_id);
        broadcast_attestation(&att);
    }
}

With 33% Byzantine validators, the protocol should still make progress (safety + liveness).

Books That Will Help

Topic	Book	Chapter
Byzantine Fault Tolerance fundamentals	Designing Data-Intensive Applications, 2nd Edition by Martin Kleppmann	Ch. 8: “The Trouble with Distributed Systems”; Ch. 9: “Consistency and Consensus”
Proof-of-Stake mechanisms	Mastering Ethereum by Andreas Antonopoulos & Gavin Wood	Ch. 14: “Consensus” (covers Casper FFG/CBC)
Game theory and economic security	Mastering Ethereum	Ch. 11: “Oracles” and Ch. 14: “Consensus”
Distributed consensus algorithms	Designing Data-Intensive Applications, 2nd Edition	Ch. 9: “Consistency and Consensus” (Paxos, Raft, Byzantine consensus)
Cryptographic primitives (BLS signatures)	Serious Cryptography, 2nd Edition by Jean-Philippe Aumasson	Ch. 11-12: “Public-Key Cryptography”
Ethereum 2.0 Beacon Chain	Upgrading Ethereum by Ben Edgington	Full book (covers Casper FFG, LMD GHOST, slashing)

Essential Papers:

Casper the Friendly Finality Gadget by Vitalik Buterin & Virgil Griffith (2017)
Combining GHOST and Casper (Ethereum 2.0 specification)
A Proof of Stake Design Philosophy by Vitalik Buterin

Common Pitfalls & Debugging

Problem 1: “Validator selection is predictable/biased”

Why: Using weak randomness or not properly weighting by stake
Fix: Use VRF (Verifiable Random Function) for unpredictable but verifiable selection. Weight probability by stake: P(validator) = stake / total_stake
Quick test: Run 1000 selections, verify distribution matches stake weights

Problem 2: “Nothing-at-stake: validators vote on multiple forks”

Why: No penalty for voting on conflicting blocks

Fix: Implement slashing conditions:

// Slashable offense 1: Double voting (two votes for same height)
if (vote1.height == vote2.height && vote1.hash != vote2.hash) { slash(validator); }

// Slashable offense 2: Surround voting (voting to revert finalized block)
if (vote1.source < vote2.source && vote1.target > vote2.target) { slash(validator); }

Problem 3: “Finality never achieved”

Why: Not tracking supermajority correctly. Finality requires >2/3 of stake to attest

Fix:

uint64_t total_attesting_stake = sum_attestations(block);
if (total_attesting_stake * 3 > total_staked * 2) {
    mark_finalized(block);  // >66.67% voted
}

Problem 4: “Long-range attack: attacker rewrites ancient history”

Why: Old validators can re-stake on alternate chain
Fix: Implement weak subjectivity checkpoints. Nodes won’t reorg past N blocks (~1 day worth). Require social consensus for deep reorgs.

Problem 5: “Validators join/leave causing stake accounting bugs”

Why: Not handling validator set changes atomically
Fix: Use epochs. Changes take effect only at epoch boundaries, never mid-epoch.

Debugging tips:

Simulate Byzantine validators (randomly vote incorrectly) - protocol should still finalize
Test with 33% malicious stake (max that protocol can tolerate)
Verify slashing removes stake and prevents future participation

Project 5: Build a Simple Smart Contract Compiler

File: BLOCKCHAIN_BITCOIN_ETHEREUM_LEARNING_PROJECTS.md
Main Programming Language: Rust
Alternative Programming Languages: Go, TypeScript, C
Coolness Level: Level 5: Pure Magic (Super Cool)
Business Potential: Level 1: The “Resume Gold”
Difficulty: Level 5: Master (The First-Principles Wizard)
Knowledge Area: Compilers, Blockchain
Software or Tool: Solidity, EVM, LLVM
Main Book: “Writing a C Compiler” by Nora Sandler

What you’ll build: A compiler that takes a tiny Solidity-like language and outputs EVM bytecode—covering parsing, type checking, and code generation.

Why it teaches smart contracts: You’ll understand why Solidity has its quirks, how high-level code becomes opcodes, what the ABI actually is, and why certain patterns are gas-expensive.

Core challenges you’ll face:

Lexing and parsing → Maps to contract syntax
Type system → Maps to Solidity’s types (address, uint256, etc.)
Storage layout → Maps to how state variables map to slots
Function dispatch → Maps to the 4-byte function selector
ABI encoding → Maps to how calldata is structured

Resources for key challenges:

“Writing a C Compiler” by Nora Sandler - Apply patterns to a different target
“Crafting Interpreters” by Robert Nystrom - Parsing and bytecode fundamentals

Key Concepts:

Compiler Construction: “Writing a C Compiler” - Nora Sandler
ABI Specification: Solidity documentation
Storage Layout: “Mastering Ethereum” Ch. 13 - Antonopoulos

Difficulty: Advanced Time estimate: 1 month+ Prerequisites: Built an interpreter before, understand EVM basics

Learning milestones:

Compile arithmetic expressions - You understand stack-based code generation
Compile storage variables - You understand SLOAD/SSTORE targeting
Compile functions with ABI - You understand Ethereum’s calling convention

Real World Outcome

When you complete this project, you’ll have written your own Solidity-like compiler that generates real EVM bytecode:

1. Compile a Simple Contract

$ cat SimpleStorage.mini
contract SimpleStorage {
    uint256 value;

    function set(uint256 _value) public {
        value = _value;
    }

    function get() public view returns (uint256) {
        return value;
    }
}

$ ./minisolc compile SimpleStorage.mini

COMPILATION SUCCESSFUL
════════════════════════════════════════════════════════════
Contract: SimpleStorage
Functions: 2 (set, get)
State variables: 1 (value at slot 0)

Bytecode (runtime): 0x608060405234801561001057600080fd5b50600436106100365760003560e01c806360fe47b11461003b5780636d4ce63c14610057575b600080fd5b61005560048036038101906100509190...

Function selectors:
  - set(uint256): 0x60fe47b1
  - get(): 0x6d4ce63c

Gas estimates:
  - Deployment: 127,345 gas
  - set(uint256): 43,324 gas
  - get(): 2,429 gas (view, free externally)
════════════════════════════════════════════════════════════

2. Deploy and Call Your Compiled Contract

$ ./minisolc deploy SimpleStorage.mini --network local

Deploying to local EVM (from Project 3)...
Contract deployed at: 0x5FbDB2315678afecb367f032d93F642f64180aa3
Deployment cost: 127,345 gas

$ ./minisolc call 0x5FbDB...0aa3 "set(uint256)" 42

Transaction sent: 0x7a3c...
Gas used: 43,324
Storage updated: slot 0 = 0x000000000000000000000000000000000000000000000000000000000000002a

$ ./minisolc call 0x5FbDB...0aa3 "get()"

Return value: 42 (uint256)

3. View Generated Assembly

$ ./minisolc compile SimpleStorage.mini --output asm

FUNCTION: set(uint256)
════════════════════════════════════════════════════════════
  JUMPDEST                    ; Function entry point
  PUSH1 0x04                  ; Calldata offset
  CALLDATALOAD                ; Load first argument
  PUSH1 0x00                  ; Storage slot 0
  SSTORE                      ; Store to state
  STOP

FUNCTION: get()
════════════════════════════════════════════════════════════
  JUMPDEST                    ; Function entry point
  PUSH1 0x00                  ; Storage slot 0
  SLOAD                       ; Load from state
  PUSH1 0x00                  ; Memory offset
  MSTORE                      ; Store to memory
  PUSH1 0x20                  ; Return 32 bytes
  PUSH1 0x00                  ; From offset 0
  RETURN                      ; Return value
════════════════════════════════════════════════════════════

Your high-level `value = _value` became just 4 opcodes!

The Core Question You’re Answering

“How does high-level code like value = _value; become low-level EVM opcodes? Why does Solidity have gas costs and weird limitations?”

Understanding a compiler forces you to see that every line of Solidity has computational cost. value = _value is an SSTORE (20,000+ gas). Loops are JUMPs. Function calls are DELEGATECALL. Nothing is magic—it all compiles down to stack manipulations and state reads/writes.

Concepts You Must Understand First

Stop and research these before coding:

Compiler Pipeline Stages
- What’s the difference between lexing, parsing, semantic analysis, and code generation?
- Why separate concerns? (Modularity, testing, optimization opportunities)
- Book Reference: “Writing a C Compiler” by Nora Sandler - Full book
Stack-Based Code Generation
- How does an expression tree become a sequence of stack operations?
- Why does a + b * c compile to PUSH a, PUSH b, PUSH c, MUL, ADD?
- Book Reference: “Crafting Interpreters” Ch. 15-17 - Robert Nystrom
EVM Storage Model
- What’s the difference between storage (persistent), memory (temporary), and stack (working)?
- How are storage slots allocated for state variables?
- Book Reference: “Mastering Ethereum” Ch. 13 - Antonopoulos & Wood
ABI Encoding
- How are function calls encoded in calldata?
- What’s the function selector? (First 4 bytes of keccak256(signature))
- Book Reference: Ethereum ABI Specification (official docs)
Type Systems
- Why does Solidity have uint8 through uint256?
- What’s the difference between value types (uint, address) and reference types (arrays, structs)?
- Book Reference: “Writing a C Compiler” Ch. 11 - Nora Sandler

Questions to Guide Your Design

Language Subset: Which features do you support? (Start: arithmetic, storage variables, functions. Skip: inheritance, modifiers, events)
Type Safety: How do you enforce that uint8 + uint256 returns uint256?
Storage Layout: How do you assign slots to state variables deterministically?
Function Dispatch: How does calldata route to the right function?
Optimization: Do you implement constant folding? Dead code elimination?

The Interview Questions They’ll Ask

“Walk me through how Solidity compiles a function call.”
“Why does SSTORE cost 20,000 gas but SLOAD only 2,100?”
“What is the ABI and why does Ethereum need it?”
“How does the EVM know which function to call in a contract?”
“Explain the difference between memory and storage in Solidity.”
“Why can’t you return a dynamically-sized array from a function in older Solidity versions?”
“What optimizations does the Solidity compiler perform?”

Hints in Layers

Hint 1: Parse to an AST First Don’t generate bytecode directly from text. Build an Abstract Syntax Tree:

enum Expr {
    Literal(u256),
    Variable(String),
    BinaryOp { op: BinOp, left: Box<Expr>, right: Box<Expr> },
}

Hint 2: Stack-Based Codegen Uses Post-Order Traversal To compile a + b:

fn compile_expr(expr: &Expr) -> Vec<Opcode> {
    match expr {
        Expr::Literal(n) => vec![PUSH32(*n)],
        Expr::Variable(name) => {
            let slot = get_slot(name);
            vec![PUSH1(slot), SLOAD]
        },
        Expr::BinaryOp { op, left, right } => {
            let mut code = compile_expr(left);  // Push left
            code.extend(compile_expr(right));   // Push right
            code.push(match op {
                BinOp::Add => ADD,
                BinOp::Mul => MUL,
            });
            code
        }
    }
}

Hint 3: Function Dispatch Table Generate a dispatcher that routes based on function selector:

PUSH1 0x00          ; Get calldata
CALLDATALOAD
PUSH1 0xE0
SHR                 ; First 4 bytes (selector)

DUP1
PUSH4 0x60fe47b1    ; set(uint256) selector
EQ
PUSH2 set_function
JUMPI               ; If match, jump to set_function

DUP1
PUSH4 0x6d4ce63c    ; get() selector
EQ
PUSH2 get_function
JUMPI

REVERT              ; Unknown function

Books That Will Help

Topic	Book	Chapter
Complete compiler implementation	Writing a C Compiler by Nora Sandler	Full book (compilers from scratch)
Bytecode generation	Crafting Interpreters by Robert Nystrom	Ch. 14-24: “Bytecode VM”
EVM storage and memory	Mastering Ethereum by Antonopoulos & Wood	Ch. 13: “The EVM”
ABI specification	Ethereum documentation	ABI Spec
Type systems and checking	Writing a C Compiler	Ch. 11: “Type Checking”

Common Pitfalls & Debugging

Problem 1: “Parser accepts invalid syntax”

Why: Grammar is too permissive or doesn’t enforce precedence
Fix: Use a parser generator (like LALR) or hand-write a recursive descent parser with proper precedence handling

Quick test:

// Should fail:
function () { return; }  // Missing function name
uint x = ;               // Missing expression

Problem 2: “Type checker allows uint8 + uint256”

Why: Not enforcing type compatibility or implicit conversions

Fix: Solidity implicitly converts smaller types to larger. Implement type widening:

fn check_binary_op(left_type: Type, right_type: Type) -> Type {
    match (left_type, right_type) {
        (Type::Uint(a), Type::Uint(b)) => Type::Uint(max(a, b)),  // Widen to larger
        _ => panic!("Type mismatch"),
    }
}

Problem 3: “Storage variables overwrite each other”

Why: Not assigning unique storage slots
Fix: Sequential allocation. First variable at slot 0, second at slot 1, etc. For structs, allocate contiguous slots.
Debug: Print storage layout during compilation

Problem 4: “Generated bytecode is huge (way more gas than solc)”

Why: Not optimizing. Naive codegen generates redundant PUSHes and DUPs

Fix: Implement peephole optimizations:

PUSH1 0x00
PUSH1 0x00
ADD         ← Optimize to just PUSH1 0x00

DUP1
POP         ← Remove entirely (no-op)

Problem 5: “Function calls fail with ‘invalid function selector’“

Why: Function selector is keccak256(signature)[:4], must match exactly

Fix:

// For function "transfer(address,uint256)"
let signature = "transfer(address,uint256)";
let selector = keccak256(signature.as_bytes())[0..4];
// Bytecode: check if calldata[0:4] == selector, then jump to function

Problem 6: “ABI encoding/decoding doesn’t match Solidity”

Why: Padding and offset rules are intricate
Fix: Follow ABI spec exactly:
- Static types (uint, address): right-padded to 32 bytes
- Dynamic types (string, bytes): offset pointer + length + data
Test: Compare your encoding with abi.encode() output from Solidity

Problem 7: “Constructor doesn’t run when deploying contract”

Why: Constructor code must be part of the init bytecode, NOT runtime bytecode
Fix: Compiler outputs two bytecode blobs:
1. Init code: Runs constructor, returns runtime code
2. Runtime code: The actual contract code
```
Init bytecode structure:
[constructor logic] [CODECOPY runtime_code] [RETURN]
```

Debugging strategy:

Compare your compiler output with solc --asm output
Test each language feature in isolation (arithmetic, then storage, then functions)
Use your own EVM (Project 3) to trace execution and verify correctness

Project Comparison Table

Project	Difficulty	Time	Depth of Understanding	Fun Factor
Bitcoin From Scratch	Advanced	1 month+	⭐⭐⭐⭐⭐	⭐⭐⭐⭐
Minimal Blockchain	Intermediate	Weekend	⭐⭐⭐	⭐⭐⭐⭐⭐
EVM From Scratch	Advanced	2-4 weeks	⭐⭐⭐⭐⭐	⭐⭐⭐⭐
Proof-of-Stake	Advanced	2-4 weeks	⭐⭐⭐⭐	⭐⭐⭐
Smart Contract Compiler	Advanced	1 month+	⭐⭐⭐⭐	⭐⭐⭐⭐

Recommendation

Start with Project 2 (Minimal Blockchain in a Weekend), then branch based on your interest:

                    ┌─────────────────────────────────────┐
                    │  Project 2: Minimal Blockchain      │
                    │  (Start here - core mental model)   │
                    └──────────────┬──────────────────────┘
                                   │
              ┌────────────────────┼────────────────────┐
              ▼                    ▼                    ▼
    ┌─────────────────┐  ┌─────────────────┐  ┌─────────────────┐
    │ Bitcoin Path    │  │ Ethereum Path   │  │ Consensus Path  │
    │ Project 1       │  │ Project 3 → 5   │  │ Project 4       │
    │ (Cryptography   │  │ (Smart contract │  │ (Distributed    │
    │  deep dive)     │  │  execution)     │  │  systems)       │
    └─────────────────┘  └─────────────────┘  └─────────────────┘

If you want to understand cryptography and Bitcoin specifically → Project 1 with Jimmy Song’s book
If you want to understand smart contracts and Ethereum → Project 3 (EVM) then Project 5 (Compiler)
If you want to understand distributed consensus → Project 4 (PoS)

Project 6: Build a Full Layer-2 Rollup

What you’ll build: A complete Layer-2 scaling solution—a rollup that batches transactions off-chain, posts compressed data to a simulated L1, and allows users to withdraw via fraud proofs or validity proofs.

Why it teaches everything: This project synthesizes all the concepts: you need cryptography (for signatures and commitments), the EVM (to execute rollup transactions), consensus (for sequencer selection), and compilers (to generate proof circuits). It’s how Optimism, Arbitrum, and zkSync actually work.

Core challenges you’ll face:

Transaction batching → Compress and post to L1
State commitments → Merkle roots of rollup state
Fraud proofs (Optimistic) → Challenge invalid state transitions
Validity proofs (ZK) → Prove execution correctness cryptographically
Bridge contracts → Deposit/withdraw between L1 and L2
Sequencer economics → Who orders transactions and why

Resources for key challenges:

Optimism’s cannon and fault proof specs
Vitalik’s Incomplete Guide to Rollups
“Mastering Ethereum” for bridge contract patterns

Key Concepts:

Rollup Architecture: Vitalik’s rollup blog posts
Fraud Proofs: Optimism documentation
ZK Proofs: “Proofs, Arguments, and Zero-Knowledge” by Justin Thaler
Bridge Security: “Mastering Ethereum” - Antonopoulos

Difficulty: Expert Time estimate: 2-3 months Prerequisites: Completed Projects 1-3, understand both Bitcoin and Ethereum

Learning milestones:

Batch transactions and post to L1 - You understand data availability
Execute and commit state - You understand rollup state machines
Fraud/validity proofs work - You understand the security model
Bridge deposits and withdrawals - You understand L1↔L2 interop

Real World Outcome

When you complete this project, you’ll have built a complete Layer-2 rollup system—this is how Optimism, Arbitrum, and zkSync scale Ethereum:

1. Deposit from L1 to L2

$ ./rollup-cli deposit --amount 10 --token ETH --to 0xYourL2Address

L1 DEPOSIT TRANSACTION:
════════════════════════════════════════════════════════════
L1 Bridge Contract: 0x1234...abcd
Tokens Locked: 10 ETH
Destination (L2): 0xYourL2Address
L1 Block: 18,245,672
L1 Tx Hash: 0x7f3a...

Waiting for L2 sequencer to process deposit...
[3 seconds later]

L2 DEPOSIT CONFIRMED:
L2 Balance updated: 0xYourL2Address = 10 ETH
L2 Block: 1,824,332
════════════════════════════════════════════════════════════

2. Execute Transactions on L2 (Cheap!)

$ ./rollup-cli transfer --to 0xAlice --amount 1 --fee 0.0001

L2 TRANSACTION:
════════════════════════════════════════════════════════════
From: 0xYour...
To: 0xAlice...
Amount: 1 ETH
L2 Gas Price: 0.0001 gwei (1000x cheaper than L1!)
L2 Gas Used: 21,000
Total Cost: ~$0.00002 (vs $2 on L1)

Status: ✓ CONFIRMED in L2 block 1,824,335
L2 Tx Hash: 0x9a2c...
════════════════════════════════════════════════════════════

$ ./rollup-cli balance 0xAlice
L2 Balance: 1.0 ETH

3. Sequencer Batches and Posts to L1

$ ./rollup-sequencer view-batch 1824

BATCH #1824 (L2 Blocks 1,824,300 - 1,824,400)
════════════════════════════════════════════════════════════
L2 Transactions: 4,872 txs
Compressed Data Size: 48 KB (from 1.2 MB uncompressed)
Compression Ratio: 25x

State Root (old): 0x7a3f...
State Root (new): 0x9c2e...

POSTING TO L1...
L1 Gas Cost: 1,200,000 gas (~$40)
Cost per L2 tx: $0.008 (shared across 4,872 txs!)

L1 Batch Transaction: 0xb3f7...
Challenge Period: 7 days (until block 18,296,000)
════════════════════════════════════════════════════════════

4. Fraud Proof Challenge (Optimistic Rollup)

$ ./rollup-verifier detect-fraud --batch 1824

FRAUD DETECTED IN BATCH #1824!
════════════════════════════════════════════════════════════
Claimed State Root: 0x9c2e...
Actual State Root: 0x8d1f... (MISMATCH!)

Transaction causing fraud: Tx #2,341 in batch
  Claimed: Transfer 5 ETH from 0xBob to 0xEve
  Problem: 0xBob only has 2 ETH (insufficient balance!)

SUBMITTING FRAUD PROOF TO L1...
════════════════════════════════════════════════════════════

L1 FRAUD PROOF VERIFICATION:
1. Loading batch from L1 calldata ✓
2. Re-executing Tx #2,341 on-chain ✓
3. Computing correct state root ✓
4. Comparing: 0x8d1f... ≠ 0x9c2e... ✓ FRAUD CONFIRMED

SLASHING SEQUENCER:
  - Sequencer bond: 1000 ETH
  - Slashed: 100 ETH (10%)
  - Challenger reward: 10 ETH
  - Batch REVERTED
  - New challenge period started

All withdrawals from this batch are now invalid!
════════════════════════════════════════════════════════════

5. Withdraw from L2 Back to L1

$ ./rollup-cli withdraw --amount 5 --to 0xYourL1Address

L2 WITHDRAWAL INITIATED:
════════════════════════════════════════════════════════════
Amount: 5 ETH
L2 Balance after: 4 ETH
L2 Block: 1,825,100

Merkle Proof Generated:
  Proof Hash: 0x3f7a...
  State Root: 0x5c2e...

Waiting for batch to be posted to L1...
[10 minutes later]

BATCH POSTED TO L1 (Batch #1825)
Challenge period: 7 days (Optimistic Rollup)

You can finalize withdrawal after: 2025-01-05 14:30:00 UTC
════════════════════════════════════════════════════════════

[7 days later]

$ ./rollup-cli finalize-withdrawal --proof-id 0x3f7a...

FINALIZING WITHDRAWAL ON L1:
════════════════════════════════════════════════════════════
Verifying Merkle proof against L1 state root... ✓
Checking no fraud proofs were submitted... ✓
Checking withdrawal not already processed... ✓

L1 Transaction: Transferring 5 ETH to 0xYourL1Address
L1 Gas Cost: 150,000 gas (~$5)

✓ WITHDRAWAL COMPLETE
L1 Balance: 5 ETH received
════════════════════════════════════════════════════════════

The Core Question You’re Answering

“How can Ethereum scale to thousands of transactions per second while maintaining security? What’s the trade-off between Optimistic and ZK rollups?”

Rollups move computation off-chain but keep data on-chain. This is the key insight: you don’t need L1 to execute every transaction, just to store the data so anyone can verify. Optimistic rollups bet that most sequencers are honest (challenge if not). ZK rollups prove correctness cryptographically (instant finality, no trust needed).

Concepts You Must Understand First

Stop and research these before coding:

Data Availability
- Why must transaction data be posted to L1 even if execution is off-chain?
- What happens if sequencer withholds data?
- Book Reference: Vitalik’s “Incomplete Guide to Rollups”
State Commitments (Merkle Roots)
- How does a 32-byte hash commit to the entire L2 state?
- What’s in a Merkle proof and why is it logarithmic in size?
- Book Reference: “Mastering Ethereum” Ch. 11 - Antonopoulos
Fraud Proofs (Optimistic)
- How do you prove a state transition was invalid?
- Why do you need to re-execute transactions on L1?
- Why a 7-day challenge period?
- Book Reference: Optimism documentation; Arbitrum Nitro specs
Validity Proofs (ZK-SNARKs)
- How does a ZK proof prove “I executed 1000 transactions correctly” without showing the transactions?
- What are circuits and why are they hard to write?
- Book Reference: “Proofs, Arguments, and Zero-Knowledge” by Justin Thaler
Bridge Security
- How do deposits from L1→L2 work?
- How do withdrawals ensure you can only take what you own?
- What’s a “forced transaction” for censorship resistance?
- Book Reference: “Mastering Ethereum” bridge patterns

The Interview Questions They’ll Ask

“Explain the difference between Optimistic and ZK rollups. What are the trade-offs?”
“Why do Optimistic rollups have a 7-day withdrawal delay?”
“What is data availability and why is it critical for rollup security?”
“How does a fraud proof work? Walk me through the on-chain verification.”
“What happens if a rollup sequencer goes offline or becomes malicious?”
“How do rollups achieve 10-100x lower fees than L1?”
“What is a validity proof and how does it differ from a fraud proof?”
“Explain sequencer centralization risks and mitigation strategies.”

Hints in Layers

Hint 1: Start with L1 Bridge Contract

contract L1Bridge {
    mapping(address => uint256) public deposits;
    bytes32 public latestStateRoot;
    uint256 public challengePeriodEnd;

    function deposit(address l2Recipient) external payable {
        deposits[msg.sender] += msg.value;
        emit Deposit(msg.sender, l2Recipient, msg.value);
        // L2 sequencer watches for Deposit events
    }
}

Hint 2: Batch Compression Matters Instead of storing full transactions, store deltas:

Full tx: [from, to, value, signature, nonce] = 200+ bytes
Compressed: [from_idx, to_idx, value_delta] = 12 bytes

Hint 3: Fraud Proof Requires On-Chain Execution L1 contract must be able to execute a single L2 transaction:

function verifyFraudProof(
    bytes32 preStateRoot,
    bytes calldata txData,
    bytes32 claimedPostStateRoot
) external {
    bytes32 actualRoot = executeTransaction(preStateRoot, txData);
    require(actualRoot != claimedPostStateRoot, "No fraud");
    slashSequencer();
}

Books That Will Help

Topic	Book/Resource	Chapter/Section
Rollup fundamentals	Vitalik’s “Incomplete Guide to Rollups”	Full post
Merkle proofs and commitments	Mastering Ethereum by Antonopoulos & Wood	Ch. 11
Optimistic rollup design	Optimism Documentation	Fault Proofs spec
ZK-SNARK theory	Proofs, Arguments, and Zero-Knowledge by Justin Thaler	Ch. 1-3, 10-12
Bridge security patterns	Mastering Ethereum	Ch. 7 (Smart Contracts)
Data availability	Ethereum Research posts	ethereum.org/roadmap/scaling

Common Pitfalls & Debugging

Problem 1: “Bridge allows withdrawing more than deposited”

Why: Not tracking L2 balances correctly, or missing replay protection
Fix: Bridge contract must:
1. Lock tokens on L1 when depositing
2. Verify Merkle proof of L2 balance before allowing withdrawal
3. Mark withdrawal as processed to prevent replay
Test: Try withdrawing same amount twice—second should fail

Problem 2: “Fraud proof window expires too quickly”

Why: Challenge period too short for verifiers to check
Fix: Optimistic rollups need ~7 days (Optimism/Arbitrum use 7-day window). This allows time for anyone to submit fraud proof if sequencer cheats
Security: Shorter window = less decentralization (only fast verifiers can participate)

Problem 3: “Data availability attack: sequencer withholds batch data”

Why: Posting state root without posting transaction data
Fix: MUST post full transaction data (or data hash) to L1. Users can’t exit if they can’t reconstruct state
Rule: Data availability is more important than computation verification!

Problem 4: “Fraud proof is invalid but gets accepted”

Why: Not verifying the proof correctly on-chain

Fix: L1 contract must:

function challengeStateRoot(
    bytes32 oldRoot,
    bytes32 newRoot,
    bytes calldata txData,
    bytes32[] calldata merkleProof
) external {
    // 1. Verify old state via Merkle proof
    require(verifyMerkleProof(oldRoot, merkleProof), "Invalid old state");

    // 2. Re-execute transaction on-chain
    bytes32 computedNewRoot = executeTx(oldRoot, txData);

    // 3. Compare with claimed new root
    require(computedNewRoot != newRoot, "State root is valid");

    // 4. Slash sequencer, reward challenger
    slashSequencer();
    rewardChallenger(msg.sender);
}

Problem 5: “ZK proof generation takes forever”

Why: ZK proofs are computationally intensive (proving ~1000 EVM opcodes can take minutes)
Fix: Use proof recursion/aggregation. Prove 1000 txs → combine 10 proofs → combine 10 meta-proofs. Final proof is constant size.
Alternative: Start with optimistic rollup (simpler), add ZK later

Problem 6: “Sequencer censorship: can’t get transactions included”

Why: Centralized sequencer ignores certain users
Fix: Implement forced inclusion: users can submit tx directly to L1 contract, sequencer MUST include it within N blocks or get slashed
Decentralization: Use sequencer rotation or shared sequencing (multiple sequencers)

Problem 7: “Gas costs explode when posting batches to L1”

Why: Not compressing transaction data
Fix: Use calldata compression:
- Omit default values (signature recovery = use v,r,s)
- Use custom encoding (not full RLP)
- Batch similar transactions together
Benchmark: Optimism achieves ~10x compression

Problem 8: “Exit from L2 to L1 fails during network congestion”

Why: Relying on sequencer to process exit
Fix: Implement emergency escape hatch: users can always exit directly via L1 by providing Merkle proof of their L2 balance

Code:

function emergencyWithdraw(
    uint256 amount,
    bytes32[] calldata proof
) external {
    bytes32 leaf = keccak256(abi.encodePacked(msg.sender, amount));
    require(verifyMerkleProof(latestStateRoot, leaf, proof), "Invalid proof");
    require(!isWithdrawn[leaf], "Already withdrawn");

    isWithdrawn[leaf] = true;
    token.transfer(msg.sender, amount);
}

Architecture decision tree:

Optimistic vs ZK Rollup?

Optimistic: Easier to build, EVM-compatible, 7-day withdrawal delay
ZK: Harder to build, requires circuits, instant finality

Start with: Optimistic (get it working), then explore ZK proofs

Debugging strategy:

Test L1-L2 deposit flow first (simpler)
Test L2 execution in isolation (use your EVM from Project 3)
Test state commitment posting
Test withdrawal flow (hardest - involves proofs)
Test fraud/validity proof verification last

Essential tools:

Hardhat/Foundry for L1 contract development
Your EVM (Project 3) for L2 execution
Circom/ZoKrates for ZK circuits (if doing ZK rollup)

Summary

This comprehensive blockchain learning path covers the complete stack—from cryptographic primitives to distributed consensus to smart contract execution. Here’s the complete list of all 6 projects:

#	Project Name	Main Language	Difficulty	Time Estimate	Key Focus
1	Build Bitcoin From Scratch	Python	Master	1 month+	Cryptography, UTXO model, Proof-of-Work
2	Build a Minimal Blockchain in a Weekend	Python/Rust	Intermediate	Weekend	Core blockchain data structure
3	Build the Ethereum Virtual Machine (EVM) From Scratch	Rust	Master	2-4 weeks	Stack machine, opcodes, gas metering
4	Implement a Proof-of-Stake Consensus	C	Master	2-4 weeks	BFT consensus, game theory, slashing
5	Build a Simple Smart Contract Compiler	Rust	Master	1 month+	Compilers, code generation, ABI
6	Build a Full Layer-2 Rollup	Multiple	Expert	2-3 months	Scaling, fraud proofs, bridges

Recommended Learning Paths

Choose your path based on what interests you most:

Path 1: Bitcoin & Cryptography Deep Dive

For: Those fascinated by cryptographic systems and decentralized money Sequence:

Project 2 (Weekend) - Get the core blockchain mental model
Project 1 (1 month) - Deep dive into Bitcoin’s cryptography
Project 4 (2-4 weeks) - Understand modern consensus (PoS)

Expected outcome: You’ll understand Bitcoin at the implementation level, know elliptic curve cryptography intimately, and grasp why Proof-of-Stake is different.

Path 2: Ethereum & Smart Contracts

For: Those building decentralized applications or auditing smart contracts Sequence:

Project 2 (Weekend) - Understand blockchain basics
Project 3 (2-4 weeks) - Build the EVM to understand execution
Project 5 (1 month) - Build a compiler to understand gas costs
Project 6 (2-3 months) - Understand scaling with rollups

Expected outcome: You’ll understand every opcode in the EVM, why Solidity has its quirks, how gas is metered, and how Layer-2 scaling works.

Path 3: Distributed Systems & Consensus

For: Those interested in distributed algorithms and system design Sequence:

Project 2 (Weekend) - Blockchain as a distributed data structure
Project 4 (2-4 weeks) - Byzantine Fault Tolerant consensus
Project 6 (2-3 months) - Rollups as distributed systems

Expected outcome: You’ll understand Byzantine Fault Tolerance, economic security models, and how to build systems that work despite malicious actors.

Path 4: Full-Stack Blockchain Engineer (Complete Path)

For: Those who want comprehensive understanding of all blockchain layers Sequence:

Project 2 (Weekend) - Foundation
Project 1 (1 month) - Cryptography layer
Project 3 (2-4 weeks) - Execution layer
Project 4 (2-4 weeks) - Consensus layer
Project 5 (1 month) - Developer tools layer
Project 6 (2-3 months) - Scaling layer

Total time: 5-7 months of dedicated learning Expected outcome: You’ll understand blockchain systems from first principles—able to read any blockchain’s source code, audit smart contracts, design consensus mechanisms, and architect scaling solutions.

Expected Outcomes After Completing These Projects

After working through all 6 projects, you will be able to:

Read and understand any blockchain’s source code
- Bitcoin Core, Geth (Ethereum), Solana runtime, etc.
- Trace how a transaction flows from submission to finality
Audit smart contracts for security vulnerabilities
- Understand gas optimization techniques
- Identify reentrancy, integer overflow, and other common bugs
- Read EVM bytecode and assembly
Design consensus mechanisms
- Understand the trade-offs between PoW, PoS, and BFT consensus
- Design economic incentives to align validators
- Implement slashing conditions
Build blockchain infrastructure
- Write indexers, block explorers, or analytics tools
- Implement custom opcodes or precompiles
- Build development tools (debuggers, profilers)
Architect scaling solutions
- Design Layer-2 rollups or sidechains
- Understand data availability sampling
- Implement bridge contracts securely
Answer technical interview questions confidently
- Explain cryptographic primitives (ECC, hash functions, Merkle trees)
- Discuss consensus trade-offs
- Compare blockchain architectures (UTXO vs Account model)
Contribute to open-source blockchain projects
- Ethereum clients (Geth, Reth, Nethermind)
- Layer-2 solutions (Optimism, Arbitrum, zkSync)
- Bitcoin Core or Lightning Network

Difficulty Progression

The projects are designed with increasing complexity:

Difficulty Curve:

Easy        Project 2 (Minimal Blockchain)
  │              │
  │              ▼
  │         [Core mental model established]
  │              │
Medium     ──────┘
  │
  │         Project 3 (EVM)          Project 4 (PoS)
  │              │                        │
  │              ▼                        ▼
  │         [Execution layer]      [Consensus layer]
  │              │                        │
Advanced   ─────┴────────────────────────┘
  │                                       │
  │         Project 1 (Bitcoin)      Project 5 (Compiler)
  │              │                        │
  │              ▼                        ▼
  │         [Cryptography]           [Developer tools]
  │              │                        │
Expert     ─────┴────────────────────────┘
                                          │
                                          ▼
                                    Project 6 (Rollup)
                                    [Full synthesis]

Start with Project 2 to build intuition, then choose your path based on interests.

Time Investment Guide

If you have 1 weekend:

Complete Project 2 (Minimal Blockchain)
Outcome: Core mental model of how blockchains work

If you have 1 month:

Weekend: Project 2
Week 1-2: Project 3 (EVM) or Project 1 (Bitcoin)
Week 3-4: Deepen chosen project or start second project
Outcome: Deep understanding of either execution (Ethereum) or cryptography (Bitcoin)

If you have 3 months:

Month 1: Projects 2 + 3 (Blockchain basics + EVM)
Month 2: Project 1 (Bitcoin) or Project 4 (PoS)
Month 3: Project 5 (Compiler) or Project 6 (Rollup)
Outcome: Comprehensive blockchain developer skills

If you have 6+ months:

Complete all 6 projects in sequence
Contribute to open-source projects between projects
Build your own blockchain or dApp as a capstone
Outcome: Expert-level blockchain engineering skills

Interview Preparation Map

Each project maps directly to common interview topics:

Interview Topic	Covered in Project	Key Questions
Cryptography	Project 1	ECDSA, hash functions, Merkle trees
Consensus Algorithms	Projects 1, 4	PoW vs PoS, Byzantine Fault Tolerance, finality
Smart Contracts	Projects 3, 5	EVM execution, gas optimization, security
Scaling Solutions	Project 6	Layer-2 rollups, data availability, bridges
Blockchain Architecture	Project 2	UTXO vs Account model, immutability, forks
Distributed Systems	Projects 4, 6	Fault tolerance, consensus, state machines

After completing these projects, you’ll confidently answer questions like:

“Walk me through what happens when you send a Bitcoin transaction.”
“Explain how the EVM executes a smart contract.”
“What’s the difference between Optimistic and ZK rollups?”
“How does proof-of-stake achieve finality?”
“Why does Solidity have gas costs?”

Next Steps After Completion

Once you’ve finished these projects:

Contribute to Open Source
- Ethereum clients: Geth, Reth, Nethermind
- Bitcoin: Bitcoin Core, Lightning Network
- Layer-2: Optimism, Arbitrum, zkSync
Build Your Own Project
- Novel consensus mechanism
- Domain-specific blockchain
- DeFi protocol or DAO
- Developer tooling
Specialize Further
- ZK cryptography (SNARKs, STARKs)
- MEV (Maximal Extractable Value)
- Blockchain security auditing
- Protocol research
Apply Your Skills
- Blockchain engineer at Web3 company
- Smart contract auditor
- Protocol researcher
- Developer relations / education

Additional Resources

Beyond the books referenced in each project, explore:

Blogs & Research:

Vitalik Buterin’s blog (ethereum.org/en/learn)
A16z Crypto Research
Trail of Bits blockchain security research

Communities:

Ethereum Research Forum (ethresear.ch)
Bitcoin Stack Exchange
/r/cryptography, /r/ethereum, /r/bitcoin

Courses (After Projects):

Stanford CS 251: Cryptocurrencies and Blockchain Technologies
Berkeley CS 294-144: Blockchain and Cryptocurrencies
MIT 15.S12: Blockchain and Money

Practice:

Ethernaut (smart contract CTF)
Capture the Ether
Damn Vulnerable DeFi

Sources

GitHub: Blockchain Development Resources - Comprehensive resource collection
Bitcoin Whitepaper - Original Satoshi paper
Programming Bitcoin by Jimmy Song - O’Reilly
Mastering Bitcoin GitHub - Free 3rd edition
EVM From Scratch - W1nt3r.eth’s course
EVM From Scratch Book - Jupyter notebooks
Mastering Ethereum Ch. 13 - EVM - GitBook
Ethereum.org EVM Docs - Official documentation