P06: HD Wallet (BIP-32/BIP-39) Implementation
P06: HD Wallet (BIP-32/BIP-39) Implementation
Project Overview
| Attribute | Value |
|---|---|
| Main Language | Rust |
| Alternative Languages | Go, Python, TypeScript |
| Difficulty | Expert |
| Coolness Level | Level 4: Hardcore Tech Flex |
| Business Potential | Resume Gold (Educational/Personal Brand) |
| Knowledge Area | Cryptography / Key Management |
| Main Book | “Mastering Bitcoin” by Andreas M. Antonopoulos |
Learning Objectives
By completing this project, you will:
- Master mnemonic seed phrase generation understanding how entropy is encoded into memorable words (BIP-39)
- Implement hierarchical key derivation using HMAC-SHA512 to derive infinite child keys from a single master seed (BIP-32)
- Understand the security tradeoffs between hardened and normal derivation paths
- Build a multi-currency wallet supporting Bitcoin, Ethereum, and other chains from a single recovery phrase
- Parse and validate derivation paths implementing the m/44’/60’/0’/0/0 notation used by modern wallets
Deep Theoretical Foundation
The Problem: Key Management at Scale
Before hierarchical deterministic (HD) wallets, cryptocurrency users faced a nightmare:
- Random key generation: Each new address required a new random private key
- Backup complexity: Users needed to back up every single key separately
- No organization: Keys were just a bag of random numbers with no structure
- Recovery nightmare: Lose one backup file, lose those coins forever
Imagine managing 100 different addresses, each with its own private key, each needing secure backup. This was the reality of Bitcoin in its early days.
The Solution: Deterministic Hierarchy
HD wallets solve this elegantly:
- Single seed: All keys derive from one master secret
- Deterministic: The same seed always produces the same keys in the same order
- Hierarchical: Keys are organized in a tree structure with meaningful paths
- Human-readable backup: The seed is encoded as 12-24 English words
With an HD wallet, you back up 24 words once, and you can recover every address you’ll ever create.
BIP-39: Mnemonic Seed Phrases
BIP-39 defines how to convert entropy into memorable words and back into a binary seed.
The Word List
BIP-39 specifies a list of 2048 words (2^11 = 2048, so each word encodes 11 bits). The English word list includes words like:
abandon, ability, able, about, above, absent, absorb, abstract, absurd, abuse...
Why 2048 words?
- Powers of 2 enable clean bit-to-word mapping
- 2048 words is enough for variety while remaining memorable
- Each word can be identified by its first 4 letters (no duplicates)
From Entropy to Mnemonic
The process:
1. Generate entropy (128-256 bits of random data)
2. Calculate checksum: SHA256(entropy)[first N bits]
where N = entropy_bits / 32
3. Append checksum to entropy
4. Split into 11-bit groups
5. Each group indexes into the word list
Example for 128-bit entropy:
Entropy: 128 bits (16 bytes)
Checksum: 128/32 = 4 bits (first 4 bits of SHA256)
Total: 132 bits
Groups: 132/11 = 12 words
For 256-bit entropy:
Entropy: 256 bits (32 bytes)
Checksum: 256/32 = 8 bits
Total: 264 bits
Groups: 264/11 = 24 words
Visual Representation
Entropy Generation
|
+------------+------------+
| 128-256 bits |
| (random bytes) |
+------------+------------+
|
v
+------------------------+
| SHA256(entropy) |
| Take first N bits |
| (checksum) |
+------------------------+
|
+------------+------------+
| entropy || checksum |
| 132-264 bits total |
+------------+------------+
|
+---------+---------+---------+---------+
| 11 bits | 11 bits | 11 bits | ... |
+----+----+----+----+----+----+---------+
| | |
v v v
+--------+ +--------+ +--------+
| word 1 | | word 2 | | word 3 | ...
+--------+ +--------+ +--------+
From Mnemonic to Seed
The mnemonic is not the seed directly. It’s converted using PBKDF2:
seed = PBKDF2(
password = mnemonic_words (space-separated),
salt = "mnemonic" + optional_passphrase,
iterations = 2048,
key_length = 64 bytes (512 bits),
hash_function = HMAC-SHA512
)
Why PBKDF2?
- Adds computational cost to brute-force attacks
- Allows an optional passphrase (the “25th word”)
- Produces a 512-bit seed for BIP-32
The optional passphrase:
- Acts as a second factor for wallet recovery
- Different passphrases produce completely different wallets
- Plausible deniability: same mnemonic with different passphrases = different wallets
BIP-32: Hierarchical Deterministic Wallets
BIP-32 defines how to derive a tree of keys from a single seed.
Master Key Generation
1. Take the 512-bit seed from BIP-39
2. Calculate: I = HMAC-SHA512(key="Bitcoin seed", data=seed)
3. Split I into two 256-bit halves:
- IL (left 32 bytes) = master private key
- IR (right 32 bytes) = master chain code
The chain code is crucial: it adds entropy to the derivation process, preventing child keys from being derived if only the private key is known.
Extended Keys
An “extended key” combines:
- The key itself (32 bytes private or 33 bytes public)
- The chain code (32 bytes)
- Metadata (depth, parent fingerprint, child index)
This is what you serialize and share as xprv... or xpub... strings.
Extended Private Key (78 bytes):
+--------+----------+--------+----------+-------+-----------+
| Version| Depth | Parent | Child | Chain | Key |
| 4 bytes| 1 byte | 4 bytes| Index | Code | Data |
| | | finger | 4 bytes | 32b | 33 bytes |
+--------+----------+--------+----------+-------+-----------+
Version codes:
0x0488ADE4 = xprv (mainnet private)
0x0488B21E = xpub (mainnet public)
0x04358394 = tprv (testnet private)
0x043587CF = tpub (testnet public)
Child Key Derivation Function (CKD)
This is the heart of BIP-32. Given a parent key and an index, derive a child key.
For private parent to private child:
Input: parent_private_key (k), parent_chain_code (c), index (i)
If i >= 2^31 (hardened):
I = HMAC-SHA512(key=c, data=0x00 || k || i)
Else (normal):
I = HMAC-SHA512(key=c, data=point(k) || i)
IL, IR = split I into 256-bit halves
child_key = (IL + k) mod n // n = secp256k1 order
child_chain_code = IR
Output: (child_key, child_chain_code)
For public parent to public child (normal derivation only):
Input: parent_public_key (K), parent_chain_code (c), index (i)
If i >= 2^31:
FAIL // Cannot derive hardened from public
Else:
I = HMAC-SHA512(key=c, data=K || i)
IL, IR = split I into 256-bit halves
child_key = point(IL) + K // Point addition
child_chain_code = IR
Output: (child_key, child_chain_code)
Hardened vs Normal Derivation
This is one of the most important security concepts in HD wallets.
Normal derivation (index 0 to 2^31-1):
- Uses compressed public key in HMAC input
- Public key can derive public children
- If a child private key leaks along with parent public key + chain code,
all sibling private keys can be computed!
Hardened derivation (index 2^31 to 2^32-1, shown as i’):
- Uses private key in HMAC input
- Public key CANNOT derive children
- Even if a child private key leaks, siblings are protected
Visual comparison:
Normal Derivation (i < 2^31):
Parent Private ──────────────────▶ Child Private
│ │
│ point() │ point()
▼ ▼
Parent Public ───────────────────▶ Child Public
▲ ▲
│ │
(can derive children from xpub) (anyone can verify)
Hardened Derivation (i >= 2^31):
Parent Private ──────────────────▶ Child Private
│ │
│ point() │ point()
▼ ▼
Parent Public X Child Public
▲ │ ▲
│ (blocked) │
(cannot derive hardened children) (anyone can verify)
Why does this matter?
Scenario: You run an e-commerce site and want to generate fresh Bitcoin addresses for each customer. You share your xpub with your server so it can derive new addresses without ever knowing private keys.
- If you use normal derivation and an attacker steals both (1) your xpub and (2) any one child private key, they can compute the master private key and steal all funds!
- If you use hardened derivation up to the account level, stealing one child key only compromises that child.
The standard practice: hardened derivation for purpose, coin type, and account; normal derivation for change and address index (so you can share account xpubs).
BIP-44: Multi-Account Hierarchy
BIP-44 standardizes the derivation path structure:
m / purpose' / coin_type' / account' / change / address_index
Each level:
| Level | Hardened? | Purpose |
|---|---|---|
| purpose’ | Yes | Always 44’ for BIP-44 |
| coin_type’ | Yes | Cryptocurrency identifier (0’=BTC, 60’=ETH, 2’=LTC) |
| account’ | Yes | User’s separate accounts (0’, 1’, 2’…) |
| change | No | 0=external (receiving), 1=internal (change) |
| address_index | No | Sequential address number (0, 1, 2…) |
Common paths:
Bitcoin mainnet: m/44'/0'/0'/0/0
Ethereum mainnet: m/44'/60'/0'/0/0
Bitcoin testnet: m/44'/1'/0'/0/0
Bitcoin first change: m/44'/0'/0'/1/0
Second account ETH: m/44'/60'/1'/0/0
Path notation:
m= master key (from seed)/= derivation step'= hardened derivation (index + 2^31)- Number = index
Key Recovery: The Full Circle
When you enter your 24 words into a new wallet:
1. Validate words exist in BIP-39 word list
2. Convert words back to entropy + checksum bits
3. Verify checksum matches
4. Apply PBKDF2 with mnemonic and passphrase
5. Use BIP-32 to derive master key
6. Apply BIP-44 paths to regenerate all addresses
7. Scan blockchain for transactions at those addresses
This is why the order of words matters, why typos fail validation, and why the same 24 words always recover the same wallet.
Security Considerations
Entropy Quality
The security of your entire wallet depends on entropy quality:
- Use a cryptographically secure random number generator (CSPRNG)
- Never use predictable sources (dates, names, keyboard patterns)
- 128 bits = ~2^128 attempts to brute force (sufficient)
- 256 bits = future-proof against quantum computers
Passphrase Considerations
The optional passphrase:
- Advantage: Second factor, plausible deniability
- Risk: Forget it = lose everything (not recoverable from mnemonic alone)
- Risk: Weak passphrase = brute-forceable with known mnemonic
- Best practice: Either use a strong passphrase or none at all
Extended Public Key Exposure
Sharing an xpub:
- Reveals all past and future addresses derived from it
- Privacy concern: anyone with xpub can track your balance
- Combined with leaked child private key = master key compromise (normal derivation)
- Safe for: Watch-only wallets, address generation servers (with hardened account)
Complete Project Specification
Functional Requirements
- Mnemonic Generation (BIP-39)
- Generate cryptographically random entropy (128, 160, 192, 224, or 256 bits)
- Calculate and append checksum
- Convert to mnemonic word sequence
- Support English word list (optional: other languages)
- Seed Derivation (BIP-39)
- Implement PBKDF2-HMAC-SHA512
- Support optional passphrase
- Produce 512-bit seed
- Master Key Generation (BIP-32)
- Derive master private key and chain code from seed
- Validate key is within secp256k1 group order
- Child Key Derivation (BIP-32)
- Implement CKDpriv (private parent to private child)
- Implement CKDpub (public parent to public child)
- Support both normal and hardened derivation
- Handle edge cases (IL >= n, result = 0)
- Path Parsing (BIP-44)
- Parse derivation paths like
m/44'/60'/0'/0/0 - Apply sequential derivations
- Validate path format
- Parse derivation paths like
- Address Generation
- Derive public key from private key (secp256k1)
- Generate Bitcoin addresses (P2PKH, P2WPKH)
- Generate Ethereum addresses (keccak256)
- Serialization
- Encode extended keys as Base58Check (
xprv,xpub) - Decode and validate extended key strings
- Encode extended keys as Base58Check (
Command-Line Interface
# Generate new mnemonic
$ hdwallet generate --words 24
Mnemonic: abandon abandon abandon abandon abandon abandon abandon abandon
abandon abandon abandon about
Seed: 5eb00bbddcf069084889a8ab9155568165f5c453ccb85e70811aaed6f6da5fc19a5ac40b389cd370d086206dec8aa6c43daea6690f20ad3d8d48b2d2ce9e38e4
# Derive keys from mnemonic
$ hdwallet derive --mnemonic "abandon..." --path "m/44'/60'/0'/0/0"
Path: m/44'/60'/0'/0/0
Private Key: 0x...
Public Key: 0x...
Ethereum Address: 0x...
# Generate Bitcoin addresses
$ hdwallet derive --mnemonic "abandon..." --path "m/44'/0'/0'/0" --count 5
Address 0: 1...
Address 1: 1...
Address 2: 1...
Address 3: 1...
Address 4: 1...
# Export extended public key
$ hdwallet xpub --mnemonic "abandon..." --path "m/44'/0'/0'"
xpub: xpub6BosfCnifzxcFwrSzQiqu2DBVTshkCXacvNsWGYJVVhhawA7d4R5WSWGFNbi8Aw6ZRc1brxMyWMzG3DSSSSoekkudhUd9yLb6qx39T9nMdj
# Recover wallet
$ hdwallet recover --mnemonic "abandon..." --passphrase "optional"
Master Private Key: xprv...
Master Public Key: xpub...
Solution Architecture
Module Structure
src/
├── main.rs # CLI entry point
├── lib.rs # Public API
├── mnemonic/
│ ├── mod.rs # Mnemonic coordination
│ ├── wordlist.rs # BIP-39 word list (English)
│ ├── entropy.rs # Entropy generation and checksum
│ └── encoding.rs # Words <-> bits conversion
├── seed/
│ ├── mod.rs # Seed derivation
│ └── pbkdf2.rs # PBKDF2-HMAC-SHA512 implementation
├── bip32/
│ ├── mod.rs # BIP-32 coordination
│ ├── master.rs # Master key generation
│ ├── derivation.rs # Child key derivation (CKD)
│ ├── extended_key.rs # Extended key structure
│ └── serialization.rs # Base58Check encoding
├── bip44/
│ ├── mod.rs # BIP-44 path parsing
│ └── path.rs # Path structure and validation
├── address/
│ ├── mod.rs # Address generation
│ ├── bitcoin.rs # Bitcoin address formats
│ └── ethereum.rs # Ethereum address format
├── crypto/
│ ├── mod.rs # Cryptographic primitives
│ ├── hmac.rs # HMAC-SHA512
│ ├── sha256.rs # SHA-256 (from P01)
│ ├── sha512.rs # SHA-512
│ └── secp256k1.rs # Elliptic curve operations (from P02)
└── tests/
├── bip39_vectors.rs # Official BIP-39 test vectors
├── bip32_vectors.rs # Official BIP-32 test vectors
└── bip44_tests.rs # Path derivation tests
Core Data Structures
/// A BIP-39 mnemonic phrase
pub struct Mnemonic {
words: Vec<String>,
language: Language,
}
/// The derived seed from a mnemonic
pub struct Seed {
bytes: [u8; 64], // 512 bits
}
/// An extended key (private or public)
pub struct ExtendedKey {
/// Network version (mainnet/testnet, private/public)
version: [u8; 4],
/// How many derivations from master (0 for master)
depth: u8,
/// First 4 bytes of parent's key identifier
parent_fingerprint: [u8; 4],
/// Which child this is (0 for master)
child_index: u32,
/// Chain code for child derivation
chain_code: [u8; 32],
/// The key data (33 bytes: 0x00 + privkey OR compressed pubkey)
key_data: [u8; 33],
}
/// A derivation path component
pub enum PathComponent {
/// Normal derivation (0 to 2^31 - 1)
Normal(u32),
/// Hardened derivation (shown as i' or iH)
Hardened(u32),
}
/// A complete derivation path
pub struct DerivationPath {
/// Path components after 'm'
components: Vec<PathComponent>,
}
/// Network configuration
pub enum Network {
Bitcoin(BitcoinNetwork),
Ethereum,
Litecoin(LitecoinNetwork),
}
pub enum BitcoinNetwork {
Mainnet,
Testnet,
}
Key Algorithms
Mnemonic to Seed (PBKDF2)
function mnemonic_to_seed(mnemonic: string, passphrase: string) -> [u8; 64]:
password = normalize_nfkd(mnemonic) // Unicode normalization
salt = "mnemonic" + normalize_nfkd(passphrase)
// PBKDF2 with HMAC-SHA512
derived_key = empty
block_count = 1 // For 64 bytes output with SHA512
for block in 1..block_count+1:
u = hmac_sha512(password, salt || big_endian_u32(block))
result = u
for iteration in 2..2048+1:
u = hmac_sha512(password, u)
result = xor(result, u)
derived_key.append(result)
return derived_key[0..64]
Master Key Derivation
function master_key_from_seed(seed: [u8; 64]) -> ExtendedKey:
// Use "Bitcoin seed" regardless of target cryptocurrency
I = hmac_sha512(key="Bitcoin seed", data=seed)
IL = I[0..32] // Master secret key
IR = I[32..64] // Master chain code
// Validate: key must be valid secp256k1 scalar
if IL >= SECP256K1_ORDER or IL == 0:
raise "Invalid master key"
return ExtendedKey {
version: MAINNET_PRIVATE, // 0x0488ADE4
depth: 0,
parent_fingerprint: [0; 4],
child_index: 0,
chain_code: IR,
key_data: [0x00] + IL, // 0x00 prefix for private keys
}
Child Key Derivation (Private)
function derive_child_private(
parent: ExtendedKey,
index: u32,
) -> ExtendedKey:
assert(parent.is_private())
if index >= HARDENED_OFFSET: // 0x80000000
// Hardened: use private key in HMAC input
data = [0x00] + parent.private_key() + big_endian_u32(index)
else:
// Normal: use public key in HMAC input
data = parent.public_key() + big_endian_u32(index)
I = hmac_sha512(key=parent.chain_code, data=data)
IL = I[0..32]
IR = I[32..64]
// Child key = (IL + parent_key) mod n
child_key = (parse_256(IL) + parent.private_key_scalar()) mod SECP256K1_ORDER
if IL >= SECP256K1_ORDER or child_key == 0:
// Extremely rare: try next index
return derive_child_private(parent, index + 1)
return ExtendedKey {
version: parent.version,
depth: parent.depth + 1,
parent_fingerprint: parent.fingerprint(),
child_index: index,
chain_code: IR,
key_data: [0x00] + child_key.to_bytes(),
}
Child Key Derivation (Public)
function derive_child_public(
parent: ExtendedKey,
index: u32,
) -> ExtendedKey:
assert(parent.is_public())
assert(index < HARDENED_OFFSET) // Cannot derive hardened from public
data = parent.public_key() + big_endian_u32(index)
I = hmac_sha512(key=parent.chain_code, data=data)
IL = I[0..32]
IR = I[32..64]
// Child pubkey = point(IL) + parent_pubkey
point_IL = scalar_mult(IL, G) // IL * generator
child_pubkey = point_add(point_IL, parent.public_key_point())
if IL >= SECP256K1_ORDER or child_pubkey.is_infinity():
return derive_child_public(parent, index + 1)
return ExtendedKey {
version: parent.version,
depth: parent.depth + 1,
parent_fingerprint: parent.fingerprint(),
child_index: index,
chain_code: IR,
key_data: child_pubkey.compress(), // 33 bytes
}
Phased Implementation Guide
Phase 1: BIP-39 Word List and Entropy
Goal: Generate valid mnemonic phrases.
Tasks:
- Include the BIP-39 English word list (2048 words)
- Generate cryptographically random entropy (16-32 bytes)
- Calculate SHA-256 checksum
- Append checksum bits to entropy
- Split into 11-bit groups and map to words
Validation:
// Known test vector from BIP-39
let entropy = hex!("00000000000000000000000000000000");
let mnemonic = generate_mnemonic(&entropy);
assert_eq!(
mnemonic.words.join(" "),
"abandon abandon abandon abandon abandon abandon abandon abandon abandon abandon abandon about"
);
Hints if stuck:
- Checksum is first
entropy_bits / 32bits of SHA256(entropy) - Use bit manipulation to extract 11-bit groups
- Word index = 11-bit value as unsigned integer
Phase 2: PBKDF2-HMAC-SHA512
Goal: Convert mnemonic to binary seed.
Tasks:
- Implement HMAC-SHA512 (if not using library)
- Implement PBKDF2 with 2048 iterations
- Apply Unicode NFKD normalization to inputs
- Generate 64-byte seed from mnemonic + passphrase
Validation:
let mnemonic = "abandon abandon abandon abandon abandon abandon abandon abandon abandon abandon abandon about";
let passphrase = "";
let seed = mnemonic_to_seed(mnemonic, passphrase);
assert_eq!(
hex::encode(&seed),
"5eb00bbddcf069084889a8ab9155568165f5c453ccb85e70811aaed6f6da5fc19a5ac40b389cd370d086206dec8aa6c43daea6690f20ad3d8d48b2d2ce9e38e4"
);
Hints if stuck:
- PBKDF2:
F(Password, Salt, c, i) = U_1 XOR U_2 XOR ... XOR U_c - Where
U_1 = PRF(Password, Salt || INT(i))andU_j = PRF(Password, U_{j-1}) - Salt is literally
"mnemonic"+ passphrase string - Consider using the
unicode-normalizationcrate for NFKD
Phase 3: Master Key Derivation
Goal: Generate master extended key from seed.
Tasks:
- Apply HMAC-SHA512 with key “Bitcoin seed”
- Split result into private key (IL) and chain code (IR)
- Validate private key is valid secp256k1 scalar
- Create ExtendedKey structure
Validation:
let seed = hex!("5eb00bbddcf069084889a8ab9155568165f5c453ccb85e70811aaed6f6da5fc19a5ac40b389cd370d086206dec8aa6c43daea6690f20ad3d8d48b2d2ce9e38e4");
let master = master_key_from_seed(&seed);
assert_eq!(
hex::encode(master.private_key()),
"e8f32e723decf4051aefac8e2c93c9c5b214313817cdb01a1494b917c8436b35"
);
assert_eq!(
hex::encode(&master.chain_code),
"873dff81c02f525623fd1fe5167eac3a55a049de3d314bb42ee227ffed37d508"
);
Hints if stuck:
- “Bitcoin seed” is a UTF-8 string used as HMAC key
- Private key must be < secp256k1 order (n)
- If invalid, specification says “master key is invalid” (regenerate entropy)
Phase 4: Child Key Derivation (Normal)
Goal: Derive child keys using normal derivation.
Tasks:
- Implement CKDpriv for private parent to private child
- Serialize parent public key (compressed, 33 bytes)
-
Construct HMAC input: pubkey index (37 bytes) - Calculate child key as (IL + parent) mod n
- Handle edge cases (IL >= n, result = 0)
Validation:
let parent = master_key;
let child_0 = derive_child(&parent, 0);
// Test against BIP-32 test vectors
Hints if stuck:
- Index is 4-byte big-endian
- Use secp256k1 point multiplication from Project 2
- Child private key = (IL + parent private key) mod curve order
Phase 5: Child Key Derivation (Hardened)
Goal: Implement hardened derivation.
Tasks:
- Detect hardened index (>= 0x80000000)
-
For hardened: HMAC input is 0x00 private_key index (37 bytes) - Same arithmetic as normal derivation
- Update fingerprint calculation
Validation:
// Derive m/0' (first hardened child)
let child_0h = derive_child(&master, 0x80000000);
// Compare against test vectors
Hints if stuck:
- Hardened derivation REQUIRES parent private key
- The 0x00 byte prefix distinguishes from public key (which starts 0x02 or 0x03)
- Cannot derive hardened children from extended public key
Phase 6: Path Parsing and Multi-Level Derivation
Goal: Parse and apply derivation paths.
Tasks:
- Parse path strings like “m/44’/60’/0’/0/0”
- Handle both ‘ and H notation for hardened
- Apply sequential derivations
- Validate path format and indices
Validation:
let path = DerivationPath::parse("m/44'/60'/0'/0/0")?;
let eth_address_0 = derive_path(&master, &path);
// Should match known Ethereum address for test mnemonic
Hints if stuck:
- Split on
/, skip first element if it’s “m” - Strip trailing
'orHto detect hardened - Index for hardened = parsed_number + 0x80000000
Phase 7: Extended Key Serialization
Goal: Encode/decode xprv and xpub strings.
Tasks:
- Construct 78-byte payload (version, depth, fingerprint, index, chain code, key)
- Apply Base58Check encoding (double SHA256 checksum)
- Implement decoding with checksum validation
- Support different version bytes (mainnet/testnet, private/public)
Validation:
let xprv_string = master.to_string();
assert!(xprv_string.starts_with("xprv"));
let decoded = ExtendedKey::from_string(&xprv_string)?;
assert_eq!(decoded, master);
Hints if stuck:
- Base58 alphabet: “123456789ABCDEFGHJKLMNPQRSTUVWXYZabcdefghijkmnopqrstuvwxyz”
- Checksum: first 4 bytes of SHA256(SHA256(payload))
- Version bytes for mainnet private: 0x0488ADE4
Phase 8: Address Generation
Goal: Generate cryptocurrency addresses from keys.
Tasks:
- Derive public key from private key (secp256k1)
- Bitcoin P2PKH: RIPEMD160(SHA256(pubkey)) + version + checksum
- Bitcoin P2WPKH: Bech32 encoding of witness program
- Ethereum: Last 20 bytes of Keccak256(uncompressed pubkey without 0x04 prefix)
Validation:
// Using test mnemonic "abandon abandon ... about"
let btc_path = DerivationPath::parse("m/44'/0'/0'/0/0")?;
let btc_key = derive_path(&master, &btc_path);
let btc_address = bitcoin_address(&btc_key.public_key());
// Check against known address
let eth_path = DerivationPath::parse("m/44'/60'/0'/0/0")?;
let eth_key = derive_path(&master, ð_path);
let eth_address = ethereum_address(ð_key.public_key());
assert_eq!(eth_address, "0x9858EfFD232B4033E47d90003D41EC34EcaEda94");
Hints if stuck:
- Bitcoin uses compressed public keys (33 bytes)
- Ethereum uses uncompressed without the 0x04 prefix (64 bytes)
- Keccak256 is NOT SHA3-256 (different padding)
Testing Strategy
Unit Tests
#[test]
fn test_checksum_calculation() {
let entropy = [0u8; 16]; // All zeros
let checksum = calculate_checksum(&entropy);
// First 4 bits should match
assert_eq!(checksum, 0x00); // SHA256 of zeros starts with...
}
#[test]
fn test_11bit_extraction() {
let bits = [0b11111111, 0b11111111];
let word_index = extract_11bits(&bits, 0);
assert_eq!(word_index, 0b11111111111); // 2047
}
#[test]
fn test_hardened_index() {
assert!(!is_hardened(0));
assert!(!is_hardened(0x7FFFFFFF));
assert!(is_hardened(0x80000000));
assert!(is_hardened(0xFFFFFFFF));
}
BIP-39 Official Test Vectors
#[test]
fn test_bip39_vector_1() {
// Test vector from BIP-39 specification
let entropy = hex!("00000000000000000000000000000000");
let mnemonic = generate_mnemonic(&entropy);
let expected = "abandon abandon abandon abandon abandon abandon abandon abandon abandon abandon abandon about";
assert_eq!(mnemonic.to_string(), expected);
let seed = mnemonic_to_seed(&mnemonic, "TREZOR");
let expected_seed = hex!("c55257c360c07c72029aebc1b53c05ed0362ada38ead3e3e9efa3708e53495531f09a6987599d18264c1e1c92f2cf141630c7a3c4ab7c81b2f001698e7463b04");
assert_eq!(seed, expected_seed);
}
#[test]
fn test_bip39_vector_japanese() {
// Japanese word list test
let entropy = hex!("00000000000000000000000000000000");
let mnemonic = generate_mnemonic_language(&entropy, Language::Japanese);
// ... validate Japanese words
}
BIP-32 Official Test Vectors
#[test]
fn test_bip32_vector_1() {
// Test Vector 1 from BIP-32
let seed = hex!("000102030405060708090a0b0c0d0e0f");
// Chain m
let m = master_key_from_seed(&seed);
assert_eq!(m.to_xpub_string(), "xpub661MyMwAqRbcFtXgS5sYJABqqG9YLmC4Q1Rdap9gSE8NqtwybGhePY2gZ29ESFjqJoCu1Rupje8YtGqsefD265TMg7usUDFdp6W1EGMcet8");
assert_eq!(m.to_xprv_string(), "xprv9s21ZrQH143K3GJpoapnV8SFfuZcESnPVTaH9d1a2Ks1NxKU1LTDhP1uqPRimb2ZxhSuwz4dPWJn4HMxqbVUmKRMF1ixQ7KneSZZS3E7DxC");
// Chain m/0'
let m_0h = derive_path(&m, &parse_path("m/0'")?);
assert_eq!(m_0h.to_xpub_string(), "xpub68Gmy5EdvgibQVfPdqkBBCHxA5htiqg55crXYuXoQRKfDBFA1WEjWgP6LHhwBZeNK1VTsfTFUHCdrfp1bgwQ9xv5ski8PX9rL2dZXvgGDnw");
// Chain m/0'/1
let m_0h_1 = derive_path(&m, &parse_path("m/0'/1")?);
// ... continue for full test vector
}
Address Generation Tests
#[test]
fn test_bitcoin_address_generation() {
let mnemonic = "abandon abandon abandon abandon abandon abandon abandon abandon abandon abandon abandon about";
let master = master_key_from_mnemonic(mnemonic, "")?;
// First Bitcoin address
let key = derive_path(&master, &parse_path("m/44'/0'/0'/0/0")?);
let address = bitcoin_p2pkh_address(&key.public_key(), Network::Mainnet);
// Known address for this test vector
assert_eq!(address, "1HZwkjkeaoZfTSaJxDw6aKkxp45agDiEzN");
}
#[test]
fn test_ethereum_address_generation() {
let mnemonic = "abandon abandon abandon abandon abandon abandon abandon abandon abandon abandon abandon about";
let master = master_key_from_mnemonic(mnemonic, "")?;
let key = derive_path(&master, &parse_path("m/44'/60'/0'/0/0")?);
let address = ethereum_address(&key.public_key());
assert_eq!(address.to_lowercase(), "0x9858effd232b4033e47d90003d41ec34ecaeda94");
}
Cross-Implementation Testing
#[test]
fn test_compatibility_with_trezor() {
// Generate addresses and compare with Trezor's expected outputs
// These are published test vectors
}
#[test]
fn test_compatibility_with_ledger() {
// Same addresses should be generated as Ledger hardware wallets
}
#[test]
fn test_metamask_compatibility() {
// MetaMask uses specific derivation paths for Ethereum
// Verify our addresses match
}
Common Pitfalls & Debugging
Pitfall 1: Bit Manipulation Errors in Mnemonic Generation
Problem: Incorrectly extracting 11-bit groups from the entropy+checksum.
Symptom: Wrong words generated, checksum validation fails.
Solution:
fn extract_11bits(bytes: &[u8], bit_offset: usize) -> u16 {
// Handle bits spanning multiple bytes
let byte_offset = bit_offset / 8;
let bit_in_byte = bit_offset % 8;
// Read 16+ bits and extract the 11 we need
let mut value: u32 = 0;
for i in 0..3 {
if byte_offset + i < bytes.len() {
value |= (bytes[byte_offset + i] as u32) << (16 - 8 * i);
}
}
// Shift to align and mask
((value >> (21 - bit_in_byte)) & 0x7FF) as u16
}
Pitfall 2: Unicode Normalization
Problem: Mnemonic words must be NFKD normalized before PBKDF2.
Symptom: Seed differs from test vectors, especially with non-ASCII passphrases.
Solution:
use unicode_normalization::UnicodeNormalization;
fn mnemonic_to_seed(mnemonic: &str, passphrase: &str) -> [u8; 64] {
let normalized_mnemonic = mnemonic.nfkd().collect::<String>();
let normalized_passphrase = passphrase.nfkd().collect::<String>();
let salt = format!("mnemonic{}", normalized_passphrase);
pbkdf2_hmac_sha512(&normalized_mnemonic, &salt, 2048)
}
Pitfall 3: Big-Endian vs Little-Endian
Problem: Indices and version bytes must be big-endian.
Symptom: Wrong child keys, invalid serialization.
Solution:
// Child index to bytes
fn index_to_bytes(index: u32) -> [u8; 4] {
index.to_be_bytes() // Big-endian!
}
// Reading version from serialized xpub/xprv
fn read_version(bytes: &[u8]) -> u32 {
u32::from_be_bytes([bytes[0], bytes[1], bytes[2], bytes[3]])
}
Pitfall 4: Hardened Index Representation
Problem: Confusing display notation (44’) with actual index value.
Symptom: Wrong derivation paths, incompatible with other wallets.
Solution:
const HARDENED_OFFSET: u32 = 0x80000000;
fn parse_path_component(s: &str) -> Result<u32> {
let is_hardened = s.ends_with('\'') || s.ends_with('H');
let num_str = s.trim_end_matches(|c| c == '\'' || c == 'H');
let index: u32 = num_str.parse()?;
if is_hardened {
Ok(index + HARDENED_OFFSET)
} else {
Ok(index)
}
}
Pitfall 5: Key Validity Edge Cases
Problem: Derived IL might be >= curve order or result in zero key.
Symptom: Invalid keys, crashes, or security vulnerabilities.
Solution:
fn derive_child(parent: &ExtendedKey, index: u32) -> Result<ExtendedKey> {
let i = hmac_sha512(/* ... */);
let il = &i[..32];
let ir = &i[32..];
let il_scalar = Scalar::from_bytes(il)?;
// Check: IL must be < n
if il_scalar >= SECP256K1_ORDER {
// Specification says to proceed with next index
return derive_child(parent, index + 1);
}
let child_scalar = (il_scalar + parent.key_scalar()) % SECP256K1_ORDER;
// Check: result must not be zero
if child_scalar.is_zero() {
return derive_child(parent, index + 1);
}
Ok(/* construct child */)
}
Pitfall 6: Fingerprint Calculation
Problem: Parent fingerprint is first 4 bytes of HASH160(public key).
Symptom: Extended key strings don’t match test vectors.
Solution:
fn fingerprint(public_key: &PublicKey) -> [u8; 4] {
let compressed = public_key.to_compressed_bytes(); // 33 bytes
let hash160 = ripemd160(&sha256(&compressed));
[hash160[0], hash160[1], hash160[2], hash160[3]]
}
Extensions and Challenges
Challenge 1: Implement BIP-85
BIP-85 defines “Deterministic Entropy From BIP32 Keychains” - deriving child entropies that can generate entirely new mnemonics.
Application: Create child wallets for different purposes, each with its own
12/24 word backup, all derivable from a single master mnemonic.
Challenge 2: Multi-Signature Path Support (BIP-48)
Implement BIP-48 derivation paths for multi-signature wallets:
m/48'/coin_type'/account'/script_type'/change/address_index
script_type: 1' = P2SH-P2WSH, 2' = P2WSH
Challenge 3: Implement BIP-84 (Native SegWit)
Add support for native SegWit addresses (bc1…) using:
Path: m/84'/coin_type'/account'/change/address_index
Address: Bech32 encoding of witness program
Challenge 4: Shamir’s Secret Sharing (SLIP-39)
Implement SLIP-39, which splits a seed into multiple shares where k-of-n shares are required to recover:
20 shares, any 3 required = "Shamir Backup"
Protects against both loss AND theft of individual shares
Challenge 5: Air-Gapped Signing Workflow
Build a complete offline signing workflow:
- Watch-only wallet (xpub) generates unsigned transactions
- QR code transfer to air-gapped device
- Offline device signs with private keys
- QR code transfer of signed transaction back
- Online device broadcasts
Challenge 6: Hardware Wallet Simulation
Implement the core of a hardware wallet:
- Secure key storage (simulated secure element)
- Transaction parsing and display
- User confirmation before signing
- PIN protection
- Plausible deniability passphrases
Real-World Connections
Hardware Wallets (Ledger, Trezor)
Every hardware wallet implements BIP-32/39/44:
- You enter 24 words during setup
- Device derives master seed and stores it in secure element
- Each cryptocurrency uses its BIP-44 coin type
- Device shows addresses and signs transactions
- Your 24 words can recover to any compatible wallet
MetaMask and Browser Wallets
MetaMask uses:
- BIP-39 for the 12-word seed phrase
- BIP-44 path
m/44'/60'/0'/0/0for the first Ethereum account - Can show multiple accounts by incrementing the account index
Exchange Hot/Cold Wallet Architecture
Exchanges use HD wallets for security:
- Hot wallet: xpub only, generates deposit addresses
- Cold storage: xprv stored offline in HSM or multi-sig
- Customer deposits go to unique addresses (derived from xpub)
- Withdrawals require cold storage approval
Recovery Services
When you “restore” a wallet:
- Words are validated against the word list
- Checksum is verified
- PBKDF2 generates the seed
- Standard paths are scanned for balances
- All your addresses are recovered
This is why:
- Word order matters
- Wrong words fail checksum
- Same words always recover same addresses
The 2014 Bitstamp Hack
In 2014, attackers obtained Bitstamp’s hot wallet private key through social engineering. The key could spend all funds immediately.
With an HD wallet architecture:
- Attackers would only get one private key
- Other addresses would remain secure (with hardened derivation)
- Limited blast radius of any single key compromise
Resources
Primary References
- BIP-32: Hierarchical Deterministic Wallets - Official specification
- BIP-39: Mnemonic code for generating deterministic keys - Mnemonic standard
- BIP-44: Multi-Account Hierarchy - Path conventions
- “Mastering Bitcoin” Chapter 5 - HD Wallets explained accessibly
Code References
- bitcoinjs-lib: GitHub - JavaScript reference
- python-mnemonic: GitHub - Trezor’s Python implementation
- rust-bip39: GitHub - Rust implementation
Test Vectors
- BIP-39 Test Vectors: Official
- BIP-32 Test Vectors: In BIP-32
- Ian Coleman’s BIP39 Tool: bip39 Tool - Interactive testing
Supplementary Reading
- SLIP-0010: Alternative to BIP-32 for Ed25519 curves
- BIP-84: Native SegWit derivation paths
- SLIP-39: Shamir’s Secret Sharing for recovery
Self-Assessment Checklist
Before moving to the next project, verify:
- I can explain why HD wallets only need one backup for infinite addresses
- I understand the entropy -> checksum -> words conversion
- I can implement PBKDF2 and explain why 2048 iterations
- I understand the security difference between hardened and normal derivation
- I can parse and apply a BIP-44 derivation path
- My implementation matches official test vectors for BIP-32 and BIP-39
- I can explain why leaking one child private key can compromise siblings (normal derivation)
- I understand why “Bitcoin seed” is used even for Ethereum keys
- I can generate valid Bitcoin and Ethereum addresses from a mnemonic
Conceptual Questions
- Why can’t you derive hardened children from an extended public key?
- What happens if you use the wrong passphrase with the correct mnemonic?
- Why is the chain code necessary? What would happen without it?
- How does the 24th word encode the checksum?
- Why do exchanges share xpubs with their hot wallet servers?
What’s Next?
With HD wallet implementation complete, you now understand the complete key management system used by every modern cryptocurrency wallet. You can derive infinite addresses from a single backup, understand the security implications of different derivation paths, and appreciate why hardware wallets are designed the way they are.
In Project 7: Bitcoin Transaction Parser, you’ll learn how these keys are actually used - decoding raw Bitcoin transactions to understand inputs, outputs, scripts, and the SegWit data structure. You’ll see exactly what gets signed when you “send” Bitcoin.