Project 10: SQL Query Engine
A SQL query engine that parses SQL, builds query plans, optimizes them, and executes against in-memory tables—supporting SELECT, WHERE, JOIN, GROUP BY, and ORDER BY.
Quick Reference
| Attribute | Value |
|---|---|
| Primary Language | Go |
| Alternative Languages | Rust, C++, Java |
| Difficulty | Level 5: Master |
| Time Estimate | 2-3 months |
| Knowledge Area | Databases, Parsing, Query Optimization |
| Tooling | None (from scratch) |
| Prerequisites | All previous projects. Strong algorithms background. Understanding of relational algebra helpful. |
What You Will Build
A SQL query engine that parses SQL, builds query plans, optimizes them, and executes against in-memory tables—supporting SELECT, WHERE, JOIN, GROUP BY, and ORDER BY.
Why It Matters
This project builds core skills that appear repeatedly in real-world systems and tooling.
Core Challenges
- SQL parsing → maps to recursive descent parsing, AST building
- Query planning → maps to tree transformations, relational algebra
- Join algorithms → maps to hash join, nested loop join
- Optimization → maps to cost estimation, plan selection
Key Concepts
- SQL parsing: “Crafting Interpreters” by Bob Nystrom (parsing techniques)
- Query execution: “Database Internals” Ch. 7-9 - Alex Petrov
- Join algorithms: “Designing Data-Intensive Applications” Ch. 3 - Kleppmann
- B-trees: “Database Internals” Ch. 2-3 - Alex Petrov
Real-World Outcome
$ ./minisql
MiniSQL v1.0
Type "help" for commands, "exit" to quit.
> CREATE TABLE users (id INT, name TEXT, age INT);
Table 'users' created.
> INSERT INTO users VALUES (1, 'Alice', 30);
> INSERT INTO users VALUES (2, 'Bob', 25);
> INSERT INTO users VALUES (3, 'Charlie', 35);
Inserted 3 rows.
> SELECT name, age FROM users WHERE age > 27;
┌─────────┬─────┐
│ name │ age │
├─────────┼─────┤
│ Alice │ 30 │
│ Charlie │ 35 │
└─────────┴─────┘
2 rows returned
> CREATE TABLE orders (id INT, user_id INT, amount FLOAT);
> INSERT INTO orders VALUES (1, 1, 99.99);
> INSERT INTO orders VALUES (2, 1, 49.99);
> INSERT INTO orders VALUES (3, 2, 199.99);
> SELECT u.name, SUM(o.amount) as total
FROM users u
JOIN orders o ON u.id = o.user_id
GROUP BY u.name
ORDER BY total DESC;
┌───────┬────────┐
│ name │ total │
├───────┼────────┤
│ Bob │ 199.99 │
│ Alice │ 149.98 │
└───────┴────────┘
> EXPLAIN SELECT * FROM users WHERE age > 25;
Query Plan:
└─ Filter: age > 25
└─ TableScan: users
Estimated cost: 3 rows scanned
Implementation Guide
- Reproduce the simplest happy-path scenario.
- Build the smallest working version of the core feature.
- Add input validation and error handling.
- Add instrumentation/logging to confirm behavior.
- Refactor into clean modules with tests.
Milestones
- Milestone 1: Minimal working program that runs end-to-end.
- Milestone 2: Correct outputs for typical inputs.
- Milestone 3: Robust handling of edge cases.
- Milestone 4: Clean structure and documented usage.
Validation Checklist
- Output matches the real-world outcome example
- Handles invalid inputs safely
- Provides clear errors and exit codes
- Repeatable results across runs
References
- Main guide:
LEARN_GO_DEEP_DIVE.md - “Database Internals” by Alex Petrov