Project 3: Stack Machine
Build a stack-based calculator/VM that parses expressions and executes bytecode instructions.
Quick Reference
| Attribute | Value |
|---|---|
| Difficulty | Intermediate |
| Time Estimate | 1 week |
| Language | C (Alternatives: Rust, Python) |
| Prerequisites | Projects 1-2, basic parsing |
| Key Topics | LIFO semantics, expression evaluation, VM design |
1. Learning Objectives
By completing this project, you will:
- Implement a stack data structure with push/pop/peek.
- Parse infix expressions and convert to postfix (RPN).
- Build a simple bytecode VM that executes stack ops.
- Explain how stacks support function calls and recursion.
2. Theoretical Foundation
2.1 Core Concepts
- LIFO discipline: Last-in, first-out order models nested computation.
- RPN evaluation: Postfix notation eliminates precedence rules during execution.
- Call stacks: Function calls push frames; returns pop them.
- Virtual machines: Instruction loops interpret bytecode.
2.2 Why This Matters
Stacks are everywhere: compilers, interpreters, parsing, and backtracking. Understanding a stack machine explains how many languages actually execute code.
2.3 Historical Context / Background
Stack machines power early calculators (HP RPN) and virtual machines like the JVM and WebAssembly (stack-based execution models).
2.4 Common Misconceptions
- “Stack = call stack only”: Many systems use explicit stacks for algorithms.
- “RPN is obscure”: It is an execution-friendly form that simplifies evaluation.
3. Project Specification
3.1 What You Will Build
- A stack library (array-based or linked).
- A parser that converts infix to postfix.
- A bytecode VM that executes stack instructions.
3.2 Functional Requirements
- Parse arithmetic expressions with +, -, *, /, parentheses.
- Convert to postfix using the shunting-yard algorithm.
- Execute postfix or bytecode with a stack.
- Provide a REPL that accepts expressions.
3.3 Non-Functional Requirements
- Performance: Expressions of 1,000 tokens evaluate fast.
- Reliability: Detect division by zero and malformed input.
- Usability: Errors include token index and reason.
3.4 Example Usage / Output
$ ./stack_vm
> (2 + 3) * 4
Postfix: 2 3 + 4 *
Result: 20
3.5 Real World Outcome
The console shows your expression, prints its postfix form, then steps through stack operations and returns a final result. With a “trace” flag, it prints each instruction and the current stack snapshot.
4. Solution Architecture
4.1 High-Level Design
┌──────────────┐ ┌────────────────┐ ┌──────────────┐
│ Parser │────▶│ Postfix/Bytecode│────▶│ Stack VM │
└──────────────┘ └────────────────┘ └──────────────┘
4.2 Key Components
| Component | Responsibility | Key Decisions |
|---|---|---|
| Tokenizer | Break input into tokens | Handle numbers and operators |
| Parser | Shunting-yard to postfix | Operator precedence table |
| VM | Executes stack ops | Iterative loop vs recursion |
4.3 Data Structures
typedef struct {
int* data;
size_t size;
size_t capacity;
} Stack;
typedef enum {
OP_PUSH, OP_ADD, OP_SUB, OP_MUL, OP_DIV
} OpCode;
4.4 Algorithm Overview
Key Algorithm: Shunting-Yard
- Read tokens left to right.
- Push operators to a stack based on precedence.
- Emit operands immediately.
- Pop remaining operators at the end.
Complexity Analysis:
- Time: O(n) tokens
- Space: O(n) operators
5. Implementation Guide
5.1 Development Environment Setup
clang -Wall -Wextra -g -o stack_vm src/*.c
5.2 Project Structure
project-root/
├── src/
│ ├── stack.c
│ ├── parser.c
│ ├── vm.c
│ └── main.c
├── tests/
│ └── test_parser.c
└── Makefile
5.3 The Core Question You’re Answering
“How does a simple stack turn text expressions into actual computation?”
5.4 Concepts You Must Understand First
Stop and research these before coding:
- Operator precedence and associativity
- Tokenization
- Stack API invariants
5.5 Questions to Guide Your Design
- How will you represent numbers in the bytecode?
- Will you support unary minus?
- How will you report parse errors with location?
5.6 Thinking Exercise
Manually convert 3 + 4 * 2 / (1 - 5) to postfix and evaluate it step by step.
5.7 The Interview Questions They’ll Ask
- “How does the shunting-yard algorithm work?”
- “Why is postfix easier to evaluate than infix?”
- “How does a call stack relate to a stack data structure?”
5.8 Hints in Layers
Hint 1: Start with postfix evaluation only Once postfix works, add infix parsing.
Hint 2: Add tracing Print the stack after each opcode for debugging.
5.9 Books That Will Help
| Topic | Book | Chapter |
|---|---|---|
| Stacks | CLRS | Ch. 10 |
| Parsing | Crafting Interpreters | Ch. 5-7 |
5.10 Implementation Phases
Phase 1: Foundation (2 days)
Goals:
- Stack implementation
- Postfix evaluator
Tasks:
- Implement push/pop/peek.
- Evaluate postfix arrays.
Checkpoint: 2 3 + returns 5.
Phase 2: Core Functionality (3 days)
Goals:
- Infix parser
- Operator precedence
Tasks:
- Implement tokenizer.
- Implement shunting-yard.
Checkpoint: (2+3)*4 yields postfix 2 3 + 4 *.
Phase 3: Polish & Edge Cases (2 days)
Goals:
- Error handling and tracing
- REPL
Tasks:
- Add error messages with token index.
- Add
--traceoutput.
Checkpoint: Invalid expressions show clear errors.
5.11 Key Implementation Decisions
| Decision | Options | Recommendation | Rationale |
|---|---|---|---|
| Stack backing | Array, linked list | Array | Cache-friendly, simple |
| VM design | Direct eval, bytecode | Bytecode | Extensible for new ops |
6. Testing Strategy
6.1 Test Categories
| Category | Purpose | Examples |
|---|---|---|
| Unit Tests | Stack and tokenizer | push/pop, tokens |
| Integration Tests | Parser + VM | expression cases |
| Edge Case Tests | Malformed input | missing paren |
6.2 Critical Test Cases
- Operator precedence:
2 + 3 * 4-> 14. - Parentheses:
(2 + 3) * 4-> 20. - Divide by zero detection.
6.3 Test Data
Expressions: 1+2, 3*(4-5), 10/2+7
7. Common Pitfalls & Debugging
7.1 Frequent Mistakes
| Pitfall | Symptom | Solution |
|---|---|---|
| Precedence errors | Wrong result | Check operator table |
| Stack underflow | Crash during eval | Validate before pop |
| Tokenization bugs | Missing numbers | Properly parse multi-digit |
7.2 Debugging Strategies
- Print token stream and postfix output.
- Run with
--traceto see stack changes.
7.3 Performance Traps
- Rebuilding tokens per character can be slow; use a single pass.
8. Extensions & Challenges
8.1 Beginner Extensions
- Add exponentiation operator.
- Add variables with a simple map.
8.2 Intermediate Extensions
- Implement function calls like
sin(3). - Add bytecode disassembler.
8.3 Advanced Extensions
- Add control flow (if, while) to the VM.
- Implement a stack-frame model for functions.
9. Real-World Connections
9.1 Industry Applications
- Compilers: parse expressions into stacks.
- Scripting engines: bytecode VMs.
9.2 Related Open Source Projects
- Lua: stack-based VM execution.
- Wasm: stack-machine semantics.
9.3 Interview Relevance
- Expression evaluation problems are common in interviews.
10. Resources
10.1 Essential Reading
- Crafting Interpreters by Bob Nystrom - parsing basics
- CLRS - stacks and queues
10.2 Video Resources
- “Shunting-Yard Algorithm” - compilers lectures
10.3 Tools & Documentation
- UBSan: https://clang.llvm.org/docs/UndefinedBehaviorSanitizer.html
10.4 Related Projects in This Series
- Previous Project: Linked List Toolkit
- Next Project: Queue Systems
11. Self-Assessment Checklist
11.1 Understanding
- I can convert infix to postfix.
- I can explain how stack evaluation works.
- I can describe call stack similarities.
11.2 Implementation
- REPL works with clear errors.
- Tests pass for precedence and parentheses.
11.3 Growth
- I can add a new opcode without breaking the VM.
12. Submission / Completion Criteria
Minimum Viable Completion:
- Stack implementation and postfix evaluator.
- Infix parser with precedence.
Full Completion:
- Bytecode VM and REPL.
Excellence (Going Above & Beyond):
- Variables, functions, and control flow.
This guide was generated from DATA_STRUCTURES_FROM_FIRST_PRINCIPLES.md. For the complete learning path, see the parent directory README.