Project 7: Heap & Priority Queue
Build a binary heap and use it to implement a priority queue and top-k operations.
Quick Reference
| Attribute | Value |
|---|---|
| Difficulty | Intermediate |
| Time Estimate | 1 week |
| Language | C (Alternatives: Rust, C++) |
| Prerequisites | Projects 1-6 |
| Key Topics | Heap invariant, sift operations, priority scheduling |
1. Learning Objectives
By completing this project, you will:
- Implement a binary heap with array backing.
- Support insert, peek, and extract-min/max.
- Use a heap to compute top-k elements.
- Build a priority queue scheduler.
2. Theoretical Foundation
2.1 Core Concepts
- Heap invariant: Parent is <= children (min-heap) or >= children (max-heap).
- Sift up/down: Restores heap property after insert/remove.
- Partial ordering: Only root is guaranteed to be min/max.
2.2 Why This Matters
Heaps power priority queues, Dijkstra’s algorithm, schedulers, and streaming top-k queries. They provide efficient access to extremes without full sorting.
2.3 Historical Context / Background
Heaps were popularized by Heapsort and are a foundational structure in algorithm design.
2.4 Common Misconceptions
- “Heap = sorted”: Only the root is guaranteed min/max.
- “Heap is a tree in memory”: It is typically stored in an array.
3. Project Specification
3.1 What You Will Build
- A binary heap implementation.
- A priority queue wrapper with task scheduling.
- A top-k demo using streaming input.
3.2 Functional Requirements
- Implement
insert,peek,extract,heapify. - Support both min-heap and max-heap modes.
- Provide a top-k function that returns highest k values.
- Build a scheduler that picks next task by priority.
3.3 Non-Functional Requirements
- Performance: O(log n) insert/extract.
- Reliability: Handle empty heap gracefully.
- Usability: Print heap levels for debugging.
3.4 Example Usage / Output
$ ./heap_demo
Insert: 5, 3, 9, 1
Min: 1
Extract: 1
New Min: 3
Top-3: [9, 7, 5]
3.5 Real World Outcome
The demo prints the heap array after each insert and shows how sifting restores order. The scheduler example shows tasks with priorities and prints the dispatch order by priority.
4. Solution Architecture
4.1 High-Level Design
┌──────────────┐ ┌──────────────────┐ ┌────────────────┐
│ Heap Core │────▶│ Priority Queue │────▶│ Top-K Utility │
└──────────────┘ └──────────────────┘ └────────────────┘
4.2 Key Components
| Component | Responsibility | Key Decisions |
|---|---|---|
| Heap array | Store values | 0-based indexing |
| Sift ops | Restore invariant | Compare parent/child |
| PQ wrapper | Task ordering | Min vs max mode |
4.3 Data Structures
typedef struct {
int* data;
size_t size;
size_t capacity;
int is_min;
} Heap;
4.4 Algorithm Overview
Key Algorithm: Insert
- Append value to end.
- Sift up until heap property holds.
Complexity Analysis:
- Time: O(log n) insert/extract
- Space: O(n)
5. Implementation Guide
5.1 Development Environment Setup
clang -Wall -Wextra -g -o heap_demo src/*.c
5.2 Project Structure
project-root/
├── src/
│ ├── heap.c
│ ├── pq.c
│ └── main.c
├── tests/
│ └── test_heap.c
└── Makefile
5.3 The Core Question You’re Answering
“How can I always find the minimum (or maximum) element without fully sorting?”
5.4 Concepts You Must Understand First
Stop and research these before coding:
- Array index math for heaps
- parent = (i-1)/2
- left = 2i+1, right = 2i+2
- Heapify vs insert
5.5 Questions to Guide Your Design
- Will you support min and max heaps with a flag?
- How will you handle resizing?
- How will you print heap levels clearly?
5.6 Thinking Exercise
Take array [5, 3, 8, 2, 4] and build a min-heap by hand using insert. Show the array after each step.
5.7 The Interview Questions They’ll Ask
- “Why is heap insert O(log n)?”
- “How does heapify run in O(n)?”
- “When would you use a heap over sorting?”
5.8 Hints in Layers
Hint 1: Implement sift up/down first Everything else uses them.
Hint 2: Add a validator Check heap invariant for debugging.
5.9 Books That Will Help
| Topic | Book | Chapter |
|---|---|---|
| Heaps | CLRS | Ch. 6 |
| Priority queues | Algorithms, 4th Edition | PQ chapter |
5.10 Implementation Phases
Phase 1: Foundation (2 days)
Goals:
- Heap insert/extract
Tasks:
- Implement array-based heap.
- Add insert and extract.
Checkpoint: Min-heap returns correct min after each insert.
Phase 2: Core Functionality (3 days)
Goals:
- Heapify and top-k
Tasks:
- Implement
heapifyfrom array. - Add top-k using min-heap of size k.
Checkpoint: Top-k output matches sorted list.
Phase 3: Polish & Edge Cases (2 days)
Goals:
- Priority queue demo
- Visualization
Tasks:
- Add task scheduling by priority.
- Print heap levels.
Checkpoint: Scheduler dispatches tasks by priority.
5.11 Key Implementation Decisions
| Decision | Options | Recommendation | Rationale |
|---|---|---|---|
| Indexing | 0-based, 1-based | 0-based | C array natural |
| Heapify | Bottom-up, repeated insert | Bottom-up | O(n) build |
6. Testing Strategy
6.1 Test Categories
| Category | Purpose | Examples |
|---|---|---|
| Unit Tests | Heap ops | insert/extract |
| Integration Tests | PQ demo | task ordering |
| Edge Case Tests | Empty heap | extract on empty |
6.2 Critical Test Cases
- Insert 1..10 and extract in order.
- Build heap from random array and validate invariant.
- Top-k returns correct list on duplicates.
6.3 Test Data
Values: 9, 1, 4, 7, 3, 8
7. Common Pitfalls & Debugging
7.1 Frequent Mistakes
| Pitfall | Symptom | Solution |
|---|---|---|
| Wrong parent/child math | Invalid heap | Verify index formulas |
| Not swapping properly | Heap invariant broken | Use helper swap |
| Forgetting to decrease size | Extract repeats | Update size before sift |
7.2 Debugging Strategies
- Print heap array after each operation.
- Add a validation function to check invariant.
7.3 Performance Traps
- Rebuilding heap from scratch on every insert is O(n log n).
8. Extensions & Challenges
8.1 Beginner Extensions
- Implement a max-heap.
- Add decrease-key operation.
8.2 Intermediate Extensions
- Implement a d-ary heap.
- Add heap sort.
8.3 Advanced Extensions
- Build a Fibonacci heap interface (theory-only).
- Use heap in Dijkstra’s algorithm.
9. Real-World Connections
9.1 Industry Applications
- Schedulers: pick highest priority tasks.
- Streaming analytics: top-k queries.
9.2 Related Open Source Projects
- Linux kernel: uses heaps in subsystems.
- Lucene: priority queues for search.
9.3 Interview Relevance
- Heap and priority queue implementations are common interview topics.
10. Resources
10.1 Essential Reading
- CLRS - Ch. 6
- Algorithms, 4th Edition - priority queues
10.2 Video Resources
- “Heaps and Priority Queues” - data structures lectures
10.3 Tools & Documentation
- gdb: https://www.gnu.org/software/gdb/
10.4 Related Projects in This Series
- Previous Project: Binary Search Tree & Balancing
- Next Project: Graph Algorithms Visualizer
11. Self-Assessment Checklist
11.1 Understanding
- I can explain heap property and sifting.
- I can derive parent/child indices.
- I can explain O(log n) behavior.
11.2 Implementation
- Heap operations pass tests.
- Top-k demo works correctly.
11.3 Growth
- I can extend to decrease-key.
12. Submission / Completion Criteria
Minimum Viable Completion:
- Min-heap with insert/extract.
- Heap invariant validation.
Full Completion:
- Priority queue demo and top-k.
Excellence (Going Above & Beyond):
- D-ary heap or heap sort.
This guide was generated from DATA_STRUCTURES_FROM_FIRST_PRINCIPLES.md. For the complete learning path, see the parent directory README.