Project 7: Heap & Priority Queue

Build a binary heap and use it to implement a priority queue and top-k operations.

Quick Reference

Attribute Value
Difficulty Intermediate
Time Estimate 1 week
Language C (Alternatives: Rust, C++)
Prerequisites Projects 1-6
Key Topics Heap invariant, sift operations, priority scheduling

1. Learning Objectives

By completing this project, you will:

  1. Implement a binary heap with array backing.
  2. Support insert, peek, and extract-min/max.
  3. Use a heap to compute top-k elements.
  4. Build a priority queue scheduler.

2. Theoretical Foundation

2.1 Core Concepts

  • Heap invariant: Parent is <= children (min-heap) or >= children (max-heap).
  • Sift up/down: Restores heap property after insert/remove.
  • Partial ordering: Only root is guaranteed to be min/max.

2.2 Why This Matters

Heaps power priority queues, Dijkstra’s algorithm, schedulers, and streaming top-k queries. They provide efficient access to extremes without full sorting.

2.3 Historical Context / Background

Heaps were popularized by Heapsort and are a foundational structure in algorithm design.

2.4 Common Misconceptions

  • “Heap = sorted”: Only the root is guaranteed min/max.
  • “Heap is a tree in memory”: It is typically stored in an array.

3. Project Specification

3.1 What You Will Build

  • A binary heap implementation.
  • A priority queue wrapper with task scheduling.
  • A top-k demo using streaming input.

3.2 Functional Requirements

  1. Implement insert, peek, extract, heapify.
  2. Support both min-heap and max-heap modes.
  3. Provide a top-k function that returns highest k values.
  4. Build a scheduler that picks next task by priority.

3.3 Non-Functional Requirements

  • Performance: O(log n) insert/extract.
  • Reliability: Handle empty heap gracefully.
  • Usability: Print heap levels for debugging.

3.4 Example Usage / Output

$ ./heap_demo
Insert: 5, 3, 9, 1
Min: 1
Extract: 1
New Min: 3
Top-3: [9, 7, 5]

3.5 Real World Outcome

The demo prints the heap array after each insert and shows how sifting restores order. The scheduler example shows tasks with priorities and prints the dispatch order by priority.


4. Solution Architecture

4.1 High-Level Design

┌──────────────┐     ┌──────────────────┐     ┌────────────────┐
│ Heap Core    │────▶│ Priority Queue   │────▶│ Top-K Utility  │
└──────────────┘     └──────────────────┘     └────────────────┘

4.2 Key Components

Component Responsibility Key Decisions
Heap array Store values 0-based indexing
Sift ops Restore invariant Compare parent/child
PQ wrapper Task ordering Min vs max mode

4.3 Data Structures

typedef struct {
    int* data;
    size_t size;
    size_t capacity;
    int is_min;
} Heap;

4.4 Algorithm Overview

Key Algorithm: Insert

  1. Append value to end.
  2. Sift up until heap property holds.

Complexity Analysis:

  • Time: O(log n) insert/extract
  • Space: O(n)

5. Implementation Guide

5.1 Development Environment Setup

clang -Wall -Wextra -g -o heap_demo src/*.c

5.2 Project Structure

project-root/
├── src/
│   ├── heap.c
│   ├── pq.c
│   └── main.c
├── tests/
│   └── test_heap.c
└── Makefile

5.3 The Core Question You’re Answering

“How can I always find the minimum (or maximum) element without fully sorting?”

5.4 Concepts You Must Understand First

Stop and research these before coding:

  1. Array index math for heaps
    • parent = (i-1)/2
    • left = 2i+1, right = 2i+2
  2. Heapify vs insert

5.5 Questions to Guide Your Design

  1. Will you support min and max heaps with a flag?
  2. How will you handle resizing?
  3. How will you print heap levels clearly?

5.6 Thinking Exercise

Take array [5, 3, 8, 2, 4] and build a min-heap by hand using insert. Show the array after each step.

5.7 The Interview Questions They’ll Ask

  1. “Why is heap insert O(log n)?”
  2. “How does heapify run in O(n)?”
  3. “When would you use a heap over sorting?”

5.8 Hints in Layers

Hint 1: Implement sift up/down first Everything else uses them.

Hint 2: Add a validator Check heap invariant for debugging.

5.9 Books That Will Help

Topic Book Chapter
Heaps CLRS Ch. 6
Priority queues Algorithms, 4th Edition PQ chapter

5.10 Implementation Phases

Phase 1: Foundation (2 days)

Goals:

  • Heap insert/extract

Tasks:

  1. Implement array-based heap.
  2. Add insert and extract.

Checkpoint: Min-heap returns correct min after each insert.

Phase 2: Core Functionality (3 days)

Goals:

  • Heapify and top-k

Tasks:

  1. Implement heapify from array.
  2. Add top-k using min-heap of size k.

Checkpoint: Top-k output matches sorted list.

Phase 3: Polish & Edge Cases (2 days)

Goals:

  • Priority queue demo
  • Visualization

Tasks:

  1. Add task scheduling by priority.
  2. Print heap levels.

Checkpoint: Scheduler dispatches tasks by priority.

5.11 Key Implementation Decisions

Decision Options Recommendation Rationale
Indexing 0-based, 1-based 0-based C array natural
Heapify Bottom-up, repeated insert Bottom-up O(n) build

6. Testing Strategy

6.1 Test Categories

Category Purpose Examples
Unit Tests Heap ops insert/extract
Integration Tests PQ demo task ordering
Edge Case Tests Empty heap extract on empty

6.2 Critical Test Cases

  1. Insert 1..10 and extract in order.
  2. Build heap from random array and validate invariant.
  3. Top-k returns correct list on duplicates.

6.3 Test Data

Values: 9, 1, 4, 7, 3, 8

7. Common Pitfalls & Debugging

7.1 Frequent Mistakes

Pitfall Symptom Solution
Wrong parent/child math Invalid heap Verify index formulas
Not swapping properly Heap invariant broken Use helper swap
Forgetting to decrease size Extract repeats Update size before sift

7.2 Debugging Strategies

  • Print heap array after each operation.
  • Add a validation function to check invariant.

7.3 Performance Traps

  • Rebuilding heap from scratch on every insert is O(n log n).

8. Extensions & Challenges

8.1 Beginner Extensions

  • Implement a max-heap.
  • Add decrease-key operation.

8.2 Intermediate Extensions

  • Implement a d-ary heap.
  • Add heap sort.

8.3 Advanced Extensions

  • Build a Fibonacci heap interface (theory-only).
  • Use heap in Dijkstra’s algorithm.

9. Real-World Connections

9.1 Industry Applications

  • Schedulers: pick highest priority tasks.
  • Streaming analytics: top-k queries.
  • Linux kernel: uses heaps in subsystems.
  • Lucene: priority queues for search.

9.3 Interview Relevance

  • Heap and priority queue implementations are common interview topics.

10. Resources

10.1 Essential Reading

  • CLRS - Ch. 6
  • Algorithms, 4th Edition - priority queues

10.2 Video Resources

  • “Heaps and Priority Queues” - data structures lectures

10.3 Tools & Documentation

  • gdb: https://www.gnu.org/software/gdb/

11. Self-Assessment Checklist

11.1 Understanding

  • I can explain heap property and sifting.
  • I can derive parent/child indices.
  • I can explain O(log n) behavior.

11.2 Implementation

  • Heap operations pass tests.
  • Top-k demo works correctly.

11.3 Growth

  • I can extend to decrease-key.

12. Submission / Completion Criteria

Minimum Viable Completion:

  • Min-heap with insert/extract.
  • Heap invariant validation.

Full Completion:

  • Priority queue demo and top-k.

Excellence (Going Above & Beyond):

  • D-ary heap or heap sort.

This guide was generated from DATA_STRUCTURES_FROM_FIRST_PRINCIPLES.md. For the complete learning path, see the parent directory README.