Project 3: Binary Search Everything

Build a toolkit of binary search variants for arrays and answer spaces.

Quick Reference

Attribute Value
Difficulty Level 1: Beginner
Time Estimate Weekend
Language Python (Alternatives: C, Java, Go)
Prerequisites Sorted arrays, loops, basic functions
Key Topics Binary search, invariants, off-by-one errors

1. Learning Objectives

By completing this project, you will:

  1. Implement multiple binary search variants correctly.
  2. Use loop invariants to reason about correctness.
  3. Apply binary search to answer spaces (parametric search).
  4. Identify and fix off-by-one errors.

2. Theoretical Foundation

2.1 Core Concepts

  • Sorted order requirement: Binary search works only when monotonicity exists.
  • Loop invariants: The target must remain in the search interval after each iteration.
  • Lower and upper bounds: First occurrence, last occurrence, and insertion position.
  • Answer-space search: Searching over possible answers when a predicate is monotonic.

2.2 Why This Matters

Binary search is a building block for efficient algorithms and appears frequently in interviews. It also teaches rigorous reasoning about correctness.

2.3 Historical Context / Background

The classic binary search bug (Bentley, 1980s) shows that even simple algorithms are easy to implement incorrectly without invariants.

2.4 Common Misconceptions

  • Misconception: Binary search only works on arrays. Reality: It works on any monotonic predicate.
  • Misconception: mid = (l + r) / 2 is safe. Reality: It can overflow in some languages.

3. Project Specification

3.1 What You Will Build

A CLI tool and small library implementing binary search variants: exact match, first/last occurrence, insertion point, and parametric search (for example, minimum feasible capacity).

3.2 Functional Requirements

  1. Exact search: Return index or -1.
  2. Bounds: Implement lower_bound and upper_bound.
  3. Insertion point: Return where a value should be inserted.
  4. Answer-space search: Solve at least two problems with monotonic predicates.

3.3 Non-Functional Requirements

  • Reliability: All variants must be tested against edge cases.
  • Usability: Clear function names and CLI examples.
  • Correctness: Use invariants in documentation.

3.4 Example Usage / Output

$ ./binary-search-tool --array 1,2,2,2,3 --find 2
first=1 last=3
$ ./binary-search-tool --capacity --weights 1,2,3,4 --days 3
min_capacity=4

3.5 Real World Outcome

You will have a reusable library of binary search functions and a CLI that demonstrates each variant. You will see how a monotonic predicate turns many optimization problems into log-time search.


4. Solution Architecture

4.1 High-Level Design

┌─────────────┐     ┌──────────────┐     ┌───────────────┐
│ CLI Parser  │────▶│ Search Core  │────▶│ Results Output │
└─────────────┘     └──────┬───────┘     └───────────────┘
                            │
                      ┌─────▼─────┐
                      │ Predicates│
                      └───────────┘

4.2 Key Components

Component Responsibility Key Decisions
Search Core Binary search variants Shared loop invariant template
Predicates Monotonic checks Separate from core search
CLI Input parsing Keep examples simple

4.3 Data Structures

# Basic container for parameters
params = {
    "array": [1, 2, 2, 2, 3],
    "target": 2,
}

4.4 Algorithm Overview

Key Algorithm: Lower Bound

  1. Maintain [l, r) interval.
  2. If a[mid] < target, move l right.
  3. Else move r left.

Complexity Analysis:

  • Time: O(log n)
  • Space: O(1)

5. Implementation Guide

5.1 Development Environment Setup

python3 -m venv .venv
. .venv/bin/activate

5.2 Project Structure

binary-search-tool/
├── src/
│   ├── core.py
│   ├── predicates.py
│   └── cli.py
├── tests/
└── README.md

5.3 The Core Question You’re Answering

“How can one loop template solve many different search problems safely?”

5.4 Concepts You Must Understand First

Stop and research these before coding:

  1. Monotonic predicates
  2. Loop invariants
  3. Overflow-safe mid calculation

5.5 Questions to Guide Your Design

  1. Will you use [l, r] or [l, r) intervals?
  2. How will you express invariants in comments?
  3. Which real problems can be solved via answer-space search?

5.6 Thinking Exercise

Given [1, 2, 2, 2, 3], trace lower_bound for target 2 and write l, r, mid at each step.

5.7 The Interview Questions They’ll Ask

  1. Implement lower_bound and upper_bound.
  2. Explain how to avoid infinite loops in binary search.
  3. Why is binary search useful beyond arrays?

5.8 Hints in Layers

Hint 1: Use half-open intervals [l, r) avoids ambiguous mid updates.

Hint 2: Predicate helper Wrap problem-specific checks into a function ok(x).

Hint 3: Overflow-safe mid Use mid = l + (r - l) // 2.

5.9 Books That Will Help

Topic Book Chapter
Binary search “Algorithmic Thinking” Ch. 6
Off-by-one errors “Programming Pearls” Column 4

5.10 Implementation Phases

Phase 1: Foundation (2-3 hours)

Goals:

  • Implement exact search
  • Add simple tests

Tasks:

  1. Write iterative binary search.
  2. Test with small arrays.

Checkpoint: Exact search passes all tests.

Phase 2: Core Functionality (3-4 hours)

Goals:

  • Add lower/upper bound variants
  • Add insertion point

Tasks:

  1. Implement bounds using half-open intervals.
  2. Add docs for invariants.

Checkpoint: Bound functions match Python bisect output.

Phase 3: Answer-Space Search (4-6 hours)

Goals:

  • Implement two predicate-based searches
  • Add CLI commands

Tasks:

  1. Add “ship packages” capacity search.
  2. Add “min days” feasibility search.

Checkpoint: CLI demonstrates correct results.

5.11 Key Implementation Decisions

Decision Options Recommendation Rationale
Interval style [l, r], [l, r) [l, r) Fewer off-by-one errors
Recursion vs iteration recursive, iterative iterative Clearer invariants
Predicates inline, helper helper Reusable logic

6. Testing Strategy

6.1 Test Categories

Category Purpose Examples
Unit Tests Validate search variants Exact, lower, upper
Integration Tests CLI parsing Argument combinations
Edge Case Tests Empty arrays, single element Target missing

6.2 Critical Test Cases

  1. Empty array returns -1 or insertion point 0.
  2. Array with duplicates returns first/last correctly.
  3. Target smaller/larger than all elements.

6.3 Test Data

[]
[5]
[1,2,2,2,3]

7. Common Pitfalls and Debugging

7.1 Frequent Mistakes

Pitfall Symptom Solution
Using while l <= r with wrong updates Infinite loop Use half-open interval
Using mid = (l + r) // 2 in C Overflow Use l + (r - l)/2
Mixing exact and lower bound logic Wrong indices Separate functions

7.2 Debugging Strategies

  • Print l, r, mid on each iteration for a small input.
  • Compare against known bisect outputs.

7.3 Performance Traps

None. Binary search is already O(log n) and fast for all inputs.


8. Extensions and Challenges

8.1 Beginner Extensions

  • Add a visualization of mid movement.
  • Add random test case generator.

8.2 Intermediate Extensions

  • Implement binary search on a rotated array.
  • Add floating-point answer-space search.

8.3 Advanced Extensions

  • Prove invariants formally in documentation.
  • Use binary search to optimize a scheduling objective.

9. Real-World Connections

9.1 Industry Applications

  • Searching in sorted indexes in databases.
  • Parametric search in optimization and planning systems.
  • libc bsearch: C standard library binary search.
  • bisect: Python standard library implementation.

9.3 Interview Relevance

Binary search variants are a classic interview topic and a frequent source of tricky bugs.


10. Resources

10.1 Essential Reading

  • “Algorithmic Thinking” by Daniel Zingaro - Ch. 6
  • “Programming Pearls” by Jon Bentley - Column 4

10.2 Video Resources

  • Binary search tutorials (search: “binary search invariants”)

10.3 Tools and Documentation

  • Python bisect module docs

11. Self-Assessment Checklist

11.1 Understanding

  • I can explain the invariant for lower_bound.
  • I can show why binary search is O(log n).
  • I can apply binary search to a monotonic predicate.

11.2 Implementation

  • All variants pass unit tests.
  • CLI outputs are correct for edge cases.

11.3 Growth

  • I can translate an optimization problem into answer-space search.

12. Submission / Completion Criteria

Minimum Viable Completion:

  • Exact search plus lower/upper bound
  • Tests for duplicates and missing targets

Full Completion:

  • Two answer-space problems solved
  • Clear invariants documented

Excellence (Going Above and Beyond):

  • Visualization of search process
  • Formal correctness explanation

This guide was generated from LEARN_ALGORITHMS_DEEP_DIVE.md. For the complete learning path, see the parent directory.