Project 13: Fuzzy String Matcher (fzf-style)
A fuzzy matcher that finds items matching a pattern with characters in order but not necessarily contiguous. Score matches by quality (gaps, position of matches, word boundaries).
Quick Reference
| Attribute | Value |
|---|---|
| Primary Language | Go |
| Alternative Languages | Rust, C++, Python |
| Difficulty | Level 3: Advanced |
| Time Estimate | 2 weeks |
| Knowledge Area | Algorithms / String Matching |
| Tooling | Custom Implementation |
| Prerequisites | Dynamic programming, string algorithms |
What You Will Build
A fuzzy matcher that finds items matching a pattern with characters in order but not necessarily contiguous. Score matches by quality (gaps, position of matches, word boundaries).
Why It Matters
This project builds core skills that appear repeatedly in real-world systems and tooling.
Core Challenges
- Subsequence matching → maps to dynamic programming
- Scoring function design → maps to heuristics for “good” matches
- Word boundary bonuses → maps to camelCase and snake_case awareness
- Performance optimization → maps to early termination, SIMD for candidate filtering
Key Concepts
- Longest Common Subsequence: Base algorithm for fuzzy matching
- Scoring Heuristics: Position bonuses, gap penalties, boundary bonuses
- Smith-Waterman: Scoring inspiration from bioinformatics
- fzf algo.go: The actual implementation to study
Real-World Outcome
$ ./fuzzy_matcher
Input: fzf
Candidates: [fuzzy_finder, foozbaz, fizzbuzz, FuzzyZoneFinder]
Matches (sorted by score):
1. FuzzyZoneFinder score=95 (f-z-f at word boundaries!)
2. fuzzy_finder score=78 (f-z-f with small gaps)
3. fizzbuzz score=45 (f-z at start, z far)
4. foozbaz score=32 (f-z far apart, z before a)
Score breakdown for "FuzzyZoneFinder":
'F' at position 0: +20 (start bonus)
'z' at position 3: +15 (after capital, word boundary)
'f' at position 10: +18 (camelCase boundary)
Gap penalty: -3 (3 chars between z and f)
Total: 95
Performance:
10,000 candidates: 2ms
100,000 candidates: 18ms
1,000,000 candidates: 150ms
Implementation Guide
- Reproduce the simplest happy-path scenario.
- Build the smallest working version of the core feature.
- Add input validation and error handling.
- Add instrumentation/logging to confirm behavior.
- Refactor into clean modules with tests.
Milestones
- Milestone 1: Minimal working program that runs end-to-end.
- Milestone 2: Correct outputs for typical inputs.
- Milestone 3: Robust handling of edge cases.
- Milestone 4: Clean structure and documented usage.
Validation Checklist
- Output matches the real-world outcome example
- Handles invalid inputs safely
- Provides clear errors and exit codes
- Repeatable results across runs
References
- Main guide:
TEXT_SEARCH_TOOLS_DEEP_DIVE.md - fzf source code (algo.go)