Project 3: Concordance Generator
A tool that reads a text file and generates a concordance: an alphabetical list of every unique word and the line numbers on which it appeared, with no duplicate line numbers per word.
Quick Reference
| Attribute | Value |
|---|---|
| Primary Language | C++ |
| Alternative Languages | Python |
| Difficulty | Level 2: Intermediate |
| Time Estimate | Weekend |
| Knowledge Area | Text Processing / Compound Data Structures |
| Tooling | N/A |
| Prerequisites | Project 1. |
What You Will Build
A tool that reads a text file and generates a concordance: an alphabetical list of every unique word and the line numbers on which it appeared, with no duplicate line numbers per word.
Why It Matters
This project builds core skills that appear repeatedly in real-world systems and tooling.
Core Challenges
- Splitting lines into words → maps to string manipulation,
stringstream - Normalizing words → maps to using
std::transformwith a lambda to convert to lowercase and remove punctuation - Storing unique, sorted line numbers for each word → maps to leveraging
std::set’s automatic sorting and uniqueness - Storing words in alphabetical order → maps to leveraging
std::map’s automatic key sorting
Key Concepts
std::set: A container for unique, sorted elements. (cppreference.com)- Nested STL Containers: e.g.,
map<string, set<int>>. std::transform: Applying an operation to every element in a range. (cppreference.com)
Real-World Outcome
It was the best of times.
It was the worst of times.
Implementation Guide
- Reproduce the simplest happy-path scenario.
- Build the smallest working version of the core feature.
- Add input validation and error handling.
- Add instrumentation/logging to confirm behavior.
- Refactor into clean modules with tests.
Milestones
- Milestone 1: Minimal working program that runs end-to-end.
- Milestone 2: Correct outputs for typical inputs.
- Milestone 3: Robust handling of edge cases.
- Milestone 4: Clean structure and documented usage.
Validation Checklist
- Output matches the real-world outcome example
- Handles invalid inputs safely
- Provides clear errors and exit codes
- Repeatable results across runs
References
- Main guide:
LEARN_CPP_STL_DEEP_DIVE.md - “Effective STL” by Scott Meyers