Project 7: Configuration Parser
Build a parser for an INI-like configuration format and expose a clean query API.
Quick Reference
| Attribute | Value |
|---|---|
| Difficulty | Intermediate |
| Time Estimate | 1 week |
| Language | C |
| Prerequisites | Projects 2-3, string handling |
| Key Topics | Parsing, state machines, I/O |
1. Learning Objectives
By completing this project, you will:
- Parse structured text with a simple grammar.
- Handle whitespace, comments, and quoting reliably.
- Store key-value pairs efficiently.
- Design a safe API to query config values.
2. Theoretical Foundation
2.1 Core Concepts
- Tokenization: Split input into meaningful tokens (section, key, value).
- State machines: Track whether you are parsing section headers or key-value lines.
- Error reporting: Line numbers and messages matter for usability.
2.2 Why This Matters
Configuration parsing is a common systems problem. The ability to build a robust parser translates to compilers, file formats, and protocol handlers.
2.3 Historical Context / Background
INI-style configs are widely used for their simplicity and human readability. Parsing them correctly still requires careful handling of edge cases.
2.4 Common Misconceptions
- “Whitespace doesn’t matter”: It matters around keys and values.
- “Comments are easy”: They must be ignored without breaking parsing.
3. Project Specification
3.1 What You Will Build
A config parser for this format:
# comment
[server]
port = 8080
host = 127.0.0.1
[logging]
level = info
Provide functions to load a file and query section.key values.
3.2 Functional Requirements
- Support sections and key-value pairs.
- Ignore blank lines and comments (
#or;). - Trim whitespace around keys and values.
- Return errors with line numbers on malformed input.
3.3 Non-Functional Requirements
- Reliability: No crashes on malformed input.
- Usability: Helpful error messages.
- Performance: Linear time parse.
3.4 Example Usage / Output
Config cfg;
config_load(&cfg, "app.ini");
const char *port = config_get(&cfg, "server", "port");
// port == "8080"
3.5 Real World Outcome
You can parse app settings into a queryable structure and confidently handle errors in config files for real applications.
4. Solution Architecture
4.1 High-Level Design
read lines -> parse -> store in map -> query API
4.2 Key Components
| Component | Responsibility | Key Decisions |
|---|---|---|
| Line reader | Read file line-by-line | getline() |
| Parser | Interpret sections and key/value | Simple state machine |
| Store | Map section+key to value | Hash table or list |
4.3 Data Structures
typedef struct {
char *section;
char *key;
char *value;
} ConfigEntry;
4.4 Algorithm Overview
Key Algorithm: Line parse
- Trim whitespace.
- If line starts with
[, parse section. - Else parse
key = value. - Store entry in map.
Complexity Analysis:
- Time: O(n) in total input length
- Space: O(n) for stored strings
5. Implementation Guide
5.1 Development Environment Setup
cc -Wall -Wextra -O2 -g -o test_cfg test_cfg.c config.c
5.2 Project Structure
config/
├── src/
│ ├── config.c
│ └── config.h
├── tests/
│ └── test_config.c
└── README.md
5.3 The Core Question You’re Answering
“How do I turn human-written text into structured data reliably?”
5.4 Concepts You Must Understand First
Stop and research these before coding:
- Line reading
- How does
getline()allocate buffers?
- How does
- Whitespace trimming
- Safe trimming without losing data.
- Error handling
- How to return line number and error code.
5.5 Questions to Guide Your Design
Before implementing, think through these:
- Will you allow duplicate keys?
- What happens if
key =has an empty value? - How will you escape
#in values?
5.6 Thinking Exercise
Edge Case Parsing
How should you interpret a line like path = /tmp#cache? Is # a comment or part of value?
5.7 The Interview Questions They’ll Ask
Prepare to answer these:
- “How do you parse structured text in C?”
- “What is a state machine and why use it here?”
- “How do you report parse errors with context?”
5.8 Hints in Layers
Hint 1: Parse sections first
Handle [section] lines separately.
Hint 2: Implement trimming helpers
Write trim_left and trim_right.
Hint 3: Build a simple key store Use a list or map for entries.
5.9 Books That Will Help
| Topic | Book | Chapter |
|---|---|---|
| Parsing basics | “Crafting Interpreters” | Ch. 4 |
| C strings | “The C Programming Language” | Ch. 5 |
5.10 Implementation Phases
Phase 1: Foundation (2-3 days)
Goals:
- Read lines and parse sections
Tasks:
- Implement line reader.
- Parse section headers.
Checkpoint: Sections recognized with line numbers.
Phase 2: Core Functionality (2-3 days)
Goals:
- Parse key/value pairs
Tasks:
- Split on
=. - Trim key and value.
Checkpoint: Values stored correctly.
Phase 3: Polish & Edge Cases (1-2 days)
Goals:
- Error handling and comments
Tasks:
- Skip comments and blank lines.
- Report malformed lines.
Checkpoint: Clear errors for bad configs.
5.11 Key Implementation Decisions
| Decision | Options | Recommendation | Rationale |
|---|---|---|---|
| Storage | List vs hash map | Hash map | Fast lookups |
| Comment syntax | # only vs # and ; |
Both | Common INI formats |
6. Testing Strategy
6.1 Test Categories
| Category | Purpose | Examples |
|---|---|---|
| Unit Tests | Parsing helpers | Trim functions |
| Integration Tests | Parse full files | Sample configs |
| Error Tests | Malformed lines | Missing = |
6.2 Critical Test Cases
- Empty file: No entries.
- Duplicate keys: Defined behavior.
- Malformed section: Error with line number.
6.3 Test Data
[app]
name = demo
[bad
7. Common Pitfalls & Debugging
7.1 Frequent Mistakes
| Pitfall | Symptom | Solution |
|---|---|---|
| Not trimming whitespace | Keys with spaces | Trim both sides |
| Not copying strings | Use-after-free | Duplicate strings |
| Poor error messages | Hard to debug | Include line numbers |
7.2 Debugging Strategies
- Print parsed entries as they are added.
- Add tests for each edge case.
7.3 Performance Traps
Repeated string scans can be O(n^2). Use single-pass trimming.
8. Extensions & Challenges
8.1 Beginner Extensions
- Add typed getters (
get_int,get_bool). - Add default values.
8.2 Intermediate Extensions
- Support quoted values with spaces.
- Support multi-line values.
8.3 Advanced Extensions
- Add JSON or YAML support.
- Provide schema validation.
9. Real-World Connections
9.1 Industry Applications
- Services: Parse config files at startup.
- Embedded: Lightweight config parsing.
9.2 Related Open Source Projects
- inih: Minimal INI parser in C.
9.3 Interview Relevance
Parsing and robust error handling is a common systems task.
10. Resources
10.1 Essential Reading
- “The C Programming Language” - Ch. 5
- “Crafting Interpreters” - Parsing intro
10.2 Video Resources
- Parsing and tokenization lectures
10.3 Tools & Documentation
man 3 getline: Line input semantics
10.4 Related Projects in This Series
- String Library: Needed for parsing.
- Hash Table: Used for storage.
11. Self-Assessment Checklist
11.1 Understanding
- I can explain parser state machines.
- I can handle whitespace reliably.
- I can design clear error messages.
11.2 Implementation
- Parser reads valid configs correctly.
- Malformed files produce errors.
- Memory is managed correctly.
11.3 Growth
- I can add typed getters.
- I can explain this project in an interview.
12. Submission / Completion Criteria
Minimum Viable Completion:
- Sections and key/value pairs parsed.
Full Completion:
- Comments, whitespace, and errors handled.
Excellence (Going Above & Beyond):
- Typed getters and schema validation.
This guide was generated from C_PROGRAMMING_COMPLETE_MASTERY.md. For the complete learning path, see the parent directory.