Project 7: Configuration Parser

Build a parser for an INI-like configuration format and expose a clean query API.

Quick Reference

Attribute Value
Difficulty Intermediate
Time Estimate 1 week
Language C
Prerequisites Projects 2-3, string handling
Key Topics Parsing, state machines, I/O

1. Learning Objectives

By completing this project, you will:

  1. Parse structured text with a simple grammar.
  2. Handle whitespace, comments, and quoting reliably.
  3. Store key-value pairs efficiently.
  4. Design a safe API to query config values.

2. Theoretical Foundation

2.1 Core Concepts

  • Tokenization: Split input into meaningful tokens (section, key, value).
  • State machines: Track whether you are parsing section headers or key-value lines.
  • Error reporting: Line numbers and messages matter for usability.

2.2 Why This Matters

Configuration parsing is a common systems problem. The ability to build a robust parser translates to compilers, file formats, and protocol handlers.

2.3 Historical Context / Background

INI-style configs are widely used for their simplicity and human readability. Parsing them correctly still requires careful handling of edge cases.

2.4 Common Misconceptions

  • “Whitespace doesn’t matter”: It matters around keys and values.
  • “Comments are easy”: They must be ignored without breaking parsing.

3. Project Specification

3.1 What You Will Build

A config parser for this format:

# comment
[server]
port = 8080
host = 127.0.0.1

[logging]
level = info

Provide functions to load a file and query section.key values.

3.2 Functional Requirements

  1. Support sections and key-value pairs.
  2. Ignore blank lines and comments (# or ;).
  3. Trim whitespace around keys and values.
  4. Return errors with line numbers on malformed input.

3.3 Non-Functional Requirements

  • Reliability: No crashes on malformed input.
  • Usability: Helpful error messages.
  • Performance: Linear time parse.

3.4 Example Usage / Output

Config cfg;
config_load(&cfg, "app.ini");
const char *port = config_get(&cfg, "server", "port");
// port == "8080"

3.5 Real World Outcome

You can parse app settings into a queryable structure and confidently handle errors in config files for real applications.


4. Solution Architecture

4.1 High-Level Design

read lines -> parse -> store in map -> query API

4.2 Key Components

Component Responsibility Key Decisions
Line reader Read file line-by-line getline()
Parser Interpret sections and key/value Simple state machine
Store Map section+key to value Hash table or list

4.3 Data Structures

typedef struct {
    char *section;
    char *key;
    char *value;
} ConfigEntry;

4.4 Algorithm Overview

Key Algorithm: Line parse

  1. Trim whitespace.
  2. If line starts with [, parse section.
  3. Else parse key = value.
  4. Store entry in map.

Complexity Analysis:

  • Time: O(n) in total input length
  • Space: O(n) for stored strings

5. Implementation Guide

5.1 Development Environment Setup

cc -Wall -Wextra -O2 -g -o test_cfg test_cfg.c config.c

5.2 Project Structure

config/
├── src/
│   ├── config.c
│   └── config.h
├── tests/
│   └── test_config.c
└── README.md

5.3 The Core Question You’re Answering

“How do I turn human-written text into structured data reliably?”

5.4 Concepts You Must Understand First

Stop and research these before coding:

  1. Line reading
    • How does getline() allocate buffers?
  2. Whitespace trimming
    • Safe trimming without losing data.
  3. Error handling
    • How to return line number and error code.

5.5 Questions to Guide Your Design

Before implementing, think through these:

  1. Will you allow duplicate keys?
  2. What happens if key = has an empty value?
  3. How will you escape # in values?

5.6 Thinking Exercise

Edge Case Parsing

How should you interpret a line like path = /tmp#cache? Is # a comment or part of value?

5.7 The Interview Questions They’ll Ask

Prepare to answer these:

  1. “How do you parse structured text in C?”
  2. “What is a state machine and why use it here?”
  3. “How do you report parse errors with context?”

5.8 Hints in Layers

Hint 1: Parse sections first Handle [section] lines separately.

Hint 2: Implement trimming helpers Write trim_left and trim_right.

Hint 3: Build a simple key store Use a list or map for entries.

5.9 Books That Will Help

Topic Book Chapter
Parsing basics “Crafting Interpreters” Ch. 4
C strings “The C Programming Language” Ch. 5

5.10 Implementation Phases

Phase 1: Foundation (2-3 days)

Goals:

  • Read lines and parse sections

Tasks:

  1. Implement line reader.
  2. Parse section headers.

Checkpoint: Sections recognized with line numbers.

Phase 2: Core Functionality (2-3 days)

Goals:

  • Parse key/value pairs

Tasks:

  1. Split on =.
  2. Trim key and value.

Checkpoint: Values stored correctly.

Phase 3: Polish & Edge Cases (1-2 days)

Goals:

  • Error handling and comments

Tasks:

  1. Skip comments and blank lines.
  2. Report malformed lines.

Checkpoint: Clear errors for bad configs.

5.11 Key Implementation Decisions

Decision Options Recommendation Rationale
Storage List vs hash map Hash map Fast lookups
Comment syntax # only vs # and ; Both Common INI formats

6. Testing Strategy

6.1 Test Categories

Category Purpose Examples
Unit Tests Parsing helpers Trim functions
Integration Tests Parse full files Sample configs
Error Tests Malformed lines Missing =

6.2 Critical Test Cases

  1. Empty file: No entries.
  2. Duplicate keys: Defined behavior.
  3. Malformed section: Error with line number.

6.3 Test Data

[app]
name = demo

[bad

7. Common Pitfalls & Debugging

7.1 Frequent Mistakes

Pitfall Symptom Solution
Not trimming whitespace Keys with spaces Trim both sides
Not copying strings Use-after-free Duplicate strings
Poor error messages Hard to debug Include line numbers

7.2 Debugging Strategies

  • Print parsed entries as they are added.
  • Add tests for each edge case.

7.3 Performance Traps

Repeated string scans can be O(n^2). Use single-pass trimming.


8. Extensions & Challenges

8.1 Beginner Extensions

  • Add typed getters (get_int, get_bool).
  • Add default values.

8.2 Intermediate Extensions

  • Support quoted values with spaces.
  • Support multi-line values.

8.3 Advanced Extensions

  • Add JSON or YAML support.
  • Provide schema validation.

9. Real-World Connections

9.1 Industry Applications

  • Services: Parse config files at startup.
  • Embedded: Lightweight config parsing.
  • inih: Minimal INI parser in C.

9.3 Interview Relevance

Parsing and robust error handling is a common systems task.


10. Resources

10.1 Essential Reading

  • “The C Programming Language” - Ch. 5
  • “Crafting Interpreters” - Parsing intro

10.2 Video Resources

  • Parsing and tokenization lectures

10.3 Tools & Documentation

  • man 3 getline: Line input semantics
  • String Library: Needed for parsing.
  • Hash Table: Used for storage.

11. Self-Assessment Checklist

11.1 Understanding

  • I can explain parser state machines.
  • I can handle whitespace reliably.
  • I can design clear error messages.

11.2 Implementation

  • Parser reads valid configs correctly.
  • Malformed files produce errors.
  • Memory is managed correctly.

11.3 Growth

  • I can add typed getters.
  • I can explain this project in an interview.

12. Submission / Completion Criteria

Minimum Viable Completion:

  • Sections and key/value pairs parsed.

Full Completion:

  • Comments, whitespace, and errors handled.

Excellence (Going Above & Beyond):

  • Typed getters and schema validation.

This guide was generated from C_PROGRAMMING_COMPLETE_MASTERY.md. For the complete learning path, see the parent directory.