Project 10: Globbing Engine

A filename expansion system that transforms *.c into a list of matching files, supporting *, ?, [abc], [a-z], [!abc], and extended globs like ** for recursive matching.

Quick Reference

Attribute Value
Primary Language C
Alternative Languages Rust, Go, Python
Difficulty Level 2: Intermediate (The Developer)
Time Estimate 1 week
Knowledge Area Pattern Matching / Filesystems
Tooling Unix Shell
Prerequisites Basic C, understanding of recursion

What You Will Build

A filename expansion system that transforms *.c into a list of matching files, supporting *, ?, [abc], [a-z], [!abc], and extended globs like ** for recursive matching.

Why It Matters

This project builds core skills that appear repeatedly in real-world systems and tooling.

Core Challenges

  • Pattern matching (* matches any sequence, ? matches one char) → maps to pattern algorithms
  • Directory traversal (reading directory entries) → maps to filesystem interaction
  • Bracket expressions ([a-z], [!0-9]) → maps to character classes
  • Dot files (patterns don’t match hidden files by default) → maps to shell conventions
  • No match behavior (POSIX: return pattern literally; bash nullglob: return nothing) → maps to shell options

Key Concepts

  • Glob pattern matching: “Mastering Regular Expressions” Chapter 1 - Friedl (for pattern intuition)
  • fnmatch function: POSIX specification - The Open Group
  • Shell globbing: “Bash Reference Manual” Section 3.5.8 - GNU

Real-World Outcome

$ ls
file1.c  file2.c  header.h  Makefile  .hidden  src/
$ ./mysh
mysh> echo *.c
file1.c file2.c
mysh> echo file?.c
file1.c file2.c
mysh> echo [fh]*
file1.c file2.c header.h
mysh> echo [!f]*
header.h Makefile
mysh> echo *.nonexistent
*.nonexistent               # No match, pattern preserved (POSIX)
mysh> shopt -s nullglob
mysh> echo *.nonexistent
                            # No match, empty (nullglob)

Implementation Guide

  1. Reproduce the simplest happy-path scenario.
  2. Build the smallest working version of the core feature.
  3. Add input validation and error handling.
  4. Add instrumentation/logging to confirm behavior.
  5. Refactor into clean modules with tests.

Milestones

  • Milestone 1: Minimal working program that runs end-to-end.
  • Milestone 2: Correct outputs for typical inputs.
  • Milestone 3: Robust handling of edge cases.
  • Milestone 4: Clean structure and documented usage.

Validation Checklist

  • Output matches the real-world outcome example
  • Handles invalid inputs safely
  • Provides clear errors and exit codes
  • Repeatable results across runs

References

  • Main guide: SHELL_INTERNALS_DEEP_DIVE_PROJECTS.md
  • “Shell Scripting: Expert Recipes” by Steve Parker