Project 15: POSIX-Compliant Shell

Build a POSIX-compliant shell that passes a conformance test suite.

Quick Reference

Attribute Value
Difficulty Level 5: Master (The First-Principles Wizard)
Time Estimate 2-3 months
Main Programming Language C
Alternative Programming Languages Rust
Coolness Level Level 5: Pure Magic (Super Cool)
Business Potential 1. The “Resume Gold” (Educational/Personal Brand)
Prerequisites POSIX shell spec, full parsing/expansion, testing harness
Key Topics expansion order, special built-ins, spec compliance

1. Learning Objectives

By completing this project, you will:

  1. Explain and implement expansion order in the context of a shell.
  2. Build a working posix-compliant shell that matches the project specification.
  3. Design tests that validate correctness and edge cases.
  4. Document design decisions, trade-offs, and limitations.

2. All Theory Needed (Per-Concept Breakdown)

POSIX Compliance and Conformance Testing

Fundamentals POSIX defines a standard shell language with specific rules for parsing, expansion, redirection, built-ins, and exit status. A POSIX-compliant shell must follow these rules precisely and pass conformance tests. This is not just about syntax; it is about detailed semantics like expansion order, special built-ins, and error handling. Implementing POSIX behavior forces you to resolve ambiguities and document deviations.

Deep Dive into the concept The POSIX Shell Command Language specification is precise and often surprising. For example, expansion occurs in a defined order: tilde expansion, parameter expansion, command substitution, arithmetic expansion, field splitting, pathname expansion, and quote removal. If you change this order, scripts that rely on it will behave differently. POSIX also defines how special built-ins behave in scripts: if a special built-in fails, the shell may exit. This affects error handling and makes the built-in table more than just a list of commands.

POSIX grammar is extensive. It includes simple commands, pipelines, &&/||, lists, and compound commands like if, case, for, while, until, and subshells. The grammar allows redirections in multiple positions and imposes rules for reserved words. Implementing this faithfully requires a parser that distinguishes keywords by context, handles line continuation, and recovers from errors in a non-interactive context by exiting with status 2 for syntax errors.

Testing for compliance is a project in itself. There are public test suites (like sh tests from POSIX or those used by dash) that run thousands of cases. A good strategy is to build a test harness that runs these tests and captures failing cases, then reproduce failures in isolation. Many failures are subtle edge cases involving quoting, expansion, or error handling. Keeping a behavior table that compares your shell to a reference implementation (like dash) helps you decide whether to match POSIX or intentionally diverge.

Performance and correctness must both be considered. A naive implementation might be correct but slow, especially for expansion-heavy scripts. Conversely, optimizing prematurely can introduce bugs. The right approach is to implement correctness first, then profile. POSIX compliance is a long-term effort; you will likely iterate, fix, and re-test many times.

How this fits on projects POSIX compliance is the integration point for every shell subsystem. It validates that your parsing, expansion, and execution rules work together correctly.

Definitions & key terms

  • POSIX: Portable Operating System Interface standard.
  • Special built-in: Built-ins with special error semantics.
  • Expansion order: Defined sequence for applying expansions.
  • Conformance test: Test suite validating specification compliance.

Mental model diagram

Spec -> Implementation -> Test Suite -> Failures -> Fixes

How it works (step-by-step)

  1. Implement POSIX grammar and expansion order.
  2. Build a test harness to run conformance suites.
  3. Compare output and exit status against reference shells.
  4. Fix failing cases and document differences.
  5. Repeat until tests pass.

Minimal concrete example

$ var="a b"; echo $var
# POSIX: expands, then field splits -> two words

Common misconceptions

  • “POSIX is just syntax” -> it defines detailed semantics.
  • “Bash behavior is POSIX” -> bash includes many extensions.
  • “Passing a few tests is enough” -> compliance requires broad coverage.

Check-your-understanding questions

  1. Why does expansion order matter?
  2. What is a “special built-in”?
  3. Why use dash as a reference?

Check-your-understanding answers

  1. It changes how words and globbing are produced.
  2. A built-in with special error rules defined by POSIX.
  3. dash is a minimal, close-to-POSIX shell.

Real-world applications

  • System scripts that rely on POSIX sh behavior.
  • Build tools that invoke /bin/sh.

Where you’ll apply it

References

  • POSIX Shell Command Language (The Open Group).
  • dash source code and test suites.

Key insights POSIX compliance is about semantic fidelity, not just syntax matching.

Summary Conformance requires careful implementation, detailed testing, and a willingness to chase edge cases until behavior matches the standard.

Homework/Exercises to practice the concept

  1. Read the POSIX section on expansion order and summarize it.
  2. Run a POSIX shell test suite against your shell.
  3. Document three differences between your shell and dash.

Solutions to the homework/exercises

  1. List the order: tilde, parameter, command, arithmetic, split, glob, quote removal.
  2. Build a script harness that runs tests and logs failures.
  3. Compare outputs and exit statuses on selected edge cases.

Parsing, Precedence, and AST Construction

Fundamentals Parsing transforms a linear stream of tokens into a structured representation that reflects operator precedence and command grouping. Shell syntax is deceptively complex: pipelines bind tighter than logical operators, and lists separated by ; or & are lower precedence than && and ||. The parser must also handle compound commands like if, while, and subshells, and integrate redirections that can appear anywhere in a simple command. A well-designed Abstract Syntax Tree (AST) makes execution straightforward because each node type encodes a specific execution strategy.

Deep Dive into the concept Shell parsing is closer to a compiler front-end than to a simple expression parser. A robust approach is recursive descent with functions that mirror the grammar: parse_list handles sequences, parse_and_or handles && and ||, parse_pipeline handles |, and parse_command handles simple commands and compound constructs. This structure enforces precedence naturally: parse_pipeline is called inside parse_and_or, so pipelines bind tighter than logical operators. Associativity matters too; pipelines and logical operators are typically left-associative, meaning a | b | c becomes a pipeline node with three children rather than nested pipelines that reverse order.

Simple commands are particularly tricky because redirections can be interleaved with words: cmd arg1 > out arg2 < in is legal and should yield a command node with arguments [cmd, arg1, arg2] plus a list of redirections. The parser must therefore accept redirection tokens at any point in a simple command, collect them, and keep the argument order intact. If your parser instead treats redirection as terminating a command, you will mis-handle valid inputs.

Compound commands add complexity. if, while, for, and case introduce nested lists of commands, often with keywords that also appear as regular words in other contexts. The parser must decide when a token like do or then is a keyword versus just a word. Most shells restrict keyword recognition to specific grammar positions, which means the parser’s current nonterminal determines whether a token is treated as a keyword. This is a subtle but essential rule.

Error handling is a first-class concern. An interactive shell should distinguish incomplete input (missing fi, unmatched )) from invalid input. Incomplete input should trigger a continuation prompt; invalid input should report a syntax error and recover to the next command boundary. This requires the parser to track expected tokens and provide meaningful diagnostics. A good parser reports the unexpected token and the context (e.g., “unexpected | after &&”). This is not just UX; it helps you debug your own grammar.

The AST is more than a tree; it is the interface between parsing and execution. Each node type (SimpleCommand, Pipeline, AndOr, List, Subshell, If, While) should hold only what execution needs: arguments, redirections, children, and control flags. If the parser builds a clean AST, the executor can be a straightforward tree walk with predictable behavior. If the AST is messy or ambiguous, execution becomes a tangle of special cases.

How this fits on projects Parsing underpins all control flow and pipeline semantics. Without a correct AST, execution will be wrong even if your runtime is perfect.

Definitions & key terms

  • AST: Abstract Syntax Tree representing command structure.
  • Precedence: Rules that determine grouping of operators.
  • Associativity: Direction of grouping for operators of equal precedence.
  • Nonterminal: A grammar production like pipeline or and_or.

Mental model diagram

Tokens --> Parser --> AST
  a | b && c
   |
   v
 AndOr
 /    Pipeline c
 /   a   b

How it works (step-by-step)

  1. Consume tokens with a recursive descent parser.
  2. Build nodes for pipelines, logical operators, and lists.
  3. Attach redirections to simple commands as a list.
  4. Produce AST root for execution phase.
  5. On error, report unexpected token and recover if interactive.

Minimal concrete example

Input:  a | b && c
AST:    AndOr(Pipeline(a,b), c)

Common misconceptions

  • “Redirections only come at the end” -> they can appear anywhere.
  • | and && have same precedence” -> pipeline binds tighter.
  • “Keywords are always keywords” -> context decides keyword meaning.

Check-your-understanding questions

  1. Why must a | b && c parse as (a | b) && c?
  2. Where should redirections be stored in the AST?
  3. How should the parser react to an unfinished if block?

Check-your-understanding answers

  1. Because pipeline has higher precedence than &&.
  2. Attached to the simple command node as a list.
  3. Mark input incomplete and request continuation.

Real-world applications

  • Shells and REPLs.
  • Configuration languages with command syntax.
  • DSLs that combine operators and commands.

Where you’ll apply it

References

  • POSIX Shell Command Language (grammar).
  • “Language Implementation Patterns” (recursive descent).

Key insights The parser shapes execution; a correct AST makes execution trivial.

Summary Parsing is the structural heart of a shell: it enforces precedence, groups commands, and defines control flow for the executor.

Homework/Exercises to practice the concept

  1. Write a tiny parser for pipelines and &&/|| and dump the AST.
  2. Add support for redirections in simple commands.
  3. Build a syntax error reporter that shows the unexpected token.

Solutions to the homework/exercises

  1. Use recursive descent with parse_pipeline inside parse_and_or.
  2. Allow redirection tokens anywhere in a simple command loop.
  3. Track the last token and expected set, print a clear message.

File Descriptors and Redirection Semantics

Fundamentals Redirection is how a shell reconfigures where a command reads input and writes output. It works by manipulating file descriptors, usually 0 (stdin), 1 (stdout), and 2 (stderr). Redirections like >, >>, <, and 2> open files with specific flags and then use dup2() to rewire the command’s standard streams. The order of redirections matters, and redirections can appear anywhere in the command line. A correct redirection engine is essential for scripts, pipelines, and error handling.

Deep Dive into the concept At the kernel level, every process has a file descriptor table. Redirection changes entries in this table before the command runs. For > the shell opens (or creates) the target file with flags like O_WRONLY | O_CREAT | O_TRUNC and permissions derived from umask, then duplicates that descriptor onto STDOUT_FILENO. For >>, O_APPEND is used so writes append rather than truncate. For <, the file is opened with O_RDONLY and duplicated to STDIN_FILENO. For 2>, the target file is duplicated to STDERR_FILENO. The shell can also duplicate descriptors directly using syntax like 2>&1, which means “make fd 2 a duplicate of fd 1.” This is not the same as 2>file and must be handled by the parser as a special redirection form.

Order is critical. In cmd >out 2>&1, stderr is redirected to the new stdout (the file), because stdout was already redirected when 2>&1 is evaluated. In cmd 2>&1 >out, stderr is duplicated to the old stdout (the terminal), and then stdout is redirected to the file, so stderr remains on the terminal. The shell must apply redirections left-to-right to match this behavior. This ordering rule is a common source of confusion and bugs in novice shells.

Redirections can also target existing descriptors. The >& and <& forms allow the user to duplicate or close file descriptors. For example, exec 3>log.txt opens a file and assigns it to fd 3, while exec 3>&- closes fd 3. Implementing these forms requires careful handling of fd lifetimes and close-on-exec flags. If your shell leaks file descriptors, child processes may inherit unexpected open files, leading to security or correctness issues.

Here-documents (<<) add another layer. They provide inline input text to a command, typically by writing the content into a pipe or temporary file and then redirecting stdin to that file. Here-docs can be quoted or unquoted, and quoting affects whether parameter expansion is performed inside the heredoc body. Even if you postpone heredoc support, your redirection engine should be designed to accommodate it later.

How this fits on projects Redirection is foundational for pipelines, scripting, and error handling. Every shell feature that touches I/O depends on correct fd manipulation.

Definitions & key terms

  • File descriptor (fd): Integer handle for open files or pipes.
  • dup2(): Duplicate one fd onto another, closing the target first.
  • umask: Mask that restricts permissions on newly created files.
  • Here-doc: Inline input redirected into stdin.

Mental model diagram

stdout (fd 1) -> open("out.txt") -> dup2(fd_out, 1)

How it works (step-by-step)

  1. Parse redirection tokens and targets.
  2. For each redirection, open or select the target fd.
  3. Apply redirections in left-to-right order.
  4. Use dup2() to rewire standard fds.
  5. Close unused descriptors to avoid leaks.

Minimal concrete example

int fd = open("out.txt", O_WRONLY|O_CREAT|O_TRUNC, 0644);
dup2(fd, STDOUT_FILENO);
close(fd);

Common misconceptions

  • “Order does not matter” -> it does; redirections are sequential.
  • “2>&1 is the same as >file 2>&1” -> order changes meaning.
  • “Redirection only applies to external commands” -> built-ins can be redirected too.

Check-your-understanding questions

  1. Why does 2>&1 >out leave stderr on the terminal?
  2. What does exec 3>&- do?
  3. Why must redirections be applied before exec()?

Check-your-understanding answers

  1. Because stderr is duplicated before stdout is redirected.
  2. It closes file descriptor 3 in the shell process.
  3. The child’s fd table must be ready before program start.

Real-world applications

  • Shell scripting and logging.
  • Daemon output redirection.
  • Build systems capturing errors.

Where you’ll apply it

References

  • POSIX Shell Command Language (redirection rules).
  • “The Linux Programming Interface” (file descriptors).

Key insights Redirection is fd table surgery; order and duplication rules are everything.

Summary A correct redirection engine uses open and dup2 in a precise order to rewire stdin/stdout/stderr and support scripts reliably.

Homework/Exercises to practice the concept

  1. Implement > and >> redirections and test with echo.
  2. Add 2>&1 and demonstrate the difference between the two orders.
  3. Track open fds and ensure none leak into child processes.

Solutions to the homework/exercises

  1. Use open with O_TRUNC and O_APPEND, then dup2.
  2. Apply redirections left-to-right and compare outputs.
  3. Close all non-needed descriptors after dup2.

3. Project Specification

3.1 What You Will Build

A shell that matches POSIX shell language semantics and passes tests.

Included:

  • Core feature set described above
  • Deterministic CLI behavior and exit codes

Excluded:

  • Bash/zsh extensions optional; focus on POSIX.

3.2 Functional Requirements

  1. Requirement 1: Implement POSIX grammar constructs.
  2. Requirement 2: Match POSIX expansion order and quote removal.
  3. Requirement 3: Support special built-ins with correct error behavior.
  4. Requirement 4: Run conformance tests and report failures.
  5. Requirement 5: Document deviations from POSIX.

3.3 Non-Functional Requirements

  • Performance: Interactive latency under 50ms for typical inputs; pipeline setup should scale linearly.
  • Reliability: No crashes on malformed input; errors reported clearly with non-zero status.
  • Usability: Clear prompts, deterministic behavior, and predictable error messages.

3.4 Example Usage / Output

$ ./mysh
mysh> result=$(echo hello | tr a-z A-Z)
mysh> echo $result
HELLO
mysh> (cd /tmp; pwd); pwd
/tmp
/home/user
mysh> ./run_posix_tests.sh ./mysh
PASS: 1247/1250 tests passed

3.5 Data Formats / Schemas / Protocols

  • Conformance test reports and behavior tables.

3.6 Edge Cases

  • Quoting edge cases
  • Subshell environment
  • Pipeline status

3.7 Real World Outcome

This is the exact behavior you should be able to demonstrate.

3.7.1 How to Run (Copy/Paste)

  • make
  • ./run_posix_tests.sh ./mysh

3.7.2 Golden Path Demo (Deterministic)

$ ./mysh
mysh> result=$(echo hello | tr a-z A-Z)
mysh> echo $result
HELLO
mysh> (cd /tmp; pwd); pwd
/tmp
/home/user
mysh> ./run_posix_tests.sh ./mysh
PASS: 1247/1250 tests passed

3.7.3 Failure Demo (Deterministic)

$ ./run_posix_tests.sh ./mysh
mysh> not_a_command
mysh> echo $?
127

4. Solution Architecture

4.1 High-Level Design

[Input] -> [Parser/Lexer] -> [Core Engine] -> [Executor/Output]

4.2 Key Components

Component Responsibility Key Decisions
POSIX Parser Complete grammar support Strict precedence and keywords.
Expansion Engine Correct order and quoting rules Matches spec.
Test Harness Runs suites and logs results Reproducible failures.

4.4 Data Structures (No Full Code)

struct TestResult { char *name; int status; char *diff; };

4.4 Algorithm Overview

Key Algorithm: Expansion Pipeline

  1. Apply expansions in order
  2. split fields
  3. glob

Complexity Analysis:

  • Time: O(n) per word
  • Space: O(n) per word

5. Implementation Guide

5.1 Development Environment Setup

# install dependencies (if any)
# build
make

5.2 Project Structure

project-root/
├── src/
│   ├── main.c
│   ├── lexer.c
│   └── executor.c
├── tests/
│   └── test_basic.sh
├── Makefile
└── README.md

5.3 The Core Question You’re Answering

Can I build a shell that behaves exactly like POSIX requires?

5.4 Concepts You Must Understand First

Stop and research these before coding:

  1. POSIX grammar
  2. Special built-ins
  3. Conformance tests

5.5 Questions to Guide Your Design

5.6 Thinking Exercise

The “Weird Expansion” Problem

Why does POSIX require quote removal after globbing, not before?

5.7 The Interview Questions They’ll Ask

5.8 Hints in Layers

Hint 1: Read the spec carefully POSIX wording is precise and often surprising.

Hint 2: Use dash as a reference It is a minimal POSIX shell implementation.

Hint 3: Build a test harness Run tests and isolate failing cases.

Hint 4: Keep a behavior table Document differences between your shell and POSIX.

5.9 Books That Will Help

Topic Book Chapter
POSIX shell “POSIX Shell Command Language” All core sections
Process control “Advanced Programming in the UNIX Environment” Ch. 8-10
Expansion semantics Bash Reference Manual Expansions (for comparison)

5.10 Implementation Phases

Phase 1: Foundation (2-3 days)

Goals:

  • Define data structures and interfaces
  • Build a minimal end-to-end demo

Tasks:

  1. Implement the core data structures
  2. Build a tiny CLI or harness for manual tests

Checkpoint: A demo command runs end-to-end with clear logging.

Phase 2: Core Functionality (1 week)

Goals:

  • Implement full feature set
  • Validate with unit tests

Tasks:

  1. Implement core requirements
  2. Add error handling and edge cases

Checkpoint: All functional requirements pass basic tests.

Phase 3: Polish & Edge Cases (2-4 days)

Goals:

  • Harden for weird inputs
  • Improve UX and documentation

Tasks:

  1. Add edge-case tests
  2. Document design decisions

Checkpoint: Deterministic golden demo and clean error output.

5.11 Key Implementation Decisions

Decision Options Recommendation Rationale
Parsing depth Minimal vs full Incremental Start small, expand safely
Error policy Silent vs verbose Verbose Debuggability for learners

6. Testing Strategy

6.1 Test Categories

Category Purpose Examples
Unit Tests Test individual components Tokenizer, matcher, env builder
Integration Tests Test component interactions Full command lines
Edge Case Tests Handle boundary conditions Empty input, bad args

6.2 Critical Test Cases

  1. Golden Path: Run the canonical demo and verify output.
  2. Failure Path: Provide invalid input and confirm error status.
  3. Stress Path: Run repeated commands to detect leaks or state corruption.

6.3 Test Data

input: echo hello
output: hello

7. Common Pitfalls & Debugging

7.1 Frequent Mistakes

Pitfall Symptom Solution
Misordered redirection Output goes to wrong place Apply redirections left-to-right
Leaked file descriptors Commands hang waiting for EOF Close unused fds in parent/child
Incorrect exit status &&/|| behave wrong Use waitpid macros correctly

7.2 Debugging Strategies

  • Trace syscalls: Use strace/dtruss to verify fork/exec/dup2 order.
  • Log state transitions: Print parser states and job table changes in debug mode.
  • Compare with dash: Run the same input in a reference shell.

7.3 Performance Traps

  • Avoid O(n^2) behavior in hot paths like line editing.
  • Minimize allocations inside the REPL loop.

8. Extensions & Challenges

8.1 Beginner Extensions

  • Add a help built-in with usage docs.
  • Add colored prompt themes.

8.2 Intermediate Extensions

  • Add a simple profiling mode for command timing.
  • Implement a which built-in using PATH lookup.

8.3 Advanced Extensions

  • Add programmable completion or plugin system.
  • Add a scriptable test harness with golden outputs.

9. Real-World Connections

9.1 Industry Applications

  • Build systems: shells orchestrate compilation and test pipelines.
  • DevOps automation: scripts manage deployments and infrastructure.
  • bash: The most common interactive shell.
  • dash: Minimal POSIX shell often used as /bin/sh.
  • zsh: Feature-rich interactive shell.

9.3 Interview Relevance

  • Process creation and lifecycle questions.
  • Parsing and system programming design trade-offs.

10. Resources

10.1 Essential Reading

  • “POSIX Shell Command Language” by The Open Group - focus on the chapters relevant to this project.
  • “Advanced Programming in the UNIX Environment” - process control and pipes.

10.2 Video Resources

  • Unix process model lectures (any OS course).
  • Compiler front-end videos for lexing/parsing projects.

10.3 Tools & Documentation

  • strace/dtruss: inspect syscalls.
  • man pages: fork, execve, waitpid, pipe, dup2.

11. Self-Assessment Checklist

11.1 Understanding

  • I can explain the core concept without notes.
  • I can trace a command through my subsystem.
  • I understand at least one key design trade-off.

11.2 Implementation

  • All functional requirements are met.
  • All critical tests pass.
  • Edge cases are handled cleanly.

11.3 Growth

  • I documented lessons learned.
  • I can explain this project in an interview.

12. Submission / Completion Criteria

Minimum Viable Completion:

  • Core feature works for the golden demo.
  • Errors are handled with non-zero status.
  • Code is readable and buildable.

Full Completion:

  • All functional requirements met.
  • Tests cover edge cases and failures.

Excellence (Going Above & Beyond):

  • Performance benchmarks and clear documentation.
  • Behavior compared against a reference shell.