Project 9: Unix Shell

Build a shell that executes commands, supports pipes, redirection, and built-ins.

Quick Reference

Attribute Value
Difficulty Expert
Time Estimate 3-4 weeks
Language C
Prerequisites fork/exec/wait, file descriptors
Key Topics Process control, pipes, parsing

1. Learning Objectives

By completing this project, you will:

  1. Parse command lines into tokens and pipelines.
  2. Execute commands with fork and exec.
  3. Implement pipes and I/O redirection.
  4. Handle built-ins like cd and exit.

2. Theoretical Foundation

2.1 Core Concepts

  • Process creation: fork() duplicates the process; exec() replaces it.
  • Pipes: Connect stdout of one process to stdin of another.
  • Redirection: Use dup2() to replace stdin/stdout.

2.2 Why This Matters

The shell is the heart of Unix. Building one is a complete exercise in process control and I/O wiring.

2.3 Historical Context / Background

Shells introduced pipelines and composability. The core model has remained stable for decades.

2.4 Common Misconceptions

  • “Built-ins run like normal commands”: They must modify the shell process itself.
  • “Pipes are special”: They are just file descriptors.

3. Project Specification

3.1 What You Will Build

A shell that supports:

  • Running external programs
  • | pipelines (at least one pipe)
  • < and > redirection
  • Built-ins: cd, exit

3.2 Functional Requirements

  1. Execute a single command with args.
  2. Support a single pipe between two commands.
  3. Support input/output redirection.
  4. Provide a prompt and handle empty input.

3.3 Non-Functional Requirements

  • Correctness: Close unused pipe fds.
  • Usability: Errors are clear and non-fatal.
  • Reliability: Shell continues after errors.

3.4 Example Usage / Output

$ ./my_shell
> ls -l | grep .c > output.txt
> cat output.txt
-rw-r--r-- 1 user user 5000 my_wc.c
> exit

3.5 Real World Outcome

You can run commands, redirect output, and chain programs via pipelines. The shell behaves like a minimal bash.


4. Solution Architecture

4.1 High-Level Design

read line -> tokenize -> parse -> fork/exec -> wait

4.2 Key Components

Component Responsibility Key Decisions
Tokenizer Split input Simple whitespace rules
Parser Identify pipes/redirection Support one pipe
Executor Fork and exec Use execvp
Built-ins cd, exit Run in parent

4.3 Data Structures

typedef struct {
    char *argv[64];
    char *input_file;
    char *output_file;
} Command;

4.4 Algorithm Overview

Key Algorithm: Pipe execution

  1. Create pipe.
  2. Fork child A, redirect stdout to pipe.
  3. Fork child B, redirect stdin from pipe.
  4. Close pipe in parent and wait.

Complexity Analysis:

  • Parsing: O(n)
  • Execution: O(1) processes per command

5. Implementation Guide

5.1 Development Environment Setup

cc -Wall -Wextra -O2 -g -o my_shell my_shell.c

5.2 Project Structure

my_shell/
├── src/
│   └── my_shell.c
├── tests/
│   └── test_shell.sh
└── README.md

5.3 The Core Question You’re Answering

“How does a shell connect independent programs into a single workflow?”

5.4 Concepts You Must Understand First

Stop and research these before coding:

  1. fork/exec/wait
    • How does exec replace the process?
  2. dup2
    • How does redirection replace stdin/stdout?
  3. Tokenization
    • How do you split while preserving arguments?

5.5 Questions to Guide Your Design

Before implementing, think through these:

  1. How will you represent pipelines internally?
  2. How will you handle quotes and escapes later?
  3. Where do you handle built-ins?

5.6 Thinking Exercise

Pipe Diagram

Draw the file descriptor table for two commands in a pipeline after dup2().

5.7 The Interview Questions They’ll Ask

Prepare to answer these:

  1. “Why must cd be a built-in?”
  2. “How does a pipe work?”
  3. “What happens if you forget to close pipe ends?”

5.8 Hints in Layers

Hint 1: Start with single commands Ignore pipes until exec works.

Hint 2: Add redirection Implement > and < first.

Hint 3: Add pipes Support exactly one | to start.

5.9 Books That Will Help

Topic Book Chapter
Process control “The Linux Programming Interface” Ch. 24-27
Pipes “The Linux Programming Interface” Ch. 44

5.10 Implementation Phases

Phase 1: Foundation (5-7 days)

Goals:

  • Execute simple commands

Tasks:

  1. Read line and tokenize.
  2. Fork and exec.

Checkpoint: ls runs correctly.

Phase 2: Core Functionality (7-10 days)

Goals:

  • Add redirection and built-ins

Tasks:

  1. Implement < and >.
  2. Add cd and exit.

Checkpoint: Redirection works.

Phase 3: Pipes and Polish (6-8 days)

Goals:

  • Add a pipeline

Tasks:

  1. Parse A | B.
  2. Wire with pipe().

Checkpoint: ls | grep .c works.

5.11 Key Implementation Decisions

Decision Options Recommendation Rationale
Parsing Simple vs full quoting Simple Keep scope contained
Pipe support One vs many One Core concept first

6. Testing Strategy

6.1 Test Categories

Category Purpose Examples
Unit Tests Tokenization Input strings
Integration Tests Pipelines echo hi | tr a-z A-Z
Edge Case Tests Empty input No crash

6.2 Critical Test Cases

  1. Invalid command: Error then prompt again.
  2. Redirection: Output file contains expected data.
  3. Pipeline: Two commands connected.

6.3 Test Data

ls
pwd
ls | grep .md

7. Common Pitfalls & Debugging

7.1 Frequent Mistakes

Pitfall Symptom Solution
Not closing pipe ends Hang Close unused fds
Running cd in child No directory change Built-in in parent
Bad tokenization Wrong args Debug tokens

7.2 Debugging Strategies

  • Print tokens and parsed commands.
  • Use strace/dtruss to inspect syscalls.

7.3 Performance Traps

None major; correctness first.


8. Extensions & Challenges

8.1 Beginner Extensions

  • Add pwd built-in.
  • Improve whitespace handling.

8.2 Intermediate Extensions

  • Support multiple pipes.
  • Add background execution (&).

8.3 Advanced Extensions

  • Job control (fg/bg).
  • Command history and line editing.

9. Real-World Connections

9.1 Industry Applications

  • DevOps: Automating workflows with pipelines.
  • Systems tooling: Shells are fundamental.
  • bash, dash: Full-featured shells.

9.3 Interview Relevance

Shell internals test deep understanding of processes and file descriptors.


10. Resources

10.1 Essential Reading

  • “The Linux Programming Interface” - Ch. 24-27, 44

10.2 Video Resources

  • OS process control lectures

10.3 Tools & Documentation

  • man 2 fork, man 2 execve, man 2 pipe
  • Build System: Executes commands similarly.
  • Debugger: Inspects processes.

11. Self-Assessment Checklist

11.1 Understanding

  • I can explain how pipes work.
  • I can describe why built-ins run in parent.
  • I understand redirection with dup2.

11.2 Implementation

  • Commands run reliably.
  • Redirection and pipes work.
  • Errors are handled gracefully.

11.3 Growth

  • I can add multiple pipelines.
  • I can explain this project in an interview.

12. Submission / Completion Criteria

Minimum Viable Completion:

  • Runs single commands and built-ins.

Full Completion:

  • Supports redirection and one pipe.

Excellence (Going Above & Beyond):

  • Multiple pipes and job control.

This guide was generated from C_PROGRAMMING_COMPLETE_MASTERY.md. For the complete learning path, see the parent directory.