Project 2: Build Your Own Shell (BYOS)

Implement a simple shell with fork/exec, redirection, and pipes.

Quick Reference

Attribute Value
Difficulty Intermediate
Time Estimate 1 week
Language C (Alt: Rust, Go)
Prerequisites C basics, pointers
Key Topics fork/exec, dup2, pipes

1. Learning Objectives

By completing this project, you will:

  1. Parse input into argv arrays.
  2. Use fork/exec to run programs.
  3. Implement redirection with dup2.
  4. Connect processes with pipes.

2. Theoretical Foundation

2.1 Core Concepts

  • fork/exec: Fork duplicates the process; exec replaces it with a new program.
  • File descriptors: Standard input/output/error are 0/1/2.
  • Pipes: Kernel buffers that connect stdout of one process to stdin of another.

2.2 Why This Matters

The shell is the canonical interface between users and the Unix process model.

2.3 Historical Context / Background

Unix shells established process control conventions still used in modern systems.

2.4 Common Misconceptions

  • “cd is a normal command”: It must be a built-in to change the shell’s state.

3. Project Specification

3.1 What You Will Build

A minimal shell that supports built-ins (cd, exit), command execution, > redirection, and | pipelines.

3.2 Functional Requirements

  1. Show a prompt and read lines.
  2. Support cd, exit, and pwd.
  3. Execute external commands with arguments.
  4. Support output redirection and pipes.

3.3 Non-Functional Requirements

  • Reliability: No zombies; always wait for children.
  • Usability: Handle empty input gracefully.

3.4 Example Usage / Output

$ ./myshell
myshell> ls -l > out.txt
myshell> cat out.txt

3.5 Real World Outcome

You can interactively run commands and connect them with pipes:

$ ./myshell
myshell> ls | grep c
main.c
myshell> exit

4. Solution Architecture

4.1 High-Level Design

read line -> parse -> handle built-ins -> fork/exec -> wait

4.2 Key Components

Component Responsibility Key Decisions
Parser Tokenize input Simple split first
Executor Run commands execvp
Redirection Update fds dup2
Pipe handler Connect commands pipe + fork

4.3 Data Structures

char *argv[MAX_ARGS];

4.4 Algorithm Overview

Key Algorithm: Pipeline

  1. Create pipe.
  2. Fork child A for left command.
  3. Fork child B for right command.
  4. Wire stdout/stdin with dup2.

Complexity Analysis:

  • Time: O(n) tokens
  • Space: O(n) args

5. Implementation Guide

5.1 Development Environment Setup

gcc --version

5.2 Project Structure

project-root/
├── shell.c
└── README.md

5.3 The Core Question You’re Answering

“What actually happens when I type a command and press Enter?”

5.4 Concepts You Must Understand First

Stop and research these before coding:

  1. fork vs exec
  2. dup2 and redirection
  3. waitpid and zombie cleanup

5.5 Questions to Guide Your Design

Before implementing, think through these:

  1. How will you handle quoted strings?
  2. How will you represent multiple pipeline stages?
  3. What happens if a command fails to exec?

5.6 Thinking Exercise

Trace ls | grep c

Write the sequence of syscalls and fd changes for a 2-stage pipeline.

5.7 The Interview Questions They’ll Ask

Prepare to answer these:

  1. “Why does cd have to be a built-in?”
  2. “What is a zombie process?”
  3. “How does a pipe connect two processes?”

5.8 Hints in Layers

Hint 1: Minimal loop Read line -> parse -> execute -> repeat.

Hint 2: Redirection Open file, dup2 to stdout, close original fd.

Hint 3: Pipes The parent should close both pipe ends after forking.

5.9 Books That Will Help

Topic Book Chapter
Process control “TLPI” Ch. 24-27
Signals “TLPI” Ch. 20-22

5.10 Implementation Phases

Phase 1: Foundation (2-3 days)

Goals:

  • Fork/exec basic commands.

Tasks:

  1. Tokenize input.
  2. execvp with fork.

Checkpoint: ls runs correctly.

Phase 2: Core Functionality (2-3 days)

Goals:

  • Built-ins and redirection.

Tasks:

  1. Implement cd and exit.
  2. Add > redirection.

Checkpoint: ls > out.txt works.

Phase 3: Polish & Edge Cases (1-2 days)

Goals:

  • Add pipes and error handling.

Tasks:

  1. Implement pipe logic.
  2. Handle failed exec gracefully.

Checkpoint: ls | grep c works.

5.11 Key Implementation Decisions

Decision Options Recommendation Rationale
Parsing strtok vs manual strtok first Simpler
Pipes 2-stage vs N-stage 2-stage first Build incrementally

6. Testing Strategy

6.1 Test Categories

Category Purpose Examples
Execution Run commands ls, pwd
Built-ins State changes cd /tmp
Pipes Data flow ls | wc -l

6.2 Critical Test Cases

  1. Invalid command prints error but shell continues.
  2. cd changes directory in parent.
  3. ls | grep produces expected output.

6.3 Test Data

Commands: ls, pwd, echo

7. Common Pitfalls & Debugging

7.1 Frequent Mistakes

Pitfall Symptom Solution
Not waiting Zombies waitpid for children
Redirection in parent Broken shell Only in child
Forgetting close Pipe hangs Close unused ends

7.2 Debugging Strategies

  • Use strace to confirm fork/exec flow.
  • Print argv arrays before exec.

7.3 Performance Traps

None for this project.


8. Extensions & Challenges

8.1 Beginner Extensions

  • Add history.
  • Add >> append redirection.

8.2 Intermediate Extensions

  • Support multiple pipeline stages.
  • Add signal handling for Ctrl+C.

8.3 Advanced Extensions

  • Add job control (bg/fg).
  • Implement simple globbing.

9. Real-World Connections

9.1 Industry Applications

  • Understanding shell internals helps with debugging scripts and process issues.
  • bash: https://www.gnu.org/software/bash/
  • zsh: https://www.zsh.org

9.3 Interview Relevance

  • fork/exec and pipes are classic Unix interview topics.

10. Resources

10.1 Essential Reading

  • fork(2) and execvp(3)

10.2 Video Resources

  • Shell implementation tutorials (search “build a shell in C”)

10.3 Tools & Documentation

  • dup2(2) and pipe(2)

11. Self-Assessment Checklist

11.1 Understanding

  • I can explain fork/exec.
  • I can explain redirection with dup2.
  • I can describe pipes and fd wiring.

11.2 Implementation

  • Built-ins work.
  • Pipes work.
  • No zombies remain.

11.3 Growth

  • I can extend to job control.
  • I can explain the shell lifecycle.

12. Submission / Completion Criteria

Minimum Viable Completion:

  • Shell runs commands with fork/exec.

Full Completion:

  • Add redirection and pipes.

Excellence (Going Above & Beyond):

  • Job control and history support.

This guide was generated from LEARN_LINUX_UNIX_INTERNALS_DEEP_DIVE.md. For the complete learning path, see the parent directory.