Project 12: Unix Shell with Job Control
Project 12: Unix Shell with Job Control
Build an interactive shell that supports foreground/background jobs, built-in commands, and correct signal handling under stress.
Quick Reference
| Attribute | Value |
|---|---|
| Difficulty | Advanced |
| Time Estimate | 2-3 weeks |
| Language | C (Alternatives: Rust, Zig, C++) |
| Prerequisites | Project 11 (Signals + Processes Sandbox) recommended |
| Key Topics | Job control, process groups, signals, race conditions, terminal I/O |
| CS:APP Chapters | 8, 12 |
Table of Contents
- Learning Objectives
- Deep Theoretical Foundation
- Project Specification
- Solution Architecture
- Implementation Guide
- Testing Strategy
- Common Pitfalls
- Extensions
- Real-World Connections
- Resources
- Self-Assessment Checklist
1. Learning Objectives
By completing this project, you will:
- Master job control semantics: Understand how shells manage foreground and background processes, and how job state transitions work
- Implement correct signal handling: Write async-signal-safe handlers for SIGCHLD, SIGINT, and SIGTSTP without races
- Understand process groups and sessions: Know how the terminal, session leader, and process groups interact
- Avoid race conditions: Use sigprocmask correctly to prevent races between parent and child processes
- Build a real interactive tool: Create something you can actually use as a working shell
- Debug concurrent systems: Develop intuition for diagnosing and fixing race conditions and zombie processes
2. Deep Theoretical Foundation
2.1 Shell Architecture Overview
A shell is an interactive command interpreter that serves as the interface between the user and the operating system. At its core, a shell performs a simple loop:
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ SHELL MAIN LOOP โ
โ โ
โ โโโโโโโโโโโ โโโโโโโโโโโ โโโโโโโโโโโ โโโโโโโโโโโ โโโโโโโโโโโ โ
โ โ Read โโโโโถโ Parse โโโโโถโ Evaluateโโโโโถโ Execute โโโโโถโ Wait โ โ
โ โ Command โ โ Command โ โ Built-inโ โ Externalโ โ(if fg) โ โ
โ โโโโโโโโโโโ โโโโโโโโโโโ โ or Ext? โ โ Command โ โโโโโโฌโโโโโ โ
โ โฒ โโโโโโโโโโโ โโโโโโโโโโโ โ โ
โ โ โ โ
โ โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ โ
โ โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
Key architectural components:
- Command Parser: Tokenizes input, handles quoting, identifies operators
- Built-in Command Handler: Executes commands without forking (cd, jobs, fg, bg, exit)
- External Command Executor: Forks and execs external programs
- Job Table: Tracks all background and stopped jobs
- Signal Handlers: Responds to SIGCHLD, SIGINT, SIGTSTP
2.2 Process Groups and Sessions
Understanding process groups is fundamental to job control. Every process belongs to exactly one process group, and every process group belongs to exactly one session.
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ SESSION (SID = 1000) โ
โ โ
โ โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ โ
โ โ Session Leader: bash (PID 1000) โ โ
โ โ (Controlling process of terminal) โ โ
โ โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ โ
โ โ
โ โโโโโโโโโโโโโโโโโโโโโโโ โโโโโโโโโโโโโโโโโโโโโโโ โโโโโโโโโโโโโโโโโโโโ โ
โ โ FOREGROUND PGRP โ โ BACKGROUND PGRP โ โ BACKGROUND PGRP โ โ
โ โ PGID = 2001 โ โ PGID = 2002 โ โ PGID = 2003 โ โ
โ โ โ โ โ โ โ โ
โ โ โโโโโโโ โโโโโโโ โ โ โโโโโโโ โโโโโโโ โ โ โโโโโโโ โ โ
โ โ โvim โ โgrep โ โ โ โmake โ โ gcc โ โ โ โsleepโ โ โ
โ โ โ2001 โ โ2005 โ โ โ โ2002 โ โ2006 โ โ โ โ2003 โ โ โ
โ โ โโโโโโโ โโโโโโโ โ โ โโโโโโโ โโโโโโโ โ โ โโโโโโโ โ โ
โ โ โ โ โ โ โ โ
โ โ (Receives terminal โ โ (No terminal input) โ โ (No terminal โ โ
โ โ signals: ^C, ^Z) โ โ โ โ input) โ โ
โ โโโโโโโโโโโโโโโโโโโโโโโ โโโโโโโโโโโโโโโโโโโโโโโ โโโโโโโโโโโโโโโโโโโโ โ
โ โ
โ Terminal (/dev/pts/0) โโโโโ Foreground process group = 2001 โ
โ โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
Key concepts:
| Concept | Description |
|---|---|
| Session | Collection of process groups; created by setsid(); has at most one controlling terminal |
| Session Leader | Process that created the session; usually the login shell |
| Process Group | Collection of related processes; identified by PGID (usually the first processโs PID) |
| Foreground Process Group | The group that can read from/write to the terminal and receives terminal signals |
| Controlling Terminal | Terminal device associated with a session; source of SIGINT, SIGTSTP, etc. |
2.3 The Controlling Terminal
The controlling terminal is the key to understanding job control. When you press Ctrl+C or Ctrl+Z:
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ TERMINAL SIGNAL FLOW โ
โ โ
โ User Types ^C โ
โ โ โ
โ โผ โ
โ โโโโโโโโโโโโโโโโโโโโโโโ โ
โ โ Terminal Driver โ โ
โ โ (/dev/pts/0) โ โ
โ โโโโโโโโโโโโฌโโโโโโโโโโโ โ
โ โ โ
โ โ Looks up: tcgetpgrp(fd) โ foreground PGID โ
โ โ โ
โ โผ โ
โ โโโโโโโโโโโโโโโโโโโโโโโ โ
โ โ Send SIGINT to all โ โ
โ โ processes in the โ โ
โ โ foreground group โ โ
โ โโโโโโโโโโโโฌโโโโโโโโโโโ โ
โ โ โ
โ โโโโโโโโโโผโโโโโโโโโ โ
โ โผ โผ โผ โ
โ โโโโโโโ โโโโโโโ โโโโโโโ โ
โ โ P1 โ โ P2 โ โ P3 โ All receive SIGINT โ
โ โโโโโโโ โโโโโโโ โโโโโโโ โ
โ โ
โ Background process groups do NOT receive the signal โ
โ โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
Terminal ownership functions:
// Get foreground process group of terminal
pid_t fg_pgid = tcgetpgrp(STDIN_FILENO);
// Set foreground process group of terminal (must be in same session)
tcsetpgrp(STDIN_FILENO, new_fg_pgid);
// Only the controlling process (shell) should call tcsetpgrp
2.4 Job Control Overview
Job control allows users to manage multiple processes from a single terminal:
| Action | Key/Command | Signal | Effect |
|---|---|---|---|
| Interrupt | Ctrl+C | SIGINT | Terminate foreground job |
| Suspend | Ctrl+Z | SIGTSTP | Stop foreground job |
| Resume in FG | fg %n |
SIGCONT | Continue stopped job in foreground |
| Resume in BG | bg %n |
SIGCONT | Continue stopped job in background |
| List jobs | jobs |
- | Display all jobs |
| Run in BG | cmd & |
- | Start job in background |
Job State Machine:
fork() fg
โโโโโโโโโโโโโโโโโถ RUNNING โโโโโโโโโโโโโโ
โ (foreground) โ
โ โ โ
โ โ ^Z (SIGTSTP) โ
โ โผ โ
โโโโโโโดโโโโโโ STOPPED โโโโโโโโโโโโโโโค
โ shell โ โ โ
โ (parent) โ โ bg โ
โโโโโโโโโโโโโ โผ โ
โ RUNNING โโโโโโโโโโโโโโโโโ
โ (background) fg
โ โ
โ โ exit/signal
โ โผ
โโโโโโโโโโโโโโโโถ TERMINATED
wait() โ
โ reap
โผ
REAPED
(removed from
job table)
2.5 Signal Handling in Shells
The shell must handle three critical signals:
SIGCHLD: Sent to parent when a child changes state (terminates, stops, continues)
void sigchld_handler(int sig) {
int olderrno = errno; // Save errno
pid_t pid;
int status;
// Reap ALL available children (signals can coalesce!)
while ((pid = waitpid(-1, &status, WNOHANG | WUNTRACED | WCONTINUED)) > 0) {
if (WIFEXITED(status)) {
// Child exited normally
// Update job state, possibly print message
} else if (WIFSIGNALED(status)) {
// Child terminated by signal
// Print message, remove from job table
} else if (WIFSTOPPED(status)) {
// Child was stopped
// Update job state to STOPPED
} else if (WIFCONTINUED(status)) {
// Child was continued
// Update job state to RUNNING
}
}
errno = olderrno; // Restore errno
}
SIGINT (Ctrl+C): The shell should NOT terminate; it should let the foreground job handle it.
SIGTSTP (Ctrl+Z): The shell should NOT stop; it should let the foreground job handle it.
// Shell's signal disposition for SIGINT/SIGTSTP
// Option 1: Ignore in parent, let children inherit default
signal(SIGINT, SIG_IGN);
signal(SIGTSTP, SIG_IGN);
// Child must restore default handlers after fork:
signal(SIGINT, SIG_DFL);
signal(SIGTSTP, SIG_DFL);
2.6 Built-in vs External Commands
Built-in commands must execute in the shell process itself because they affect the shellโs state:
| Command | Why Built-in? |
|---|---|
cd |
Changes shellโs working directory |
exit |
Terminates the shell |
jobs |
Reads shellโs job table |
fg |
Modifies foreground group, waits |
bg |
Sends SIGCONT to job |
export |
Modifies shellโs environment |
External commands run in child processes:
if (is_builtin(argv[0])) {
execute_builtin(argv); // Run in shell process
} else {
pid = fork();
if (pid == 0) {
// Child: exec external command
execvp(argv[0], argv);
exit(1); // If exec fails
}
// Parent: wait or add to background jobs
}
2.7 I/O Redirection Fundamentals
While not required for the basic shell, understanding I/O redirection is essential for extensions:
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ I/O REDIRECTION โ
โ โ
โ Before: After "cmd > output.txt": โ
โ โ
โ โโโโโโโโโโโ โโโโโโโโโโโ โ
โ โ Process โ โ Process โ โ
โ โโโโโโโโโโโค โโโโโโโโโโโค โ
โ โ fd 0 โโโโโโโถ terminal โ fd 0 โโโโโโโถ terminal โ
โ โ fd 1 โโโโโโโถ terminal โ fd 1 โโโโโโโถ output.txt (file) โ
โ โ fd 2 โโโโโโโถ terminal โ fd 2 โโโโโโโถ terminal โ
โ โโโโโโโโโโโ โโโโโโโโโโโ โ
โ โ
โ Implementation: โ
โ 1. Open output.txt, get fd (e.g., 3) โ
โ 2. dup2(3, STDOUT_FILENO) // fd 1 now points to file โ
โ 3. close(3) // Clean up extra fd โ
โ 4. exec() // New program inherits redirected fds โ
โ โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
2.8 Pipeline Mechanics
Pipelines connect the stdout of one process to the stdin of another:
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ PIPELINE: ls | grep foo | wc โ
โ โ
โ โโโโโโโโโ โโโโโโโโโ โโโโโโโโโ โ
โ โ ls โ โ grep โ โ wc โ โ
โ โ โ โ โ โ โ โ
โ โ fd1 โโโผโโโpipeโโโโผโถ fd0 โ โ โ โ
โ โ โ [0] โ fd1 โโโผโโโpipeโโโโผโถ fd0 โ โ
โ โ โ โ โ [1] โ fd1 โโโผโโโถ terminal โ
โ โโโโโโโโโ โโโโโโโโโ โโโโโโโโโ โ
โ โ
โ Key points: โ
โ - All processes in pipeline are in the SAME process group โ
โ - Shell waits for ALL processes to complete โ
โ - Pipe fds must be closed in parent after fork โ
โ โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
2.9 Race Conditions in Shell Implementation
The classic shell race condition occurs between adding a job and reaping it:
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ THE SHELL RACE CONDITION โ
โ โ
โ Timeline (INCORRECT): โ
โ โ
โ Parent (shell) Child โ
โ โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ โ
โ fork() returns child_pid fork() returns 0 โ
โ โ โ โ
โ โ โผ โ
โ โ setpgid(0, 0) โ
โ โ exec("fast_cmd") โ
โ โ // runs and exits quickly โ
โ โ โ โ
โ โ โโโโโโโโดโโโโโโโ โ
โ โ โ SIGCHLD to โ โ
โ โ โ parent! โ โ
โ โผ โโโโโโโโฌโโโโโโโ โ
โ // About to add job โ โ
โ โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ โ
โ โ โ
โ sigchld_handler() runs! โ Handler can't find job (not added yet!) โ
โ // Tries to update job โ
โ // that doesn't exist โ
โ โ โ
โ โผ โ
โ addjob(child_pid, ...) โ ZOMBIE! Job added but child already dead โ
โ โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
2.10 Proper Use of sigprocmask
The solution is to block SIGCHLD before fork and unblock after adding the job:
sigset_t mask, prev_mask;
// 1. Block SIGCHLD before fork
sigemptyset(&mask);
sigaddset(&mask, SIGCHLD);
sigprocmask(SIG_BLOCK, &mask, &prev_mask);
pid = fork();
if (pid == 0) {
// Child: restore signal mask before exec
sigprocmask(SIG_SETMASK, &prev_mask, NULL);
// Set up process group
setpgid(0, 0);
// Restore default signal dispositions
signal(SIGINT, SIG_DFL);
signal(SIGTSTP, SIG_DFL);
// Execute command
execvp(argv[0], argv);
exit(1);
}
// 2. Parent: Add job while SIGCHLD is still blocked
addjob(pid, ...);
// 3. Unblock SIGCHLD - now handler can safely run
sigprocmask(SIG_SETMASK, &prev_mask, NULL);
Signal blocking diagram:
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ CORRECT SIGNAL BLOCKING PATTERN โ
โ โ
โ Parent (shell) Child โ
โ โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ โ
โ โ
โ sigprocmask(BLOCK SIGCHLD) โ
โ โ โ
โ โผ โ
โ fork() โโโโโโโโโโโโโโโโโโโโโโโโโถ fork() โ
โ โ โ โ
โ โ โผ โ
โ โ sigprocmask(RESTORE) โ
โ โ setpgid(0, 0) โ
โ โ exec(cmd) โ
โ โ // maybe exits quickly โ
โ โผ โ โ
โ addjob(pid) โ SIGCHLD generated โ
โ โ โ โ
โ โผ โ โ
โ sigprocmask(UNBLOCK) โโโโโโโโโโโโโโโโโโโ SIGCHLD now delivered โ
โ โ โ
โ โผ โ
โ sigchld_handler() โ Job exists in table, handler works! โ
โ โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
2.11 waitpid() Options and Patterns
The waitpid() function is crucial for correct shell behavior:
pid_t waitpid(pid_t pid, int *status, int options);
PID argument:
| Value | Meaning |
|---|---|
-1 |
Wait for any child |
> 0 |
Wait for specific child with that PID |
0 |
Wait for any child in same process group as caller |
< -1 |
Wait for any child in process group abs(pid) |
Options:
| Option | Effect |
|---|---|
WNOHANG |
Return immediately if no child has exited |
WUNTRACED |
Also return if child has stopped (not just terminated) |
WCONTINUED |
Also return if stopped child has been continued |
Status examination macros:
| Macro | Returns true ifโฆ |
|---|---|
WIFEXITED(status) |
Child terminated normally (exit or return from main) |
WIFSIGNALED(status) |
Child terminated by a signal |
WIFSTOPPED(status) |
Child is currently stopped |
WIFCONTINUED(status) |
Child was continued |
WEXITSTATUS(status) |
Exit status (only valid if WIFEXITED) |
WTERMSIG(status) |
Signal that terminated child (only valid if WIFSIGNALED) |
WSTOPSIG(status) |
Signal that stopped child (only valid if WIFSTOPPED) |
Pattern for SIGCHLD handler:
// MUST use loop because signals can coalesce
while ((pid = waitpid(-1, &status, WNOHANG | WUNTRACED | WCONTINUED)) > 0) {
// Process each child status change
}
2.12 Async-Signal-Safe Functions
Signal handlers can interrupt the main program at any point. Only async-signal-safe functions can be called from handlers:
Safe to call from signal handlers:
_exit(),write(),waitpid(),signal(),sigaction()sigprocmask(),sigemptyset(),sigaddset(),sigfillset()getpid(),getppid(),fork(),execve()open(),close(),read(),lseek(),stat()
NOT safe to call from signal handlers:
printf(),fprintf(),sprintf()- uses global buffersmalloc(),free()- uses global heap structuresexit()- calls atexit handlers
Safe output pattern:
// Use write() for output in signal handlers
void safe_print(const char *msg) {
write(STDOUT_FILENO, msg, strlen(msg));
}
// For integer output, convert to string first
void safe_print_int(int n) {
char buf[32];
int len = 0;
// ... convert n to string in buf ...
write(STDOUT_FILENO, buf, len);
}
3. Project Specification
3.1 What You Will Build
An interactive shell named tsh (tiny shell) that supports:
- Running external commands (foreground and background)
- Job control with fg, bg, jobs commands
- Proper signal handling for SIGINT, SIGTSTP, SIGCHLD
- Built-in commands: quit, jobs, bg, fg
3.2 Functional Requirements
- Command Execution:
- Execute external programs with arguments
- Support foreground execution (wait for completion)
- Support background execution with
&suffix
- Built-in Commands:
quit: Exit the shelljobs: List all background and stopped jobsbg <job>: Resume a stopped job in the backgroundfg <job>: Resume a stopped or background job in the foreground
- Signal Handling:
- Ctrl+C (SIGINT): Send to foreground job only, not the shell
- Ctrl+Z (SIGTSTP): Stop foreground job only, not the shell
- SIGCHLD: Reap terminated/stopped children
- Job Control:
- Maintain a job list with job IDs and process groups
- Track job state: Running, Stopped
- Report job state changes
3.3 Non-Functional Requirements
- No zombies: All terminated children must be reaped
- No races: Signal blocking must prevent race conditions
- Clean output: Job state changes must be reported clearly
- Robust parsing: Handle malformed input gracefully
3.4 Example Shell Session
$ ./tsh
tsh> /bin/ls -l
total 64
-rwxr-xr-x 1 user user 16432 Dec 26 10:00 tsh
-rw-r--r-- 1 user user 8234 Dec 26 09:55 tsh.c
tsh> /bin/sleep 10 &
[1] (12345) /bin/sleep 10 &
tsh> /bin/sleep 20 &
[2] (12346) /bin/sleep 20 &
tsh> jobs
[1] (12345) Running /bin/sleep 10 &
[2] (12346) Running /bin/sleep 20 &
tsh> fg %1
Job [1] (12345) stopped by signal 20 <-- After pressing Ctrl+Z
tsh> jobs
[1] (12345) Stopped /bin/sleep 10 &
[2] (12346) Running /bin/sleep 20 &
tsh> bg %1
[1] (12345) /bin/sleep 10 &
tsh> jobs
[1] (12345) Running /bin/sleep 10 &
[2] (12346) Running /bin/sleep 20 &
tsh> /bin/echo hello
hello
tsh> quit
$
3.5 Command Line Syntax
command [arg1 arg2 ...] [< infile] [> outfile] [&]
For the basic implementation:
- Parse space-separated tokens
- Identify
&at end for background execution - Handle quoted strings as single arguments (extension)
4. Solution Architecture
4.1 High-Level Design
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ TSH ARCHITECTURE โ
โ โ
โ โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ โ
โ โ MAIN LOOP โ โ
โ โ โ โ
โ โ while (1) { โ โ
โ โ print_prompt(); โ โ
โ โ cmdline = read_line(); โ โ
โ โ if (feof(stdin)) break; โ โ
โ โ eval(cmdline); โ โ
โ โ } โ โ
โ โ โ โ
โ โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโฌโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ โ
โ โ โ
โ โผ โ
โ โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ โ
โ โ EVAL FUNCTION โ โ
โ โ โ โ
โ โ โโโโโโโโโโโโโโ โโโโโโโโโโโโโโโ โโโโโโโโโโโโโโโโโโโโโโโโโโ โ โ
โ โ โ Parse โโโโโโถโ Built-in? โโโโโโถโ Execute Built-in โ โ โ
โ โ โ Cmdline โ โโโโโโโโฌโโโโโโโ โ (do_builtin_cmd) โ โ โ
โ โ โโโโโโโโโโโโโโ โ No โโโโโโโโโโโโโโโโโโโโโโโโโโ โ โ
โ โ โผ โ โ
โ โ โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ โ โ
โ โ โ FORK/EXEC CHILD โ โ โ
โ โ โ โ โ โ
โ โ โ 1. Block SIGCHLD โ โ โ
โ โ โ 2. Fork โ โ โ
โ โ โ 3. Child: setpgid, restore signals, exec โ โ โ
โ โ โ 4. Parent: add to job table โ โ โ
โ โ โ 5. Unblock SIGCHLD โ โ โ
โ โ โ 6. If FG: wait for job to complete โ โ โ
โ โ โ โ โ โ
โ โ โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ โ โ
โ โ โ โ
โ โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ โ
โ โ
โ โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ โ
โ โ SIGNAL HANDLERS โ โ
โ โ โ โ
โ โ โโโโโโโโโโโโโโโโโโโ โโโโโโโโโโโโโโโโโโโ โโโโโโโโโโโโโโโโโโโโโโโโ โ โ
โ โ โ sigchld_handler โ โ sigint_handler โ โ sigtstp_handler โ โ โ
โ โ โ โ โ โ โ โ โ โ
โ โ โ Reap children, โ โ Forward SIGINT โ โ Forward SIGTSTP โ โ โ
โ โ โ update job โ โ to foreground โ โ to foreground โ โ โ
โ โ โ table โ โ process group โ โ process group โ โ โ
โ โ โโโโโโโโโโโโโโโโโโโ โโโโโโโโโโโโโโโโโโโ โโโโโโโโโโโโโโโโโโโโโโโโ โ โ
โ โ โ โ
โ โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ โ
โ โ
โ โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ โ
โ โ JOB TABLE โ โ
โ โ โ โ
โ โ struct job_t jobs[MAXJOBS]; โ โ
โ โ โ โ
โ โ Each job contains: โ โ
โ โ - pid: Process ID of job leader โ โ
โ โ - jid: Job ID (1, 2, 3, ...) โ โ
โ โ - state: UNDEF, BG, FG, ST โ โ
โ โ - cmdline: Original command line โ โ
โ โ โ โ
โ โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ โ
โ โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
4.2 Job Table Design
/* Job states */
#define UNDEF 0 /* undefined */
#define FG 1 /* running in foreground */
#define BG 2 /* running in background */
#define ST 3 /* stopped */
/* Maximum jobs */
#define MAXJOBS 16
#define MAXLINE 1024
#define MAXARGS 128
struct job_t {
pid_t pid; /* job PID */
int jid; /* job ID [1, 2, ...] */
int state; /* UNDEF, BG, FG, or ST */
char cmdline[MAXLINE]; /* command line */
};
struct job_t jobs[MAXJOBS]; /* The job list */
Job table operations:
| Function | Purpose |
|---|---|
clearjob(job) |
Clear a job entry |
initjobs(jobs) |
Initialize the job list |
maxjid(jobs) |
Return largest allocated JID |
addjob(jobs, pid, state, cmdline) |
Add a job to the list |
deletejob(jobs, pid) |
Delete a job from the list |
fgpid(jobs) |
Return PID of current foreground job |
getjobpid(jobs, pid) |
Find job by PID |
getjobjid(jobs, jid) |
Find job by JID |
pid2jid(jobs, pid) |
Map PID to JID |
listjobs(jobs) |
Print the job list |
4.3 Command Parsing
/* Parse the command line and build the argv array */
int parseline(const char *cmdline, char **argv) {
static char array[MAXLINE]; /* holds local copy of command line */
char *buf = array; /* ptr that traverses command line */
char *delim; /* points to first space delimiter */
int argc; /* number of args */
int bg; /* background job? */
strcpy(buf, cmdline);
buf[strlen(buf)-1] = ' '; /* replace trailing '\n' with space */
while (*buf && (*buf == ' ')) /* ignore leading spaces */
buf++;
/* Build the argv list */
argc = 0;
while ((delim = strchr(buf, ' '))) {
argv[argc++] = buf;
*delim = '\0';
buf = delim + 1;
while (*buf && (*buf == ' ')) /* ignore spaces */
buf++;
}
argv[argc] = NULL;
if (argc == 0) /* ignore blank line */
return 1;
/* Should the job run in the background? */
if ((bg = (*argv[argc-1] == '&')) != 0) {
argv[--argc] = NULL;
}
return bg;
}
4.4 Algorithm: Waiting for Foreground Jobs
The shell must wait for a foreground job to complete (terminate or stop):
void waitfg(pid_t pid) {
sigset_t mask;
sigemptyset(&mask);
while (fgpid(jobs) == pid) {
sigsuspend(&mask); // Sleep until signal arrives
}
}
Why sigsuspend instead of pause?
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ RACE WITH pause() vs sigsuspend() โ
โ โ
โ INCORRECT (with pause()): CORRECT (with sigsuspend()): โ
โ โ
โ while (fgpid == pid) { sigset_t mask; โ
โ // <<< SIGCHLD arrives here! sigemptyset(&mask); โ
โ pause(); // Will sleep forever while (fgpid == pid) { โ
โ } // because signal was sigsuspend(&mask); โ
โ // already handled } โ
โ โ
โ sigsuspend atomically: โ
โ 1. Replaces signal mask โ
โ 2. Suspends until signal โ
โ 3. Restores original mask โ
โ โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
5. Implementation Guide
5.1 Development Environment Setup
# Required tools
sudo apt-get install gcc gdb make
# Create project structure
mkdir -p tsh/{src,tests}
cd tsh
# Create Makefile
cat > Makefile << 'EOF'
CC = gcc
CFLAGS = -Wall -Wextra -g -O0
TARGET = tsh
$(TARGET): tsh.c
$(CC) $(CFLAGS) -o $@ $<
clean:
rm -f $(TARGET) core.*
test: $(TARGET)
./sdriver.pl -t trace01.txt -s ./$(TARGET) -a "-p"
.PHONY: clean test
EOF
5.2 Project Structure
tsh/
โโโ src/
โ โโโ tsh.c # Main shell implementation
โโโ tests/
โ โโโ trace01.txt # Test: proper termination
โ โโโ trace02.txt # Test: builtin quit
โ โโโ trace03.txt # Test: foreground jobs
โ โโโ trace04.txt # Test: background jobs
โ โโโ trace05.txt # Test: jobs command
โ โโโ ...
โโโ sdriver.pl # Test driver script
โโโ Makefile
โโโ README.md
5.3 Implementation Phases
Phase 1: Foundation (Days 1-3)
Goals:
- Set up project skeleton with main loop
- Implement basic command parsing
- Execute simple foreground commands
Tasks:
- Create tsh.c with main loop (prompt, read, eval)
- Implement parseline() to tokenize input
- Implement basic eval() that forks and execs
- Test with simple commands:
/bin/ls,/bin/echo hello
Checkpoint: Can run /bin/ls and see output.
// Phase 1 skeleton
int main() {
char cmdline[MAXLINE];
// Install signal handlers (empty for now)
while (1) {
printf("tsh> ");
if (fgets(cmdline, MAXLINE, stdin) == NULL) {
printf("\n");
exit(0);
}
eval(cmdline);
}
return 0;
}
void eval(char *cmdline) {
char *argv[MAXARGS];
int bg = parseline(cmdline, argv);
if (argv[0] == NULL) return; // Empty line
pid_t pid = fork();
if (pid == 0) {
// Child
execvp(argv[0], argv);
printf("%s: Command not found\n", argv[0]);
exit(1);
}
// Parent
if (!bg) {
int status;
waitpid(pid, &status, 0);
}
}
Phase 2: Job Table (Days 4-6)
Goals:
- Implement job table data structure
- Add jobs for background processes
- Implement jobs built-in command
Tasks:
- Define job structure and global job array
- Implement clearjob(), initjobs(), addjob(), deletejob()
- Implement getjobpid(), getjobjid(), pid2jid()
- Implement listjobs() for jobs command
- Add background job tracking with
&
Checkpoint: Can run /bin/sleep 10 & and see it in jobs output.
// Phase 2 additions
void initjobs(struct job_t *jobs) {
for (int i = 0; i < MAXJOBS; i++)
clearjob(&jobs[i]);
}
int addjob(struct job_t *jobs, pid_t pid, int state, char *cmdline) {
if (pid < 1) return 0;
for (int i = 0; i < MAXJOBS; i++) {
if (jobs[i].pid == 0) {
jobs[i].pid = pid;
jobs[i].state = state;
jobs[i].jid = nextjid++;
strcpy(jobs[i].cmdline, cmdline);
return 1;
}
}
return 0; // No room
}
Phase 3: Signal Handlers (Days 7-10)
Goals:
- Implement SIGCHLD handler to reap children
- Implement SIGINT handler to forward to foreground
- Implement SIGTSTP handler to stop foreground
Tasks:
- Write sigchld_handler() with waitpid loop
- Write sigint_handler() to send SIGINT to foreground PGID
- Write sigtstp_handler() to send SIGTSTP to foreground PGID
- Install handlers in main() using Signal() wrapper
- Handle job state transitions in sigchld_handler()
Checkpoint: Ctrl+C terminates foreground job but not shell.
// Phase 3: Signal handlers
void sigchld_handler(int sig) {
(void)sig;
int olderrno = errno;
pid_t pid;
int status;
while ((pid = waitpid(-1, &status, WNOHANG | WUNTRACED)) > 0) {
struct job_t *job = getjobpid(jobs, pid);
if (job == NULL) continue;
if (WIFEXITED(status)) {
deletejob(jobs, pid);
} else if (WIFSIGNALED(status)) {
// Print message about termination
safe_printf("Job [%d] (%d) terminated by signal %d\n",
job->jid, pid, WTERMSIG(status));
deletejob(jobs, pid);
} else if (WIFSTOPPED(status)) {
job->state = ST;
safe_printf("Job [%d] (%d) stopped by signal %d\n",
job->jid, pid, WSTOPSIG(status));
}
}
errno = olderrno;
}
void sigint_handler(int sig) {
(void)sig;
pid_t fg_pid = fgpid(jobs);
if (fg_pid > 0) {
kill(-fg_pid, SIGINT); // Send to entire process group
}
}
Phase 4: Race Avoidance (Days 11-13)
Goals:
- Add sigprocmask to prevent races
- Implement proper process group setup
- Ensure child inherits correct signal masks
Tasks:
- Block SIGCHLD before fork
- Child: restore mask, set process group, restore signal defaults
- Parent: add job, then unblock SIGCHLD
- Test with rapidly-exiting commands
Checkpoint: No zombies even with rapid /bin/true & commands.
// Phase 4: Race-free eval
void eval(char *cmdline) {
char *argv[MAXARGS];
int bg = parseline(cmdline, argv);
sigset_t mask, prev_mask;
if (argv[0] == NULL) return;
if (builtin_cmd(argv)) return;
// Block SIGCHLD
sigemptyset(&mask);
sigaddset(&mask, SIGCHLD);
sigprocmask(SIG_BLOCK, &mask, &prev_mask);
pid_t pid = fork();
if (pid == 0) {
// Child: restore mask FIRST
sigprocmask(SIG_SETMASK, &prev_mask, NULL);
// Create new process group
setpgid(0, 0);
// Restore default signal handlers
signal(SIGINT, SIG_DFL);
signal(SIGTSTP, SIG_DFL);
execvp(argv[0], argv);
printf("%s: Command not found\n", argv[0]);
exit(1);
}
// Parent: add job while SIGCHLD blocked
addjob(jobs, pid, bg ? BG : FG, cmdline);
// Unblock SIGCHLD
sigprocmask(SIG_SETMASK, &prev_mask, NULL);
if (!bg) {
waitfg(pid);
} else {
printf("[%d] (%d) %s", pid2jid(pid), pid, cmdline);
}
}
Phase 5: fg/bg Commands (Days 14-17)
Goals:
- Implement fg built-in command
- Implement bg built-in command
- Handle job ID parsing (%1, %2)
Tasks:
- Parse job argument (JID or PID)
- Send SIGCONT to process group
- For fg: change state and wait
- For bg: change state and return
Checkpoint: Can stop job with Ctrl+Z, resume with bg, bring to fg.
// Phase 5: fg and bg commands
void do_bgfg(char **argv) {
struct job_t *job;
char *id = argv[1];
if (id == NULL) {
printf("%s command requires PID or %%jobid argument\n", argv[0]);
return;
}
// Parse job/process ID
if (id[0] == '%') {
int jid = atoi(&id[1]);
job = getjobjid(jobs, jid);
if (job == NULL) {
printf("%%%d: No such job\n", jid);
return;
}
} else {
pid_t pid = atoi(id);
job = getjobpid(jobs, pid);
if (job == NULL) {
printf("(%d): No such process\n", pid);
return;
}
}
// Send SIGCONT to process group
kill(-(job->pid), SIGCONT);
if (!strcmp(argv[0], "fg")) {
job->state = FG;
waitfg(job->pid);
} else {
job->state = BG;
printf("[%d] (%d) %s", job->jid, job->pid, job->cmdline);
}
}
Phase 6: Polish & Testing (Days 18-21)
Goals:
- Comprehensive testing
- Fix edge cases
- Code cleanup
Tasks:
- Run all provided trace tests
- Stress test with rapid commands
- Test signal sequences (multiple Ctrl+C, Ctrl+Z)
- Fix any remaining issues
- Add comments and documentation
Checkpoint: All trace tests pass; no zombies under stress.
6. Testing Strategy
6.1 Test Categories
| Category | Purpose | Examples |
|---|---|---|
| Unit Tests | Test individual functions | parseline, addjob, getjobpid |
| Trace Tests | Test shell behavior sequences | Execute commands, check output |
| Stress Tests | Find race conditions | Rapid job creation/destruction |
| Signal Tests | Verify signal handling | Interrupt sequences |
6.2 Trace Test Examples
trace01.txt - Proper termination:
# Trace file 1: Properly terminate on EOF
CLOSE
WAIT
trace02.txt - Built-in quit command:
# Trace file 2: Process builtin quit command
quit
WAIT
trace03.txt - Foreground jobs:
# Trace file 3: Run a foreground job
/bin/echo tsh> quit
WAIT
trace04.txt - Background jobs:
# Trace file 4: Run a background job
/bin/echo -e tsh> ./myspin 1 \046
WAIT
trace05.txt - Jobs command:
# Trace file 5: Process jobs builtin command
/bin/echo -e tsh> ./myspin 2 \046
/bin/echo -e tsh> ./myspin 3 \046
/bin/echo tsh> jobs
WAIT
6.3 Stress Test: Zombie Hunter
#!/bin/bash
# Run many background jobs and check for zombies
for i in {1..100}; do
echo "/bin/true &" | ./tsh
done
# Check for zombie processes
ps aux | grep tsh | grep -v grep
6.4 Signal Stress Test
#!/bin/bash
# Rapidly send signals to test handler robustness
./tsh &
TSH_PID=$!
for i in {1..50}; do
echo "/bin/sleep 1 &" > /dev/pts/X # Send to tsh's terminal
kill -SIGINT $TSH_PID
kill -SIGTSTP $TSH_PID
done
wait $TSH_PID
6.5 Testing Process Group Handling
/* Test program: myspin.c */
#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
int main(int argc, char **argv) {
int secs = 1;
if (argc > 1) secs = atoi(argv[1]);
printf("myspin running (pid=%d, pgid=%d)\n", getpid(), getpgrp());
sleep(secs);
return 0;
}
7. Common Pitfalls
7.1 Race Conditions
| Problem | Symptom | Solution |
|---|---|---|
| Reap before add | Zombie processes | Block SIGCHLD before fork, unblock after addjob |
| Delete during iteration | Corrupted job table | Use separate deletion flag or copy data |
| Signal coalescing | Missed child deaths | Use while loop in sigchld_handler |
| waitfg race | Infinite hang | Use sigsuspend instead of pause |
7.2 Zombie Leaks
Causes:
- Forgetting to reap children
- Only calling waitpid once (signals coalesce!)
- Not using WNOHANG option
- Race condition: child exits before addjob
Detection:
# Watch for zombies
watch -n 1 'ps aux | grep defunct'
Prevention:
// CORRECT: Loop until no more children
while ((pid = waitpid(-1, &status, WNOHANG | WUNTRACED)) > 0) {
// Process each child
}
// INCORRECT: Only reap one child
if ((pid = waitpid(-1, &status, WNOHANG)) > 0) {
// May miss other children!
}
7.3 Signal Handler Mistakes
| Mistake | Consequence | Fix |
|---|---|---|
| Calling printf | Undefined behavior, possible deadlock | Use write() |
| Calling malloc | Heap corruption | Use static buffers |
| Not saving errno | Mysterious failures | Save/restore errno |
| Not using volatile | Compiler optimization issues | Declare shared variables volatile |
7.4 Process Group Issues
// WRONG: Child in same group as shell
pid = fork();
if (pid == 0) {
execvp(argv[0], argv); // Still in shell's group!
}
// RIGHT: Child in its own group
pid = fork();
if (pid == 0) {
setpgid(0, 0); // Create new process group
execvp(argv[0], argv);
}
7.5 Terminal Ownership Mistakes
// WRONG: Shell gives away terminal without taking it back
// After fg job exits, shell can't read input!
// RIGHT: Shell reclaims terminal after foreground job
void waitfg(pid_t pid) {
// Wait for job to finish
while (fgpid(jobs) == pid) {
sigsuspend(&mask);
}
// Reclaim terminal control if needed
tcsetpgrp(STDIN_FILENO, getpgrp());
}
8. Extensions
8.1 Beginner Extensions
- Command history: Use up/down arrows to recall commands
- Tab completion: Complete partial command names
- ~/ expansion: Expand ~ to home directory
- Colorized output: Color prompt and error messages
8.2 Intermediate Extensions
- I/O redirection: Support
<,>,>> - Pipelines: Support
cmd1 | cmd2 | cmd3 - Environment variables: Implement export, support $VAR
- Wildcard expansion: Expand
*.cto matching files - Aliases: Support alias/unalias commands
8.3 Advanced Extensions
- Here documents: Support
<<EOF - Command substitution: Support
$(cmd)or`cmd` - Scripting: Execute shell script files
- Signal handling configuration: Support trap command
- Job control for pipelines: Treat pipeline as single job
- Subshells: Support
(cmd1; cmd2)
8.4 Implementation Hints for Extensions
I/O Redirection:
// In child, before exec:
if (input_file) {
int fd = open(input_file, O_RDONLY);
dup2(fd, STDIN_FILENO);
close(fd);
}
if (output_file) {
int fd = open(output_file, O_WRONLY | O_CREAT | O_TRUNC, 0644);
dup2(fd, STDOUT_FILENO);
close(fd);
}
Pipelines:
// For: cmd1 | cmd2
int pipefd[2];
pipe(pipefd);
pid1 = fork();
if (pid1 == 0) {
dup2(pipefd[1], STDOUT_FILENO);
close(pipefd[0]);
close(pipefd[1]);
exec(cmd1);
}
pid2 = fork();
if (pid2 == 0) {
dup2(pipefd[0], STDIN_FILENO);
close(pipefd[0]);
close(pipefd[1]);
exec(cmd2);
}
close(pipefd[0]);
close(pipefd[1]);
// Wait for both children
9. Real-World Connections
9.1 Industry Applications
- Container runtimes (Docker, containerd): Use same process/signal primitives
- Process supervisors (systemd, supervisord): Job control at scale
- Terminal emulators (iTerm, gnome-terminal): Understand session/terminal relationships
- CI/CD systems (Jenkins, GitHub Actions): Process management for build jobs
- Debugging tools (gdb, strace): Understand how debuggers attach to processes
9.2 Related Open Source Projects
| Project | What to Learn |
|---|---|
| bash | Production-quality shell with all features |
| fish | Modern shell with different design philosophy |
| zsh | Extensible shell with rich scripting |
| dash | Minimal POSIX shell, great for studying |
| busybox ash | Embedded shell, resource-conscious |
9.3 Interview Relevance
This project prepares you for questions like:
- โExplain how job control works in Unixโ
- โWhat happens when you press Ctrl+C?โ
- โHow would you prevent race conditions in a signal handler?โ
- โWhatโs the difference between a process group and a session?โ
- โHow does a shell know when a child has terminated?โ
- โWhy canโt you call printf from a signal handler?โ
- โExplain the relationship between fork, exec, and waitโ
10. Resources
10.1 Essential Reading
- CS:APP Chapter 8: โExceptional Control Flowโ - Signals, process control
- CS:APP Chapter 12: โConcurrent Programmingโ - Race conditions, synchronization
- Advanced Programming in the Unix Environment by Stevens: Chapter 9 (Process Relationships)
- The Linux Programming Interface by Kerrisk: Chapters 34-37
10.2 Man Pages
man 2 fork
man 2 execve
man 2 waitpid
man 2 signal
man 2 sigaction
man 2 sigprocmask
man 2 sigsuspend
man 2 kill
man 2 setpgid
man 3 tcgetpgrp
man 7 signal
man 7 credentials
10.3 Video Resources
- MIT 6.033 lectures on process management
- โWriting a Unix Shellโ series on YouTube
- โSignals in Cโ tutorials
10.4 Tools for Debugging
# Watch process groups
ps -o pid,ppid,pgid,sid,comm
# Trace system calls
strace -f ./tsh
# Trace signals
strace -e trace=signal ./tsh
# Check for zombies
ps aux | grep defunct
10.5 Related Projects in This Series
- Previous: P11 (Signals + Processes Sandbox) - Foundation for this project
- Next: P15 (Robust Unix I/O Toolkit) - Builds on process I/O concepts
- See also: P16 (Concurrency Workbench) - Extends race avoidance patterns
11. Self-Assessment Checklist
Before considering this project complete, verify:
Understanding
- I can explain what a process group is and why shells use them
- I understand why SIGCHLD handlers must use a while loop
- I can explain the race condition between fork() and addjob()
- I know which functions are async-signal-safe and why
- I can draw the signal flow diagram for Ctrl+C
- I understand the difference between SIG_IGN and SIG_DFL
Implementation
- Shell runs foreground commands and waits for completion
- Background jobs work with
&and appear in jobs output - Ctrl+C terminates foreground job but not shell
- Ctrl+Z stops foreground job but not shell
- fg/bg commands work with %JID syntax
- No zombie processes accumulate
- No race conditions under stress testing
Testing
- All provided trace tests pass
- Stress test with rapid job creation shows no zombies
- Signal sequences (Ctrl+C, Ctrl+Z, fg, bg) work correctly
- Error cases handled gracefully (bad commands, bad job IDs)
Code Quality
- Signal handlers are async-signal-safe
- errno is saved and restored in handlers
- sigprocmask used correctly to prevent races
- Code is well-commented and readable
- No memory leaks
12. Completion Criteria
Minimum Viable Completion:
- Execute external commands in foreground
- Basic job table with background jobs
- jobs built-in command works
- SIGCHLD handler reaps children
Full Completion:
- All built-in commands work (quit, jobs, bg, fg)
- All signal handlers work (SIGCHLD, SIGINT, SIGTSTP)
- No race conditions (verified by stress testing)
- No zombies under any circumstances
- All provided trace tests pass
Excellence (Going Above & Beyond):
- I/O redirection support
- Pipeline support
- Environment variable handling
- Command history with readline
- Shell scripting support
13. Real World Outcome
When you complete this project, you will have a fully functional interactive shell with job control. Here is exactly what running your shell will look like:
Basic Shell Session
$ ./tsh
tsh> /bin/echo hello world
hello world
tsh> /bin/ls -la
total 64
drwxr-xr-x 4 user user 4096 Dec 26 10:00 .
drwxr-xr-x 12 user user 4096 Dec 26 09:55 ..
-rwxr-xr-x 1 user user 16432 Dec 26 10:00 tsh
-rw-r--r-- 1 user user 8234 Dec 26 09:55 tsh.c
-rw-r--r-- 1 user user 2048 Dec 26 09:50 Makefile
tsh> quit
$
Background Jobs and Job Listing
tsh> /bin/sleep 30 &
[1] (45231) /bin/sleep 30 &
tsh> /bin/sleep 60 &
[2] (45232) /bin/sleep 60 &
tsh> /bin/sleep 90 &
[3] (45233) /bin/sleep 90 &
tsh> jobs
[1] (45231) Running /bin/sleep 30 &
[2] (45232) Running /bin/sleep 60 &
[3] (45233) Running /bin/sleep 90 &
tsh>
Signal Handling (Ctrl+C and Ctrl+Z)
tsh> /bin/sleep 100
^C
Job [1] (45234) terminated by signal 2
tsh> /bin/sleep 100
^Z
Job [2] (45235) stopped by signal 20
tsh> jobs
[2] (45235) Stopped /bin/sleep 100
tsh>
Job Control with fg and bg
tsh> /bin/sleep 100 &
[1] (45240) /bin/sleep 100 &
tsh> /bin/sleep 200 &
[2] (45241) /bin/sleep 200 &
tsh> /bin/cat
^Z
Job [3] (45242) stopped by signal 20
tsh> jobs
[1] (45240) Running /bin/sleep 100 &
[2] (45241) Running /bin/sleep 200 &
[3] (45242) Stopped /bin/cat
tsh> bg %3
[3] (45242) /bin/cat
tsh> jobs
[1] (45240) Running /bin/sleep 100 &
[2] (45241) Running /bin/sleep 200 &
[3] (45242) Running /bin/cat &
tsh> fg %1
(shell waits for sleep 100 to complete...)
^C
Job [1] (45240) terminated by signal 2
tsh>
Verbose Mode for Debugging
$ ./tsh -v
tsh> /bin/echo test
[VERBOSE] Parsed: argc=2, argv[0]=/bin/echo, argv[1]=test, bg=0
[VERBOSE] Not a builtin, forking...
[VERBOSE] Child (45250): setpgid(0, 0), restoring signals, execvp
[VERBOSE] Parent: blocking SIGCHLD
[VERBOSE] Parent: addjob(45250, FG, "/bin/echo test")
[VERBOSE] Parent: unblocking SIGCHLD
[VERBOSE] Parent: waitfg(45250) - waiting for foreground job
test
[VERBOSE] sigchld_handler: received SIGCHLD
[VERBOSE] sigchld_handler: waitpid returned 45250, WIFEXITED, status=0
[VERBOSE] sigchld_handler: deletejob(45250)
[VERBOSE] waitfg: foreground job 45250 no longer running
tsh>
Process Group Demonstration
tsh> /bin/ps -o pid,pgid,ppid,comm &
[1] (45260) /bin/ps -o pid,pgid,ppid,comm &
PID PGID PPID COMMAND
45260 45260 45200 ps
45200 45200 45199 tsh
tsh>
Notice: The child (ps) has PGID=45260, which equals its own PID. This is because the shell called setpgid(0, 0) to put the child in its own process group, separate from the shell.
Stress Testing Output
$ ./stress_test.sh
Running stress test: rapid background job creation/reaping...
Creating 100 rapid background jobs...
[โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ] 100%
Checking for zombies: ps aux | grep defunct
(no output - no zombies detected)
Signal bombardment test (50 SIGINT/SIGTSTP pairs)...
[โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ] 100%
Shell survived signal bombardment.
Race condition test (1000 fork-exit-reap cycles)...
[โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ] 100%
No zombies detected after race condition test.
STRESS TEST PASSED
14. The Core Question Youโre Answering
โHow does a Unix shell orchestrate multiple processes, handle signals correctly, and provide job control without race conditions or resource leaks?โ
The shell is the userโs interface to the operating system. Building one forces you to understand process groups, sessions, terminal control, signal handling, and race condition prevention - all concepts that are fundamental to systems programming. Every systems programmer should build a shell at least once to truly understand Unix process control.
15. Concepts You Must Understand First
Before starting this project, ensure you have a solid grasp of these foundational concepts:
| Concept | Where to Learn | Why It Matters |
|---|---|---|
| Process creation (fork/exec) | CS:APP 8.2-8.4 | Every command execution requires fork+exec |
| Process termination and reaping | CS:APP 8.4 | Must reap children to avoid zombies |
| Signal handling (sigaction) | CS:APP 8.5 | Handling SIGCHLD, SIGINT, SIGTSTP |
| Signal blocking (sigprocmask) | CS:APP 8.5.4 | Preventing race conditions |
| Async-signal-safe functions | CS:APP 8.5.5 | What you can call in handlers |
| Process groups (setpgid, getpgrp) | APUE Ch. 9 | Job control requires process groups |
| Terminal control (tcsetpgrp) | APUE Ch. 9 | Foreground process group management |
| waitpid options (WNOHANG, WUNTRACED) | man waitpid | Proper child state detection |
16. Questions to Guide Your Design
Work through these questions before writing any code:
-
Foreground Waiting: How will the shell wait for a foreground job to complete? Why is
while(fgpid(jobs) == pid) sleep(1);a bad approach? What should you use instead? -
Process Group Setup: Why does the child need to call setpgid(0, 0) to create its own process group? What would happen if it stayed in the shellโs process group?
-
Terminal Signals: When the user presses Ctrl+C, who receives SIGINT? How do you ensure only the foreground job receives it, not the shell itself?
-
Race Prevention: Draw a timeline showing how a fast-exiting child could cause a race between addjob() and the SIGCHLD handler. How does sigprocmask prevent this?
-
Handler Design: Should the SIGCHLD handler just reap children, or should it also print messages? What constraints does async-signal-safety impose?
-
Job ID Management: How will you assign job IDs? What happens when job 1 finishes but jobs 2 and 3 are still running? Does job 4 get ID 1 or ID 4?
17. Thinking Exercise
Before coding, trace through this scenario by hand:
The user runs the following commands in your shell:
tsh> /bin/sleep 100 &
tsh> /bin/sleep 200 &
tsh> fg %1
(user presses Ctrl+Z after 5 seconds)
tsh> bg %1
tsh> jobs
For each step, write down:
- What PIDs/PGIDs are created
- What the job table looks like
- Which process group is foreground
- What signals are sent/received
Solution (click to expand)
Step 1: /bin/sleep 100 &
- Shell forks, child PID = 5001
- Child: setpgid(0, 0), so PGID = 5001
- Parent: addjob(5001, BG, โsleep 100โ)
- Job table: [{jid=1, pid=5001, state=BG, cmd=โsleep 100โ}]
- Foreground PGID: shellโs own (say, 5000)
- Output:
[1] (5001) /bin/sleep 100 &
Step 2: /bin/sleep 200 &
- Shell forks, child PID = 5002
- Child: setpgid(0, 0), so PGID = 5002
- Parent: addjob(5002, BG, โsleep 200โ)
- Job table: [{1, 5001, BG}, {2, 5002, BG}]
- Output:
[2] (5002) /bin/sleep 200 &
Step 3: fg %1
- Shell looks up job 1: PID=5001
- Shell changes job state: state = FG
- Shell calls tcsetpgrp(STDIN, 5001) - job 1 is now foreground
- Shell sends SIGCONT to -5001 (though already running)
- Shell calls waitfg(5001) - waits for job to finish/stop
- User presses Ctrl+Z after 5 seconds
- Terminal sends SIGTSTP to PGID 5001
- Process 5001 stops
- Shellโs SIGCHLD handler runs:
- waitpid returns 5001 with WIFSTOPPED
- Updates job state to ST
- Prints:
Job [1] (5001) stopped by signal 20
- waitfg returns (fgpid is no longer 5001)
- Shell reclaims terminal: tcsetpgrp(STDIN, 5000)
Step 4: bg %1
- Shell looks up job 1: PID=5001, state=ST
- Shell sends SIGCONT to -5001
- Shell changes job state: state = BG
- Output:
[1] (5001) /bin/sleep 100 & - Shellโs SIGCHLD handler might run with WIFCONTINUED
Step 5: jobs
- Job table: [{1, 5001, BG}, {2, 5002, BG}]
- Output:
[1] (5001) Running /bin/sleep 100 & [2] (5002) Running /bin/sleep 200 &
18. The Interview Questions Theyโll Ask
After completing this project, you should be able to confidently answer these questions:
- โExplain exactly what happens when the user presses Ctrl+C in a shell.โ
- Terminal driver sends SIGINT to foreground process group
- Shell must be in different group to survive
- Shell may print message about terminated job
- โWhat is a process group and why do shells use them?โ
- Collection of processes for signal delivery
- Shells put each job in its own group
- Allows Ctrl+C to kill entire pipeline, not just one process
- โHow do you prevent the race condition between fork() and adding a job to the job table?โ
- Block SIGCHLD before fork
- Add job to table
- Unblock SIGCHLD
- Handler can now safely find the job
- โWhy canโt you just use pause() or sleep() to wait for a foreground job?โ
- Race: signal might arrive before pause/sleep
- Use sigsuspend() which atomically unblocks and waits
- โWhatโs the difference between SIGINT and SIGTSTP? How should a shell handle each?โ
- SIGINT: terminate (default); shell forwards to fg job
- SIGTSTP: stop (suspend); shell forwards to fg job
- Shell itself ignores both to survive
- โWhy must the child call setpgid(0, 0) before exec?โ
- To create its own process group
- So Ctrl+C only affects the job, not the shell
- Must happen before exec to avoid race with signal delivery
19. Hints in Layers
Use these hints progressively if you get stuck.
Hint Layer 1: Getting Started
- Start without signal handlers. Just fork, exec, waitpid (blocking) for foreground jobs.
- Add background jobs next: skip the waitpid, print โ[n] (pid) cmdโ
- Test with
/bin/echo,/bin/ls,/bin/sleep 5 &
Hint Layer 2: Signal Handling
// Safe signal handler template
void sigchld_handler(int sig) {
int olderrno = errno;
pid_t pid;
int status;
while ((pid = waitpid(-1, &status, WNOHANG | WUNTRACED)) > 0) {
// Handle status, update job table
}
errno = olderrno;
}
- For SIGINT/SIGTSTP handlers: find foreground job, send signal to its process group
Hint Layer 3: Race Prevention
// Safe fork pattern
sigset_t mask, prev;
sigemptyset(&mask);
sigaddset(&mask, SIGCHLD);
sigprocmask(SIG_BLOCK, &mask, &prev); // Block
pid = fork();
if (pid == 0) {
sigprocmask(SIG_SETMASK, &prev, NULL); // Restore in child
setpgid(0, 0);
// ... exec ...
}
addjob(pid, ...); // Safe: SIGCHLD blocked
sigprocmask(SIG_SETMASK, &prev, NULL); // Unblock
Hint Layer 4: Common Bugs
- Forgetting to restore default signal handlers in child (SIGINT, SIGTSTP)
- Using pause() instead of sigsuspend() in waitfg() - causes hangs
- Not using
-pid(negative) with kill() to send to process group - Printing job messages from signal handler (use write(), not printf())
- Not handling WIFCONTINUED in SIGCHLD handler (for bg command)
20. Books That Will Help
| Topic | Book | Specific Chapters |
|---|---|---|
| Process control fundamentals | CS:APP (3rd ed.) | Chapter 8 (entire chapter) |
| Signals and handlers | CS:APP (3rd ed.) | Chapter 8.5 |
| Job control architecture | Advanced Programming in the Unix Environment (Stevens) | Chapter 9 |
| Terminal I/O | Advanced Programming in the Unix Environment (Stevens) | Chapter 18 |
| Signal handling best practices | The Linux Programming Interface (Kerrisk) | Chapters 20-22 |
| Process groups and sessions | The Linux Programming Interface (Kerrisk) | Chapter 34 |
| Shell implementation | Operating Systems: Three Easy Pieces (OSTEP) | Chapters on processes |
| Practical shell building | Your Unix: The Ultimate Guide (Sumitabha Das) | Chapters on shell internals |
This guide was expanded from CSAPP_3E_DEEP_LEARNING_PROJECTS.md. For the complete learning path, see the project index.