Project 9: Job Control System

Create job control with foreground/background jobs and a job table.

Quick Reference

Attribute Value
Difficulty Level 4: Expert (The Systems Architect)
Time Estimate 2 weeks
Main Programming Language C
Alternative Programming Languages Rust, Go
Coolness Level Level 5: Pure Magic (Super Cool)
Business Potential 1. The “Resume Gold” (Educational/Personal Brand)
Prerequisites process groups, signals, pipelines
Key Topics tcsetpgrp, job table, fg/bg

1. Learning Objectives

By completing this project, you will:

  1. Explain and implement tcsetpgrp in the context of a shell.
  2. Build a working job control system that matches the project specification.
  3. Design tests that validate correctness and edge cases.
  4. Document design decisions, trade-offs, and limitations.

2. All Theory Needed (Per-Concept Breakdown)

Job Control, Process Groups, and Terminal Control

Fundamentals Job control lets a shell manage foreground and background jobs, stop and resume processes, and deliver signals to groups of processes. It is built on process groups and terminal control. A job is typically a pipeline; all processes in that pipeline share a process group ID (PGID). The terminal sends signals like SIGINT and SIGTSTP to the foreground process group. The shell must therefore move jobs between foreground and background and update the terminal’s foreground group with tcsetpgrp().

Deep Dive into the concept When you run a pipeline in the foreground, the shell should place all pipeline processes into a new process group and then set that group as the foreground process group for the terminal. This gives the job control of the terminal: Ctrl+C and Ctrl+Z are delivered to the job, not the shell. When the job completes or stops, the shell regains the terminal by setting its own process group as foreground again. If you fail to reclaim the terminal, the shell can no longer read input correctly.

Background jobs are similar, but the shell does not give them terminal control. Instead, the job runs with its own process group in the background. If the job attempts to read from the terminal, it may receive SIGTTIN, which typically stops it. Proper job control must handle this by either preventing background jobs from reading or by catching the stop and reporting it to the user. The shell keeps a job table mapping job IDs to process groups, PIDs, command lines, and state (running, stopped, done). Built-ins like jobs, fg, and bg inspect and manipulate this table.

Stopping and resuming jobs requires more signal coordination. When a user presses Ctrl+Z, the terminal sends SIGTSTP to the foreground process group. The shell should detect that the job stopped (via waitpid with WUNTRACED), mark it as stopped, and return control to the user. The fg command then sends SIGCONT to the job’s process group and moves it into the foreground, updating the terminal’s foreground PGID. This interplay between signals, waitpid flags, and terminal control is the essence of job control.

A critical detail is that process groups must be set in both the parent and the child to avoid race conditions. The typical pattern is: after fork(), in the child call setpgid(0, pgid); in the parent, call setpgid(child_pid, pgid) as well. This ensures that even if the child execs quickly, it ends up in the correct process group. Only then should the shell call tcsetpgrp() for foreground jobs. The shell should also ignore SIGTTOU when changing terminal control to avoid being stopped while attempting to manipulate the terminal.

Job control also intersects with pipelines. A pipeline is a single job, but it includes multiple PIDs. The shell must track all of them to know when the job is done. Some shells consider the job complete when the last process exits; others require all processes to finish. Using waitpid in a loop with WNOHANG and WUNTRACED allows you to update job state incrementally without blocking the shell.

How this fits on projects Job control is the defining feature of interactive shells and depends on correct pipeline and signal handling.

Definitions & key terms

  • Process group (PGID): A set of related processes.
  • Foreground job: Process group that controls the terminal.
  • Background job: Process group not controlling the terminal.
  • tcsetpgrp(): Set terminal foreground process group.

Mental model diagram

Shell PGID (foreground) --tcsetpgrp--> Job PGID
Ctrl+C -> terminal -> SIGINT -> Job PGID

How it works (step-by-step)

  1. Create a new process group for a pipeline.
  2. Set the job PGID in both parent and children.
  3. If foreground, call tcsetpgrp to hand over terminal.
  4. Wait for job completion or stop; update job table.
  5. Restore shell as foreground group when job ends.

Minimal concrete example

setpgid(child_pid, child_pid);
tcsetpgrp(shell_tty, child_pid);

Common misconceptions

  • “Background jobs can read the terminal” -> they are often stopped by SIGTTIN.
  • “Job == PID” -> jobs are process groups, often multiple PIDs.
  • “Only parent sets PGID” -> parent and child should both set it.

Check-your-understanding questions

  1. Why must the shell regain terminal control after a job finishes?
  2. What signal stops a background job that reads stdin?
  3. Why set process group IDs in both parent and child?

Check-your-understanding answers

  1. Otherwise the shell cannot read input from the terminal.
  2. SIGTTIN.
  3. To avoid races if the child execs before parent sets PGID.

Real-world applications

  • Interactive shells and terminal job management.
  • Terminal multiplexers like tmux.

Where you’ll apply it

  • In this project: see §4.1 High-Level Design and §5.10 Phase 2.
  • Also used in: None

References

  • “Advanced Programming in the UNIX Environment” (job control).
  • POSIX Terminal Interfaces.

Key insights Job control is process group control plus careful terminal ownership.

Summary A shell with job control can stop, resume, and manage multiple running programs without losing the terminal.

Homework/Exercises to practice the concept

  1. Create a pipeline and set a common process group for all children.
  2. Implement fg to resume a stopped job.
  3. Print job table updates when jobs stop or finish.

Solutions to the homework/exercises

  1. Use setpgid in both parent and child for each process.
  2. Send SIGCONT and call tcsetpgrp before waiting.
  3. Track job states and update on waitpid events.

Signals, Handlers, and Asynchronous Events

Fundamentals Signals are asynchronous notifications delivered to processes to indicate events like interrupts (Ctrl+C), terminal stop (Ctrl+Z), or child termination (SIGCHLD). A shell must carefully manage signal handling so that interactive control works as expected. Typically, the parent shell ignores certain signals (SIGINT, SIGTSTP) so it doesn’t die when the user presses Ctrl+C, while child processes restore default handlers so they can be interrupted. Correct signal behavior is essential for a usable shell.

Deep Dive into the concept Signals are delivered by the kernel and can interrupt normal control flow. In a shell, this is both powerful and dangerous. For interactive use, the shell should ignore SIGINT and SIGQUIT while idle so that Ctrl+C does not kill the shell itself. However, when running a foreground job, the terminal driver delivers SIGINT to the foreground process group, which should be the job, not the shell. This is why the shell must set process groups and manage terminal control. The shell should also handle SIGCHLD to detect when background jobs exit or stop. A SIGCHLD handler can reap children with waitpid(-1, WNOHANG) and update job state; however, signal handlers must be async-signal-safe, so any complex work should be deferred to the main loop using a self-pipe or flag.

Signal dispositions are inherited across fork() and exec(). If the parent ignores SIGINT, the child would also ignore it unless it resets to default. The correct pattern is: in the parent, ignore SIGINT/SIGTSTP; after fork(), in the child, restore default handlers before execve(). This ensures that interactive interrupts apply to the child program. Similarly, if the shell installs a SIGCHLD handler, the child should either reset it or ensure it does not interfere with program behavior.

Another subtlety is the signal mask. Signals can be blocked temporarily using sigprocmask. Shells often block SIGCHLD during critical sections (like updating the job table) to avoid race conditions. Without this, you can miss a child exit and leave a zombie or display a stale job status. The right approach is to block SIGCHLD, update shared data structures, then unblock and handle pending signals. This is a common pattern in robust shells.

Signal handling must also consider system calls that are interrupted. read() on stdin may return EINTR when a signal arrives. An interactive shell should treat this as a normal condition, maybe redisplay the prompt or re-read input. Using sigaction with SA_RESTART can reduce interruptions for some syscalls, but you must understand which calls will restart and which will not.

Finally, signal behavior influences scripts. In a script, a SIGINT should generally terminate the script unless handled. Some shells provide trap to install user handlers. If you plan to implement trap, you need to record handlers in shell state and run them when signals are received. Even without full trap support, you should at least ensure that signals terminate child processes as expected.

How this fits on projects Signals interact with job control, pipelines, and the line editor. They are central to responsive interactive behavior.

Definitions & key terms

  • Signal: Asynchronous notification delivered to a process.
  • SIGINT: Interrupt (Ctrl+C).
  • SIGTSTP: Terminal stop (Ctrl+Z).
  • SIGCHLD: Child process state change.
  • Signal disposition: How a process handles a signal (default, ignore, handler).

Mental model diagram

Ctrl+C -> terminal -> SIGINT -> foreground process group

How it works (step-by-step)

  1. Parent shell sets signal dispositions (ignore SIGINT/SIGTSTP).
  2. Before exec, child restores default handlers.
  3. Shell installs SIGCHLD handler to reap background jobs.
  4. Use sigprocmask to block SIGCHLD while updating jobs.
  5. Handle EINTR or use SA_RESTART as appropriate.

Minimal concrete example

struct sigaction sa = {0};
sa.sa_handler = SIG_IGN;
sigaction(SIGINT, &sa, NULL);

Common misconceptions

  • “Signals are synchronous” -> they interrupt at arbitrary times.
  • “Ignore in parent means ignore everywhere” -> children inherit unless reset.
  • “SIGCHLD handler can do anything” -> only async-signal-safe actions are allowed.

Check-your-understanding questions

  1. Why must children reset SIGINT to default before exec?
  2. What is the purpose of SIGCHLD in a shell?
  3. Why block SIGCHLD while updating the job table?

Check-your-understanding answers

  1. So user interrupts affect child programs rather than the shell.
  2. To detect child exits and avoid zombies.
  3. To prevent race conditions between handler and main loop.

Real-world applications

  • Interactive shells and REPLs.
  • Process supervisors reacting to child exits.
  • Terminal-based applications handling Ctrl+C.

Where you’ll apply it

References

  • “Advanced Programming in the UNIX Environment” (signals).
  • POSIX signal semantics.

Key insights Signals are asynchronous; robust shells tame them with careful masking and reset rules.

Summary Correct signal handling separates a responsive interactive shell from an unusable one and prevents zombies or unkillable processes.

Homework/Exercises to practice the concept

  1. Install a SIGINT handler that prints a message in the parent.
  2. Fork a child and ensure Ctrl+C terminates the child, not the parent.
  3. Implement a SIGCHLD handler that reaps children.

Solutions to the homework/exercises

  1. Use sigaction with SIG_IGN or a custom handler.
  2. Reset the handler to default in the child before exec.
  3. Call waitpid(-1, &status, WNOHANG) in the handler.

Pipes, Pipelines, and Dataflow

Fundamentals Pipes connect the stdout of one process to the stdin of another, allowing command composition. A pipeline like ls | grep foo | wc -l is not a single process but a set of processes connected by pipe file descriptors. The shell is responsible for creating the pipes, forking each process, wiring their file descriptors with dup2(), and closing unused pipe ends so that EOF is delivered correctly. Pipelines are the heart of Unix composition, so their correctness affects nearly every interactive session and script.

Deep Dive into the concept A pipeline of N commands requires N-1 pipes. Each pipe is a pair of file descriptors: a read end and a write end. The shell typically creates all pipes before forking, or it creates one pipe at a time in a loop. For each command in the pipeline, the shell forks a child and then duplicates the appropriate pipe ends to STDIN_FILENO or STDOUT_FILENO. The first command gets its stdout connected to the write end of pipe 0, the last command gets its stdin from the read end of the last pipe, and middle commands connect both stdin and stdout to adjacent pipes. After dup2(), the child must close all unused pipe ends; otherwise, pipes will stay open and readers will never see EOF.

Pipeline execution also involves process groups and job control. In an interactive shell, all processes in a pipeline should belong to the same process group so that terminal signals (like Ctrl+C) affect the entire pipeline. This means the shell must set a process group ID (PGID) for the pipeline, typically using the PID of the first process. If the pipeline runs in the foreground, the shell should hand terminal control to that process group using tcsetpgrp(), then wait for the pipeline to finish. If the pipeline runs in the background, the shell should not give terminal control and should immediately return to the prompt.

There are subtle ordering constraints. The shell should fork children in left-to-right order to preserve the expected behavior of commands that immediately read from stdin. If a later command starts before the earlier one has connected its output, you can get unexpected hangs. Similarly, if the parent closes pipe ends too early, children may inherit invalid descriptors. A robust implementation closes pipe ends in the parent after forking each child; the parent only needs to keep track of PIDs and perhaps the read end for the next iteration.

Error propagation in pipelines is nuanced. If a command in the middle fails to execute, the pipeline may still produce output or may break. Some shells choose to terminate the entire pipeline when a child fails to exec, while others allow remaining children to run and report their own errors. You should document and test your chosen behavior. For exit status, most shells report the status of the last command in the pipeline, while “pipefail” mode reports a non-zero status if any command fails. Even if you do not implement pipefail, your implementation should be structured so it can be added later.

Pipelines also create a performance consideration: each command becomes a separate process with its own address space. The shell must avoid heavy work in the pipeline setup path, as this is in the interactive hot loop. Efficient pipeline setup uses simple loops, avoids unnecessary heap allocations, and ensures pipes are created only as needed.

How this fits on projects Pipelines are a core shell feature and interact with redirection, job control, and signal handling.

Definitions & key terms

  • Pipe: Kernel buffer connecting a write end to a read end.
  • Pipeline: A sequence of commands connected by pipes.
  • Process group: A set of processes treated as a unit for signals.
  • EOF: End-of-file signaled when all write ends are closed.

Mental model diagram

cmd1 stdout -> [pipe] -> cmd2 stdout -> [pipe] -> cmd3

How it works (step-by-step)

  1. Parse pipeline into an ordered list of commands.
  2. Create a pipe for each adjacent pair.
  3. Fork each command; in child, wire stdin/stdout with dup2.
  4. Close unused pipe ends in both child and parent.
  5. Set process group for the pipeline if job control is enabled.
  6. Wait for foreground pipeline or record background job.

Minimal concrete example

pipe(p);
if (fork()==0) { dup2(p[1], 1); execvp("ls", argv1); }
if (fork()==0) { dup2(p[0], 0); execvp("wc", argv2); }

Common misconceptions

  • “Pipelines run in one process” -> each command is its own process.
  • “EOF happens when a command exits” -> only when all writers close.
  • “Pipelines are just redirections” -> they require process orchestration.

Check-your-understanding questions

  1. Why must unused pipe ends be closed in every child?
  2. Why should pipeline processes share a process group?
  3. What happens if the parent keeps a write end open?

Check-your-understanding answers

  1. Otherwise readers never see EOF and may hang.
  2. So signals like Ctrl+C affect the entire pipeline.
  3. The reader will block forever because the pipe never closes.

Real-world applications

  • Unix shell pipelines (ps aux | grep, etc.).
  • Data processing in build systems.
  • Streaming log processing tools.

Where you’ll apply it

References

  • “Advanced Programming in the UNIX Environment” (pipes and process groups).
  • POSIX Shell Command Language (pipeline semantics).

Key insights A pipeline is process orchestration plus careful file descriptor wiring.

Summary Pipelines turn individual commands into composable dataflow networks and require precise pipe management to avoid deadlocks.

Homework/Exercises to practice the concept

  1. Build a two-command pipeline and print PIDs.
  2. Add a third command and verify EOF behavior.
  3. Experiment with leaving a pipe open to observe hanging behavior.

Solutions to the homework/exercises

  1. Use pipe, fork, dup2, and exec in a loop.
  2. Add an extra pipe and wire it between the second and third commands.
  3. Skip closing a write end and observe the reader blocking.

3. Project Specification

3.1 What You Will Build

A job control system that can stop, resume, and list jobs with fg and bg.

Included:

  • Core feature set described above
  • Deterministic CLI behavior and exit codes

Excluded:

  • Advanced job notifications optional.

3.2 Functional Requirements

  1. Requirement 1: Create a process group for each pipeline job.
  2. Requirement 2: Manage foreground/background transitions with tcsetpgrp.
  3. Requirement 3: Implement jobs, fg, and bg built-ins.
  4. Requirement 4: Track job state (running, stopped, done).
  5. Requirement 5: Handle SIGTSTP and SIGCONT correctly.

3.3 Non-Functional Requirements

  • Performance: Interactive latency under 50ms for typical inputs; pipeline setup should scale linearly.
  • Reliability: No crashes on malformed input; errors reported clearly with non-zero status.
  • Usability: Clear prompts, deterministic behavior, and predictable error messages.

3.4 Example Usage / Output

$ ./mysh
mysh> sleep 30 &
[1] 12345
mysh> jobs
[1] Running sleep 30
mysh> fg %1
sleep 30
^Z
[1] Stopped sleep 30
mysh> bg %1
[1] sleep 30 &

3.5 Data Formats / Schemas / Protocols

  • Job table entries: job id, PGID, command, state, pids.

3.6 Edge Cases

  • Stopped jobs with multiple PIDs
  • Background job reading stdin
  • Race between SIGCHLD and wait

3.7 Real World Outcome

This is the exact behavior you should be able to demonstrate.

3.7.1 How to Run (Copy/Paste)

  • make
  • ./mysh

3.7.2 Golden Path Demo (Deterministic)

$ ./mysh
mysh> sleep 30 &
[1] 12345
mysh> jobs
[1] Running sleep 30
mysh> fg %1
sleep 30
^Z
[1] Stopped sleep 30
mysh> bg %1
[1] sleep 30 &

3.7.3 Failure Demo (Deterministic)

$ ./mysh
mysh> not_a_command
mysh> echo $?
127

4. Solution Architecture

4.1 High-Level Design

[Input] -> [Parser/Lexer] -> [Core Engine] -> [Executor/Output]

4.2 Key Components

Component Responsibility Key Decisions
Job Table Track job metadata and state Central list with IDs.
Terminal Control tcsetpgrp for fg jobs Ensures shell regains terminal.
Job Builtins jobs/fg/bg handlers Operate on job table.

4.4 Data Structures (No Full Code)

struct Job { int id; pid_t pgid; int state; char *cmdline; };

4.4 Algorithm Overview

Key Algorithm: State Update

  1. waitpid with WUNTRACED
  2. update job state

Complexity Analysis:

  • Time: O(jobs) per update
  • Space: O(jobs) per update

5. Implementation Guide

5.1 Development Environment Setup

# install dependencies (if any)
# build
make

5.2 Project Structure

project-root/
├── src/
│   ├── main.c
│   ├── lexer.c
│   └── executor.c
├── tests/
│   └── test_basic.sh
├── Makefile
└── README.md

5.3 The Core Question You’re Answering

How does a shell pause, resume, and manage multiple concurrent jobs?

5.4 Concepts You Must Understand First

Stop and research these before coding:

  1. Process groups and sessions
  2. Terminal foreground group
  3. SIGCHLD and job status

5.5 Questions to Guide Your Design

5.6 Thinking Exercise

The “Foreground Swap” Problem

Describe the exact sequence of syscalls when running fg on a stopped job.

5.7 The Interview Questions They’ll Ask

5.8 Hints in Layers

Hint 1: Make the shell its own process group Call setpgid(0, 0) at startup.

Hint 2: Create a new group for each pipeline All processes in a pipeline share a PGID.

Hint 3: Use tcsetpgrp to hand over terminal Set terminal fg group to the job PGID.

Hint 4: Track job status in SIGCHLD Use waitpid with WUNTRACED.

5.9 Books That Will Help

Topic Book Chapter
Job control “Advanced Programming in the UNIX Environment” Ch. 9
Signals “The Linux Programming Interface” Ch. 22
Terminal control POSIX tcsetpgrp Spec

5.10 Implementation Phases

Phase 1: Foundation (2-3 days)

Goals:

  • Define data structures and interfaces
  • Build a minimal end-to-end demo

Tasks:

  1. Implement the core data structures
  2. Build a tiny CLI or harness for manual tests

Checkpoint: A demo command runs end-to-end with clear logging.

Phase 2: Core Functionality (1 week)

Goals:

  • Implement full feature set
  • Validate with unit tests

Tasks:

  1. Implement core requirements
  2. Add error handling and edge cases

Checkpoint: All functional requirements pass basic tests.

Phase 3: Polish & Edge Cases (2-4 days)

Goals:

  • Harden for weird inputs
  • Improve UX and documentation

Tasks:

  1. Add edge-case tests
  2. Document design decisions

Checkpoint: Deterministic golden demo and clean error output.

5.11 Key Implementation Decisions

Decision Options Recommendation Rationale
Parsing depth Minimal vs full Incremental Start small, expand safely
Error policy Silent vs verbose Verbose Debuggability for learners

6. Testing Strategy

6.1 Test Categories

Category Purpose Examples
Unit Tests Test individual components Tokenizer, matcher, env builder
Integration Tests Test component interactions Full command lines
Edge Case Tests Handle boundary conditions Empty input, bad args

6.2 Critical Test Cases

  1. Golden Path: Run the canonical demo and verify output.
  2. Failure Path: Provide invalid input and confirm error status.
  3. Stress Path: Run repeated commands to detect leaks or state corruption.

6.3 Test Data

input: echo hello
output: hello

7. Common Pitfalls & Debugging

7.1 Frequent Mistakes

Pitfall Symptom Solution
Misordered redirection Output goes to wrong place Apply redirections left-to-right
Leaked file descriptors Commands hang waiting for EOF Close unused fds in parent/child
Incorrect exit status &&/|| behave wrong Use waitpid macros correctly

7.2 Debugging Strategies

  • Trace syscalls: Use strace/dtruss to verify fork/exec/dup2 order.
  • Log state transitions: Print parser states and job table changes in debug mode.
  • Compare with dash: Run the same input in a reference shell.

7.3 Performance Traps

  • Avoid O(n^2) behavior in hot paths like line editing.
  • Minimize allocations inside the REPL loop.

8. Extensions & Challenges

8.1 Beginner Extensions

  • Add a help built-in with usage docs.
  • Add colored prompt themes.

8.2 Intermediate Extensions

  • Add a simple profiling mode for command timing.
  • Implement a which built-in using PATH lookup.

8.3 Advanced Extensions

  • Add programmable completion or plugin system.
  • Add a scriptable test harness with golden outputs.

9. Real-World Connections

9.1 Industry Applications

  • Build systems: shells orchestrate compilation and test pipelines.
  • DevOps automation: scripts manage deployments and infrastructure.
  • bash: The most common interactive shell.
  • dash: Minimal POSIX shell often used as /bin/sh.
  • zsh: Feature-rich interactive shell.

9.3 Interview Relevance

  • Process creation and lifecycle questions.
  • Parsing and system programming design trade-offs.

10. Resources

10.1 Essential Reading

  • “Advanced Programming in the UNIX Environment” by W. Richard Stevens - focus on the chapters relevant to this project.
  • “Advanced Programming in the UNIX Environment” - process control and pipes.

10.2 Video Resources

  • Unix process model lectures (any OS course).
  • Compiler front-end videos for lexing/parsing projects.

10.3 Tools & Documentation

  • strace/dtruss: inspect syscalls.
  • man pages: fork, execve, waitpid, pipe, dup2.

11. Self-Assessment Checklist

11.1 Understanding

  • I can explain the core concept without notes.
  • I can trace a command through my subsystem.
  • I understand at least one key design trade-off.

11.2 Implementation

  • All functional requirements are met.
  • All critical tests pass.
  • Edge cases are handled cleanly.

11.3 Growth

  • I documented lessons learned.
  • I can explain this project in an interview.

12. Submission / Completion Criteria

Minimum Viable Completion:

  • Core feature works for the golden demo.
  • Errors are handled with non-zero status.
  • Code is readable and buildable.

Full Completion:

  • All functional requirements met.
  • Tests cover edge cases and failures.

Excellence (Going Above & Beyond):

  • Performance benchmarks and clear documentation.
  • Behavior compared against a reference shell.