Project 8: Signal Handler

Implement signal handling for interactive shells and child process management.

Quick Reference

Attribute	Value
Difficulty	Level 3: Advanced (The Engineer)
Time Estimate	1 week
Main Programming Language	C
Alternative Programming Languages	Rust, Go, Zig
Coolness Level	Level 4: Hardcore Tech Flex
Business Potential	1. The “Resume Gold” (Educational/Personal Brand)
Prerequisites	signals, sigaction, fork/exec
Key Topics	signal dispositions, SIGCHLD reaping, EINTR handling

1. Learning Objectives

By completing this project, you will:

Explain and implement signal dispositions in the context of a shell.
Build a working signal handler that matches the project specification.
Design tests that validate correctness and edge cases.
Document design decisions, trade-offs, and limitations.

2. All Theory Needed (Per-Concept Breakdown)

Signals, Handlers, and Asynchronous Events

Fundamentals Signals are asynchronous notifications delivered to processes to indicate events like interrupts (Ctrl+C), terminal stop (Ctrl+Z), or child termination (SIGCHLD). A shell must carefully manage signal handling so that interactive control works as expected. Typically, the parent shell ignores certain signals (SIGINT, SIGTSTP) so it doesn’t die when the user presses Ctrl+C, while child processes restore default handlers so they can be interrupted. Correct signal behavior is essential for a usable shell.

Deep Dive into the concept Signals are delivered by the kernel and can interrupt normal control flow. In a shell, this is both powerful and dangerous. For interactive use, the shell should ignore SIGINT and SIGQUIT while idle so that Ctrl+C does not kill the shell itself. However, when running a foreground job, the terminal driver delivers SIGINT to the foreground process group, which should be the job, not the shell. This is why the shell must set process groups and manage terminal control. The shell should also handle SIGCHLD to detect when background jobs exit or stop. A SIGCHLD handler can reap children with waitpid(-1, WNOHANG) and update job state; however, signal handlers must be async-signal-safe, so any complex work should be deferred to the main loop using a self-pipe or flag.

Signal dispositions are inherited across fork() and exec(). If the parent ignores SIGINT, the child would also ignore it unless it resets to default. The correct pattern is: in the parent, ignore SIGINT/SIGTSTP; after fork(), in the child, restore default handlers before execve(). This ensures that interactive interrupts apply to the child program. Similarly, if the shell installs a SIGCHLD handler, the child should either reset it or ensure it does not interfere with program behavior.

Another subtlety is the signal mask. Signals can be blocked temporarily using sigprocmask. Shells often block SIGCHLD during critical sections (like updating the job table) to avoid race conditions. Without this, you can miss a child exit and leave a zombie or display a stale job status. The right approach is to block SIGCHLD, update shared data structures, then unblock and handle pending signals. This is a common pattern in robust shells.

Signal handling must also consider system calls that are interrupted. read() on stdin may return EINTR when a signal arrives. An interactive shell should treat this as a normal condition, maybe redisplay the prompt or re-read input. Using sigaction with SA_RESTART can reduce interruptions for some syscalls, but you must understand which calls will restart and which will not.

Finally, signal behavior influences scripts. In a script, a SIGINT should generally terminate the script unless handled. Some shells provide trap to install user handlers. If you plan to implement trap, you need to record handlers in shell state and run them when signals are received. Even without full trap support, you should at least ensure that signals terminate child processes as expected.

How this fits on projects Signals interact with job control, pipelines, and the line editor. They are central to responsive interactive behavior.

Definitions & key terms

Signal: Asynchronous notification delivered to a process.
SIGINT: Interrupt (Ctrl+C).
SIGTSTP: Terminal stop (Ctrl+Z).
SIGCHLD: Child process state change.
Signal disposition: How a process handles a signal (default, ignore, handler).

Mental model diagram

Ctrl+C -> terminal -> SIGINT -> foreground process group

How it works (step-by-step)

Parent shell sets signal dispositions (ignore SIGINT/SIGTSTP).
Before exec, child restores default handlers.
Shell installs SIGCHLD handler to reap background jobs.
Use sigprocmask to block SIGCHLD while updating jobs.
Handle EINTR or use SA_RESTART as appropriate.

Minimal concrete example

struct sigaction sa = {0};
sa.sa_handler = SIG_IGN;
sigaction(SIGINT, &sa, NULL);

Common misconceptions

“Signals are synchronous” -> they interrupt at arbitrary times.
“Ignore in parent means ignore everywhere” -> children inherit unless reset.
“SIGCHLD handler can do anything” -> only async-signal-safe actions are allowed.

Check-your-understanding questions

Why must children reset SIGINT to default before exec?
What is the purpose of SIGCHLD in a shell?
Why block SIGCHLD while updating the job table?

Check-your-understanding answers

So user interrupts affect child programs rather than the shell.
To detect child exits and avoid zombies.
To prevent race conditions between handler and main loop.

Real-world applications

Interactive shells and REPLs.
Process supervisors reacting to child exits.
Terminal-based applications handling Ctrl+C.

Where you’ll apply it

In this project: see §4.1 High-Level Design and §6 Testing Strategy.
Also used in: P09 Job Control System, P11 Line Editor (Mini-Readline)

References

“Advanced Programming in the UNIX Environment” (signals).
POSIX signal semantics.

Key insights Signals are asynchronous; robust shells tame them with careful masking and reset rules.

Summary Correct signal handling separates a responsive interactive shell from an unusable one and prevents zombies or unkillable processes.

Homework/Exercises to practice the concept

Install a SIGINT handler that prints a message in the parent.
Fork a child and ensure Ctrl+C terminates the child, not the parent.
Implement a SIGCHLD handler that reaps children.

Solutions to the homework/exercises

Use sigaction with SIG_IGN or a custom handler.
Reset the handler to default in the child before exec.
Call waitpid(-1, &status, WNOHANG) in the handler.

Process Creation and Exec Lifecycle

Fundamentals A Unix shell is a long-running parent process that repeatedly creates child processes to run external commands. The split between fork() and execve() is the foundation: fork() clones the current process so the child inherits memory, file descriptors, environment, and current working directory, while execve() replaces the child image with a new program. This separation is why a shell can set up redirections, pipelines, and signal dispositions before launching a program. It is also why built-ins must run in the parent: only the parent can change the shell’s own state (like its directory or variables). If you understand when the parent waits, when it does not, and what the child inherits, you can predict how any shell command behaves. This concept is the root of process orchestration and almost every other shell feature.

Deep Dive into the concept The process lifecycle in a shell is a choreography between parent and child processes that must be deterministic, observable, and correct under failure. When the shell reads a command, it first decides whether the command is a built-in or an external program. Built-ins execute in the parent and therefore can mutate shell state. For external commands, the shell calls fork(). Internally, fork() creates a new task by duplicating the parent’s address space, file descriptor table, signal dispositions, and working directory. Modern kernels implement this with copy-on-write, so the child’s memory is not physically copied until it changes. From the shell’s perspective, fork() returns twice: once in the parent with the child PID, and once in the child with return value 0. This dual return is what allows the same code path to branch into parent logic versus child logic.

Once in the child, the shell must prepare the execution environment. This is where file descriptor wiring happens: dup2() to connect pipes or redirected files onto STDIN_FILENO, STDOUT_FILENO, and STDERR_FILENO; close() to remove unused descriptors; and chdir() if the command is a subshell with a different working directory. The child must also reset signal handlers to default for signals like SIGINT and SIGTSTP if the parent shell ignores them. Failure to do this leaves child programs “immune” to Ctrl+C because they inherit the shell’s ignored handlers. The child may also join or create a process group when pipelines or job control are involved, which matters for terminal control and signal delivery. Only after the environment is correct does the child call execve() (or execvp() for PATH lookup). At that point, the program image is replaced; the child’s memory, stack, and code become the new program, but the file descriptor table and environment remain as you configured them.

The parent does not disappear. It either waits for the child (foreground execution) or returns immediately (background execution). Waiting is done with waitpid(), which reports how the child finished: a normal exit (WIFEXITED) with an exit code, or a signal termination (WIFSIGNALED) with the terminating signal. Shells interpret these status codes to update $? and to print diagnostic messages like “Terminated by signal 9”. A robust shell handles interrupted waits (EINTR) and reaps all children to avoid zombies. In interactive shells, a SIGCHLD handler often records child state changes and wakes the main loop so that completed background jobs are announced promptly.

Failure handling is a central part of the lifecycle. If fork() fails (out of memory or process limit), the shell must report an error and continue running. If execve() fails, the child must print an error and exit with a defined status (commonly 127 for “command not found” and 126 for “found but not executable”). This behavior is relied upon by scripts, so the shell must be consistent. The parent should not attempt to recover from a failed exec by continuing in the child; the child must exit to avoid running shell code in an unexpected state.

Finally, remember that the execution environment is more than variables: it includes umask, current directory, signal mask, resource limits, and open file descriptors. A shell that incorrectly preserves or resets any of these will behave differently from the system shells you are comparing against. For example, if you forget to set close-on-exec on internal file descriptors, a child process might inherit unexpected descriptors, causing hangs (pipes never closing) or security leaks (files exposed). These subtle lifecycle details distinguish toy shells from robust ones.

How this fits on projects This concept is central to command execution, pipelines, redirection, and job control, so it appears in almost every project that launches external programs.

Definitions & key terms

fork(): Clone the current process into a child process.
execve(): Replace the current process image with a new program.
waitpid(): Wait for a child process to change state.
Zombie: A terminated child that has not been reaped.
Copy-on-write: Memory optimization where pages are copied only when written.

Mental model diagram

Parent Shell
   |
   | fork()
   v
Child Shell -- set fds/signals -- execve("/bin/ls")
   |
   | exit(status)
   v
Parent waits -> collects status -> updates $?

How it works (step-by-step)

Parse the command into a simple command node.
Classify: built-in/function vs external.
Fork a child if external. Invariant: parent must not block unless foreground.
Child setup: apply redirections, reset signals, set process group if needed.
Exec the program image. Failure mode: execve returns with errno.
Parent waits for foreground child or records job for background.
Update $? and job table; reap zombies. Failure mode: missed waitpid().

Minimal concrete example

pid_t pid = fork();
if (pid == 0) {
// Child: replace image
execlp("ls", "ls", "-l", NULL);
perror("exec failed");
_exit(127);
}
int status;
waitpid(pid, &status, 0);
printf("status=%d\n", WEXITSTATUS(status));

Common misconceptions

“fork runs the program” -> fork() only clones; exec() runs the program.
“exit status is boolean” -> only 0 is success; non-zero encodes errors.
“child changes affect parent” -> changes are isolated after fork().

Check-your-understanding questions

Why must a shell use fork() before execve() for external commands?
What happens if a parent never calls waitpid() for a child?
Why do shells reset signal handlers in the child before exec()?

Check-your-understanding answers

The shell must keep running; execve() replaces the current process.
The child becomes a zombie until it is reaped.
Otherwise the child inherits ignored signals and cannot be controlled.

Real-world applications

Interactive shells (bash, dash, zsh).
Process supervisors and daemons that spawn workers.
Build systems that run many external commands.

Where you’ll apply it

In this project: see §3.2 Functional Requirements and §5.10 Phase 2.
Also used in: P01 Minimal Command Executor, P05 Pipeline System, P06 I/O Redirection Engine, P17 Capstone - Your Own Shell

References

“Advanced Programming in the UNIX Environment” (Process Control).
“The Linux Programming Interface” (Process and exec chapters).
POSIX Shell Command Language (execution environment).

Key insights A shell is primarily a process orchestrator, not a program runner.

Summary Understanding the fork/exec lifecycle gives you the ability to predict how shell commands behave and why the shell can keep control while running external programs.

Homework/Exercises to practice the concept

Write a launcher that runs a command and prints the exit status.
Add a flag to run the command in the background without waiting.
Use strace -f or dtruss to observe fork/exec/wait.

Solutions to the homework/exercises

Use fork(), execvp(), waitpid(), and WEXITSTATUS.
Skip waitpid() for background and add a SIGCHLD reaper.
Trace system calls and confirm the sequence matches your mental model.

3. Project Specification

3.1 What You Will Build

A signal subsystem that handles SIGINT/SIGTSTP in the parent and resets handlers in children.

Included:

Core feature set described above
Deterministic CLI behavior and exit codes

Excluded:

Full trap support optional.

3.2 Functional Requirements

Requirement 1: Ignore or handle SIGINT/SIGTSTP in the parent shell.
Requirement 2: Reset signal handlers to default in child processes.
Requirement 3: Implement SIGCHLD handler to reap background children.
Requirement 4: Handle EINTR in input reading and waiting.
Requirement 5: Provide a debug mode to log signal events.

3.3 Non-Functional Requirements

Performance: Interactive latency under 50ms for typical inputs; pipeline setup should scale linearly.
Reliability: No crashes on malformed input; errors reported clearly with non-zero status.
Usability: Clear prompts, deterministic behavior, and predictable error messages.

3.4 Example Usage / Output

$ ./mysh
mysh> sleep 10
^C
mysh> sleep 10
^Z
[1] Stopped  sleep 10
mysh> jobs
[1] Stopped  sleep 10

3.5 Data Formats / Schemas / Protocols

Signal configuration stored in shell state.

3.6 Edge Cases

SIGCHLD storms
Interrupted read
Child inherits ignored SIGINT

3.7 Real World Outcome

This is the exact behavior you should be able to demonstrate.

3.7.1 How to Run (Copy/Paste)

make
./mysh

3.7.2 Golden Path Demo (Deterministic)

$ ./mysh
mysh> sleep 10
^C
mysh> sleep 10
^Z
[1] Stopped  sleep 10
mysh> jobs
[1] Stopped  sleep 10

3.7.3 Failure Demo (Deterministic)

$ ./mysh
mysh> not_a_command
mysh> echo $?
127

4. Solution Architecture

4.1 High-Level Design

[Input] -> [Parser/Lexer] -> [Core Engine] -> [Executor/Output]

4.2 Key Components

Component	Responsibility	Key Decisions
Signal Setup	Configures dispositions in parent	Uses sigaction.
Child Reset	Restores defaults before exec	Prevents inherited ignores.
SIGCHLD Reaper	Non-blocking wait in handler	Avoid zombies.

4.4 Data Structures (No Full Code)

struct SigState { int interactive; };

4.4 Algorithm Overview

Key Algorithm: Reap Loop

waitpid(-1, WNOHANG)
update job table

Complexity Analysis:

Time: O(k) per signal
Space: O(k) per signal

5. Implementation Guide

5.1 Development Environment Setup

# install dependencies (if any)
# build
make

5.2 Project Structure

project-root/
├── src/
│   ├── main.c
│   ├── lexer.c
│   └── executor.c
├── tests/
│   └── test_basic.sh
├── Makefile
└── README.md

5.3 The Core Question You’re Answering

How does a shell remain alive while signals terminate or stop child processes?

5.4 Concepts You Must Understand First

Stop and research these before coding:

Signal handling
SIGCHLD
Terminal signals

5.5 Questions to Guide Your Design

5.6 Thinking Exercise

The “Stuck SIGCHLD” Problem

What happens if you forget to reap children after they exit?

5.7 The Interview Questions They’ll Ask

5.8 Hints in Layers

Hint 1: Use sigaction It provides reliable semantics and options.

Hint 2: Ignore SIGINT/SIGTSTP in parent But reset to defaults in children.

Hint 3: Reap with waitpid(-1, ...) Use WNOHANG inside SIGCHLD handler.

Hint 4: Restore terminal state After an interrupt, redraw the prompt.

5.9 Books That Will Help

Topic	Book	Chapter
Signals	“Advanced Programming in the UNIX Environment”	Ch. 10
SIGCHLD	“The Linux Programming Interface”	Ch. 22
Job control	POSIX Shell Command Language	Job control

5.10 Implementation Phases

Phase 1: Foundation (2-3 days)

Goals:

Define data structures and interfaces
Build a minimal end-to-end demo

Tasks:

Implement the core data structures
Build a tiny CLI or harness for manual tests

Checkpoint: A demo command runs end-to-end with clear logging.

Phase 2: Core Functionality (1 week)

Goals:

Implement full feature set
Validate with unit tests

Tasks:

Implement core requirements
Add error handling and edge cases

Checkpoint: All functional requirements pass basic tests.

Phase 3: Polish & Edge Cases (2-4 days)

Goals:

Harden for weird inputs
Improve UX and documentation

Tasks:

Add edge-case tests
Document design decisions

Checkpoint: Deterministic golden demo and clean error output.

5.11 Key Implementation Decisions

Decision	Options	Recommendation	Rationale
Parsing depth	Minimal vs full	Incremental	Start small, expand safely
Error policy	Silent vs verbose	Verbose	Debuggability for learners

6. Testing Strategy

6.1 Test Categories

Category	Purpose	Examples
Unit Tests	Test individual components	Tokenizer, matcher, env builder
Integration Tests	Test component interactions	Full command lines
Edge Case Tests	Handle boundary conditions	Empty input, bad args

6.2 Critical Test Cases

Golden Path: Run the canonical demo and verify output.
Failure Path: Provide invalid input and confirm error status.
Stress Path: Run repeated commands to detect leaks or state corruption.

6.3 Test Data

input: echo hello
output: hello

7. Common Pitfalls & Debugging

7.1 Frequent Mistakes

Pitfall	Symptom	Solution
Misordered redirection	Output goes to wrong place	Apply redirections left-to-right
Leaked file descriptors	Commands hang waiting for EOF	Close unused fds in parent/child
Incorrect exit status	`&&`/`\|\|` behave wrong	Use waitpid macros correctly

7.2 Debugging Strategies

Trace syscalls: Use strace/dtruss to verify fork/exec/dup2 order.
Log state transitions: Print parser states and job table changes in debug mode.
Compare with dash: Run the same input in a reference shell.

7.3 Performance Traps

Avoid O(n^2) behavior in hot paths like line editing.
Minimize allocations inside the REPL loop.

8. Extensions & Challenges

8.1 Beginner Extensions

Add a help built-in with usage docs.
Add colored prompt themes.

8.2 Intermediate Extensions

Add a simple profiling mode for command timing.
Implement a which built-in using PATH lookup.

8.3 Advanced Extensions

Add programmable completion or plugin system.
Add a scriptable test harness with golden outputs.

9. Real-World Connections

9.1 Industry Applications

Build systems: shells orchestrate compilation and test pipelines.
DevOps automation: scripts manage deployments and infrastructure.

bash: The most common interactive shell.
dash: Minimal POSIX shell often used as /bin/sh.
zsh: Feature-rich interactive shell.

9.3 Interview Relevance

Process creation and lifecycle questions.
Parsing and system programming design trade-offs.

10. Resources

10.1 Essential Reading

“Advanced Programming in the UNIX Environment” by W. Richard Stevens - focus on the chapters relevant to this project.
“Advanced Programming in the UNIX Environment” - process control and pipes.

10.2 Video Resources

Unix process model lectures (any OS course).
Compiler front-end videos for lexing/parsing projects.

10.3 Tools & Documentation

strace/dtruss: inspect syscalls.
man pages: fork, execve, waitpid, pipe, dup2.

11. Self-Assessment Checklist

11.1 Understanding

I can explain the core concept without notes.
I can trace a command through my subsystem.
I understand at least one key design trade-off.

11.2 Implementation

All functional requirements are met.
All critical tests pass.
Edge cases are handled cleanly.

11.3 Growth

I documented lessons learned.
I can explain this project in an interview.

12. Submission / Completion Criteria

Minimum Viable Completion:

Core feature works for the golden demo.
Errors are handled with non-zero status.
Code is readable and buildable.

Full Completion:

All functional requirements met.
Tests cover edge cases and failures.

Excellence (Going Above & Beyond):

Performance benchmarks and clear documentation.
Behavior compared against a reference shell.

Project 8: Signal Handler

Quick Reference

1. Learning Objectives

2. All Theory Needed (Per-Concept Breakdown)

Signals, Handlers, and Asynchronous Events

Process Creation and Exec Lifecycle

3. Project Specification

3.1 What You Will Build

3.2 Functional Requirements

3.3 Non-Functional Requirements

3.4 Example Usage / Output

3.5 Data Formats / Schemas / Protocols

3.6 Edge Cases

3.7 Real World Outcome

3.7.1 How to Run (Copy/Paste)

3.7.2 Golden Path Demo (Deterministic)

3.7.3 Failure Demo (Deterministic)

4. Solution Architecture

4.1 High-Level Design

4.2 Key Components

4.4 Data Structures (No Full Code)

4.4 Algorithm Overview

5. Implementation Guide

5.1 Development Environment Setup

5.2 Project Structure

5.3 The Core Question You’re Answering

5.4 Concepts You Must Understand First

5.5 Questions to Guide Your Design

5.6 Thinking Exercise

5.7 The Interview Questions They’ll Ask

5.8 Hints in Layers

5.9 Books That Will Help

5.10 Implementation Phases

Phase 1: Foundation (2-3 days)

Phase 2: Core Functionality (1 week)

Phase 3: Polish & Edge Cases (2-4 days)

5.11 Key Implementation Decisions

6. Testing Strategy

6.1 Test Categories

6.2 Critical Test Cases

6.3 Test Data

7. Common Pitfalls & Debugging

7.1 Frequent Mistakes

7.2 Debugging Strategies

7.3 Performance Traps

8. Extensions & Challenges

8.1 Beginner Extensions

8.2 Intermediate Extensions

8.3 Advanced Extensions

9. Real-World Connections

9.1 Industry Applications

9.2 Related Open Source Projects

9.3 Interview Relevance

10. Resources

10.1 Essential Reading

10.2 Video Resources

10.3 Tools & Documentation

10.4 Related Projects in This Series

11. Self-Assessment Checklist

11.1 Understanding

11.2 Implementation

11.3 Growth

12. Submission / Completion Criteria