Project 5: Process Psychic (procfs Inspector)

Build a ps-like tool that reads /proc and reports process state, memory, and CPU time.

Quick Reference

Attribute Value
Difficulty Intermediate
Time Estimate Weekend
Main Programming Language C or Python (Alternatives: Go, Rust)
Alternative Programming Languages Go, Rust
Coolness Level See REFERENCE.md (Level 3)
Business Potential See REFERENCE.md (Level 3)
Prerequisites File parsing, process model, basic stats
Key Topics procfs, process states, CPU ticks

1. Learning Objectives

By completing this project, you will:

  1. Explain how procfs exposes kernel data structures.
  2. Parse process state and resource usage reliably.
  3. Convert kernel time units into human-readable values.
  4. Build a deterministic snapshot of running processes.

2. All Theory Needed (Per-Concept Breakdown)

procfs and Process State Introspection

Fundamentals procfs is a virtual filesystem that exposes kernel state as files. Each running process has a directory under /proc containing files like stat, status, and cmdline. These files are not stored on disk; they are generated by the kernel when read. This makes procfs an essential introspection tool: by reading text files, you can observe process state, memory usage, CPU time, and command-line arguments. Understanding procfs is critical for building tools like ps, top, and custom monitoring agents.

Deep Dive procfs is a deliberately simple interface: it maps kernel state into a file hierarchy. The kernel constructs file contents on demand, which means the data reflects a momentary snapshot. This is powerful but also fragile: a process can exit while you are reading its procfs files, so your code must handle missing files gracefully. Most procfs files are formatted as human-readable text, but their formats are precise and sometimes tricky to parse. For example, /proc/[pid]/stat is a single line with many fields; the second field (the command name) is enclosed in parentheses and can contain spaces, so naive space splitting fails. A correct parser must locate the matching closing parenthesis before splitting the remaining fields.

Process states are represented by single-letter codes, such as R (running), S (sleeping), D (uninterruptible sleep), Z (zombie), and T (stopped). These states are defined by the kernel scheduler and are visible in procfs. The state reflects why a process is not running: it might be waiting for I/O (D), waiting for input (S), or defunct (Z). Understanding these states allows you to explain why a process appears stuck and whether it will respond to signals.

CPU time accounting is another critical detail. The kernel tracks CPU usage in units of ticks or jiffies, not seconds. To convert these values to time, you need to know the system tick rate. This value can be queried from the system configuration. A monitoring tool typically reads /proc/stat for system-wide CPU time and /proc/[pid]/stat for per-process CPU time, then calculates percentages over a time interval. For this project, a snapshot is sufficient, but you should still understand the unit conversion.

Memory usage in procfs is also nuanced. The status file lists multiple fields such as RSS (resident set size) and virtual memory size. RSS indicates how much physical memory the process currently holds, while virtual size includes mapped but non-resident pages. For a basic tool, RSS is the most informative. Understanding the difference prevents you from misinterpreting memory usage.

Finally, procfs is a virtual filesystem, which means its contents can be inconsistent across reads. A robust tool anticipates races: a process may die between reading its directory name and reading its stat file; permissions may deny access for some processes; and the data can change between fields. Your tool should handle these gracefully by skipping or marking entries rather than crashing. This is what makes monitoring tools reliable in real systems.

How this fit on projects You will apply this concept in §3.1 to define required fields, in §4.2 to design parsers, and in §6.2 for test cases. It also supports P02-syscall-tracer.md by exposing process state transitions.

Definitions & key terms

  • procfs: Virtual filesystem exposing kernel state.
  • RSS: Resident Set Size, physical memory used by a process.
  • Jiffy: Kernel time unit used for CPU accounting.
  • Zombie: Defunct process awaiting parent reaping.
  • State code: Single-letter process state indicator.

Mental model diagram

/proc
  /1234
    stat   -> process state, CPU time
    status -> memory, credentials
    cmdline -> command arguments

How it works

  1. List numeric directories under /proc.
  2. For each PID, read stat and status files.
  3. Parse fields into a structured record.
  4. Print a deterministic snapshot.

Minimal concrete example

proc snapshot (conceptual):
PID 1042: state=R, utime=120, stime=40, rss=2048 KB

Common misconceptions

  • “procfs files are stored on disk.” They are generated on demand.
  • “stat is easy to parse with split.” The comm field breaks naive parsing.
  • “RSS equals virtual memory size.” They are different measures.

Check-your-understanding questions

  1. Why can a procfs read fail even if the PID was just listed?
  2. What is the difference between RSS and virtual size?
  3. Why does /proc/[pid]/stat require special parsing?
  4. What does process state D mean?

Check-your-understanding answers

  1. The process may exit between listing and reading.
  2. RSS is resident memory; virtual size includes unmapped or swapped pages.
  3. The command field is in parentheses and can contain spaces.
  4. It is uninterruptible sleep, typically waiting for I/O.

Real-world applications

  • Building monitoring agents and diagnostic tools.
  • Investigating performance issues in production.
  • Explaining why processes appear stuck.

Where you’ll apply it

References

  • procfs documentation: https://docs.kernel.org/filesystems/proc.html
  • “The Linux Programming Interface” - procfs sections

Key insights procfs is the kernel’s structured narrative about process state.

Summary If you can parse procfs, you can explain what processes are doing right now.

Homework/Exercises to practice the concept

  1. Inspect /proc/self/stat and identify the state field.
  2. Compare RSS and virtual size for a running process.

Solutions to the homework/exercises

  1. The state is the third field in stat after the command name.
  2. RSS is usually smaller than virtual size.

3. Project Specification

3.1 What You Will Build

A command-line tool myps that scans /proc, reads process state and memory usage, and prints a table of PID, user, state, CPU time, and command.

3.2 Functional Requirements

  1. Process listing: Enumerate numeric /proc entries.
  2. State parsing: Extract state, CPU time, and memory usage.
  3. Command display: Read command line or name reliably.

3.3 Non-Functional Requirements

  • Performance: Must handle hundreds of processes quickly.
  • Reliability: Must tolerate processes exiting during scan.
  • Usability: Output aligned and easy to read.

3.4 Example Usage / Output

$ ./myps
PID  USER   STATE  CPU(ms)  RSS(KB)  CMD
1    root   S      12345    4567     /sbin/init

3.5 Data Formats / Schemas / Protocols

  • /proc/[pid]/stat fields.
  • /proc/[pid]/status key-value pairs.

3.6 Edge Cases

  • PID disappears while reading.
  • Permission denied for a process owned by another user.
  • Command line is empty (kernel threads).

3.7 Real World Outcome

3.7.1 How to Run (Copy/Paste)

  • Build in project-root.
  • Run ./myps in a terminal.

3.7.2 Golden Path Demo (Deterministic)

Run on a quiet system with a fixed set of test processes.

3.7.3 If CLI: Exact terminal transcript

$ ./myps
PID  USER  STATE  CPU(ms)  RSS(KB)  CMD
1    root  S      12000    4500     /sbin/init
200  user  R      150      2100     ./myps
# exit code: 0

Failure demo (deterministic):

$ ./myps --pid not-a-number
error: invalid PID
# exit code: 2

Exit codes:

  • 0 success
  • 2 invalid input

4. Solution Architecture

4.1 High-Level Design

/proc scan -> parse stat/status -> format -> print table

4.2 Key Components

Component Responsibility Key Decisions
Scanner Find numeric PIDs Ignore non-numeric dirs
Parser Parse stat/status Handle comm field safely
Formatter Align table output Fixed column widths

4.4 Data Structures (No Full Code)

  • Process record: pid, user, state, cpu_ms, rss_kb, cmd.
  • Parser state: raw line + parsed fields.

4.4 Algorithm Overview

Key Algorithm: proc scan

  1. List /proc directories.
  2. For each numeric PID, read stat/status.
  3. Extract fields and print.

Complexity Analysis:

  • Time: O(p) for p processes.
  • Space: O(p) for stored records.

5. Implementation Guide

5.1 Development Environment Setup

# Install a C or Python runtime

5.2 Project Structure

project-root/
├── src/
│   ├── myps.c
│   └── parse.c
├── tests/
│   └── ps_tests.sh
└── README.md

5.3 The Core Question You’re Answering

“How does ps know what is running without special syscalls?”

5.4 Concepts You Must Understand First

  1. procfs layout
    • What files exist under /proc/[pid]?
    • Book Reference: “The Linux Programming Interface” - procfs sections
  2. Process state codes
    • What does R, S, D, Z mean?
    • Book Reference: “Operating Systems: Three Easy Pieces” - CPU chapters

5.5 Questions to Guide Your Design

  1. How will you parse the command name in /proc/[pid]/stat?
  2. How will you handle kernel threads with empty cmdline?

5.6 Thinking Exercise

Manual Inspection

Read /proc/self/stat and map the state field to the state letter.

5.7 The Interview Questions They’ll Ask

  1. “What is procfs?”
  2. “Why does ps not need a special syscall?”
  3. “What does state D mean?”
  4. “How do you compute CPU time from ticks?”

5.8 Hints in Layers

Hint 1: Start with status Use /proc/[pid]/status for readable fields.

Hint 2: Parse stat carefully The comm field is wrapped in parentheses.

Hint 3: Convert ticks Use the system tick rate to convert to ms.

Hint 4: Debugging Compare output with ps -o for the same PID.

5.9 Books That Will Help

Topic Book Chapter
procfs “The Linux Programming Interface” procfs sections
Scheduling “Operating Systems: Three Easy Pieces” CPU chapters

5.10 Implementation Phases

Phase 1: Foundation (1 day)

Goals:

  • Scan /proc and list PIDs.

Tasks:

  1. Filter numeric directories.
  2. Read status for each PID.

Checkpoint: PIDs listed with state.

Phase 2: Core Functionality (1 day)

Goals:

  • Parse stat and compute CPU time.

Tasks:

  1. Extract state and time fields.
  2. Convert ticks to ms.

Checkpoint: CPU times match ps for sample PID.

Phase 3: Polish & Edge Cases (1 day)

Goals:

  • Handle races and errors.

Tasks:

  1. Skip missing files gracefully.
  2. Normalize output formatting.

Checkpoint: No crashes under rapid process churn.

5.11 Key Implementation Decisions

Decision Options Recommendation Rationale
Source file status vs stat stat + status balance precision and readability
Output format table vs JSON table matches ps behavior

6. Testing Strategy

6.1 Test Categories

Category Purpose Examples
Unit Tests Parser correctness comm field parsing
Integration Tests Full scan compare with ps
Edge Case Tests PID exit race simulate short-lived process

6.2 Critical Test Cases

  1. Running process: state R for current process.
  2. Sleeping process: state S for idle shell.
  3. Zombie: detect Z after a fork without wait.

6.3 Test Data

Sample PIDs and expected states

7. Common Pitfalls & Debugging

7.1 Frequent Mistakes

Pitfall Symptom Solution
Bad parsing wrong fields handle comm field parentheses
Race conditions missing files catch ENOENT and skip
Wrong CPU units huge numbers convert ticks correctly

7.2 Debugging Strategies

  • Compare with ps: validate a single PID.
  • Log raw lines: inspect stat/status lines for parsing issues.

7.3 Performance Traps

Full scans on very large systems can be slow; avoid repeated scans in a tight loop.


8. Extensions & Challenges

8.1 Beginner Extensions

  • Filter by user.
  • Sort by CPU time.

8.2 Intermediate Extensions

  • Add CPU usage percentages over time.
  • Add memory usage charts.

8.3 Advanced Extensions

  • Implement a live updating TUI.
  • Export data in JSON for dashboards.

9. Real-World Connections

9.1 Industry Applications

  • Monitoring agents: process-level metrics.
  • Incident response: diagnosing stuck processes.
  • procps: https://gitlab.com/procps-ng/procps - ps/top tools.
  • htop: https://htop.dev/ - interactive process viewer.

9.3 Interview Relevance

Process states and procfs are common system interview topics.


10. Resources

10.1 Essential Reading

  • procfs documentation - docs.kernel.org
  • “The Linux Programming Interface” - procfs sections

10.2 Video Resources

  • “Understanding /proc” - lectures (search title)

10.3 Tools & Documentation

  • procfs docs: https://docs.kernel.org/filesystems/proc.html
  • man7: https://man7.org/linux/man-pages/man5/proc.5.html

11. Self-Assessment Checklist

11.1 Understanding

  • I can explain how procfs is generated
  • I can parse stat fields correctly
  • I understand process state codes

11.2 Implementation

  • All functional requirements are met
  • Races are handled gracefully
  • Output is deterministic

11.3 Growth

  • I can explain this tool in an interview
  • I documented lessons learned
  • I can propose an extension

12. Submission / Completion Criteria

Minimum Viable Completion:

  • List processes with state and command
  • Parse stat safely
  • Handle missing PIDs

Full Completion:

  • All minimum criteria plus:
  • Show CPU time and RSS
  • Provide failure demo with exit code

Excellence (Going Above & Beyond):

  • Live updating interface
  • Export metrics for monitoring systems