Project 5: Process Psychic (procfs Inspector)
Build a ps-like tool that reads /proc and reports process state, memory, and CPU time.
Quick Reference
| Attribute | Value |
|---|---|
| Difficulty | Intermediate |
| Time Estimate | Weekend |
| Main Programming Language | C or Python (Alternatives: Go, Rust) |
| Alternative Programming Languages | Go, Rust |
| Coolness Level | See REFERENCE.md (Level 3) |
| Business Potential | See REFERENCE.md (Level 3) |
| Prerequisites | File parsing, process model, basic stats |
| Key Topics | procfs, process states, CPU ticks |
1. Learning Objectives
By completing this project, you will:
- Explain how procfs exposes kernel data structures.
- Parse process state and resource usage reliably.
- Convert kernel time units into human-readable values.
- Build a deterministic snapshot of running processes.
2. All Theory Needed (Per-Concept Breakdown)
procfs and Process State Introspection
Fundamentals
procfs is a virtual filesystem that exposes kernel state as files. Each running process has a directory under /proc containing files like stat, status, and cmdline. These files are not stored on disk; they are generated by the kernel when read. This makes procfs an essential introspection tool: by reading text files, you can observe process state, memory usage, CPU time, and command-line arguments. Understanding procfs is critical for building tools like ps, top, and custom monitoring agents.
Deep Dive
procfs is a deliberately simple interface: it maps kernel state into a file hierarchy. The kernel constructs file contents on demand, which means the data reflects a momentary snapshot. This is powerful but also fragile: a process can exit while you are reading its procfs files, so your code must handle missing files gracefully. Most procfs files are formatted as human-readable text, but their formats are precise and sometimes tricky to parse. For example, /proc/[pid]/stat is a single line with many fields; the second field (the command name) is enclosed in parentheses and can contain spaces, so naive space splitting fails. A correct parser must locate the matching closing parenthesis before splitting the remaining fields.
Process states are represented by single-letter codes, such as R (running), S (sleeping), D (uninterruptible sleep), Z (zombie), and T (stopped). These states are defined by the kernel scheduler and are visible in procfs. The state reflects why a process is not running: it might be waiting for I/O (D), waiting for input (S), or defunct (Z). Understanding these states allows you to explain why a process appears stuck and whether it will respond to signals.
CPU time accounting is another critical detail. The kernel tracks CPU usage in units of ticks or jiffies, not seconds. To convert these values to time, you need to know the system tick rate. This value can be queried from the system configuration. A monitoring tool typically reads /proc/stat for system-wide CPU time and /proc/[pid]/stat for per-process CPU time, then calculates percentages over a time interval. For this project, a snapshot is sufficient, but you should still understand the unit conversion.
Memory usage in procfs is also nuanced. The status file lists multiple fields such as RSS (resident set size) and virtual memory size. RSS indicates how much physical memory the process currently holds, while virtual size includes mapped but non-resident pages. For a basic tool, RSS is the most informative. Understanding the difference prevents you from misinterpreting memory usage.
Finally, procfs is a virtual filesystem, which means its contents can be inconsistent across reads. A robust tool anticipates races: a process may die between reading its directory name and reading its stat file; permissions may deny access for some processes; and the data can change between fields. Your tool should handle these gracefully by skipping or marking entries rather than crashing. This is what makes monitoring tools reliable in real systems.
How this fit on projects You will apply this concept in §3.1 to define required fields, in §4.2 to design parsers, and in §6.2 for test cases. It also supports P02-syscall-tracer.md by exposing process state transitions.
Definitions & key terms
- procfs: Virtual filesystem exposing kernel state.
- RSS: Resident Set Size, physical memory used by a process.
- Jiffy: Kernel time unit used for CPU accounting.
- Zombie: Defunct process awaiting parent reaping.
- State code: Single-letter process state indicator.
Mental model diagram
/proc
/1234
stat -> process state, CPU time
status -> memory, credentials
cmdline -> command arguments
How it works
- List numeric directories under
/proc. - For each PID, read stat and status files.
- Parse fields into a structured record.
- Print a deterministic snapshot.
Minimal concrete example
proc snapshot (conceptual):
PID 1042: state=R, utime=120, stime=40, rss=2048 KB
Common misconceptions
- “procfs files are stored on disk.” They are generated on demand.
- “stat is easy to parse with split.” The comm field breaks naive parsing.
- “RSS equals virtual memory size.” They are different measures.
Check-your-understanding questions
- Why can a procfs read fail even if the PID was just listed?
- What is the difference between RSS and virtual size?
- Why does
/proc/[pid]/statrequire special parsing? - What does process state D mean?
Check-your-understanding answers
- The process may exit between listing and reading.
- RSS is resident memory; virtual size includes unmapped or swapped pages.
- The command field is in parentheses and can contain spaces.
- It is uninterruptible sleep, typically waiting for I/O.
Real-world applications
- Building monitoring agents and diagnostic tools.
- Investigating performance issues in production.
- Explaining why processes appear stuck.
Where you’ll apply it
- See §3.1 What You Will Build and §4.1 High-Level Design.
- Also used in: P02-syscall-tracer.md
References
- procfs documentation: https://docs.kernel.org/filesystems/proc.html
- “The Linux Programming Interface” - procfs sections
Key insights procfs is the kernel’s structured narrative about process state.
Summary If you can parse procfs, you can explain what processes are doing right now.
Homework/Exercises to practice the concept
- Inspect
/proc/self/statand identify the state field. - Compare RSS and virtual size for a running process.
Solutions to the homework/exercises
- The state is the third field in
statafter the command name. - RSS is usually smaller than virtual size.
3. Project Specification
3.1 What You Will Build
A command-line tool myps that scans /proc, reads process state and memory usage, and prints a table of PID, user, state, CPU time, and command.
3.2 Functional Requirements
- Process listing: Enumerate numeric
/procentries. - State parsing: Extract state, CPU time, and memory usage.
- Command display: Read command line or name reliably.
3.3 Non-Functional Requirements
- Performance: Must handle hundreds of processes quickly.
- Reliability: Must tolerate processes exiting during scan.
- Usability: Output aligned and easy to read.
3.4 Example Usage / Output
$ ./myps
PID USER STATE CPU(ms) RSS(KB) CMD
1 root S 12345 4567 /sbin/init
3.5 Data Formats / Schemas / Protocols
/proc/[pid]/statfields./proc/[pid]/statuskey-value pairs.
3.6 Edge Cases
- PID disappears while reading.
- Permission denied for a process owned by another user.
- Command line is empty (kernel threads).
3.7 Real World Outcome
3.7.1 How to Run (Copy/Paste)
- Build in
project-root. - Run
./mypsin a terminal.
3.7.2 Golden Path Demo (Deterministic)
Run on a quiet system with a fixed set of test processes.
3.7.3 If CLI: Exact terminal transcript
$ ./myps
PID USER STATE CPU(ms) RSS(KB) CMD
1 root S 12000 4500 /sbin/init
200 user R 150 2100 ./myps
# exit code: 0
Failure demo (deterministic):
$ ./myps --pid not-a-number
error: invalid PID
# exit code: 2
Exit codes:
- 0 success
- 2 invalid input
4. Solution Architecture
4.1 High-Level Design
/proc scan -> parse stat/status -> format -> print table
4.2 Key Components
| Component | Responsibility | Key Decisions |
|---|---|---|
| Scanner | Find numeric PIDs | Ignore non-numeric dirs |
| Parser | Parse stat/status | Handle comm field safely |
| Formatter | Align table output | Fixed column widths |
4.4 Data Structures (No Full Code)
- Process record: pid, user, state, cpu_ms, rss_kb, cmd.
- Parser state: raw line + parsed fields.
4.4 Algorithm Overview
Key Algorithm: proc scan
- List
/procdirectories. - For each numeric PID, read stat/status.
- Extract fields and print.
Complexity Analysis:
- Time: O(p) for p processes.
- Space: O(p) for stored records.
5. Implementation Guide
5.1 Development Environment Setup
# Install a C or Python runtime
5.2 Project Structure
project-root/
├── src/
│ ├── myps.c
│ └── parse.c
├── tests/
│ └── ps_tests.sh
└── README.md
5.3 The Core Question You’re Answering
“How does
psknow what is running without special syscalls?”
5.4 Concepts You Must Understand First
- procfs layout
- What files exist under
/proc/[pid]? - Book Reference: “The Linux Programming Interface” - procfs sections
- What files exist under
- Process state codes
- What does R, S, D, Z mean?
- Book Reference: “Operating Systems: Three Easy Pieces” - CPU chapters
5.5 Questions to Guide Your Design
- How will you parse the command name in
/proc/[pid]/stat? - How will you handle kernel threads with empty cmdline?
5.6 Thinking Exercise
Manual Inspection
Read /proc/self/stat and map the state field to the state letter.
5.7 The Interview Questions They’ll Ask
- “What is procfs?”
- “Why does
psnot need a special syscall?” - “What does state D mean?”
- “How do you compute CPU time from ticks?”
5.8 Hints in Layers
Hint 1: Start with status
Use /proc/[pid]/status for readable fields.
Hint 2: Parse stat carefully The comm field is wrapped in parentheses.
Hint 3: Convert ticks Use the system tick rate to convert to ms.
Hint 4: Debugging
Compare output with ps -o for the same PID.
5.9 Books That Will Help
| Topic | Book | Chapter |
|---|---|---|
| procfs | “The Linux Programming Interface” | procfs sections |
| Scheduling | “Operating Systems: Three Easy Pieces” | CPU chapters |
5.10 Implementation Phases
Phase 1: Foundation (1 day)
Goals:
- Scan
/procand list PIDs.
Tasks:
- Filter numeric directories.
- Read
statusfor each PID.
Checkpoint: PIDs listed with state.
Phase 2: Core Functionality (1 day)
Goals:
- Parse
statand compute CPU time.
Tasks:
- Extract state and time fields.
- Convert ticks to ms.
Checkpoint: CPU times match ps for sample PID.
Phase 3: Polish & Edge Cases (1 day)
Goals:
- Handle races and errors.
Tasks:
- Skip missing files gracefully.
- Normalize output formatting.
Checkpoint: No crashes under rapid process churn.
5.11 Key Implementation Decisions
| Decision | Options | Recommendation | Rationale |
|---|---|---|---|
| Source file | status vs stat | stat + status | balance precision and readability |
| Output format | table vs JSON | table | matches ps behavior |
6. Testing Strategy
6.1 Test Categories
| Category | Purpose | Examples |
|---|---|---|
| Unit Tests | Parser correctness | comm field parsing |
| Integration Tests | Full scan | compare with ps |
| Edge Case Tests | PID exit race | simulate short-lived process |
6.2 Critical Test Cases
- Running process: state R for current process.
- Sleeping process: state S for idle shell.
- Zombie: detect Z after a fork without wait.
6.3 Test Data
Sample PIDs and expected states
7. Common Pitfalls & Debugging
7.1 Frequent Mistakes
| Pitfall | Symptom | Solution |
|---|---|---|
| Bad parsing | wrong fields | handle comm field parentheses |
| Race conditions | missing files | catch ENOENT and skip |
| Wrong CPU units | huge numbers | convert ticks correctly |
7.2 Debugging Strategies
- Compare with ps: validate a single PID.
- Log raw lines: inspect stat/status lines for parsing issues.
7.3 Performance Traps
Full scans on very large systems can be slow; avoid repeated scans in a tight loop.
8. Extensions & Challenges
8.1 Beginner Extensions
- Filter by user.
- Sort by CPU time.
8.2 Intermediate Extensions
- Add CPU usage percentages over time.
- Add memory usage charts.
8.3 Advanced Extensions
- Implement a live updating TUI.
- Export data in JSON for dashboards.
9. Real-World Connections
9.1 Industry Applications
- Monitoring agents: process-level metrics.
- Incident response: diagnosing stuck processes.
9.2 Related Open Source Projects
- procps: https://gitlab.com/procps-ng/procps - ps/top tools.
- htop: https://htop.dev/ - interactive process viewer.
9.3 Interview Relevance
Process states and procfs are common system interview topics.
10. Resources
10.1 Essential Reading
- procfs documentation - docs.kernel.org
- “The Linux Programming Interface” - procfs sections
10.2 Video Resources
- “Understanding /proc” - lectures (search title)
10.3 Tools & Documentation
- procfs docs: https://docs.kernel.org/filesystems/proc.html
- man7: https://man7.org/linux/man-pages/man5/proc.5.html
10.4 Related Projects in This Series
11. Self-Assessment Checklist
11.1 Understanding
- I can explain how procfs is generated
- I can parse stat fields correctly
- I understand process state codes
11.2 Implementation
- All functional requirements are met
- Races are handled gracefully
- Output is deterministic
11.3 Growth
- I can explain this tool in an interview
- I documented lessons learned
- I can propose an extension
12. Submission / Completion Criteria
Minimum Viable Completion:
- List processes with state and command
- Parse stat safely
- Handle missing PIDs
Full Completion:
- All minimum criteria plus:
- Show CPU time and RSS
- Provide failure demo with exit code
Excellence (Going Above & Beyond):
- Live updating interface
- Export metrics for monitoring systems