Project 4: The Process Psychic (Process Inspector)
Build a
ps-like tool by parsing/procdirectly.
Quick Reference
| Attribute | Value |
|---|---|
| Difficulty | Intermediate |
| Time Estimate | Weekend |
| Language | C or Python (Alt: Go) |
| Prerequisites | File I/O, string parsing |
| Key Topics | /proc parsing, process states, CPU usage |
1. Learning Objectives
By completing this project, you will:
- Scan
/procfor running processes. - Parse
/proc/<pid>/stator/proc/<pid>/status. - Display command line, state, and memory stats.
- Compute CPU usage from jiffies.
2. Theoretical Foundation
2.1 Core Concepts
- /proc: A virtual filesystem exposing kernel process data.
- Process states: R, S, D, Z describe scheduler state.
- Jiffies: Kernel ticks used for CPU accounting.
2.2 Why This Matters
Tools like ps and top are just /proc parsers. This project removes the mystery.
2.3 Historical Context / Background
The /proc filesystem became the standard Unix interface for process introspection.
2.4 Common Misconceptions
- “ps uses special syscalls”: It mostly reads /proc files.
- “cmdline is space-separated”: It is null-separated.
3. Project Specification
3.1 What You Will Build
A CLI tool that prints PID, user, state, memory usage, and command line for running processes.
3.2 Functional Requirements
- Enumerate numeric /proc entries.
- Parse process state and memory fields.
- Resolve UID to username.
- Print a process table.
3.3 Non-Functional Requirements
- Reliability: Handle processes that exit mid-read.
- Usability: Output columns align and truncate safely.
3.4 Example Usage / Output
$ ./myps
PID USER STATE CMD
1 root S /sbin/init
3.5 Real World Outcome
You will run myps and see a table similar to ps:
$ ./myps
PID USER STATE CMD
1 root S /sbin/init
4. Solution Architecture
4.1 High-Level Design
scan /proc -> parse stat/status -> resolve uid -> render table
4.2 Key Components
| Component | Responsibility | Key Decisions |
|---|---|---|
| Scanner | Iterate /proc | Filter numeric dirs |
| Parser | Extract state, mem | status vs stat |
| Resolver | UID -> name | getpwuid |
| Renderer | Output table | Fixed columns |
4.3 Data Structures
struct proc_info { pid_t pid; char state; long rss; char cmd[256]; };
4.4 Algorithm Overview
Key Algorithm: CPU%
- Read utime+stime.
- Sample again after interval.
- Compute delta vs total CPU time.
Complexity Analysis:
- Time: O(n) per scan
- Space: O(n)
5. Implementation Guide
5.1 Development Environment Setup
gcc --version
5.2 Project Structure
project-root/
├── myps.c
└── README.md
5.3 The Core Question You’re Answering
“How does
topknow what’s running right now?”
5.4 Concepts You Must Understand First
Stop and research these before coding:
- /proc layout
- Null-separated cmdline
- Tick to seconds conversion
5.5 Questions to Guide Your Design
Before implementing, think through these:
- Will you parse
statorstatusfor memory values? - How will you handle permissions for root-owned processes?
- How frequently will you sample for CPU%?
5.6 Thinking Exercise
Manual /proc
Inspect /proc/self/status and /proc/self/cmdline and compare to ps output.
5.7 The Interview Questions They’ll Ask
Prepare to answer these:
- “What is
/procand why is it virtual?” - “How do you interpret the process state letters?”
- “Why is
cmdlinenull-separated?”
5.8 Hints in Layers
Hint 1: Numeric directories
Check isdigit for directory names.
Hint 2: cmdline parsing
Replace \0 with spaces when printing.
Hint 3: Race conditions Ignore ENOENT when processes disappear.
5.9 Books That Will Help
| Topic | Book | Chapter |
|---|---|---|
| /proc | “TLPI” | Ch. 10, 12 |
5.10 Implementation Phases
Phase 1: Foundation (1-2 days)
Goals:
- List PIDs and read cmdline.
Tasks:
- Scan /proc.
- Print PID + cmdline.
Checkpoint: Output looks reasonable.
Phase 2: Core Functionality (2-3 days)
Goals:
- Add state and memory.
Tasks:
- Parse status or stat.
- Add UID -> name.
Checkpoint: Table matches ps fields.
Phase 3: Polish & Edge Cases (1-2 days)
Goals:
- Add CPU% sampling.
Tasks:
- Sample twice.
- Compute and display CPU%.
Checkpoint: CPU% is plausible.
5.11 Key Implementation Decisions
| Decision | Options | Recommendation | Rationale |
|---|---|---|---|
| Source | stat vs status | status | Easier parsing |
| CPU% | single vs two samples | two samples | Correct values |
6. Testing Strategy
6.1 Test Categories
| Category | Purpose | Examples |
|---|---|---|
| Parsing | Validate fields | Compare with ps |
| UID mapping | User names | getpwuid |
| CPU% | Sampling | Compare with top |
6.2 Critical Test Cases
- Processes that exit mid-read are skipped.
- cmdline prints correctly with spaces.
- CPU% for idle process near zero.
6.3 Test Data
PID: self
7. Common Pitfalls & Debugging
7.1 Frequent Mistakes
| Pitfall | Symptom | Solution |
|---|---|---|
| Parsing cmdline as text | Missing args | Replace nulls |
| Wrong field index | Bad state | Use correct stat positions |
| Permission errors | Crash | Handle gracefully |
7.2 Debugging Strategies
- Print raw stat line for a known PID.
- Compare with
ps -o pid,stat,cmd.
7.3 Performance Traps
Scanning /proc too often is expensive; use reasonable refresh intervals.
8. Extensions & Challenges
8.1 Beginner Extensions
- Add filtering by user.
- Add sorting by PID.
8.2 Intermediate Extensions
- Add RSS/VSZ columns.
- Add process tree view.
8.3 Advanced Extensions
- Add JSON output for monitoring integration.
- Add thread counts from /proc.
9. Real-World Connections
9.1 Industry Applications
- Lightweight monitoring agents and diagnostics.
9.2 Related Open Source Projects
- procps: https://gitlab.com/procps-ng/procps
9.3 Interview Relevance
- /proc parsing and process states are common topics.
10. Resources
10.1 Essential Reading
- proc(5) -
man 5 proc
10.2 Video Resources
- /proc tutorials (search “Linux /proc”)
10.3 Tools & Documentation
- ps(1) -
man 1 ps
10.4 Related Projects in This Series
- Filesystem Explorer: another /proc-heavy parser.
11. Self-Assessment Checklist
11.1 Understanding
- I can explain /proc.
- I can explain process states.
- I can compute CPU% from jiffies.
11.2 Implementation
- Tool lists processes correctly.
- cmdline displays properly.
- CPU% looks reasonable.
11.3 Growth
- I can add more fields easily.
- I can compare output with top/ps.
12. Submission / Completion Criteria
Minimum Viable Completion:
- List PID, user, state, and cmdline.
Full Completion:
- Add memory and CPU% columns.
Excellence (Going Above & Beyond):
- Add process tree and JSON output.
This guide was generated from LEARN_LINUX_UNIX_INTERNALS_DEEP_DIVE.md. For the complete learning path, see the parent directory.