Project 8: Process Genealogist
Trace a PID’s full ancestry and descendants using
/procandps.
Quick Reference
| Attribute | Value |
|---|---|
| Difficulty | Intermediate |
| Time Estimate | 1 week |
| Language | Python (Alternatives: Go, Rust, C) |
| Prerequisites | Project 1, fork/exec basics |
| Key Topics | PPID chains, sessions, process groups |
1. Learning Objectives
By completing this project, you will:
- Build parent/child trees from
/proc. - Trace a process back to PID 1.
- Inspect sessions, process groups, and controlling terminals.
- List a process’s open file descriptors and cwd.
2. Theoretical Foundation
2.1 Core Concepts
- fork/exec: Processes are created by fork and then replaced by exec.
- Sessions and groups: Job control depends on session/PGID relationships.
- Orphans and reparenting: Orphans are adopted by PID 1.
2.2 Why This Matters
Understanding lineage explains why processes have certain environments, fds, and terminals.
2.3 Historical Context / Background
Unix has always tracked PPIDs and sessions; /proc exposes these fields directly.
2.4 Common Misconceptions
- “PPID is always stable”: It can change when parents die.
- “pstree is magic”: It just reads
/proc.
3. Project Specification
3.1 What You Will Build
A CLI tool that prints a PID’s ancestry tree, descendant tree, and session details, along with open file descriptors.
3.2 Functional Requirements
- Resolve a PID’s parent chain to PID 1.
- Build a list of descendants.
- Show PGID, SID, and controlling terminal.
- List open file descriptors and cwd.
3.3 Non-Functional Requirements
- Performance: Handle full
/procscan quickly. - Reliability: Handle processes that exit during scan.
- Usability: Clear, tree-like output.
3.4 Example Usage / Output
$ ./process-genealogist 1234
systemd(1) -> sshd(521) -> bash(1199) -> python3(1234)
3.5 Real World Outcome
You will run the tool and see the full lineage and open resources for a process. Example:
$ ./process-genealogist 1234
systemd(1) -> sshd(521) -> bash(1199) -> python3(1234)
4. Solution Architecture
4.1 High-Level Design
scan /proc -> build PID->PPID map -> trace ancestry -> find descendants -> report
4.2 Key Components
| Component | Responsibility | Key Decisions |
|---|---|---|
| Scanner | Collect stat/status | Use /proc numeric dirs |
| Tree builder | Build child lists | Map PPID -> children |
| Reporter | Render ancestry | Arrow or tree view |
4.3 Data Structures
ppid = {pid: ppid}
children = {pid: [child1, child2]}
4.4 Algorithm Overview
Key Algorithm: Ancestry Chain
- Start with PID.
- Repeatedly lookup PPID until 1.
- Reverse list for display.
Complexity Analysis:
- Time: O(n) scan + O(depth) traversal
- Space: O(n)
5. Implementation Guide
5.1 Development Environment Setup
python3 --version
5.2 Project Structure
project-root/
├── process_genealogist.py
└── README.md
5.3 The Core Question You’re Answering
“Where did this process come from, and what did it inherit?”
5.4 Concepts You Must Understand First
Stop and research these before coding:
- PPID and reparenting
- Sessions and process groups
- /proc fields in stat and status
5.5 Questions to Guide Your Design
Before implementing, think through these:
- How do you handle processes that exit during scanning?
- How will you display ancestry and descendants separately?
- What resource info is most valuable to show?
5.6 Thinking Exercise
Trace your shell
Use /proc/$$/status and follow PPID until PID 1.
5.7 The Interview Questions They’ll Ask
Prepare to answer these:
- “What happens to children when a parent exits?”
- “What is a session leader?”
- “How do you find which terminal owns a process?”
5.8 Hints in Layers
Hint 1: Start with status
/proc/<pid>/status exposes PPID, UID, and state.
Hint 2: Build child lists Scan all PIDs and group by PPID.
**Hint 3: Use /proc/
5.9 Books That Will Help
| Topic | Book | Chapter |
|---|---|---|
| Process creation | “TLPI” | Ch. 24-28 |
| Sessions/groups | “APUE” | Ch. 9 |
| /proc | “How Linux Works” | Ch. 8 |
5.10 Implementation Phases
Phase 1: Foundation (2 days)
Goals:
- Parse PID->PPID mapping.
Tasks:
- Scan /proc.
- Build map.
Checkpoint: Map matches ps -o pid,ppid.
Phase 2: Core Functionality (3 days)
Goals:
- Show ancestry and descendants.
Tasks:
- Trace ancestry chain.
- Build descendant tree.
Checkpoint: Output matches pstree.
Phase 3: Polish & Edge Cases (2 days)
Goals:
- Add session and resource info.
Tasks:
- Read SID/PGID.
- List open files and cwd.
Checkpoint: Output shows expected fds and cwd.
5.11 Key Implementation Decisions
| Decision | Options | Recommendation | Rationale |
|---|---|---|---|
| Data source | ps vs /proc | /proc | More detail |
| Output style | tree vs list | tree | Readability |
6. Testing Strategy
6.1 Test Categories
| Category | Purpose | Examples |
|---|---|---|
| Mapping | Validate PPIDs | Compare with ps |
| Tree | Validate children | Compare with pstree |
| Resources | Validate fds | Compare with lsof |
6.2 Critical Test Cases
- PID with no children shows empty descendants.
- Orphaned process shows parent PID 1.
- Process exits during scan without crash.
6.3 Test Data
PID chain: 1 -> 521 -> 1199 -> 1234
7. Common Pitfalls & Debugging
7.1 Frequent Mistakes
| Pitfall | Symptom | Solution |
|---|---|---|
| Missing fields | Wrong PPID | Use /proc/ |
| Races | KeyError | Catch file-not-found |
| Parsing cmdline | Empty names | Use /proc/ |
7.2 Debugging Strategies
- Print ancestry chain step-by-step.
- Compare with
pstree -poutput.
7.3 Performance Traps
Reading too many files repeatedly can be slow; scan once per run.
8. Extensions & Challenges
8.1 Beginner Extensions
- Add user name lookup.
- Add command-line argument display.
8.2 Intermediate Extensions
- Add process group tree.
- Show environment variables count.
8.3 Advanced Extensions
- Visualize output as Graphviz.
- Add filtering by user or session.
9. Real-World Connections
9.1 Industry Applications
- Debugging runaway process trees and orphaned jobs.
9.2 Related Open Source Projects
- pstree: https://gitlab.com/procps-ng/procps
- lsof: https://github.com/lsof-org/lsof
9.3 Interview Relevance
- Process lineage and job control are core Unix topics.
10. Resources
10.1 Essential Reading
- proc(5) -
man 5 proc - ps(1) -
man 1 ps
10.2 Video Resources
- Process tree explanations (search “Unix process tree”)
10.3 Tools & Documentation
- pstree(1) -
man 1 pstree
10.4 Related Projects in This Series
- Zombie Hunter: build on PPID and reaping.
11. Self-Assessment Checklist
11.1 Understanding
- I can explain PPID chains.
- I can explain sessions and process groups.
- I can interpret /proc fields.
11.2 Implementation
- Ancestry output is correct.
- Descendants are correct.
- Resource listing works.
11.3 Growth
- I can use this to debug real services.
- I can extend output formats.
12. Submission / Completion Criteria
Minimum Viable Completion:
- Print a PID’s ancestry chain.
Full Completion:
- Include descendant tree and session details.
Excellence (Going Above & Beyond):
- Export Graphviz or JSON for visualization.
This guide was generated from LINUX_SYSTEM_TOOLS_MASTERY.md. For the complete learning path, see the parent directory.