Project 5: Process Monitor and /proc Explorer
Build a process monitor that reads /proc filesystem to display running processes with their state, memory usage, CPU time, open files, and environment variables.
Quick Reference
| Attribute | Value |
|---|---|
| Difficulty | Level 3 - Advanced |
| Time Estimate | 20-40 hours |
| Language | C (primary), Rust/Go/Python (alternatives) |
| Prerequisites | Projects 1-3, understanding of processes |
| Key Topics | /proc Filesystem, Process States, Memory Metrics, CPU Accounting |
1. Learning Objectives
After completing this project, you will:
- Understand the /proc virtual filesystem and what information it exposes
- Parse /proc/[pid]/stat, status, statm, and other process files
- Calculate CPU usage percentage from cumulative counters
- Understand Linux process states (R, S, D, Z, T) and their meanings
- Distinguish between virtual memory (VSZ), resident memory (RSS), and shared memory
- Read open file descriptors from /proc/[pid]/fd
- Handle race conditions when processes disappear mid-read
- Build a real-time updating display (like top/htop)
- Understand what information requires root vs regular user access
2. Theoretical Foundation
2.1 Core Concepts
The /proc Filesystem
/proc is a virtual filesystem—it doesn’t exist on disk. The kernel generates its contents on-the-fly when you read from it. It’s the kernel’s window into process and system state.
/proc/ Structure
┌─────────────────────────────────────────────────────────────────┐
│ /proc/ │
│ ├── 1/ ← Process 1 (init/systemd) │
│ │ ├── cmdline ← Command line arguments │
│ │ ├── cwd -> / ← Current working directory (symlink)│
│ │ ├── environ ← Environment variables │
│ │ ├── exe -> /sbin/init ← Executable path (symlink) │
│ │ ├── fd/ ← Open file descriptors │
│ │ │ ├── 0 -> /dev/null │
│ │ │ ├── 1 -> /dev/null │
│ │ │ └── 2 -> /dev/null │
│ │ ├── maps ← Memory mappings │
│ │ ├── stat ← Process status (one line) │
│ │ ├── statm ← Memory status │
│ │ ├── status ← Human-readable status │
│ │ └── ... │
│ ├── 1234/ ← Another process │
│ ├── self -> 1234 ← Symlink to current process │
│ ├── cpuinfo ← CPU information │
│ ├── meminfo ← Memory information │
│ ├── loadavg ← Load averages │
│ ├── uptime ← System uptime │
│ └── stat ← System-wide CPU statistics │
└─────────────────────────────────────────────────────────────────┘
Parsing /proc/[pid]/stat
This is the most information-dense file—a single line with ~52 fields:
$ cat /proc/1234/stat
1234 (python3) S 1 1234 1234 0 -1 4194304 12345 0 0 0 452 124 0 0 20 0 4 0 ...
^ ^ ^ ^ ^ ^ ^ ^ ^ ^
| | | | | | | | | |
PID comm | | PGID SID flags minflt utime stime
state
PPID
Key Fields (1-indexed):
1: pid - Process ID
2: comm - Executable name in parentheses (WATCH OUT: can contain spaces!)
3: state - R/S/D/Z/T/W/X
4: ppid - Parent process ID
5: pgrp - Process group ID
6: session - Session ID
7: tty_nr - Controlling terminal
8: tpgid - Foreground process group of controlling terminal
9: flags - Kernel flags
10: minflt - Minor page faults
11: cminflt - Minor page faults of waited-for children
12: majflt - Major page faults
13: cmajflt - Major page faults of waited-for children
14: utime - User mode jiffies
15: stime - Kernel mode jiffies
16: cutime - User mode jiffies of waited-for children
17: cstime - Kernel mode jiffies of waited-for children
18: priority - Priority
19: nice - Nice value (-20 to 19)
20: num_threads - Number of threads
21: itrealvalue - (obsolete, always 0)
22: starttime - Time process started (jiffies since boot)
23: vsize - Virtual memory size in bytes
24: rss - Resident set size in pages
Full list: man 5 proc
Parsing challenge: The comm field (field 2) can contain spaces, parentheses, or any character. You must find the LAST ‘)’ in the line to correctly parse the remaining fields.
// Safe parsing of /proc/[pid]/stat
char *parse_stat(char *line, int *pid, char *comm, char *state) {
// Find the command name between ( and last )
char *start = strchr(line, '(');
char *end = strrchr(line, ')'); // LAST ), not first!
if (!start || !end) return NULL;
*pid = atoi(line);
int comm_len = end - start - 1;
strncpy(comm, start + 1, comm_len);
comm[comm_len] = '\0';
// Parse state and remaining fields after the )
char *rest = end + 2; // Skip ") "
*state = *rest;
return rest + 2; // Return pointer to ppid field
}
Process States
┌─────────────────────────────────────────────────────────────────────────┐
│ PROCESS STATES │
│ │
│ ┌───────┐ │
│ │ R │ RUNNING - Currently executing or in run queue │
│ │Running│ - On CPU right now, or ready to run │
│ └───┬───┘ - Only state where process is using CPU │
│ │ │
│ │ wait for I/O or event │
│ ▼ │
│ ┌───────┐ │
│ │ S │ SLEEPING (Interruptible) - Waiting for something │
│ │Sleeping│ - Waiting for I/O, timer, signal, lock │
│ └───┬───┘ - Can be woken by signal │
│ │ │
│ │ waiting for disk (uninterruptible) │
│ ▼ │
│ ┌───────┐ │
│ │ D │ DISK SLEEP (Uninterruptible) - Cannot be interrupted │
│ │D Sleep│ - Usually waiting for disk I/O │
│ └───────┘ - Cannot be killed! (even with SIGKILL) │
│ - If stuck here for long: hardware issue │
│ │
│ ┌───────┐ │
│ │ T │ STOPPED - Execution halted by signal │
│ │Stopped│ - Received SIGSTOP or SIGTSTP (Ctrl-Z) │
│ └───────┘ - Will resume on SIGCONT │
│ │
│ ┌───────┐ │
│ │ Z │ ZOMBIE - Terminated but not reaped │
│ │Zombie │ - Process has exited │
│ └───────┘ - Parent hasn't called wait() yet │
│ - Holds exit status and resource usage │
│ - Should be short-lived; many zombies = bug │
│ │
│ Additional states: │
│ t - Stopped (trace) - Stopped by debugger │
│ X - Dead - Should never be visible │
│ I - Idle (kernel thread) │
│ │
└─────────────────────────────────────────────────────────────────────────┘
Memory Metrics
┌─────────────────────────────────────────────────────────────────────────┐
│ MEMORY METRICS │
│ │
│ Virtual Memory (VSIZE/VSZ) │
│ ┌────────────────────────────────────────────────────────────────────┐ │
│ │ Total address space mapped by the process │ │
│ │ Includes: │ │
│ │ - Code (text segment) │ │
│ │ - Data (heap, stack) │ │
│ │ - Memory-mapped files │ │
│ │ - Shared libraries │ │
│ │ - Memory that's been reserved but not used │ │
│ │ │ │
│ │ ⚠️ VSZ can be HUGE (10x+ actual RAM used) │ │
│ │ A 64-bit process can map terabytes without using RAM │ │
│ └────────────────────────────────────────────────────────────────────┘ │
│ │
│ Resident Set Size (RSS) │
│ ┌────────────────────────────────────────────────────────────────────┐ │
│ │ Physical RAM currently being used by the process │ │
│ │ Includes: │ │
│ │ - Code pages actually loaded in RAM │ │
│ │ - Heap pages that have been touched │ │
│ │ - Stack pages currently in use │ │
│ │ - Shared library pages (counted for EACH process) │ │
│ │ │ │
│ │ ⚠️ RSS overcounts because shared pages counted multiple times │ │
│ │ 10 processes using libc each count libc's pages │ │
│ └────────────────────────────────────────────────────────────────────┘ │
│ │
│ Shared Memory │
│ ┌────────────────────────────────────────────────────────────────────┐ │
│ │ Pages that can be shared with other processes │ │
│ │ - Shared libraries (.so files) │ │
│ │ - Memory-mapped files (mmap with MAP_SHARED) │ │
│ │ - Shared memory segments (shmget/shmat) │ │
│ └────────────────────────────────────────────────────────────────────┘ │
│ │
│ PSS (Proportional Set Size) - More accurate but slower │
│ ┌────────────────────────────────────────────────────────────────────┐ │
│ │ RSS but shared pages divided by number of sharers │ │
│ │ If libc is shared by 10 processes, each counts 1/10 │ │
│ │ Found in /proc/[pid]/smaps │ │
│ └────────────────────────────────────────────────────────────────────┘ │
│ │
│ Example from /proc/[pid]/status: │
│ VmSize: 450000 kB ← Virtual (may be much larger than RAM) │
│ VmRSS: 94000 kB ← Actually in RAM │
│ RssAnon: 80000 kB ← Anonymous (heap, stack) │
│ RssFile: 12000 kB ← File-backed (code, mmap files) │
│ RssShmem: 2000 kB ← Shared memory │
│ │
└─────────────────────────────────────────────────────────────────────────┘
CPU Usage Calculation
CPU percentage requires TWO samples because /proc gives cumulative counters:
Sample 1 (time T1):
Process utime1 = 1000 jiffies
Process stime1 = 200 jiffies
System total_time1 = 50000 jiffies
Sample 2 (time T2, e.g., 100ms later):
Process utime2 = 1050 jiffies
Process stime2 = 210 jiffies
System total_time2 = 50500 jiffies
Calculation:
Process delta = (utime2 + stime2) - (utime1 + stime1)
= (1050 + 210) - (1000 + 200)
= 1260 - 1200
= 60 jiffies
System delta = total_time2 - total_time1
= 50500 - 50000
= 500 jiffies (across all CPUs)
CPU % = 100.0 * (process_delta / system_delta)
= 100.0 * (60 / 500)
= 12.0%
Note: If system has 4 CPUs, max is 400%. Divide by ncpus for normalized percentage.
Reading system CPU time from /proc/stat:
$ cat /proc/stat
cpu 123456 1234 56789 1234567 12345 0 1234 0 0 0
^ ^ ^ ^ ^ ^
user nice sys idle iowait irq...
total_time = user + nice + sys + idle + iowait + irq + softirq + steal
2.2 Why This Matters
Understanding /proc is essential for:
- System monitoring: top, htop, ps all read from /proc
- Debugging: Understanding process state helps debug hangs
- Performance analysis: Memory and CPU metrics guide optimization
- Container monitoring: cgroups expose similar interfaces
- Security auditing: Check open files, environment, connections
Real-world applications:
- Prometheus node_exporter reads /proc for metrics
- Docker reads /proc for container stats
- strace uses /proc to attach to processes
- gdb reads /proc/[pid]/maps for memory layout
2.3 Historical Context
The /proc filesystem originated in Plan 9 from Bell Labs and was adopted by Linux. It replaced the need for system calls to get process information—everything is just reading files.
This “everything is a file” philosophy extends to:
- /sys - kernel and device information
- /dev - device access
- /run - runtime state
The interface is designed to be easy to read with shell scripts, enabling powerful one-liners like:
cat /proc/loadavg
cat /proc/meminfo | grep MemFree
for p in /proc/[0-9]*; do cat $p/comm 2>/dev/null; done
2.4 Common Misconceptions
Misconception 1: “/proc is a real filesystem with files on disk”
- Reality: It’s virtual. Data is generated on-the-fly by the kernel.
Misconception 2: “VSZ (virtual size) shows actual memory usage”
- Reality: VSZ can be enormous. A 64-bit process can map terabytes. RSS is more meaningful.
Misconception 3: “RSS shows how much memory the process ‘costs’“
- Reality: RSS double-counts shared libraries. PSS is more accurate but harder to get.
Misconception 4: “CPU percentage > 100% is impossible”
- Reality: On multi-core systems, a process using 2 cores fully shows 200%.
Misconception 5: “I can always read any process’s /proc files”
- Reality: Permission checks apply. Non-root can’t read /proc/[other_user’s_pid]/environ.
3. Project Specification
3.1 What You Will Build
A process monitor called myps that:
- Lists all processes with key metrics (like
ps aux) - Shows detailed information for a specific process
- Displays real-time updates (like
top) - Provides more insight than standard tools
3.2 Functional Requirements
- Process listing:
mypsshows all processes with PID, PPID, USER, STATE, CPU%, MEM%, VSZ, RSS, COMMAND - Specific process:
myps -p PIDshows detailed info for one process - Real-time mode:
myps --topupdates display periodically - Sort options:
--sort cpuor--sort memto sort by column - Show threads:
-tflag includes individual threads - Open files: Show open file descriptors for specific process
- Environment: Show environment variables for specific process
- Memory maps: Show memory mappings for specific process
3.3 Non-Functional Requirements
- Race condition handling: Gracefully handle processes disappearing
- Permission handling: Work as non-root with limited info
- Performance: Refresh quickly even with many processes
- Accuracy: CPU/MEM percentages should match top/htop
3.4 Example Usage / Output
# 1. Basic process list
$ ./myps
PID PPID USER STATE %CPU %MEM VSZ RSS COMMAND
1 0 root S 0.0 0.1 167M 12M /sbin/init
42 1 root S 0.0 0.0 23M 5M /lib/systemd/systemd-journald
1234 1 user S 0.5 2.3 450M 94M /usr/bin/python3 app.py
5678 1234 user R 1.2 0.5 12M 4M ./myps
...
Total: 234 processes, 2 running, 232 sleeping
# 2. Detailed view of specific process
$ ./myps -p 1234
PID: 1234
Name: python3
State: S (Sleeping)
Parent PID: 1
Thread Count: 4
Priority: 20 (nice: 0)
Memory:
Virtual: 450 MB
Resident: 94 MB
Shared: 12 MB
Text: 8 MB
Data: 86 MB
CPU:
User Time: 45.32 seconds
System Time: 12.45 seconds
Start Time: Mar 15 10:00:00
Open Files (5):
0: /dev/pts/0 (terminal)
1: /dev/pts/0 (terminal)
2: /dev/pts/0 (terminal)
3: socket:[12345] (TCP 0.0.0.0:8080)
4: /var/log/app.log (regular file)
Environment (truncated):
PATH=/usr/local/bin:/usr/bin:/bin
HOME=/home/user
PYTHONPATH=/app/lib
# 3. Real-time mode (like top)
$ ./myps --top
myps - 10:23:45 up 5 days, 3:21, 2 users, load: 0.52 0.38 0.31
PID USER %CPU %MEM COMMAND
5678 user 25.3 4.2 ffmpeg
1234 user 5.1 2.3 python3
...
[Press 'q' to quit, 'k' to kill, 'r' to renice]
# 4. Sorted by memory
$ ./myps --sort mem | head -10
PID USER %CPU %MEM VSZ RSS COMMAND
789 user 0.1 15.2 2.1G 620M firefox
...
3.5 Real World Outcome
What success looks like:
- Accurate metrics: CPU and memory percentages match
top - Complete information: All major /proc fields accessible
- Robust: Handles processes appearing/disappearing
- Real-time updates: Smooth refresh without flicker
- Detailed analysis: Can inspect any process deeply
4. Solution Architecture
4.1 High-Level Design
┌─────────────────────────────────────────────────────────────────────┐
│ myps │
│ │
│ ┌────────────────┐ │
│ │ Parse Args │ ← -p PID, --top, --sort, -t │
│ └───────┬────────┘ │
│ │ │
│ ▼ │
│ ┌────────────────────────────────────────────────────────────┐ │
│ │ Mode Selection │ │
│ │ │ │
│ │ [List Mode] [Detail Mode] [Top Mode] │ │
│ │ │ │ │ │ │
│ │ ▼ ▼ ▼ │ │
│ │ ┌──────────────┐ ┌──────────────┐ ┌──────────────┐ │ │
│ │ │ Scan /proc/ │ │ Read /proc/ │ │ Loop: │ │ │
│ │ │ for all PIDs │ │ [pid]/* │ │ Sample 1 │ │ │
│ │ └──────┬───────┘ │ extensively │ │ Sleep │ │ │
│ │ │ └──────────────┘ │ Sample 2 │ │ │
│ │ ▼ │ Calculate │ │ │
│ │ ┌──────────────────┐ │ Display │ │ │
│ │ │ For each PID: │ │ Handle keys │ │ │
│ │ │ Read stat │ └──────────────┘ │ │
│ │ │ Parse fields │ │ │
│ │ │ Resolve user │ │ │
│ │ │ Format output │ │ │
│ │ └──────────────────┘ │ │
│ │ │ │
│ └─────────────────────────────────────────────────────────────┘ │
│ │
│ ┌────────────────────────────────────────────────────────────┐ │
│ │ /proc Reading Layer │ │
│ │ │ │
│ │ ┌────────────────┐ ┌────────────────┐ ┌────────────────┐│ │
│ │ │ /proc/[pid]/ │ │ /proc/[pid]/ │ │ /proc/[pid]/ ││ │
│ │ │ stat │ │ status │ │ statm ││ │
│ │ │ - PID, state │ │ - Name │ │ - Memory pages ││ │
│ │ │ - utime/stime │ │ - Uid/Gid │ │ ││ │
│ │ │ - vsize/rss │ │ - VmRSS etc │ │ ││ │
│ │ └────────────────┘ └────────────────┘ └────────────────┘│ │
│ │ │ │
│ │ ┌────────────────┐ ┌────────────────┐ ┌────────────────┐│ │
│ │ │ /proc/[pid]/ │ │ /proc/[pid]/ │ │ /proc/ ││ │
│ │ │ fd/ │ │ environ │ │ stat ││ │
│ │ │ - Open files │ │ - Environment │ │ - Total CPU ││ │
│ │ └────────────────┘ └────────────────┘ └────────────────┘│ │
│ │ │ │
│ └─────────────────────────────────────────────────────────────┘ │
│ │
└──────────────────────────────────────────────────────────────────────┘
4.2 Key Components
| Component | Purpose | Key Operations |
|---|---|---|
| Process Scanner | Find all PIDs in /proc | opendir(“/proc”), filter numeric |
| Stat Parser | Parse /proc/[pid]/stat | Handle comm with spaces |
| Status Parser | Parse /proc/[pid]/status | Key: Value format |
| User Resolver | UID to username | getpwuid() |
| CPU Calculator | Compute CPU percentage | Two-sample differential |
| Memory Calculator | Compute memory percentage | /proc/meminfo for total |
| FD Reader | List open files | readlink /proc/[pid]/fd/* |
| Display Formatter | Format output | printf, ncurses (optional) |
4.3 Data Structures
// Process information
typedef struct {
pid_t pid;
pid_t ppid;
char comm[256]; // Command name
char state; // R, S, D, Z, T, etc.
uid_t uid;
char username[32];
// CPU
unsigned long utime; // User mode jiffies
unsigned long stime; // Kernel mode jiffies
unsigned long starttime; // Start time (jiffies since boot)
int num_threads;
int priority;
int nice;
// Memory (in KB)
unsigned long vsize; // Virtual size
unsigned long rss; // Resident set size (pages)
unsigned long shared; // Shared pages
// Calculated (after two samples)
double cpu_percent;
double mem_percent;
} process_info_t;
// System-wide information
typedef struct {
unsigned long total_cpu; // From /proc/stat
unsigned long total_memory; // From /proc/meminfo
unsigned long free_memory;
unsigned long uptime_seconds;
double load_avg[3];
int num_cpus;
} system_info_t;
// For CPU calculation, need two samples
typedef struct {
pid_t pid;
unsigned long utime;
unsigned long stime;
unsigned long total_time; // System total at sample time
} cpu_sample_t;
4.4 Algorithm Overview
FUNCTION scan_processes():
processes = empty list
dir = opendir("/proc")
WHILE (entry = readdir(dir)) != NULL:
IF entry.d_name is all digits:
pid = atoi(entry.d_name)
proc = read_process_info(pid)
IF proc != NULL:
append proc to processes
closedir(dir)
RETURN processes
FUNCTION read_process_info(pid):
proc = new process_info_t
// Read /proc/[pid]/stat
path = sprintf("/proc/%d/stat", pid)
content = read_file(path)
IF content == NULL:
RETURN NULL // Process disappeared
parse_stat(content, proc)
// Read /proc/[pid]/status for UID and more
path = sprintf("/proc/%d/status", pid)
content = read_file(path)
IF content != NULL:
parse_status(content, proc)
// Resolve username
pw = getpwuid(proc.uid)
IF pw != NULL:
proc.username = pw.pw_name
ELSE:
proc.username = sprintf("%d", proc.uid)
RETURN proc
FUNCTION calculate_cpu_percentages(processes, samples):
// Get current system total CPU
sys_info = read_system_info()
FOR each process in processes:
old_sample = find_sample(samples, process.pid)
IF old_sample != NULL:
proc_delta = (process.utime + process.stime) -
(old_sample.utime + old_sample.stime)
sys_delta = sys_info.total_cpu - old_sample.total_time
IF sys_delta > 0:
process.cpu_percent = 100.0 * proc_delta / sys_delta
// Save new sample for next iteration
save_sample(process.pid, process.utime, process.stime,
sys_info.total_cpu)
FUNCTION calculate_mem_percentages(processes, total_memory):
page_size = sysconf(_SC_PAGESIZE)
FOR each process in processes:
rss_kb = process.rss * page_size / 1024
process.mem_percent = 100.0 * rss_kb / total_memory
5. Implementation Guide
5.1 Development Environment Setup
# Create project directory
$ mkdir -p ~/projects/myps
$ cd ~/projects/myps
# Check /proc is available
$ ls /proc/self/stat
/proc/self/stat
# Get some test PIDs
$ pgrep -a bash
1234 /bin/bash
# Read example files
$ cat /proc/self/stat
$ cat /proc/self/status
$ ls -la /proc/self/fd
5.2 Project Structure
myps/
├── myps.c # Main program
├── proc_reader.c # /proc reading functions
├── proc_reader.h
├── parser.c # File parsing functions
├── parser.h
├── display.c # Output formatting
├── display.h
├── Makefile
└── README.md
5.3 The Core Question You’re Answering
“How does the operating system expose process internals, and how can we inspect running processes without special privileges?”
The /proc filesystem is Linux’s window into the kernel. Understanding it teaches you what information the kernel tracks about each process and how to access it from userspace.
5.4 Concepts You Must Understand First
Stop and research these before coding:
- The /proc Filesystem
- What is /proc? Is it a real filesystem?
- What files exist in /proc/[pid]/?
- Book Reference: “The Linux Programming Interface” Ch. 12
- Process States
- R (Running), S (Sleeping), D (Uninterruptible), Z (Zombie), T (Stopped)
- What causes each state?
- Memory Metrics
- What’s the difference between VSZ and RSS?
- What is shared memory?
- Book Reference: “APUE” Ch. 7.6
- CPU Time Accounting
- User time vs system time
- How to calculate CPU percentage?
- What are jiffies?
5.5 Questions to Guide Your Design
Before implementing, think through these:
- Data Collection
- Which /proc files give you which information?
- How to handle parsing errors gracefully?
- What if a file is missing or empty?
- CPU Percentage
- This requires two samples—how to structure that?
- What time interval to use? (100ms typical)
- How to handle new processes (no previous sample)?
- Error Handling
- Process disappears between opendir and reading—what to do?
- Permission denied on some /proc entries?
- Should one bad process stop the whole listing?
5.6 Thinking Exercise
Parse /proc/[pid]/stat
This single line contains most process info:
$ cat /proc/1234/stat
1234 (python3) S 1 1234 1234 0 -1 4194304 12345 0 0 0 452 124 0 0 20 0 4 0
^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^
| | | | | | | | | | |
PID comm | | | SID TTY flags minflt utime stime
state
PPID
PGID
Exercise: What if the command is (evil) process? The comm would be ((evil) process). How do you parse that correctly?
Answer: Find the LAST ) in the line, not the first!
5.7 Hints in Layers
Hint 1: Start with /proc/[pid]/stat This file has most of what you need. Parse it carefully—the comm field can contain spaces.
// Read /proc/[pid]/stat
char path[64];
snprintf(path, sizeof(path), "/proc/%d/stat", pid);
FILE *f = fopen(path, "r");
if (!f) return NULL; // Process disappeared
char line[1024];
if (!fgets(line, sizeof(line), f)) {
fclose(f);
return NULL;
}
fclose(f);
Hint 2: Directory Scanning
// List all processes
DIR *dir = opendir("/proc");
struct dirent *entry;
while ((entry = readdir(dir)) != NULL) {
// Check if name is all digits (a PID)
char *p = entry->d_name;
while (*p && isdigit(*p)) p++;
if (*p == '\0' && entry->d_name[0] != '\0') {
pid_t pid = atoi(entry->d_name);
// Process this PID
}
}
closedir(dir);
Hint 3: CPU Percentage Calculation
// Sample 1: record utime1 + stime1, total_time1
// Sleep for interval (e.g., 100ms)
// Sample 2: record utime2 + stime2, total_time2
double proc_delta = (utime2 + stime2) - (utime1 + stime1);
double sys_delta = total_time2 - total_time1;
double cpu_percent = 100.0 * proc_delta / sys_delta;
// Note: May exceed 100% on multi-core (one process using 2 cores = 200%)
Hint 4: Open Files from /proc/[pid]/fd
This is a directory of symlinks. readlink() each entry to get the file path.
char fd_dir[64];
snprintf(fd_dir, sizeof(fd_dir), "/proc/%d/fd", pid);
DIR *dir = opendir(fd_dir);
if (!dir) return; // Permission denied or process gone
struct dirent *entry;
while ((entry = readdir(dir)) != NULL) {
if (entry->d_name[0] == '.') continue;
char link_path[128], target[256];
snprintf(link_path, sizeof(link_path), "%s/%s", fd_dir, entry->d_name);
ssize_t len = readlink(link_path, target, sizeof(target) - 1);
if (len > 0) {
target[len] = '\0';
printf(" %s: %s\n", entry->d_name, target);
}
}
closedir(dir);
5.8 The Interview Questions They’ll Ask
- “How would you find which process is using the most CPU?”
- Two samples of /proc/[pid]/stat
- Calculate delta in utime + stime
- Sort by delta / system_delta
- “What’s the difference between /proc/meminfo and /proc/[pid]/status?”
- /proc/meminfo: system-wide memory stats
- /proc/[pid]/status: per-process memory stats
- Different levels of detail
- “How does top calculate CPU percentage?”
- Samples CPU times periodically
- Calculates delta between samples
- Divides by total CPU time delta
- Multiplies by 100 (or number of CPUs for >100%)
- “What information can you get about a process without being root?”
- Basic stat/status: yes
- fd directory: only your own processes
- environ: only your own processes
- exe symlink: usually readable
- “How would you detect if a process is leaking file descriptors?”
- Count entries in /proc/[pid]/fd over time
- If count grows unbounded, it’s leaking
- Can also check /proc/[pid]/limits for max
5.9 Books That Will Help
| Topic | Book | Chapter |
|---|---|---|
| /proc filesystem | “The Linux Programming Interface” | Ch. 12 |
| Process environment | “APUE” by Stevens | Ch. 7 |
| Memory management | “Understanding the Linux Kernel” | Ch. 8-9 |
| CPU scheduling | “Linux Kernel Development” by Love | Ch. 4 |
5.10 Implementation Phases
Phase 1: Basic Process List (4-6 hours)
- Scan /proc for all PIDs
- Read basic info from /proc/[pid]/stat
- Display in ps-like format
Phase 2: Better Parsing (3-4 hours)
- Handle comm field with spaces/parens
- Parse more fields from stat
- Add status file parsing
Phase 3: User Resolution (2-3 hours)
- Read UID from status
- Convert to username with getpwuid()
- Handle unknown UIDs
Phase 4: CPU Percentage (4-6 hours)
- Read /proc/stat for system totals
- Implement two-sample calculation
- Handle process creation/deletion between samples
Phase 5: Memory Percentage (2-3 hours)
- Read /proc/meminfo for total memory
- Calculate RSS as percentage
- Add human-readable size formatting
Phase 6: Detail Mode (3-4 hours)
- -p PID flag
- Read fd directory
- Read environment (if permitted)
- Show memory maps summary
Phase 7: Real-time Mode (4-6 hours)
- Loop with sleep
- Clear and redraw screen
- Handle keyboard input (q to quit)
- Optional: ncurses for better display
5.11 Key Implementation Decisions
| Decision | Trade-offs |
|---|---|
| Parse stat vs status | stat has more fields; status is more readable |
| Store all processes vs stream | Store: can sort. Stream: less memory |
| ncurses vs printf | ncurses: better display. printf: simpler |
| Sample interval | Shorter: more responsive. Longer: more accurate |
| Handle errors | Skip silently vs warn. Usually skip for race conditions |
6. Testing Strategy
6.1 Unit Tests
| Test | Input | Expected Result |
|---|---|---|
| parse_stat normal | “123 (cmd) S 1 …” | pid=123, comm=”cmd”, state=’S’ |
| parse_stat with space | “123 (my cmd) S 1 …” | comm=”my cmd” |
| parse_stat with paren | “123 ((evil)) S 1 …” | comm=”(evil)” |
| uid_to_username(0) | 0 | “root” |
| uid_to_username(65534) | 65534 | “nobody” or “65534” |
6.2 Integration Tests
# Test basic listing
$ ./myps | head -5
# Should show header and processes
# Test specific process
$ ./myps -p $$
# Should show current shell details
# Compare with ps
$ ./myps | wc -l
$ ps aux | wc -l
# Counts should be similar
# Test CPU percentage (run something busy)
$ yes > /dev/null &
$ ./myps --sort cpu | head -5
# 'yes' should be near top
# Test process disappearing
$ (sleep 0.1) & ./myps -p $!
# Should handle gracefully
6.3 Edge Cases to Test
| Case | Setup | Expected Behavior |
|---|---|---|
| Process dies during read | Kill process mid-scan | Skip gracefully |
| Kernel threads | Check /proc/2/stat | Should work (may have empty cmdline) |
| Zombie process | Create with bad parent | Show Z state |
| Stopped process | ^Z a process | Show T state |
| Permission denied | Read other user’s fd | Note in output |
| Very long command | Process with 1000-char cmdline | Truncate display |
| Many processes | 1000+ processes | Complete in <1s |
6.4 Verification Commands
# Compare with top
$ ./myps --sort cpu | head -10
$ top -b -n 1 | head -15
# Top consumers should match
# Compare with ps
$ ./myps -p 1
$ ps -p 1 -o pid,ppid,uid,stat,vsz,rss,comm
# Fields should match
# Test memory calculation
$ ./myps | awk '{sum += $6} END {print sum}'
$ free -k | awk '/Mem:/ {print $3}'
# Should be in same ballpark (RSS overcounts)
# Test for memory leaks
$ valgrind --leak-check=full ./myps
7. Common Pitfalls & Debugging
Problem 1: “Segfault when parsing comm field”
- Why: Process name contains ‘)’ or spaces
- Fix: Find the LAST ‘)’ in the line, not the first
Problem 2: “CPU percentage over 100%”
- Why: Multi-threaded processes can use more than 100% (one core)
- Fix: This is correct for SMP. Divide by num_cores for normalized percentage.
Problem 3: “Permission denied on /proc/[pid]/fd”
- Why: Can only read fd directory for your own processes (unless root)
- Fix: Skip or show “permission denied” gracefully
Problem 4: “Process vanishes mid-read”
- Why: Process exited between directory scan and file read
- Fix: Handle ENOENT gracefully—the process simply ended
Problem 5: “CPU percentage is always 0”
- Why: Not taking two samples, or using same sample
- Fix: Must sample, wait, sample again to calculate delta
Problem 6: “Numbers don’t match top/htop”
- Why: Different calculation methods, timing
- Fix: Use same interval, same calculation. Small differences are normal.
8. Extensions & Challenges
8.1 Easy Extensions
| Extension | Description | Learning |
|---|---|---|
| Tree view | Show process hierarchy | Parent/child relationships |
| Color output | Color by state/cpu | Terminal escape codes |
| Thread view | Show /proc/[pid]/task/* | Thread enumeration |
| Search | Filter by name | String matching |
8.2 Advanced Challenges
| Challenge | Description | Learning |
|---|---|---|
| cgroups | Read cgroup info | Container monitoring |
| Network connections | Parse /proc/net/* | Network state |
| I/O stats | Read /proc/[pid]/io | Disk I/O monitoring |
| PSS calculation | Parse /proc/[pid]/smaps | Accurate memory |
| Historical graphs | Track over time | Data visualization |
8.3 Research Topics
- How does htop achieve its rich display?
- What is /proc/[pid]/smaps_rollup?
- How do containers isolate /proc views?
- What’s the difference between /proc and /sys?
9. Real-World Connections
9.1 Production Systems Using This
| System | How It Uses /proc | Notable Feature |
|---|---|---|
| top/htop | Process monitoring | Real-time updates |
| ps | Process listing | Snapshot view |
| Prometheus node_exporter | Metrics collection | Time series |
| Docker stats | Container metrics | cgroup integration |
| sysstat (sar) | Historical data | Long-term trends |
| lsof | Open file listing | /proc/[pid]/fd |
9.2 How the Pros Do It
htop:
- Uses ncurses for display
- Reads /proc efficiently
- Sorts and filters in-memory
- Updates incrementally
node_exporter:
- Exposes /proc data as Prometheus metrics
- Handles many metric types
- Designed for continuous collection
9.3 Reading the Source
# htop source
$ git clone https://github.com/htop-dev/htop
$ less htop/linux/LinuxProcessList.c
# procps (ps, top) source
$ git clone https://gitlab.com/procps-ng/procps
$ less procps/proc/readproc.c
# Simple example
$ git clone https://github.com/hishamhm/htop
$ less htop/linux/Platform.c
10. Resources
10.1 Man Pages
$ man 5 proc # /proc filesystem documentation
$ man 1 ps # ps command options
$ man 1 top # top command
$ man 2 readlink # Read symbolic links
$ man 3 getpwuid # User ID to name
$ man 3 sysconf # System configuration
10.2 Online Resources
10.3 Book Chapters
| Book | Chapters | Topics Covered |
|---|---|---|
| “TLPI” by Kerrisk | Ch. 12 | System and Process Information |
| “Linux System Programming” by Love | Ch. 11 | Process Management |
| “Linux Kernel Development” by Love | Ch. 3 | Process Management |
| “Understanding the Linux Kernel” | Ch. 3 | Processes |
11. Self-Assessment Checklist
Before considering this project complete, verify:
- I can explain what /proc is and how it works
- I can parse /proc/[pid]/stat correctly (including tricky comm fields)
- I understand process states (R, S, D, Z, T)
- I can calculate CPU percentage from two samples
- I understand the difference between VSZ and RSS
- My program handles processes disappearing gracefully
- I can read open file descriptors (for own processes)
- I can answer all the interview questions
- My code has no memory leaks
- Output matches ps/top reasonably closely
12. Submission / Completion Criteria
Your project is complete when:
- Functionality
- Lists all processes with key metrics
- -p PID shows detailed process info
- CPU and memory percentages are reasonably accurate
- Handles errors gracefully
- Quality
- No compiler warnings
- No valgrind errors
- Handles race conditions
- Testing
- Works on systems with many processes
- Output matches ps/top
- Edge cases handled
- Understanding
- Can explain /proc structure
- Can explain calculation methods
- Can read and interpret process state
Next Steps
After completing this project, you understand how to inspect running processes. Consider:
- Project 6: Signal Handler - Interact with processes via signals
- Project 9: Daemon Service - Create long-running processes
- Project 10: File Watcher - Monitor filesystem changes
The /proc knowledge you’ve gained is essential for any systems monitoring or debugging work.