Project 10: File Change Watcher with inotify

Build a file system watcher that monitors directories for changes (create, modify, delete, rename) using Linux’s inotify API, useful for build systems, sync tools, and log monitoring.

Quick Reference

Attribute Value
Difficulty Level 3 - Advanced
Time Estimate 2-3 weeks (25-40 hours)
Language C (primary), Rust/Go (alternatives)
Prerequisites File I/O, event-driven programming concepts
Key Topics inotify API, event-driven I/O, recursive watching, filesystem events

1. Learning Objectives

After completing this project, you will be able to:

  • Understand the inotify API including watch descriptors, event masks, and event structures
  • Implement efficient filesystem monitoring without wasteful polling
  • Handle recursive directory watching including dynamically created subdirectories
  • Parse and interpret inotify events correctly handling variable-length event structures
  • Manage watch descriptors mapping them to paths for meaningful output
  • Handle edge cases like event overflow, watch limits, and rapid changes
  • Build practical tools like build watchers, sync tools, and log monitors

2. Theoretical Foundation

2.1 Core Concepts

Traditional file monitoring uses polling: repeatedly calling stat() on files to detect changes. This is wasteful and has poor latency. inotify is a Linux kernel feature that delivers filesystem events directly to your application.

Polling vs inotify

Polling Approach:                   inotify Approach:
┌────────────────────────────────┐  ┌────────────────────────────────┐
│                                │  │                                │
│ while (true) {                 │  │ // Kernel watches for us       │
│   for each file:               │  │                                │
│     stat(file)                 │  │ inotify_add_watch(dir)         │
│     if changed:                │  │                                │
│       process()                │  │ while (true) {                 │
│   sleep(1)                     │  │   read(inotify_fd)  // blocks  │
│ }                              │  │   process(events)              │
│                                │  │ }                              │
│ Problems:                      │  │                                │
│ - Wastes CPU                   │  │ Benefits:                      │
│ - 1 second latency             │  │ - Zero CPU when idle           │
│ - Doesn't scale to many files  │  │ - Instant notification         │
│ - Misses rapid changes         │  │ - Scales to thousands of files │
│                                │  │ - Kernel does the work         │
└────────────────────────────────┘  └────────────────────────────────┘

How inotify Works:

inotify Architecture

            User Space                          Kernel Space
     ┌─────────────────────────────┐    ┌─────────────────────────────────┐
     │                             │    │                                 │
     │  ┌───────────────────────┐  │    │  ┌────────────────────────────┐│
     │  │  Your Application     │  │    │  │   inotify Subsystem        ││
     │  │                       │  │    │  │                            ││
     │  │  inotify_fd = 5       │◄─┼────┼──┤  Watch List:               ││
     │  │                       │  │    │  │    wd=1 → /home/user/proj  ││
     │  │  read(5, buf, ...)   ─┼──┼────┼─►│    wd=2 → /var/log         ││
     │  │                       │  │    │  │    wd=3 → /tmp/build       ││
     │  │  ┌─────────────────┐  │  │    │  │                            ││
     │  │  │ Event Buffer    │◄─┼──┼────┼──┤  Event Queue:              ││
     │  │  │                 │  │  │    │  │    [CREATE, /proj/foo.c]   ││
     │  │  │ struct inotify  │  │  │    │  │    [MODIFY, /log/syslog]   ││
     │  │  │ _event          │  │  │    │  │    [DELETE, /tmp/x.o]      ││
     │  │  │   .wd           │  │  │    │  │                            ││
     │  │  │   .mask         │  │  │    │  │                            ││
     │  │  │   .cookie       │  │  │    │  │  VFS Hooks:                ││
     │  │  │   .len          │  │  │    │  │    create() → queue event  ││
     │  │  │   .name[]       │  │  │    │  │    unlink() → queue event  ││
     │  │  └─────────────────┘  │  │    │  │    write()  → queue event  ││
     │  │                       │  │    │  │    rename() → queue event  ││
     │  └───────────────────────┘  │    │  └────────────────────────────┘│
     │                             │    │                                 │
     └─────────────────────────────┘    │  Filesystem Layer               │
                                        │  ┌────────────────────────────┐│
                                        │  │ ext4, xfs, btrfs, etc.    ││
                                        │  │                            ││
                                        │  │ /home/user/project/        ││
                                        │  │   ├── main.c               ││
                                        │  │   ├── util.c               ││
                                        │  │   └── Makefile             ││
                                        │  └────────────────────────────┘│
                                        └─────────────────────────────────┘

Event Types:

Event Meaning Example Trigger
IN_CREATE File/dir created touch newfile
IN_DELETE File/dir deleted rm file
IN_MODIFY File modified echo “x” » file
IN_MOVED_FROM File moved away mv file ../other/
IN_MOVED_TO File moved here mv ../file .
IN_OPEN File opened cat file
IN_CLOSE_WRITE Writable file closed After writing
IN_CLOSE_NOWRITE Read-only file closed After reading
IN_ATTRIB Metadata changed chmod, chown
IN_ISDIR Event is for directory (flag, combined with above)

2.2 Why This Matters

Real-World Usage:

File watching is everywhere in modern development:

  • Build systems: make, webpack, cargo watch recompile on changes
  • Hot reload: Development servers restart when code changes
  • File sync: Dropbox, syncthing detect changes for sync
  • Log monitoring: Logrotate, fail2ban watch log files
  • Backup systems: Detect changes for incremental backup
  • Security: Intrusion detection (watch for unauthorized changes)
  • IDE integration: Auto-refresh file trees, trigger linting

Career Impact:

Understanding inotify demonstrates:

  • Knowledge of Linux internals
  • Event-driven programming skills
  • Efficient resource usage thinking
  • Ability to build developer tools

The Numbers:

inotify is dramatically more efficient than polling:

  • Polling 10,000 files every second: ~10,000 stat() calls/sec
  • inotify: 0 calls when nothing changes, instant notification when something does
  • CPU usage difference: 10-100x less for inotify

2.3 Historical Context

Before inotify (pre-2005):

  • Polling with stat() was the only option
  • dnotify existed but was limited (one signal per directory, one fd per directory)
  • FAM (File Alteration Monitor) used polling with a daemon

inotify (2005, Linux 2.6.13):

  • Single file descriptor for all watches
  • Event queue instead of signals
  • Much more scalable
  • Became the standard on Linux

fanotify (2009, Linux 2.6.36):

  • More advanced successor to inotify
  • Supports filesystem-wide watching
  • Used for antivirus, hierarchical storage management
  • More complex API

Other Platforms:

  • macOS: FSEvents (different API, per-host not per-directory)
  • BSD: kqueue with EVFILT_VNODE
  • Windows: ReadDirectoryChangesW
  • Cross-platform: libuv, libfswatch abstract differences

2.4 Common Misconceptions

Misconception 1: “inotify watches files recursively by default”

Reality: Each watch is for ONE directory only. You must:

  • Add watches for all subdirectories manually
  • Watch for IN_CREATE IN_ISDIR and add watches for new directories
  • Handle directories that appear after initial setup

Misconception 2: “inotify tells you the full path”

Reality: Events contain only the filename within the watched directory. You must:

  • Track which watch descriptor maps to which path
  • Combine the watch path with the filename

Misconception 3: “Events arrive immediately”

Reality: Events are batched. Multiple events can arrive in a single read(). You must:

  • Parse the buffer correctly (variable-length events)
  • Handle cookie matching for rename pairs

Misconception 4: “inotify works on any filesystem”

Reality: inotify works on most local filesystems but NOT on:

  • NFS (network filesystem)
  • FUSE filesystems (varies)
  • procfs, sysfs (pseudo-filesystems)

3. Project Specification

3.1 What You Will Build

A file system watcher utility with these features:

  1. Basic watching: Monitor directories for file changes
  2. Recursive watching: Automatically watch subdirectories
  3. Event filtering: Select which events to report
  4. Action triggers: Execute commands on events
  5. Pattern matching: Filter by filename patterns (globs)

3.2 Functional Requirements

  1. Watch Management
    • Add watches for specified directories
    • Recursive option to watch all subdirectories
    • Handle dynamically created subdirectories
    • Remove watches when directories deleted
  2. Event Handling
    • Report all standard events (create, modify, delete, move)
    • Show full path to affected file
    • Handle rename tracking (cookie matching)
    • Detect and report overflow conditions
  3. Filtering
    • Filter events by type (–events=create,modify)
    • Filter by filename pattern (–glob “*.c”)
    • Exclude patterns (–exclude “*.swp”)
  4. Actions
    • Execute command on event (–exec “make”)
    • Pass event info to command ({} substitution)
    • Debounce rapid events (configurable delay)
  5. Output
    • Human-readable format by default
    • JSON format option (–json)
    • Quiet mode (just run actions, no output)

3.3 Non-Functional Requirements

  1. Performance
    • Handle 10,000+ watched directories
    • Process 1,000+ events/second
    • Minimal memory per watch
  2. Reliability
    • Handle watch limit gracefully
    • Recover from overflow events
    • No memory leaks on long runs
  3. Usability
    • Clear error messages
    • Helpful –help output
    • Exit codes indicate status

3.4 Example Usage / Output

# 1. Watch a directory
$ ./mywatcher /home/user/project
Watching /home/user/project (recursive)

# In another terminal, make changes:
$ echo "hello" > /home/user/project/test.txt
$ mkdir /home/user/project/subdir
$ mv /home/user/project/test.txt /home/user/project/subdir/
$ rm /home/user/project/subdir/test.txt

# Output from watcher:
[CREATE]      /home/user/project/test.txt
[MODIFY]      /home/user/project/test.txt
[CLOSE_WRITE] /home/user/project/test.txt
[CREATE]      /home/user/project/subdir/
[ISDIR]       Now watching: /home/user/project/subdir
[MOVED_FROM]  /home/user/project/test.txt
[MOVED_TO]    /home/user/project/subdir/test.txt
[DELETE]      /home/user/project/subdir/test.txt

# 2. Trigger actions on events
$ ./mywatcher --exec "echo 'Changed: {}'" /var/log
[MODIFY]  /var/log/syslog
Changed: /var/log/syslog
[MODIFY]  /var/log/auth.log
Changed: /var/log/auth.log

# 3. Filter by event type
$ ./mywatcher --events=create,delete /tmp
[CREATE]  /tmp/tempfile.abc123
[DELETE]  /tmp/tempfile.abc123

# 4. Filter by pattern
$ ./mywatcher --glob "*.c" /home/user/project
Watching for *.c changes in /home/user/project
[MODIFY]  /home/user/project/main.c
[CREATE]  /home/user/project/util.c

# 5. Build system integration
$ ./mywatcher --glob "*.c" --exec "make" --debounce 500 ./src
Watching for *.c changes in ./src
[MODIFY]  ./src/main.c
[debounce: 500ms]
Running: make
gcc -c main.c -o main.o
gcc main.o -o program
Build successful

# 6. JSON output
$ ./mywatcher --json /tmp
{"event":"CREATE","path":"/tmp/foo.txt","time":"2024-03-15T10:00:00Z"}
{"event":"MODIFY","path":"/tmp/foo.txt","time":"2024-03-15T10:00:00Z"}

# 7. Non-recursive watching
$ ./mywatcher --no-recursive /home/user
# Only watches /home/user, not subdirectories

# 8. Exclude patterns
$ ./mywatcher --exclude "*.swp" --exclude ".git" ./project
# Ignores vim swap files and .git directory

3.5 Real World Outcome

When complete, you will have a tool that can:

  1. Replace simple polling scripts with efficient event-driven monitoring
  2. Power development workflows like automatic recompilation
  3. Monitor log directories for new entries
  4. Track configuration changes in real-time
  5. Integrate with build systems like make or cargo

4. Solution Architecture

4.1 High-Level Design

File Watcher Architecture

┌─────────────────────────────────────────────────────────────────────────────┐
│                           mywatcher                                          │
│                                                                             │
│  ┌─────────────────────────────────────────────────────────────────────┐   │
│  │                      Watch Manager                                   │   │
│  │                                                                      │   │
│  │  Watch Descriptor Map:                                               │   │
│  │  ┌────────────────────────────────────────────────────────────────┐ │   │
│  │  │  wd=1 → "/home/user/project"                                   │ │   │
│  │  │  wd=2 → "/home/user/project/src"                               │ │   │
│  │  │  wd=3 → "/home/user/project/include"                           │ │   │
│  │  │  wd=4 → "/home/user/project/tests"                             │ │   │
│  │  └────────────────────────────────────────────────────────────────┘ │   │
│  │                                                                      │   │
│  │  Reverse Map (for cleanup):                                          │   │
│  │  ┌────────────────────────────────────────────────────────────────┐ │   │
│  │  │  "/home/user/project" → wd=1                                   │ │   │
│  │  │  ...                                                           │ │   │
│  │  └────────────────────────────────────────────────────────────────┘ │   │
│  └─────────────────────────────────────────────────────────────────────┘   │
│                                                                             │
│  ┌─────────────────────────────────────────────────────────────────────┐   │
│  │                      Event Loop                                      │   │
│  │                                                                      │   │
│  │  while (running) {                                                   │   │
│  │      n = read(inotify_fd, buffer, sizeof(buffer));                   │   │
│  │      for each event in buffer:                                       │   │
│  │          path = lookup_path(event.wd) + "/" + event.name             │   │
│  │          if matches_filter(path, event.mask):                        │   │
│  │              if event.mask & IN_ISDIR && event.mask & IN_CREATE:     │   │
│  │                  add_watch_recursive(path)                           │   │
│  │              print_event(event)                                      │   │
│  │              if action_configured:                                   │   │
│  │                  run_action(path)                                    │   │
│  │  }                                                                   │   │
│  └─────────────────────────────────────────────────────────────────────┘   │
│                                                                             │
│  ┌─────────────────────────────────────────────────────────────────────┐   │
│  │                      Configuration                                   │   │
│  │                                                                      │   │
│  │  recursive: true                                                     │   │
│  │  events: [IN_CREATE, IN_MODIFY, IN_DELETE, IN_MOVED_FROM/TO]        │   │
│  │  glob_pattern: "*.c"                                                 │   │
│  │  exclude_patterns: ["*.swp", ".git"]                                 │   │
│  │  action: "make"                                                      │   │
│  │  debounce_ms: 500                                                    │   │
│  └─────────────────────────────────────────────────────────────────────┘   │
│                                                                             │
└─────────────────────────────────────────────────────────────────────────────┘

4.2 Key Components

1. Watch Manager

  • Maintains mapping from watch descriptors to paths
  • Handles adding/removing watches
  • Tracks hierarchy for recursive watching

2. Event Parser

  • Reads from inotify fd
  • Parses variable-length events
  • Handles cookie matching for renames

3. Filter Engine

  • Event type filtering
  • Glob pattern matching
  • Exclusion patterns

4. Action Runner

  • Command execution
  • Debouncing logic
  • Environment variable substitution

5. Output Formatter

  • Human-readable format
  • JSON format
  • Quiet mode

4.3 Data Structures

// Watch descriptor to path mapping
typedef struct {
    int wd;
    char path[PATH_MAX];
} watch_entry_t;

// Dynamic array of watches
typedef struct {
    watch_entry_t *entries;
    size_t count;
    size_t capacity;
} watch_table_t;

// Rename tracking (cookie matching)
typedef struct {
    uint32_t cookie;
    char from_path[PATH_MAX];
    time_t timestamp;
} pending_rename_t;

// Configuration
typedef struct {
    char **paths;               // Paths to watch
    size_t path_count;
    int recursive;              // Watch subdirectories
    uint32_t event_mask;        // Which events to report
    char *glob_pattern;         // Filename filter
    char **exclude_patterns;    // Exclusion patterns
    size_t exclude_count;
    char *action;               // Command to run
    int debounce_ms;            // Debounce delay
    int json_output;            // JSON format
    int quiet;                  // No output
} config_t;

// Main state
typedef struct {
    int inotify_fd;
    watch_table_t watches;
    pending_rename_t *pending_renames;
    size_t rename_count;
    config_t config;
    volatile int running;
} watcher_t;

4.4 Algorithm Overview

Adding Recursive Watches:

add_watch_recursive(path):
    wd = inotify_add_watch(inotify_fd, path, mask)
    if wd < 0:
        if errno == ENOSPC:
            error("Watch limit reached")
        return error

    store_mapping(wd, path)

    if recursive:
        for each entry in readdir(path):
            if entry is directory and not "." or "..":
                if not excluded(entry.name):
                    add_watch_recursive(path + "/" + entry.name)

Event Loop:

event_loop():
    buffer = allocate(EVENT_BUF_LEN)

    while running:
        length = read(inotify_fd, buffer, EVENT_BUF_LEN)
        if length < 0:
            if errno == EINTR:
                continue
            error and exit

        ptr = buffer
        while ptr < buffer + length:
            event = (struct inotify_event *)ptr

            path = lookup_path(event.wd)
            if event.len > 0:
                full_path = path + "/" + event.name
            else:
                full_path = path

            if should_process(event, full_path):
                handle_event(event, full_path)

            # Move to next event (variable length!)
            ptr += sizeof(struct inotify_event) + event.len

Rename Tracking:

handle_rename(event, path):
    if event.mask & IN_MOVED_FROM:
        # Store pending rename with cookie
        pending = {cookie: event.cookie, from_path: path, time: now()}
        add_pending(pending)

    if event.mask & IN_MOVED_TO:
        # Look for matching cookie
        pending = find_pending(event.cookie)
        if pending:
            print("RENAMED: %s -> %s", pending.from_path, path)
            remove_pending(pending)
        else:
            print("MOVED_TO: %s (from outside watched tree)", path)

    # Periodically clean up old pending renames (no matching MOVED_TO)
    cleanup_old_pending()

5. Implementation Guide

5.1 Development Environment Setup

# Check inotify support
$ cat /proc/sys/fs/inotify/max_user_watches
8192  # Default, may need to increase

# Increase watch limit (temporary)
$ sudo sysctl fs.inotify.max_user_watches=524288

# Increase watch limit (permanent)
$ echo "fs.inotify.max_user_watches=524288" | sudo tee -a /etc/sysctl.conf

# Create project
$ mkdir mywatcher && cd mywatcher
$ touch mywatcher.c mywatcher.h Makefile

# Makefile
$ cat > Makefile << 'EOF'
CC = gcc
CFLAGS = -Wall -Wextra -Werror -g -O2
LDFLAGS =

all: mywatcher

mywatcher: mywatcher.c mywatcher.h
	$(CC) $(CFLAGS) -o $@ mywatcher.c $(LDFLAGS)

clean:
	rm -f mywatcher *.o

test: mywatcher
	./test_watcher.sh

.PHONY: all clean test
EOF

5.2 Project Structure

mywatcher/
├── mywatcher.c       # Main implementation
├── mywatcher.h       # Header with structures
├── test_watcher.sh   # Test script
├── Makefile
└── README.md

5.3 The Core Question You’re Answering

“How do you efficiently detect file system changes without continuously polling the disk?”

The answer is inotify—a kernel subsystem that delivers file change events. This is the foundation of every “hot reload” feature, every build watcher, every file sync service.

Think about:

  • How does the kernel know which processes care about which files?
  • What happens when changes occur faster than you can process them?
  • How do you track which watch descriptor corresponds to which path?

5.4 Concepts You Must Understand First

1. inotify API

  • What does inotify_init() return?
  • What is a watch descriptor?
  • How is inotify_add_watch() different from watching a file descriptor?
  • Book Reference: “The Linux Programming Interface” Ch. 19

2. Event Types

  • What’s the difference between IN_MODIFY and IN_CLOSE_WRITE?
  • When do you get IN_CREATE vs IN_MOVED_TO?
  • What does the IN_ISDIR flag indicate?
  • How do cookies help with rename tracking?

3. Event Structure

  • What does struct inotify_event look like?
  • Why is the name field variable length?
  • How do you iterate through multiple events in a buffer?

4. Limitations

  • Why doesn’t inotify watch recursively?
  • What is the watch limit and how do you change it?
  • What happens on IN_Q_OVERFLOW?

5.5 Questions to Guide Your Design

Recursive Watching:

  • When a new directory is created, how do you start watching it?
  • What if files are created in the new directory before you add the watch?
  • How do you handle deeply nested hierarchies efficiently?

Event Batching:

  • Saving a file triggers create+open+write+close—how do you coalesce?
  • What’s a reasonable debounce interval?
  • How do you avoid missing important events during debounce?

Memory Management:

  • How do you store the wd-to-path mappings efficiently?
  • What’s the memory cost per watch?
  • How do you handle cleanup when directories are deleted?

5.6 Thinking Exercise

Event Sequences

What events fire when you echo "hello" > file.txt?

Shell: echo "hello" > file.txt

The shell does:
1. open("file.txt", O_WRONLY|O_CREAT|O_TRUNC, 0644)
   If file is new:
     → IN_CREATE     (file created)
   Then:
     → IN_OPEN       (file opened)

2. write(fd, "hello\n", 6)
   → IN_MODIFY      (file contents changed)

3. close(fd)
   → IN_CLOSE_WRITE (file closed after writing)

So you see 3-4 events for a simple echo!

For `mv a.txt b.txt`:
   → IN_MOVED_FROM (a.txt) with cookie=12345
   → IN_MOVED_TO (b.txt) with cookie=12345
   The cookies match - it's the same operation!

For `mv a.txt ../other/`:
   In watched directory:
     → IN_MOVED_FROM (a.txt) with cookie=12346
   In ../other/ (if watched):
     → IN_MOVED_TO (a.txt) with cookie=12346
   If ../other/ not watched:
     → You only see IN_MOVED_FROM (file left your tree)

For `mkdir subdir`:
   → IN_CREATE (subdir) with IN_ISDIR flag
   You must add a new watch for subdir!

For `rm file.txt`:
   → IN_DELETE (file.txt)

For `rm -r subdir/`:
   → IN_DELETE_SELF for subdir  (the directory itself deleted)
   → Watch becomes invalid, subsequent events show IN_IGNORED

5.7 Hints in Layers

Hint 1: Basic Setup

int inotify_fd = inotify_init1(IN_NONBLOCK);
if (inotify_fd < 0) {
    perror("inotify_init1");
    exit(1);
}

int wd = inotify_add_watch(inotify_fd, path,
    IN_CREATE | IN_DELETE | IN_MODIFY | IN_MOVED_FROM | IN_MOVED_TO);
if (wd < 0) {
    perror("inotify_add_watch");
    exit(1);
}

// Store mapping: wd -> path

Hint 2: Reading Events

#define EVENT_BUF_LEN (1024 * (sizeof(struct inotify_event) + 16))
char buffer[EVENT_BUF_LEN];

ssize_t length = read(inotify_fd, buffer, sizeof(buffer));
if (length < 0) {
    if (errno == EAGAIN) {
        // No events available (non-blocking)
        return;
    }
    perror("read");
    exit(1);
}

// buffer now contains one or more struct inotify_event
// Each event is variable length!

Hint 3: Parsing Events

char *ptr = buffer;
while (ptr < buffer + length) {
    struct inotify_event *event = (struct inotify_event *)ptr;

    // event->wd    = watch descriptor
    // event->mask  = event type(s)
    // event->cookie = rename tracking cookie
    // event->len   = length of name (may be 0)
    // event->name  = filename (if len > 0)

    if (event->len > 0) {
        printf("%s: %s\n", event_name(event->mask), event->name);
    }

    // Move to next event
    ptr += sizeof(struct inotify_event) + event->len;
}

Hint 4: Recursive Watching When you see IN_CREATE with IN_ISDIR, immediately add a watch for the new directory. Also walk it to catch any files created before the watch was added.

if (event->mask & IN_ISDIR && event->mask & IN_CREATE) {
    char new_path[PATH_MAX];
    snprintf(new_path, sizeof(new_path), "%s/%s",
             get_path_for_wd(event->wd), event->name);
    add_watch_recursive(new_path);
}

5.8 The Interview Questions They’ll Ask

  1. “What’s the difference between inotify and polling?”
    • inotify: kernel sends events, zero CPU when idle, instant notification
    • Polling: repeatedly call stat(), wastes CPU, latency = poll interval
  2. “How does inotify handle recursive directory watching?”
    • It doesn’t! Each inotify_add_watch covers ONE directory
    • You must manually add watches for all subdirectories
    • Watch for IN_CREATE IN_ISDIR to catch new directories
  3. “What happens when events arrive faster than you can process?”
    • Kernel has a per-user event queue (max_queued_events)
    • If queue overflows, IN_Q_OVERFLOW event is sent
    • Some events may be lost—you might need to rescan
  4. “How would you track file renames across directories?”
    • IN_MOVED_FROM and IN_MOVED_TO events share a cookie
    • Match cookies to pair the “from” and “to”
    • If only one arrives, file left/entered your watched tree
  5. “What are the limitations of inotify?”
    • No recursive watching (must be implemented manually)
    • Watch limit (adjustable via sysctl)
    • Doesn’t work on NFS or most FUSE filesystems
    • Event queue can overflow under heavy load

5.9 Books That Will Help

Topic Book Chapter
inotify “The Linux Programming Interface” by Kerrisk Ch. 19
File systems “APUE” by Stevens & Rago Ch. 4
Event-driven I/O “The Linux Programming Interface” by Kerrisk Ch. 63
select/poll/epoll “APUE” by Stevens & Rago Ch. 14.4

5.10 Implementation Phases

Phase 1: Basic inotify (2-3 hours)

  • Initialize inotify
  • Add watch for single directory
  • Read and print events
  • Test with touch, rm, mv

Phase 2: Watch Management (2-3 hours)

  • Implement wd-to-path mapping
  • Print full paths in output
  • Handle watch removal on IN_DELETE_SELF

Phase 3: Recursive Watching (3-4 hours)

  • Walk directory tree at startup
  • Add watches for all subdirectories
  • Handle new directory creation
  • Handle directory deletion (remove watch)

Phase 4: Event Filtering (2-3 hours)

  • Parse event mask from command line
  • Implement glob pattern matching
  • Implement exclusion patterns
  • Test various filter combinations

Phase 5: Rename Tracking (2-3 hours)

  • Track cookies for pending renames
  • Match MOVED_FROM/MOVED_TO pairs
  • Handle timeouts for unmatched events
  • Report renames correctly

Phase 6: Actions (2-3 hours)

  • Execute command on events
  • Implement {} path substitution
  • Add debouncing logic
  • Test with build commands

Phase 7: Polish (2-3 hours)

  • Add JSON output option
  • Improve error messages
  • Handle edge cases
  • Write documentation

5.11 Key Implementation Decisions

Decision 1: Blocking vs Non-blocking

  • Blocking: simpler, blocks on read() until events
  • Non-blocking: can handle signals, timeouts, multiple sources
  • Recommendation: Use non-blocking with select/poll for flexibility

Decision 2: Watch Table Data Structure

  • Simple array: O(n) lookup, but n is usually small
  • Hash table: O(1) lookup, more complex
  • Recommendation: Start with array, optimize if needed

Decision 3: Event Buffer Size

  • Too small: might not fit events, need to read multiple times
  • Too large: wastes memory
  • Recommendation: 4KB-8KB is typically sufficient

Decision 4: Debounce Strategy

  • Per-file: each file has its own timer
  • Global: one timer for all events
  • Recommendation: Start global, add per-file if needed

6. Testing Strategy

6.1 Unit Tests

Test individual components:

// Test event mask parsing
void test_parse_events() {
    assert(parse_events("create") == IN_CREATE);
    assert(parse_events("create,modify") == (IN_CREATE | IN_MODIFY));
    assert(parse_events("all") == IN_ALL_EVENTS);
}

// Test glob matching
void test_glob_match() {
    assert(glob_match("*.c", "main.c") == 1);
    assert(glob_match("*.c", "main.h") == 0);
    assert(glob_match("test_*", "test_foo.c") == 1);
}

// Test watch table
void test_watch_table() {
    watch_table_t table = {0};
    add_watch(&table, 1, "/home/user");
    assert(strcmp(get_path(&table, 1), "/home/user") == 0);
    remove_watch(&table, 1);
    assert(get_path(&table, 1) == NULL);
}

6.2 Integration Tests

Test complete scenarios:

#!/bin/bash
# test_watcher.sh

WATCHER=./mywatcher
TESTDIR=$(mktemp -d)
OUTPUT=$(mktemp)

cleanup() {
    rm -rf "$TESTDIR" "$OUTPUT"
    kill $WATCHER_PID 2>/dev/null
}
trap cleanup EXIT

# Test 1: Detect file creation
echo "Test 1: File creation"
$WATCHER "$TESTDIR" > "$OUTPUT" &
WATCHER_PID=$!
sleep 0.5
touch "$TESTDIR/newfile.txt"
sleep 0.5
kill $WATCHER_PID
if grep -q "CREATE.*newfile.txt" "$OUTPUT"; then
    echo "  PASS"
else
    echo "  FAIL"
fi

# Test 2: Detect modification
echo "Test 2: File modification"
touch "$TESTDIR/existing.txt"
$WATCHER "$TESTDIR" > "$OUTPUT" &
WATCHER_PID=$!
sleep 0.5
echo "hello" >> "$TESTDIR/existing.txt"
sleep 0.5
kill $WATCHER_PID
if grep -q "MODIFY.*existing.txt" "$OUTPUT"; then
    echo "  PASS"
else
    echo "  FAIL"
fi

# Test 3: Recursive watching
echo "Test 3: Recursive watching"
mkdir -p "$TESTDIR/sub1/sub2"
$WATCHER --recursive "$TESTDIR" > "$OUTPUT" &
WATCHER_PID=$!
sleep 0.5
touch "$TESTDIR/sub1/sub2/deep.txt"
sleep 0.5
kill $WATCHER_PID
if grep -q "CREATE.*deep.txt" "$OUTPUT"; then
    echo "  PASS"
else
    echo "  FAIL"
fi

# Test 4: New directory watching
echo "Test 4: New subdirectory"
$WATCHER --recursive "$TESTDIR" > "$OUTPUT" &
WATCHER_PID=$!
sleep 0.5
mkdir "$TESTDIR/newdir"
touch "$TESTDIR/newdir/file.txt"
sleep 0.5
kill $WATCHER_PID
if grep -q "CREATE.*file.txt" "$OUTPUT"; then
    echo "  PASS"
else
    echo "  FAIL"
fi

# Test 5: Rename tracking
echo "Test 5: Rename tracking"
touch "$TESTDIR/original.txt"
$WATCHER "$TESTDIR" > "$OUTPUT" &
WATCHER_PID=$!
sleep 0.5
mv "$TESTDIR/original.txt" "$TESTDIR/renamed.txt"
sleep 0.5
kill $WATCHER_PID
if grep -q "MOVED_FROM.*original.txt" "$OUTPUT" && \
   grep -q "MOVED_TO.*renamed.txt" "$OUTPUT"; then
    echo "  PASS"
else
    echo "  FAIL"
fi

echo "All tests completed"

6.3 Edge Cases to Test

  1. Watch limit: Add watches until ENOSPC
  2. Event overflow: Generate events faster than reading
  3. Rapid changes: Create+delete+create same file quickly
  4. Unicode filenames: Test with UTF-8 names
  5. Long paths: Near PATH_MAX length
  6. Symlinks: Watch symlink vs target
  7. Permission denied: Directory becomes unreadable
  8. Directory deleted: Watch for IN_DELETE_SELF and IN_IGNORED
  9. Unmount: Filesystem unmounted while watching
  10. Race conditions: Directory deleted during recursive add

6.4 Verification Commands

# Check current watches
$ cat /proc/$(pgrep mywatcher)/fd/ | grep inotify

# Check watch count
$ cat /proc/sys/fs/inotify/max_user_watches

# List watches for a process
$ cat /proc/$(pgrep mywatcher)/fdinfo/3  # fd 3 is often inotify

# Stress test with many files
$ for i in {1..1000}; do touch /tmp/test/file$i; done

# Test event overflow
$ for i in {1..10000}; do touch /tmp/test/file$i; rm /tmp/test/file$i; done

# Memory usage
$ ps -o rss,vsz,comm -p $(pgrep mywatcher)

# Valgrind for leaks
$ valgrind --leak-check=full ./mywatcher /tmp &
# ... generate some events ...
# kill and check output

7. Common Pitfalls & Debugging

Problem 1: “Miss events for new subdirectories”

Symptom: Files created in new directories not reported

Why: Didn’t add watch for new directory when IN_CREATE IN_ISDIR received

Fix:

if (event->mask & IN_ISDIR && event->mask & IN_CREATE) {
    char new_path[PATH_MAX];
    snprintf(new_path, sizeof(new_path), "%s/%s",
             get_path_for_wd(event->wd), event->name);

    // Add watch for new directory
    int wd = inotify_add_watch(inotify_fd, new_path, watch_mask);
    if (wd >= 0) {
        add_to_watch_table(wd, new_path);

        // Walk directory to catch files created before watch
        walk_and_report(new_path);
    }
}

Problem 2: “Events out of order or missing pair”

Symptom: See MOVED_FROM but no MOVED_TO

Why:

  • File moved to unwatched directory
  • Events from different directories can interleave
  • Cookie timeout too short

Fix:

  • Keep pending renames in a list with timestamps
  • Match cookies when MOVED_TO arrives
  • Report unpaired MOVED_FROM after timeout (file left watched tree)
// Store pending rename
if (event->mask & IN_MOVED_FROM) {
    pending_rename_t *pending = add_pending_rename(event->cookie, path);
}

// Match with MOVED_TO
if (event->mask & IN_MOVED_TO) {
    pending_rename_t *pending = find_pending_rename(event->cookie);
    if (pending) {
        printf("RENAMED: %s -> %s\n", pending->from_path, path);
        remove_pending_rename(pending);
    } else {
        printf("MOVED_TO: %s (from outside)\n", path);
    }
}

// Periodically clean up old pending renames
void cleanup_pending_renames(int timeout_ms) {
    time_t now = time(NULL);
    for (each pending) {
        if (now - pending->timestamp > timeout_ms / 1000) {
            printf("MOVED_FROM: %s (to outside)\n", pending->from_path);
            remove_pending_rename(pending);
        }
    }
}

Problem 3: “Hit watch limit”

Symptom: inotify_add_watch returns -1 with errno ENOSPC

Why: Exceeded /proc/sys/fs/inotify/max_user_watches

Fix:

int wd = inotify_add_watch(inotify_fd, path, mask);
if (wd < 0) {
    if (errno == ENOSPC) {
        fprintf(stderr, "Watch limit reached. Increase with:\n");
        fprintf(stderr, "  sudo sysctl fs.inotify.max_user_watches=524288\n");
        // Could also try to continue with partial coverage
    }
    return -1;
}

Problem 4: “IN_Q_OVERFLOW events”

Symptom: Receive IN_Q_OVERFLOW event

Why: Event queue filled up (too many events, too slow reading)

Fix:

if (event->mask & IN_Q_OVERFLOW) {
    fprintf(stderr, "Warning: Event queue overflow, some events lost\n");
    // Best practice: do a full rescan of watched directories
    rescan_all_watches();
}

Problem 5: “Memory leak with dynamic watches”

Symptom: Memory grows over time with directory creates/deletes

Why: Not removing entries from watch table when directories deleted

Fix:

// When directory is deleted, watch becomes invalid
if (event->mask & IN_IGNORED) {
    // Watch was automatically removed by kernel
    remove_from_watch_table(event->wd);
}

// Also handle IN_DELETE_SELF (directory itself deleted)
if (event->mask & IN_DELETE_SELF) {
    remove_from_watch_table(event->wd);
}

8. Extensions & Challenges

8.1 Easy Extensions

  1. Colorized output: Color events by type (create=green, delete=red)
  2. Timestamp format: Add ISO 8601 timestamps
  3. Event counting: Report statistics on exit
  4. Exclude hidden files: Skip .dotfiles

8.2 Advanced Challenges

  1. Debounce with coalescing: Group rapid events into one
  2. Remote watching: Forward events over network
  3. Persistent state: Resume watching after restart
  4. fanotify integration: Use fanotify for filesystem-wide watching
  5. Efficient path storage: Use trie for paths with common prefixes

8.3 Research Topics

  1. Cross-platform abstraction: How does libuv abstract inotify/kqueue/FSEvents?
  2. Database triggers: How do databases implement change notification?
  3. Distributed file systems: How do NFS/CIFS handle notifications?
  4. Container filesystems: How does overlayfs interact with inotify?

9. Real-World Connections

9.1 Production Systems Using This

  1. Webpack/Vite: Watch source files for hot module replacement
  2. nodemon: Restart Node.js on file changes
  3. cargo watch: Rebuild Rust projects on changes
  4. Dropbox/syncthing: Detect local changes for sync
  5. systemd.path: Start services when files change
  6. fail2ban: Monitor log files for intrusion attempts
  7. rsync with inotify: Trigger sync on changes

9.2 How the Pros Do It

webpack watch mode:

  • Uses chokidar (cross-platform abstraction over inotify)
  • Debounces rapid changes
  • Ignores node_modules by default (too many files)

Dropbox:

  • Uses inotify on Linux
  • Falls back to polling for unsupported filesystems
  • Has sophisticated conflict detection

IDE file trees:

  • Watch project root recursively
  • Batch updates to avoid UI flicker
  • Handle many file types differently

9.3 Reading the Source

  1. chokidar: Popular Node.js file watcher
    • https://github.com/paulmillr/chokidar
  2. watchman (Facebook): High-performance file watcher
    • https://github.com/facebook/watchman
  3. notify (Rust): Cross-platform file watcher
    • https://github.com/notify-rs/notify
  4. fswatch: Command-line file watcher
    • https://github.com/emcrisostomo/fswatch

10. Resources

10.1 Man Pages

$ man inotify              # Overview
$ man inotify_init         # Create instance
$ man inotify_add_watch    # Add watch
$ man inotify_rm_watch     # Remove watch
$ man 7 inotify            # Detailed documentation

10.2 Online Resources

  • LWN article on inotify: https://lwn.net/Articles/604686/
  • Kernel documentation: Documentation/filesystems/inotify.txt
  • inotifywait man page: Reference implementation

10.3 Book Chapters

Book Chapter Topic
“TLPI” by Kerrisk Ch. 19 inotify (comprehensive)
“APUE” by Stevens Ch. 4 Files and Directories
“TLPI” by Kerrisk Ch. 63 Alternative I/O Models

11. Self-Assessment Checklist

Before considering this project complete, verify:

  • I can explain how inotify works at a high level
  • My watcher detects create, modify, delete, and rename events
  • Recursive watching works for existing and new directories
  • I handle the variable-length event structure correctly
  • Rename tracking works with cookie matching
  • I handle IN_Q_OVERFLOW gracefully
  • I clean up watches when directories are deleted
  • valgrind shows no memory leaks
  • I can answer all five interview questions
  • My tool is useful for real development workflows

12. Submission / Completion Criteria

Your project is complete when:

  1. Basic watching works: Detects all event types correctly
  2. Recursive watching works: Handles existing and new directories
  3. Renames tracked: Cookie matching works correctly
  4. Filters work: Event types, globs, exclusions all functional
  5. Actions work: Can trigger commands on events
  6. Clean code: No memory leaks, handles errors gracefully

Deliverables:

  • mywatcher.c - Main implementation
  • mywatcher.h - Header file
  • Makefile - Build system
  • test_watcher.sh - Test script
  • README.md - Usage documentation

Demo scenario that must work:

# Create test project
$ mkdir -p /tmp/testproj/src

# Start watcher with build command
$ ./mywatcher --recursive --glob "*.c" --exec "echo 'Would rebuild!'" /tmp/testproj &

# Make changes
$ touch /tmp/testproj/src/main.c
# Output: [CREATE] /tmp/testproj/src/main.c
#         Would rebuild!

$ echo "int main() {}" >> /tmp/testproj/src/main.c
# Output: [MODIFY] /tmp/testproj/src/main.c
#         Would rebuild!

$ mkdir /tmp/testproj/src/util
$ touch /tmp/testproj/src/util/helper.c
# Output: [CREATE] /tmp/testproj/src/util/ (directory)
#         Now watching: /tmp/testproj/src/util
#         [CREATE] /tmp/testproj/src/util/helper.c
#         Would rebuild!

$ mv /tmp/testproj/src/main.c /tmp/testproj/src/app.c
# Output: [MOVED_FROM] /tmp/testproj/src/main.c
#         [MOVED_TO]   /tmp/testproj/src/app.c
#         Would rebuild!

# Cleanup
$ kill %1