Project 1: Multi-Source Log Tailer with Rotation Handling

Build a production-grade log tailer that follows multiple files across rotations without missing lines.

Quick Reference

Attribute Value
Difficulty Intermediate
Time Estimate 1-2 weeks
Language C (Alternatives: Rust, Go)
Prerequisites Basic C I/O, file descriptors, stat(2) basics
Key Topics inode vs path, I/O multiplexing, non-blocking reads

1. Learning Objectives

By completing this project, you will:

  1. Distinguish inodes from file paths and explain why log rotation breaks naive tailing.
  2. Implement a multi-file tailer that survives rename() and truncate() rotations.
  3. Use poll() or select() to multiplex reads without blocking.
  4. Detect and recover from file descriptor exhaustion and stale handles.

2. Theoretical Foundation

2.1 Core Concepts

  • File descriptors and inodes: An FD refers to an open file description tied to an inode, not a path name. When logrotate renames a file, the inode stays with the old FD.
  • File rotation patterns: Common patterns include rename + create (app.log -> app.log.1, new app.log), copy-truncate, and time-based filenames.
  • I/O multiplexing: select()/poll() allow a single thread to wait on multiple FDs without blocking on one slow file.
  • Non-blocking reads: O_NONBLOCK and read() semantics prevent your tailer from stalling on a quiet file.

2.2 Why This Matters

Production systems rely on logs for debugging. If your tailer misses lines during rotation or blocks on a silent file, operators lose visibility during the most critical moments. Understanding the inode/path boundary is a classic integration point failure.

2.3 Historical Context / Background

Unix created the “everything is a file” model and exposed FDs as integers. This simplicity enables powerful composition but also introduces subtle bugs when naming and storage diverge.

2.4 Common Misconceptions

  • “If I reopen the path, I keep reading the same file.” Not if rotation created a new inode.
  • tail -f just reads forever.” It actually uses inode tracking and additional heuristics.

3. Project Specification

3.1 What You Will Build

A command-line tool that tails multiple log files, follows each across rotation events, and prints annotated output with source file and timestamps.

3.2 Functional Requirements

  1. Multi-file follow: Accept N paths and follow all concurrently.
  2. Rotation handling: Detect rename and truncate rotations, continue from correct position.
  3. Annotation: Prefix each line with file name and timestamp.
  4. Signal handling: On SIGTERM, flush and close all FDs cleanly.

3.3 Non-Functional Requirements

  • Performance: Handle 1,000+ lines/sec without dropping output.
  • Reliability: No missed lines on rotation under normal conditions.
  • Usability: Clear CLI usage and help text.

3.4 Example Usage / Output

$ ./multitail /var/log/syslog /var/log/nginx/access.log
[2025-01-10 12:00:01] syslog: kernel: device eth0 up
[2025-01-10 12:00:01] access.log: 10.0.0.5 - - "GET /health HTTP/1.1" 200 2

3.5 Real World Outcome

You run the tool in a terminal and rotate logs with logrotate. You still see continuous output without gaps. A single example of output:

$ ./multitail /var/log/app.log
[2025-01-10 12:04:05] app.log: worker[1234] started
[2025-01-10 12:04:07] app.log: request_id=8f2 status=200 ms=14
[2025-01-10 12:04:09] app.log: rotated to app.log.1, switching to new inode
[2025-01-10 12:04:10] app.log: request_id=901 status=500 ms=87

4. Solution Architecture

4.1 High-Level Design

┌──────────────┐   ┌──────────────┐   ┌──────────────┐
│  Path Watch  │──▶│ Inode Tracker│──▶│  Read Loop   │
└──────────────┘   └──────────────┘   └──────────────┘
        │                  │                  │
        ▼                  ▼                  ▼
   stat/fstat        reopen/seek          poll/select

Log Tailer Architecture

4.2 Key Components

Component Responsibility Key Decisions
FileState Track fd, inode, offset Store dev+inode to detect rotation
Watcher Periodic stat() checks Time-based vs inotify fallback
ReadLoop Multiplex reads poll() with timeout

4.3 Data Structures

struct file_state {
    int fd;
    dev_t dev;
    ino_t ino;
    off_t offset;
    char path[PATH_MAX];
};

4.4 Algorithm Overview

Key Algorithm: rotation detection

  1. fstat(fd) to capture current inode and size.
  2. stat(path) to check current path inode.
  3. If inode differs or size decreased, reopen and reset offset.

Complexity Analysis:

  • Time: O(N) per poll tick for N files
  • Space: O(N)

5. Implementation Guide

5.1 Development Environment Setup

sudo apt-get install build-essential

5.2 Project Structure

multitail/
├── src/
│   ├── main.c
│   ├── tailer.c
│   └── tailer.h
├── tests/
│   └── test_rotation.sh
├── Makefile
└── README.md

Log Tailer Project Structure

5.3 The Core Question You’re Answering

“How can I follow a file by inode rather than by name, and detect when the name changes?”

5.4 Concepts You Must Understand First

Stop and research these before coding:

  1. Inode vs Path
    • What does stat(path) return vs fstat(fd)?
    • How does rename() affect inode references?
    • Book Reference: “The Linux Programming Interface” Ch. 15
  2. Non-blocking I/O
    • What happens when read() has no data?
    • How does O_NONBLOCK change behavior?
    • Book Reference: “APUE” Ch. 14
  3. Polling Multiple FDs
    • Differences between select() and poll()
    • What does it mean for an FD to be readable?
    • Book Reference: “TLPI” Ch. 63

5.5 Questions to Guide Your Design

Before implementing, think through these:

  1. How frequently should you call stat() to detect rotation?
  2. When should you reopen a file? On inode mismatch, size shrink, or both?
  3. How do you avoid blocking on a quiet file while others are active?
  4. How do you handle a file that disappears temporarily?

5.6 Thinking Exercise

Trace Rotation by Hand

Simulate:

  1. Open app.log and read 100 bytes.
  2. mv app.log app.log.1 and create a new app.log.
  3. Ask: What does your FD point to? What does stat("app.log") return?

5.7 The Interview Questions They’ll Ask

Prepare to answer these:

  1. “Why does tail -f miss lines during rotation?”
  2. “What is the difference between stat() and fstat()?”
  3. “How do you detect a truncated file?”
  4. “Why use poll() over threads?”

5.8 Hints in Layers

Hint 1: Track inode and dev Store st_dev and st_ino from fstat() and compare with stat(path).

Hint 2: Handle truncation If stat(path).st_size is smaller than your offset, reset offset to 0.

Hint 3: Use timeouts Use poll() with a timeout to periodically run rotation checks.

5.9 Books That Will Help

Topic Book Chapter
File I/O model “The Linux Programming Interface” Ch. 4
Inodes and metadata “The Linux Programming Interface” Ch. 15
Multiplexing “The Linux Programming Interface” Ch. 63
Advanced I/O “Advanced Programming in the UNIX Environment” Ch. 14

5.10 Implementation Phases

Phase 1: Foundation (2-3 days)

Goals:

  • Parse CLI args
  • Open files and read lines

Tasks:

  1. Implement simple tail for one file.
  2. Add poll() and handle multiple FDs.

Checkpoint: Tail two files without blocking.

Phase 2: Core Functionality (3-5 days)

Goals:

  • Rotation detection
  • Offset tracking

Tasks:

  1. Track inode and size with fstat().
  2. Detect rename/truncate and reopen.

Checkpoint: No missing lines during rotation test.

Phase 3: Polish & Edge Cases (2-3 days)

Goals:

  • Robust error handling
  • Graceful shutdown

Tasks:

  1. Handle file disappearance and recreation.
  2. Add signal handling and cleanup.

Checkpoint: Pass scripted rotation test.

5.11 Key Implementation Decisions

Decision Options Recommendation Rationale
Multiplexing select() vs poll() poll() Scales better for many FDs
Rotation detection inotify vs stat() stat() Simple and portable
Output formatting raw vs annotated annotated Aids debugging and correlation

6. Testing Strategy

6.1 Test Categories

Category Purpose Examples
Unit Tests Validate helpers inode comparison logic
Integration Tests End-to-end rotation script
Edge Case Tests Failure modes file deleted mid-read

6.2 Critical Test Cases

  1. Rename rotation: mv app.log app.log.1 while tailing.
  2. Copy-truncate: cp app.log app.log.1 && : > app.log.
  3. FD exhaustion: open many files and ensure errors are surfaced.

6.3 Test Data

Line 1
Line 2
Line 3

7. Common Pitfalls & Debugging

7.1 Frequent Mistakes

Pitfall Symptom Solution
Tracking only path Missed lines after rotation Compare inode/dev
Blocking read() Program stalls Use poll() with timeout
Ignoring truncation Duplicated lines Reset offset on size shrink

7.2 Debugging Strategies

  • Use lsof -p to verify which inode your FD targets.
  • Add debug logs showing inode changes and reopen decisions.

7.3 Performance Traps

Polling too frequently wastes CPU. Use a reasonable timeout (e.g., 500ms).


8. Extensions & Challenges

8.1 Beginner Extensions

  • Add --since to start tailing from time-based filters.
  • Add colored output by file source.

8.2 Intermediate Extensions

  • Add inotify to reduce polling when available.
  • Implement JSON output for log pipelines.

8.3 Advanced Extensions

  • Persist offsets to disk for restart continuity.
  • Support reading from compressed rotated logs.

9. Real-World Connections

9.1 Industry Applications

  • Log aggregation agents: Fluent Bit, Filebeat, and Splunk forwarders.
  • Debug tooling: On-call engineers tail multiple logs during incidents.
  • Filebeat: https://github.com/elastic/beats - Production log shipper
  • multitail: https://github.com/flok99/multitail - Similar open-source tool

9.3 Interview Relevance

  • Questions about inode/path differences and log rotation handling.
  • Demonstrates systems debugging thinking.

10. Resources

10.1 Essential Reading

  • “The Linux Programming Interface” by Michael Kerrisk - Ch. 4, 15, 63
  • “APUE” by Stevens & Rago - Ch. 14

10.2 Video Resources

  • Log rotation walkthroughs - YouTube (“logrotate inode”)
  • Linux file I/O deep dives - Conference talks

10.3 Tools & Documentation

  • man 2 stat: Metadata retrieval
  • man 2 poll: I/O multiplexing
  • Project 2: Connection pools build on FD knowledge.
  • Project 3: Supervisor adds signal handling to FD management.

11. Self-Assessment Checklist

11.1 Understanding

  • I can explain inode vs path with a concrete example.
  • I can describe how log rotation affects open FDs.
  • I can explain why poll() prevents blocking.

11.2 Implementation

  • All functional requirements are met.
  • Rotation tests pass reliably.
  • Error handling covers missing files.

11.3 Growth

  • I documented the hardest bug and how I fixed it.
  • I can explain this project in an interview.

12. Submission / Completion Criteria

Minimum Viable Completion:

  • Follows two files without blocking.
  • Detects rename rotation correctly.
  • Clean shutdown on SIGTERM.

Full Completion:

  • Handles copy-truncate rotations.
  • Annotated output with timestamps.

Excellence (Going Above & Beyond):

  • Inotify optimization and offset persistence.
  • Documented performance characteristics.

This guide was generated from SPRINT_5_SYSTEMS_INTEGRATION_PROJECTS.md. For the complete learning path, see the parent directory.