Project 1: Multi-Source Log Tailer with Rotation Handling
Build a production-grade log tailer that follows multiple files through rename and copytruncate rotations while preserving ordered, timestamped output.
Quick Reference
| Attribute | Value |
|---|---|
| Difficulty | Intermediate |
| Time Estimate | 1-2 weeks |
| Main Programming Language | C (Alternatives: Rust, Go) |
| Alternative Programming Languages | Rust, Go |
| Coolness Level | Level 4 - Useful in production |
| Business Potential | Level 3 - Observability tooling |
| Prerequisites | File I/O, stat(), basic polling, errno handling |
| Key Topics | Inodes, open file descriptions, rotation, poll/epoll, time |
1. Learning Objectives
By completing this project, you will:
- Explain and detect file identity across renames using device/inode pairs.
- Implement a tailer that survives both rename and copytruncate log rotation.
- Design a multi-file polling loop that avoids blocking and preserves ordering.
- Produce deterministic output with stable timestamps and reproducible ordering.
- Diagnose and recover from missing files, short reads, and descriptor limits.
2. All Theory Needed (Per-Concept Breakdown)
2.1 File Identity: Inodes, Device IDs, and Open File Descriptions
Fundamentals
A path is not the file; it is a name that resolves to a file identity. The real identity is the inode number on a device. When you call open(), the kernel creates an open file description that contains the current offset and file flags. Your process receives a file descriptor that points to that open file description. This means that if a log file is renamed, your descriptor still points to the same open file description and therefore the same inode, even though the path now points to a different inode. This is why naive tailers stop receiving new lines after rotation. File identity is a pair: (st_dev, st_ino). This pair is stable for the life of the file, even if the name changes. If a file is deleted, the inode can live as long as at least one descriptor still points to it. Understanding this distinction is essential to building a tailer that can survive real-world rotation schemes.
Deep Dive into the concept
The kernel uses separate tables for file descriptors and open file descriptions. The descriptor table is per-process, and each entry points to an open file description in the system-wide table. Each open file description contains the file offset, status flags (O_APPEND, O_NONBLOCK), and a pointer to the inode. When multiple descriptors point to the same open file description, they share a single offset. This happens after fork() or dup(), and it is why a child process can accidentally advance the parent’s file offset. The inode itself is an on-disk structure that holds metadata (mode, owner, size, timestamps, block pointers). Two different paths can point to the same inode via hard links, which is why inode numbers are the only reliable identity for a file.
Log rotation interacts with these identities in subtle ways. In the rename pattern, the log file is renamed (e.g., app.log -> app.log.1) and a fresh file is created at the original path. Your tailer that holds an FD to the old inode will continue reading the old file. That might be fine for a few seconds while the writer still writes to the old descriptor, but eventually the writer switches to the new file and your tailer misses new data. In copytruncate, the file is copied to a rotated name and then truncated in place. The inode does not change, but the file size shrinks, causing the old offset to point past EOF. If you do not detect this shrink, you will sit at EOF forever. The only correct response is to reset the offset when size decreases.
To detect rotation, you need a strategy that combines stat() (path-based identity) and fstat() (descriptor-based identity). A resilient tailer keeps both: it monitors the current path identity and compares it to the descriptor’s identity. If they differ, it closes the old descriptor and opens the new file. For copytruncate, the inode stays the same, so you must compare file size and detect a decrease, then seek back to zero. You also need to account for the time window in which the writer still writes to the old FD after rename. A robust approach is to keep the old descriptor open for a short grace window and read it until it reaches EOF, then close it. This avoids data loss during the rename handoff.
Another integration edge is descriptor inheritance. If you exec a child process and you forget to set FD_CLOEXEC, your child will inherit log descriptors, keeping rotated files open and preventing deletion. This creates disk leaks where rotated logs never get freed because some process still holds a descriptor. Production systems often experience this. The correct design is to set O_CLOEXEC on all log descriptors in the tailer, and to close unused descriptors after a fork in the child before exec.
Finally, consider resource limits. Your tailer opens multiple files, and production limits can be low. If you hit RLIMIT_NOFILE, open() returns EMFILE. A robust tailer must handle this gracefully: drop low-priority files, retry with backoff, and surface a visible error rather than crashing silently. This is where file identity knowledge intersects resource management.
How this fits in projects
This concept is the core of Project 1. You use inode/device identity to detect rename rotations and file size changes to detect copytruncate. It also appears in Project 6, where log aggregation must survive rotation across deployments, and in Project 5 where open FD enumeration reveals leaked descriptors.
Definitions & key terms
- Inode: Kernel metadata record representing a file identity on disk.
- Device ID (st_dev): Identifier for the filesystem device hosting the inode.
- Open file description: Kernel object that stores file offset and flags.
- File descriptor: Process-local integer index that references an open file description.
- FD_CLOEXEC: Flag that closes the descriptor on exec.
- Copytruncate: Rotation strategy that copies then truncates in place.
Mental model diagram (ASCII)
Path "app.log" -> inode 100 on /dev/sda1
Process FD table
+----+-----------+ +------------------+
| 3 | ------+ |----> | open desc: inode |
| 4 | ------+ | | offset=2048 |
+----+-----------+ +------------------+
Rename rotation:
app.log -> app.log.1 (inode 100)
app.log -> new file (inode 101)
FD still points to inode 100
How it works (step-by-step, with invariants and failure modes)
open(path)returns FD to inode A. Invariant: FD->open file description->inode A.- You
fstat(fd)and store(dev, inode, size, mtime)as descriptor identity. - You periodically
stat(path)and compare(dev, inode)to current FD identity. - If inode differs, rotation-by-rename occurred. Open new FD and optionally keep old FD for a grace period to drain buffered writes.
- If inode matches but size < last_offset, copytruncate occurred. Seek to 0 and reset state.
- Failure mode: missing file (
ENOENT) during rotation. You must retry with backoff and keep last known state. - Failure mode: EMFILE. You must reduce watched set or apply backoff and alert.
Minimal concrete example
struct file_id { dev_t dev; ino_t ino; };
struct file_id stat_path(const char *path) {
struct stat st;
if (stat(path, &st) != 0) { /* handle */ }
return (struct file_id){ .dev = st.st_dev, .ino = st.st_ino };
}
struct file_id fstat_fd(int fd) {
struct stat st;
if (fstat(fd, &st) != 0) { /* handle */ }
return (struct file_id){ .dev = st.st_dev, .ino = st.st_ino };
}
bool rotated(struct file_id path_id, struct file_id fd_id) {
return path_id.dev != fd_id.dev || path_id.ino != fd_id.ino;
}
Common misconceptions
- “If the path is the same, it is the same file.” False. The path can point to a new inode after rename rotation.
- “Tail -f is enough for production.” Not always. Tail -F exists specifically because -f follows descriptors and fails on rename.
- “Copytruncate is safe and trivial.” It can lose lines when the writer and tailer race; you must detect size shrink.
Check-your-understanding questions
- Why can two different paths refer to the same inode?
- What changes in the rename rotation sequence, and what stays the same?
- If file size shrinks but inode is unchanged, what rotation method is likely?
- Why does FD inheritance matter for log rotation?
- What are two strategies for handling EMFILE in a tailer?
Check-your-understanding answers
- Hard links allow multiple paths to point to one inode.
- The path -> inode mapping changes; the open file description and inode behind the FD do not.
- Copytruncate, because the inode remains but size resets.
- Inherited descriptors keep old files open, preventing deletion and disk reclamation.
- Reduce watched set or implement backoff and retry with a visible alert.
Real-world applications
- Log shippers like Fluent Bit and Filebeat track inode and device to avoid missing lines.
- Systemd-journald uses identity rules to manage rotated binary journals.
- Backup agents detect file identity changes to avoid duplicate uploads.
Where you will apply it
- Project 1: See §3.2 Functional Requirements and §5.10 Phase 2.
- Project 6: Log aggregation step in deployment pipeline.
- Also used in: P06 Deployment Pipeline Tool.
References
- “The Linux Programming Interface” (Kerrisk), Chapters 4 and 15.
- “Advanced Programming in the UNIX Environment” (Stevens/Rago), Chapter 3.
man 2 open,man 2 fstat.
Key insights
File identity is inode + device, not a path name.
Summary
A robust tailer must track kernel identity and detect both rename and copytruncate rotations. File descriptors bind to open file descriptions, which outlive path names. If you treat the path as the identity, your tool will quietly fail in production.
Homework/exercises to practice the concept
- Create a file, open it, rename it, then delete the path; observe that the FD still reads data.
- Simulate copytruncate with
cpthen: > fileand observe the size shrink. - Write a tiny program that prints
(dev, inode)for a path before and after rotation.
Solutions to the homework/exercises
- The FD remains valid and reads content even after the path is gone because the inode is still referenced.
- The inode stays the same but size resets to zero; a tailer must detect this and seek to 0.
- The
(dev, inode)pair changes after rename rotation but not after copytruncate.
2.2 Log Rotation Patterns and Tailing Semantics
Fundamentals
Log rotation is the process of moving or truncating a log file so it does not grow forever. The two most common patterns are rename and copytruncate. In rename rotation, the log file is renamed and a new file is created under the original name. In copytruncate, the contents are copied to a rotated file and the original file is truncated in place. The semantics of tailing differ dramatically between these two patterns. A tailer that follows a descriptor continues reading the old file after rename, which misses new data. A tailer that follows by name may reopen the new file but risks missing buffered writes that still go to the old file. For copytruncate, the inode is unchanged but the size shrinks. If your tailer assumes size only grows, it will sit at EOF forever. A correct tailer must detect both patterns and incorporate a safe handoff strategy that avoids duplicate lines or data loss.
Deep Dive into the concept
A log tailer must model the rotation timeline and writer behavior. Many applications keep the log file open and write to the same descriptor for long periods. Rotation tools like logrotate can signal the app to reopen logs, but in practice there is a window where the old descriptor still receives writes after rotation. Your tailer must handle this window. A naive follow-by-name tailer that immediately closes and reopens the new file after rename will miss those late writes. Conversely, a follow-by-descriptor tailer will capture late writes but will never see new content from the new file. The best strategy is a hybrid: detect the rename, open the new file, but keep the old descriptor open for a grace period. During this window, read both files; once the old file reaches EOF and remains idle for a configurable timeout, close it.
Copytruncate introduces a different challenge. The inode remains the same, and the file size shrinks to zero. If your tailer continues to read from the old offset, you will remain beyond EOF and never see new data. The correct approach is to detect a size decrease and reset the offset to zero. However, a race exists if the writer is concurrently appending while truncation occurs. To avoid losing lines, you should track inode, size, and a monotonically increasing line counter. When size shrinks, you should emit a rotation event and reset the offset, but you should also consider delaying a short interval to avoid reading partial writes while the file is being copied and truncated.
Another subtle issue is ordering across multiple files. If you tail multiple sources and want a unified stream, you must choose how to order lines. The most stable approach is to parse timestamps from lines and then interleave based on timestamp, breaking ties by source order or by monotonic read order. But timestamps can be missing or malformed. You need a fallback: use the local monotonic clock at read time and attach it as the tailer timestamp. This can introduce reordering under high skew if one file has delayed writes, so the system should indicate ordering policy in output and optionally provide a secondary sequence number.
Log rotation can happen while the tailer is down. If the tailer restarts, it should resume from a stored checkpoint (inode, offset, and a hash of the last line) to avoid duplicates. Because you are not building a production log shipper, you can keep this checkpoint in a local state file. But you must design the state format explicitly and handle mismatches (inode no longer exists, or line hash mismatch). This shows the integration skill: building something that survives restarts and time gaps.
Finally, log tailing must account for file permission changes. After rotation, the new file might have different permissions. Your tailer should handle EACCES gracefully, logging the error and continuing to retry with backoff. It should not crash or spin.
How this fits in projects
This concept directly defines your rotation detection logic and tailing policy in Project 1. It also appears in Project 6 when you stream logs during deployment and must survive restarts without missing data.
Definitions & key terms
- Rename rotation:
mv app.log app.log.1then recreateapp.log. - Copytruncate:
cp app.log app.log.1then truncateapp.login place. - Grace period: Time window where old and new files are tailed simultaneously.
- Checkpoint: Persistent record of inode, offset, and last-line hash.
Mental model diagram (ASCII)
Time ->
T0: app.log (inode 10) grows
T1: rename: app.log -> app.log.1 (inode 10)
T2: new app.log created (inode 11)
T3: writer switches to inode 11
Hybrid tailer:
- keep FD to inode 10
- open new FD to inode 11
- read both until inode 10 idle
How it works (step-by-step, with invariants and failure modes)
- Track
(dev, inode, size, offset)per file. - Poll or inotify for changes. On activity, read new data.
- On each iteration, compare
stat(path)to stored inode. - If inode differs, open new file; mark old file as “draining”.
- If size shrinks, treat as copytruncate; reset offset to 0.
- Merge outputs by timestamp or read-order; stamp with tailer time.
- Failure mode: permissions after rotation -> log error and backoff.
- Failure mode: missing file -> retry with exponential delay.
Minimal concrete example
bool is_copytruncate(off_t last_size, off_t new_size, ino_t last_ino, ino_t new_ino) {
return new_ino == last_ino && new_size < last_size;
}
bool is_rename_rotation(ino_t last_ino, ino_t new_ino) {
return new_ino != last_ino;
}
Common misconceptions
- “inotify solves rotation”: inotify helps detect changes but does not solve identity tracking or copytruncate.
- “Ordering doesn’t matter”: ordering matters when logs are used for incident timelines.
- “Rotation is rare”: rotation can be daily or hourly, and some systems rotate on size.
Check-your-understanding questions
- What is the risk of reopening immediately after rename rotation?
- Why does copytruncate require size tracking, not inode tracking?
- How would you avoid duplicate lines during a tailer restart?
- What ordering policy is most defensible in incident reviews?
Check-your-understanding answers
- You can miss late writes that still go to the old FD.
- The inode does not change, so only size shrink indicates truncation.
- Store a checkpoint and de-duplicate by last-line hash or offset.
- Timestamp-based ordering with explicit tie-breaking and clear policy.
Real-world applications
- Logrotate + tailing agents in production servers.
- Database WAL tailers for replication.
- Monitoring pipelines that merge multiple service logs.
Where you will apply it
- Project 1: See §3.6 Edge Cases and §5.10 Phase 2.
- Project 6: Streaming logs during deployment.
- Also used in: P06 Deployment Pipeline Tool.
References
- Logrotate documentation and
man 8 logrotate. - “The Linux Programming Interface” Ch. 15.
Key insights
Rotation is a behavior of the filesystem and writer together; your tailer must model both.
Summary
Rotation patterns dictate how you detect identity changes and when to reopen files. A correct tailer uses a hybrid strategy to avoid missing lines or duplicates and clearly defines its ordering policy.
Homework/exercises to practice the concept
- Simulate rename rotation with a background writer and measure missed lines.
- Simulate copytruncate and verify that size shrink detection replays from 0.
- Implement a simple checkpoint file and test a restart.
Solutions to the homework/exercises
- A follow-by-name tailer misses lines written to the old FD; a hybrid tailer does not.
- The offset resets to zero and new lines appear again after truncation.
- The tailer reopens and resumes from stored inode/offset or safely restarts from 0 if mismatched.
2.3 I/O Multiplexing and FD Scalability
Fundamentals
When following multiple files, you must wait for data on any of them without blocking on one. I/O multiplexing allows a single thread to wait for readiness on multiple FDs. The classic APIs are select(), poll(), and epoll() on Linux. select() has a hard FD limit (FD_SETSIZE), which makes it unreliable for large numbers of files. poll() does not have the same limit but still scales linearly. epoll() uses a kernel-managed interest list that scales better with large FD counts. For a tailer that watches a few files, poll() is typically sufficient and simpler. For hundreds or thousands of files, epoll() is preferable. Understanding these trade-offs is key to building a stable tailer that does not fail under scale.
Deep Dive into the concept
select() works by copying a bitmask from user space to kernel space on every call, and the kernel then scans every bit to determine readiness. This leads to O(n) overhead and a fixed maximum number of FDs. poll() uses an array of struct pollfd and avoids the FD_SETSIZE limitation, but the kernel still scans every entry every time. epoll() is different: you register your FDs and interest events once, and the kernel only reports the FDs that are ready. This makes epoll() efficient for large sets where only a few are active at a time.
In the context of log tailing, readiness does not always mean you can read a full line. A file may be readable but the data ends mid-line. You need to build a line buffer per file that holds partial line fragments until a newline arrives. This buffering is central to correct output: without it, you can emit broken lines or drop data. The multiplexing loop must therefore track per-FD buffers, last read offsets, and parse states.
Another subtlety is that files behave differently from sockets. For regular files, poll() will often report them as always readable because data is considered available even at EOF. This means the tailer could spin in a loop at EOF and consume CPU. You need to explicitly track offsets and only attempt reads when size grows. One pattern is: periodically fstat() the file to check size; if size > offset, read. This decouples file readiness from poll readiness. If you use inotify, you can get notified on write and modify events, but inotify does not capture every write for heavily rotating files, and it can drop events if the queue overflows. Therefore, a robust tailer should combine inotify (or polling) with size checks.
Scaling also affects FD limits. Each watched file consumes a file descriptor. If you tail hundreds of files, you can hit RLIMIT_NOFILE. Your tailer must track its FD usage and reduce the watch set or implement a sharding approach. For example, a tailer can open files only when they are updated by scanning directory modification times, which reduces the number of open FDs. Another approach is to use openat() to open and close quickly, trading CPU for FD pressure.
The multiplexing loop should also handle EINTR. Signals like SIGCHLD or SIGTERM can interrupt poll() and cause it to return -1 with errno=EINTR. Correct code must retry, but also check for shutdown flags. This is part of building a resilient long-running process that behaves predictably under signals.
How this fits in projects
This concept is used in the main loop of Project 1 for multi-file tailing. It is also relevant to Project 2 where sockets are multiplexed, and Project 6 where the deployment tool monitors multiple streams simultaneously.
Definitions & key terms
- select(): Multiplexing API with FD_SETSIZE limit.
- poll(): Multiplexing API with linear scan but no fixed FD limit.
- epoll(): Linux scalable event notification API.
- EINTR: Error indicating a system call was interrupted by a signal.
Mental model diagram (ASCII)
poll loop
+--------------------+
| poll(fds, n, t) |
+---------+----------+
|
ready fds
v
+--------------------+
| read -> parse -> |
| buffer -> emit |
+--------------------+
How it works (step-by-step, with invariants and failure modes)
- Build an array of watched FDs with desired events (POLLIN).
- Call
poll()with a timeout to allow periodic housekeeping. - For each FD with POLLIN, attempt read into per-file buffer.
- Parse complete lines; keep partial line in buffer.
- If EOF and file size has not grown, do not spin; sleep or wait for change.
- Failure mode: EINTR -> retry; failure mode: busy loop at EOF -> mitigate with size check.
Minimal concrete example
int n = poll(pfds, nfds, 250);
if (n < 0 && errno == EINTR) return;
for (int i = 0; i < nfds; i++) {
if (pfds[i].revents & POLLIN) { /* read */ }
}
Common misconceptions
- “poll() tells me when a file has new data”: for regular files it often reports readable even at EOF.
- “select() is fine for any number of files”: FD_SETSIZE can hard-fail at scale.
- “No buffering needed”: logs can split lines across reads.
Check-your-understanding questions
- Why is EOF handling different for regular files compared to sockets?
- What does FD_SETSIZE limit and why does it matter?
- How do you avoid spinning when a file is at EOF?
Check-your-understanding answers
- Regular files are always readable, but can return 0 at EOF; sockets block.
- select() uses a fixed-size bitmask; exceeding it results in failures.
- Compare file size to last offset and only read when size grows.
Real-world applications
- Tailers and log shippers.
- Network servers using epoll for many connections.
- File watchers in IDEs and build systems.
Where you will apply it
- Project 1: See §4.1 High-Level Design and §5.10 Phase 2.
- Project 2: Connection pool multiplexing.
- Also used in: P02 HTTP Connection Pool.
References
man 2 poll,man 2 select,man 7 epoll.- “The Linux Programming Interface” Ch. 63.
Key insights
Multiplexing is not just about waiting on FDs; it is about managing buffers, offsets, and scale.
Summary
To build a tailer that does not block, you need a multiplexing loop with per-file buffering and careful EOF handling. The choice between poll and epoll depends on scale.
Homework/exercises to practice the concept
- Write a small program that polls two files and prints when data is available.
- Create a case where
poll()returns readability at EOF and observe behavior. - Simulate EINTR by sending SIGUSR1 during a poll and ensure your loop recovers.
Solutions to the homework/exercises
- Use
poll()with a timeout and read only when size increases. - You will see
poll()return ready but read returns 0; this is the EOF spin trap. - The loop should catch errno=EINTR and retry or handle shutdown flags.
2.4 Timestamping, Ordering, and Deterministic Output
Fundamentals
When tailing multiple files, you must decide how to order events. The simplest approach is “read order”: emit lines in the order they are read from the kernel. However, this can misrepresent real time because different files can be written at slightly different times, or the tailer can read one file earlier than another due to polling. A more reliable approach is to parse timestamps from log lines and then interleave lines by timestamp. This requires a stable parsing policy, handling missing timestamps, and defining tie-breakers. For deterministic output, you should avoid using the system wall clock for ordering and instead attach a monotonic timestamp when a line is read, and then normalize it for output. Determinism matters because you need to compare two runs, write tests, and reproduce a bug.
Deep Dive into the concept
Log lines contain timestamps in various formats: ISO 8601 with timezone, epoch seconds, or custom formats. A tailer that supports multiple formats must define a priority order and fallback when parsing fails. You should never block output waiting for a timestamp to appear; instead, attach a tailer timestamp and optionally mark the source timestamp if present. A robust ordering policy uses a small window buffer: collect lines from all sources for a small interval (e.g., 200 ms), then sort by source timestamp (if parseable) and by read order as a tie-breaker. This keeps output close to real time while reducing reordering artifacts.
Deterministic output requires control of time. In tests, you can inject a mock time source with a fixed starting point and deterministic increments, instead of using clock_gettime(CLOCK_REALTIME). This enables reproducible ordering and stable snapshots. When running in production, you can default to real time but keep the same formatting. The tailer should also include a monotonic sequence number per line to allow deterministic ordering in post-processing.
Another complexity is time skew between files. Different services might run on different clocks, leading to timestamp drift. Your tailer should not assume perfect sync. If you blindly sort by timestamp, you might place lines out of causal order. A better approach is to attach both the parsed timestamp and the read timestamp, and order within a small sliding window. You can also optionally output the delta between these timestamps to quantify skew.
Ordering across file rotations is also tricky. If the tailer reopens a file, the new file might contain older lines (for example, when a new log is created with older content or when the tailer restarts). To avoid confusion, you should include the file identity (inode) in debug output and optionally in the output line metadata. This allows users to understand that a line came from a new file.
How this fits in projects
This concept governs your output format and ensures reproducible tests in Project 1. It also appears in Project 6 where multiple subsystems produce logs that must be merged deterministically.
Definitions & key terms
- Monotonic clock: A clock that never goes backward, suitable for measuring intervals.
- Wall clock: Real time that can jump due to NTP or manual changes.
- Ordering window: A small buffer of lines to reorder by timestamp.
- Sequence number: A monotonically increasing ID for stable ordering.
Mental model diagram (ASCII)
read lines -> attach timestamps -> buffer window -> sort -> emit
| |
source ts? tie-break: seq
How it works (step-by-step, with invariants and failure modes)
- For each line, parse timestamp if possible.
- Attach read timestamp from a monotonic clock.
- Place the line into a small buffer window.
- Periodically flush buffer: sort by parsed timestamp if valid, else by read time.
- Use a sequence number to break ties for determinism.
- Failure mode: clock jumps backwards -> monotonic clock avoids this.
Minimal concrete example
struct line_meta {
uint64_t seq;
int has_ts;
int64_t ts_epoch_ms;
int64_t ts_read_ms;
};
Common misconceptions
- “Wall clock is always safe”: NTP adjustments can reorder lines.
- “Read order equals event order”: not always across multiple files.
- “Parsing timestamps is optional”: without it, ordering loses meaning.
Check-your-understanding questions
- Why is a monotonic clock preferable for read timestamps?
- What is the trade-off of buffering lines before output?
- How do you handle lines with missing timestamps?
Check-your-understanding answers
- It never goes backward, so ordering remains stable even if wall clock jumps.
- Buffering adds latency but improves ordering accuracy.
- Use read timestamp and mark source timestamp as missing.
Real-world applications
- Incident timeline reconstruction across services.
- Log correlation in observability systems.
- Replay systems that need deterministic ordering.
Where you will apply it
- Project 1: See §3.4 Example Usage and §5.10 Phase 3.
- Project 6: Cross-service log aggregation.
- Also used in: P06 Deployment Pipeline Tool.
References
- RFC 3339 for timestamp formats.
clock_gettime()documentation.
Key insights
Ordering is a policy choice; make it explicit and deterministic.
Summary
A production tailer needs a clear ordering policy and deterministic timestamps for reproducible output and tests. Use a monotonic clock for read timestamps and a small buffer to reduce reordering artifacts.
Homework/exercises to practice the concept
- Build a parser for ISO 8601 timestamps and validate it against known samples.
- Implement a fixed time source and verify deterministic output in tests.
- Inject out-of-order timestamps and observe the ordering behavior.
Solutions to the homework/exercises
- Parsing should produce epoch milliseconds; invalid formats return a clear error.
- A fixed time source yields identical outputs across runs.
- The buffer window should reorder within bounds and fall back to read order.
3. Project Specification
3.1 What You Will Build
A CLI tool that tails multiple log files according to a YAML configuration, detects both rename and copytruncate rotations, and emits a unified stream of timestamped lines in stable order. It will:
- Follow file identity across renames using inode/device pairs.
- Detect copytruncate by size shrink and reset offsets safely.
- Merge output across files with a deterministic ordering policy.
- Emit structured rotation events and errors.
- Expose a debug mode that prints file identities and offsets.
Excluded:
- No network shipping to external systems.
- No GUI or web interface.
- No distributed log collection beyond the local machine.
3.2 Functional Requirements
- Multi-file follow: Follow at least 3 files concurrently without blocking.
- Rotation detection: Detect rename and copytruncate rotations reliably.
- Ordering: Emit ordered output with deterministic tie-breaking.
- Config-driven: Accept a YAML config file specifying file paths and options.
- Backoff: Retry missing files with exponential backoff.
- State: Optionally persist a checkpoint file for restart safety.
3.3 Non-Functional Requirements
- Performance: Must handle 1,000 lines/sec across 3 files on a laptop.
- Reliability: No missed lines during rotation windows (within documented limits).
- Usability: Clear log messages for rotation, errors, and state changes.
3.4 Example Usage / Output
$ ./tailer --config logs.yml --order timestamp --state state.json
[tailer] watching 3 files with poll() interval=250ms
[tailer] /var/log/app.log inode=4198401
[tailer] /var/log/api.log inode=4198420
[tailer] /var/log/worker.log inode=4198451
2026-01-01T12:00:01Z app.log INFO started pid=23312
2026-01-01T12:00:02Z api.log WARN upstream timeout id=9f2c
2026-01-01T12:00:02Z worker.log INFO batch=17 processed=200
[tailer] app.log rotated (inode changed 4198401 -> 4201130), reopening
2026-01-01T12:00:10Z app.log INFO rotation detected, continuing
3.5 Data Formats / Schemas / Protocols
Config (YAML):
poll_interval_ms: 250
order: timestamp
state_file: state.json
files:
- path: /var/log/app.log
name: app.log
max_idle_ms: 5000
- path: /var/log/api.log
name: api.log
max_idle_ms: 5000
State file (JSON):
{
"version": 1,
"files": {
"/var/log/app.log": {
"dev": 2049,
"ino": 4201130,
"offset": 12890,
"last_line_hash": "b2f4..."
}
}
}
Output line format:
<tailer_ts> <source_name> <level> <message>
3.6 Edge Cases
- File missing at startup (ENOENT).
- File becomes unreadable after rotation (EACCES).
- Copytruncate causes size shrink while writer is active.
- Multiple rotations within one poll interval.
- Timestamp parsing fails or is missing.
- RLIMIT_NOFILE exceeded.
3.7 Real World Outcome
This is the definitive behavior to compare against.
3.7.1 How to Run (Copy/Paste)
make
./tailer --config logs.yml --order timestamp --state state.json
3.7.2 Golden Path Demo (Deterministic)
- Use fixed timestamps in test logs:
2026-01-01T12:00:00Zincrements. - Tailer prints lines in the same order on every run.
3.7.3 If CLI: exact terminal transcript
$ ./tailer --config tests/logs.yml --order timestamp --state tests/state.json
[tailer] watching 2 files with poll() interval=100ms
[tailer] /tmp/a.log inode=1001
[tailer] /tmp/b.log inode=1002
2026-01-01T12:00:00Z a.log INFO boot
2026-01-01T12:00:01Z b.log INFO ready
[tailer] a.log truncated (size decreased), reset offset
2026-01-01T12:00:02Z a.log INFO rotated
Failure demo (missing file):
$ ./tailer --config tests/missing.yml
[tailer] error opening /tmp/missing.log: ENOENT
[tailer] retry in 1s (backoff)
Exit codes:
0on graceful shutdown.2on invalid config.3on unrecoverable I/O errors.
4. Solution Architecture
4.1 High-Level Design
+---------------------+
| Config Loader |
+----------+----------+
|
v
+----------+----------+
| File Watch Registry |
+-----+------+--------+
| |
v v
+-----+--+ +--+-----+
| Reader | | Reader |
+--+-----+ +-----+--+
| |
v v
+---------------------+
| Ordering / Buffering|
+----------+----------+
|
v
+---------------------+
| Output Formatter |
+---------------------+
4.2 Key Components
| Component | Responsibility | Key Decisions |
|---|---|---|
| Config loader | Parse YAML into runtime config | Strict schema + clear errors |
| File registry | Track inode/device/offset per file | Store both path and FD identity |
| Reader loop | Poll and read file content | poll() with size checks |
| Ordering buffer | Merge lines across files | Timestamp ordering with tie-breakers |
| State manager | Persist offsets, inode IDs | JSON state with versioning |
4.3 Data Structures (No Full Code)
struct file_state {
char path[256];
char name[64];
int fd;
dev_t dev;
ino_t ino;
off_t offset;
uint64_t seq;
int draining;
int64_t last_size;
char buffer[8192];
size_t buffer_len;
};
struct line {
uint64_t seq;
int64_t ts_source_ms;
int64_t ts_read_ms;
int has_source_ts;
char source[64];
char text[1024];
};
4.4 Algorithm Overview
Key Algorithm: Rotation-aware read loop
stat(path)and compare with stored inode.- Detect rename rotation; open new FD and mark old as draining.
- Detect copytruncate by size shrink; reset offset.
- Read new data if size > offset; parse lines into buffer.
- Merge lines in ordering buffer and emit.
Complexity Analysis:
- Time: O(n) per poll cycle for n files, plus O(m log m) for ordering buffer of m lines.
- Space: O(n) for file state + O(m) for buffered lines.
5. Implementation Guide
5.1 Development Environment Setup
sudo apt-get install -y gcc make libyaml-dev
5.2 Project Structure
log-tailer/
├── src/
│ ├── main.c
│ ├── config.c
│ ├── tailer.c
│ ├── rotate.c
│ ├── order.c
│ └── state.c
├── include/
│ ├── config.h
│ ├── tailer.h
│ └── state.h
├── tests/
│ ├── logs/
│ ├── test_rotation.sh
│ └── test_ordering.sh
├── Makefile
└── README.md
5.3 The Core Question You’re Answering
“How do I follow a file that does not stay the same file?”
You are building a tool that treats file identity as the truth, not file names. This is the difference between a toy tailer and a production tailer.
5.4 Concepts You Must Understand First
Stop and research these before coding:
- Open file descriptions and file identity
- How does FD -> open file description -> inode mapping work?
- Why does rename not change the FD’s target?
- Log rotation patterns
- Rename vs copytruncate behavior.
- I/O multiplexing
- Why
poll()can still spin at EOF for regular files.
- Why
- Timestamps and ordering
- Why monotonic clocks matter for deterministic output.
5.5 Questions to Guide Your Design
- How will you detect rename rotation without missing late writes?
- What is your strategy for copytruncate detection?
- What ordering policy will you state and implement?
- How will you handle missing or malformed timestamps?
5.6 Thinking Exercise
Rotation Timeline
T0: app.log inode=100 size=10MB
T1: rename to app.log.1 inode=100
T2: new app.log inode=101 size=0
T3: writer still writes to inode=100 for 2s
Questions:
- When should you close inode=100?
- How will you avoid missing lines between T2 and T3?
5.7 The Interview Questions They’ll Ask
- Why does
tail -Fbehave differently fromtail -f? - How do you detect copytruncate safely?
- What is the difference between file name and file identity?
- How do you avoid busy loops at EOF?
5.8 Hints in Layers
Hint 1: Track identity
struct stat st;
stat(path, &st);
Hint 2: Compare inode
if (st.st_ino != file->ino) rotate = 1;
Hint 3: Size shrink means truncation
if (st.st_size < file->offset) file->offset = 0;
Hint 4: Use a grace period
Keep old FD for N seconds, read until idle, then close.
5.9 Books That Will Help
| Topic | Book | Chapter |
|---|---|---|
| File I/O model | The Linux Programming Interface | Ch. 4 |
| File attributes, inodes | The Linux Programming Interface | Ch. 15 |
| I/O multiplexing | Advanced Programming in the UNIX Environment | Ch. 14 |
| Time and clocks | The Linux Programming Interface | Ch. 10 |
5.10 Implementation Phases
Phase 1: Foundation (2-3 days)
Goals:
- Parse config.
- Open files and read new lines.
Tasks:
- Implement YAML config loader with clear errors.
- Open files, read from EOF, and print new lines.
Checkpoint: Tailer prints new lines from a single file.
Phase 2: Rotation + Multiplexing (4-6 days)
Goals:
- Multi-file polling.
- Rotation detection (rename + copytruncate).
Tasks:
- Build polling loop with per-file buffers.
- Detect inode change and reopen with grace period.
- Detect size shrink and reset offset.
Checkpoint: Tailer survives manual rotations without losing lines.
Phase 3: Ordering + State (3-4 days)
Goals:
- Deterministic ordering.
- State persistence.
Tasks:
- Implement timestamp parsing and read-time fallback.
- Add ordering buffer with tie-breaking sequence.
- Write and restore checkpoint file.
Checkpoint: Deterministic output across two runs.
5.11 Key Implementation Decisions
| Decision | Options | Recommendation | Rationale |
|---|---|---|---|
| Multiplexing API | select, poll, epoll | poll | Simple and enough for 3-50 files |
| Rotation strategy | follow-name, follow-fd, hybrid | hybrid | Avoids missing late writes |
| Ordering policy | read-order, timestamp-order | timestamp + seq | Better incident readability |
6. Testing Strategy
6.1 Test Categories
| Category | Purpose | Examples |
|---|---|---|
| Unit Tests | Parse timestamps, config, state | timestamp parser |
| Integration Tests | Rotation handling and ordering | rename + copytruncate scripts |
| Edge Case Tests | Missing files, permissions, EMFILE | simulate ENOENT |
6.2 Critical Test Cases
- Rename rotation: Ensure new inode is detected and no lines are lost.
- Copytruncate: Ensure offset reset and new data is read.
- Ordering window: Verify deterministic output under interleaved writes.
6.3 Test Data
# logs/a.log
2026-01-01T12:00:00Z a start
2026-01-01T12:00:01Z a step
# logs/b.log
2026-01-01T12:00:00Z b init
2026-01-01T12:00:02Z b done
7. Common Pitfalls & Debugging
7.1 Frequent Mistakes
| Pitfall | Symptom | Solution |
|---|---|---|
| Follow-by-FD only | Miss new file after rename | Hybrid reopen with grace |
| No size shrink detection | Tailer stuck at EOF | Reset offset on shrink |
| No buffering per file | Broken lines | Line buffer per FD |
7.2 Debugging Strategies
- Use strace to confirm open/close and stat calls during rotation.
- Compare inode pairs with
ls -ito confirm identity changes. - Add debug logs for offsets and sizes to catch truncation.
7.3 Performance Traps
- Busy loops at EOF due to poll readiness for regular files.
- Excessive fstat calls per loop; cache and throttle when idle.
8. Extensions & Challenges
8.1 Beginner Extensions
- Add colorized output per file source.
- Add a
--follow-from-startoption.
8.2 Intermediate Extensions
- Add an inotify-based change detector with fallback to polling.
- Support JSON output lines with structured fields.
8.3 Advanced Extensions
- Persist checkpoints with checksum validation and recovery policies.
- Add per-file rate limits to avoid noisy sources.
9. Real-World Connections
9.1 Industry Applications
- Log shippers: Filebeat and Fluent Bit rely on inode tracking.
- Incident response: Ordered logs are essential for root cause analysis.
9.2 Related Open Source Projects
- Filebeat: Agent that tails and ships logs with inode tracking.
- Fluent Bit: Lightweight log processor with rotation handling.
9.3 Interview Relevance
- File descriptors, inodes, and rotation questions are common in systems interviews.
10. Resources
10.1 Essential Reading
- The Linux Programming Interface (Kerrisk) - Ch. 4, 10, 15.
- Advanced Programming in the UNIX Environment (Stevens/Rago) - Ch. 14.
10.2 Video Resources
- Talks on log rotation and file identity (conference recordings).
10.3 Tools & Documentation
- logrotate documentation.
- strace usage guides for open/stat.
10.4 Related Projects in This Series
- P02 HTTP Connection Pool for multiplexing techniques.
- P06 Deployment Pipeline Tool for log aggregation.
11. Self-Assessment Checklist
11.1 Understanding
- I can explain inode vs path without notes.
- I can describe rename vs copytruncate behavior.
- I can explain why poll can spin at EOF.
11.2 Implementation
- All functional requirements are met.
- Rotation handling passes tests.
- Output ordering is deterministic.
11.3 Growth
- I can identify a trade-off I made in ordering policy.
- I can explain my grace-period design in an interview.
12. Submission / Completion Criteria
Minimum Viable Completion:
- Tail at least 3 files with rotation detection.
- Correctly detect copytruncate and reset offset.
- Emit ordered output with timestamps.
Full Completion:
- Add checkpoint state and deterministic ordering buffer.
- Provide integration tests for rename and copytruncate.
Excellence (Going Above & Beyond):
- Inotify + polling hybrid mode.
- Configurable rotation grace period and per-file policies.