Project 4: Logging Library
Build a reusable C logging library with pluggable sinks, log levels, and safe concurrency semantics.
Quick Reference
| Attribute | Value |
|---|---|
| Difficulty | Intermediate |
| Time Estimate | ~1 week |
| Main Programming Language | C (Alternatives: Rust, Go) |
| Alternative Programming Languages | Rust, Go |
| Coolness Level | Level 6 - production-grade diagnostics |
| Business Potential | Level 6 - reusable logging module |
| Prerequisites | File I/O, time functions, threading basics |
| Key Topics | Sink interfaces, formatting, thread safety |
1. Learning Objectives
By completing this project, you will:
- Design a stable logging API with clear ownership and threading rules.
- Implement pluggable log sinks (stderr, file, callback).
- Provide structured log formats (logfmt/JSON) and level filtering.
- Handle log failures without crashing the application.
- Add log rotation and deterministic output for tests.
2. All Theory Needed (Per-Concept Breakdown)
2.1 Sink Architecture & Thread Safety
Fundamentals
A log sink is a destination for log events (stderr, file, network). A logging library is a boundary between application code and these sinks. It must define how log events are emitted, how sinks are registered, and whether logging is thread-safe. If multiple threads log concurrently, messages can interleave or corrupt output unless you synchronize. The contract must specify whether logging is synchronous, whether callbacks may reenter the logger, and how the library prevents deadlocks.
Deep Dive into the Concept
A robust logging library separates log event creation from log event delivery. The simplest design is synchronous: when log_info() is called, it formats a string and writes to each sink immediately. This is easy to reason about, but it means the caller’s thread pays the cost of I/O. If a sink is slow (e.g., disk or network), logging can become a bottleneck. The contract must document this. For small libraries, synchronous logging is acceptable, but you should still avoid deadlocks.
Thread safety is a core boundary concern. If you use a global logger with multiple sinks, you need a mutex to guard the sink list and output. Without a mutex, two threads can interleave their writes and create corrupted lines. However, holding a mutex while calling a user-provided sink callback can be dangerous: if the callback calls the logger again, you can deadlock. This is a classic reentrancy problem. The safe pattern is: copy the sink list under lock, release the lock, then invoke each sink. For file sinks, you can keep a per-sink lock instead of a global lock, so slow sinks do not block others.
A more advanced design is asynchronous logging: push log events onto a queue and have a background thread flush them. This improves performance and avoids blocking, but introduces complexity (queue limits, flush on shutdown, lost logs on crash). For this project, synchronous logging with optional async extension is a good balance. The key is to clearly document that logging is synchronous and thread-safe, and to ensure that each sink is called in a deterministic order.
The sink interface should be simple and stable. A common pattern is:
typedef void (*log_sink_fn)(void *ctx, const log_event *ev);
Here, ctx is a user-provided pointer (for file handles, network sockets, etc.). This pattern keeps the API flexible and avoids global state. Ownership is explicit: the library owns log_event during the call; the sink may not store pointers to its internal buffers unless documented. This is another boundary contract that prevents dangling pointers.
Finally, consider per-logger instances. A logger_t* handle makes logging more explicit and testable. It also allows different components to use different sinks or log levels. This mirrors good design in other libraries and aligns with ownership rules from Projects 2 and 3.
How This Fits in This Project
You will implement a logger handle with a list of sinks (Sec. 4.2) and define thread-safe emission in Sec. 5.10 Phase 2. This concept also informs error handling in Sec. 7.1. The same handle pattern is used in Project 2’s network client. Also used in: Project 2.
Definitions & Key Terms
- Sink -> Destination for log events.
- Reentrancy -> Safe to call again before returning.
- Synchronous logging -> Log call blocks until sinks complete.
- Asynchronous logging -> Log call enqueues and returns quickly.
Mental Model Diagram (ASCII)
log_info() -> format event -> [sink list] -> stderr / file / callback
^
| (mutex to protect sink list)
How It Works (Step-by-Step)
- Caller invokes
log_info(logger, ...). - Logger builds a
log_eventwith timestamp and fields. - Logger copies sink list under lock.
- Logger iterates sinks and calls each sink function.
- Errors are recorded but do not crash the caller.
Minimal Concrete Example
typedef struct {
log_sink_fn fn;
void *ctx;
} log_sink;
Common Misconceptions
- “Logging is always thread-safe.” -> Only if you design it so.
- “Callbacks are harmless.” -> They can reenter and deadlock.
- “Global logger is simplest.” -> It limits testability and flexibility.
Check-Your-Understanding Questions
- Why should sink callbacks not be called under the global lock?
- What is the trade-off between synchronous and asynchronous logging?
- Why use a
logger_t*handle instead of globals?
Check-Your-Understanding Answers
- To avoid reentrancy deadlocks.
- Synchronous is simpler but can block; async is faster but complex.
- Handles allow multiple configurations and explicit ownership.
Real-World Applications
- Web servers with multiple log destinations.
- Embedded systems logging to UART and file.
Where You’ll Apply It
- In this project: Sec. 4.2 (components), Sec. 5.10 Phase 2 (thread safety).
- Also used in: Project 2.
References
- “The Linux Programming Interface” - Ch. 31 (Threads)
- “Clean Code” - Ch. 3 (Interfaces)
Key Insight
A logging library is a concurrency boundary; reentrancy rules are part of the API.
Summary
A good logging system separates sinks from log event creation, defines thread safety explicitly, and avoids deadlocks by design.
Homework/Exercises to Practice the Concept
- Write a logger with two sinks and test in two threads.
- Create a sink that logs again; observe deadlock without safeguards.
- Implement per-logger handles with different sinks.
Solutions to the Homework/Exercises
- Protect sink list with a mutex and serialize output.
- Release lock before calling sinks.
- Store sink list per logger instance.
2.2 Formatting, Levels, and Structured Logs
Fundamentals
Logs are only useful if they are consistent and parseable. A logging library must define log levels (DEBUG, INFO, WARN, ERROR) and filter messages accordingly. It must also define a format: plain text, logfmt, or JSON. Structured logs allow machines to parse fields without brittle string parsing. A good interface lets callers include key/value pairs while keeping message formatting safe and deterministic.
Deep Dive into the Concept
Log levels establish severity and allow filtering. The library should define a clear ordering (DEBUG < INFO < WARN < ERROR) and provide a minimum level threshold. This allows expensive debug logs to be disabled in production. The threshold can be global or per logger. A per-logger threshold is more flexible and avoids global state. The contract is simple: if event.level < logger.min_level, the log call is a no-op.
Formatting must be safe. In C, formatting with printf-style variadics can be dangerous if format strings are constructed dynamically. The library should treat the format string as user-provided but still avoid buffer overflows by using vsnprintf into a bounded buffer. If a message exceeds the buffer, the library should truncate and indicate truncation (e.g., append ...). This is an explicit boundary rule: you guarantee no overflow and deterministic output.
Structured logging can be implemented by allowing key/value pairs. One approach is to provide an API like:
log_info(logger, "cache miss", "key", key, "ttl", ttl_str, NULL);
This is simple but not type-safe. Another approach is to define a log_field struct array. For this project, you can implement a simple log_kv API with string keys and values. For JSON output, ensure proper escaping of quotes and control characters. For logfmt, follow the standard key=value with quoting when necessary.
Time is also part of formatting. Use ISO-8601 timestamps in UTC (e.g., 2026-01-01T12:00:00Z). This is deterministic and easy to parse. The library should allow injecting a clock function for tests so that output is deterministic. This is a key boundary for testability.
Finally, consider the design of error logs. When logging an error, include both a human-readable message and a machine-readable field (e.g., err=KV_ERR_CONN). This makes logs searchable and improves debugging. It also ties into Project 2’s error model.
How This Fits in This Project
Formatting and levels drive your API surface (Sec. 3.2), output examples (Sec. 3.7), and test strategy (Sec. 6.2). You’ll reuse the deterministic timestamp idea in Project 5 for request logs. Also used in: Project 5.
Definitions & Key Terms
- Log level -> Severity category for filtering.
- Structured logs -> Logs with key/value fields.
- logfmt ->
key=valueformat with quoting rules. - ISO-8601 -> Standard timestamp format.
Mental Model Diagram (ASCII)
log_event {level, ts, msg, fields} -> formatter -> sink output
How It Works (Step-by-Step)
- Build a
log_eventwith level, timestamp, message, fields. - Check against
min_level; drop if below. - Format into a bounded buffer using
vsnprintf. - Escape fields for logfmt or JSON.
- Write to sinks.
Minimal Concrete Example
logger_set_level(logger, LOG_INFO);
log_info(logger, "cache miss", "key", "user:1", NULL);
Common Misconceptions
- “Logging is just printf.” -> Formatting must be safe and consistent.
- “JSON output is always valid.” -> You must escape strings correctly.
- “Levels are cosmetic.” -> They control performance and noise.
Check-Your-Understanding Questions
- Why use
vsnprintfinstead ofsprintf? - What advantage does logfmt have over plain text?
- Why should timestamps be injectable for tests?
Check-Your-Understanding Answers
- It prevents buffer overflows.
- It is machine-parseable without full JSON overhead.
- It makes outputs deterministic.
Real-World Applications
- Observability pipelines (ELK, Splunk).
- CLI tools with structured diagnostics.
Where You’ll Apply It
- In this project: Sec. 3.7 (output format), Sec. 5.10 Phase 1 (formatter).
- Also used in: Project 5.
References
- “Logging Best Practices” - community guide
- “Clean Code” - Ch. 3
Key Insight
A log is a data product; format and structure are part of the API.
Summary
Levels and structured formatting make logs useful for both humans and machines. Safe formatting prevents crashes and keeps output deterministic.
Homework/Exercises to Practice the Concept
- Implement a logfmt formatter with escaping.
- Add a JSON formatter and verify valid JSON output.
- Add a fixed clock for tests.
Solutions to the Homework/Exercises
- Quote values with spaces and escape quotes.
- Escape
"and\nin JSON strings. - Use a function pointer for
now()in logger.
2.3 Reliability, Rotation, and Failure Handling
Fundamentals
Logging should never crash the application. A logging library must define how it behaves when sinks fail (disk full, file permission errors). It should also support log rotation to prevent unbounded file growth. These behaviors are part of the boundary contract: callers need to know whether logging failures propagate or are swallowed.
Deep Dive into the Concept
A sink failure is an error in an auxiliary system, not necessarily in the main application. Most logging libraries treat sink failures as non-fatal: they return an error code to the caller but do not abort. For a synchronous logging API, you can return a status code from log calls, or you can store a “last sink error” in the logger handle. The design choice should match your error model. For this project, a simple approach is: log functions return int (0 success, negative error). This allows callers to check if they care, but keeps logging non-fatal.
File rotation is a common requirement. A simple rotation policy is size-based: when the log file exceeds N bytes, close it, rename it to app.log.1, shift older files (.2, .3), and open a new file. Rotation must be synchronized to avoid interleaved writes. If multiple threads log, only one should rotate at a time. You can achieve this by locking the file sink and checking size before writing. Rotation is a great boundary exercise because it requires careful state transitions and error handling.
Failure handling must be deterministic. If rotation fails (e.g., rename error), log it to stderr or store it in last_error, but continue logging to the existing file if possible. If opening a file fails at startup, the logger should still operate with other sinks (stderr). This is part of “fail open” behavior: the system stays usable even if a sink is broken.
Backpressure is another concern. If logging is synchronous and a sink is slow, it can block the application. You can mitigate this by offering an async extension with a bounded queue. If the queue is full, drop logs or block based on a policy. Even if you keep synchronous logging in the core, you should describe these trade-offs and provide hooks for future async support.
Finally, deterministic testability. To test rotation, you need a deterministic size threshold and a test logger that writes fixed-size strings. For failure handling, simulate errors by opening logs in a read-only directory. The boundary is not just the API; it includes the observable behavior under stress.
How This Fits in This Project
You will implement rotation in Sec. 5.10 Phase 3 and document failure behaviors in Sec. 3.3 and Sec. 7.1. These reliability rules are reused in Project 5’s server logs. Also used in: Project 5.
Definitions & Key Terms
- Rotation -> Replacing log files once they exceed size.
- Fail-open -> Continue operation despite logging errors.
- Backpressure -> Control of logging speed under load.
Mental Model Diagram (ASCII)
write log -> check size -> rotate? -> write -> flush
^
| (sink lock)
How It Works (Step-by-Step)
- Before write, check current file size.
- If size exceeds threshold, close and rotate files.
- Reopen new log file.
- Write message; return status.
- On failure, store error and continue with other sinks.
Minimal Concrete Example
int log_sink_file_write(file_sink *s, const char *msg) {
if (s->size > s->max_size) rotate_logs(s);
return fputs(msg, s->fp) == EOF ? -1 : 0;
}
Common Misconceptions
- “Logging failure should abort the app.” -> It usually shouldn’t.
- “Rotation is just renaming.” -> It must be synchronized and safe.
- “Dropping logs is always bad.” -> Sometimes it’s better than blocking.
Check-Your-Understanding Questions
- Why should logging failures be non-fatal?
- What is a safe rotation policy?
- How do you test rotation deterministically?
Check-Your-Understanding Answers
- Logging is auxiliary; it shouldn’t crash the main app.
- Size-based rotation with a lock around file operations.
- Use fixed message sizes and small rotation thresholds.
Real-World Applications
- Application servers with rotating log files.
- Embedded devices with limited storage.
Where You’ll Apply It
- In this project: Sec. 3.3 (non-functional), Sec. 5.10 Phase 3 (rotation).
- Also used in: Project 5.
References
- “The Linux Programming Interface” - File I/O chapters
- “The Pragmatic Programmer” - logging practices
Key Insight
A logging system must fail gracefully; reliability is part of its interface.
Summary
Rotation and failure handling define the reliability boundary of your logging library. They keep logs useful without destabilizing the application.
Homework/Exercises to Practice the Concept
- Implement size-based rotation with 3 backup files.
- Simulate permission errors and ensure logger still writes to stderr.
- Add a policy to drop logs when queue is full (for async extension).
Solutions to the Homework/Exercises
- Rename
app.logtoapp.log.1, shift older files, reopen. - Attempt to open logs in a read-only directory and fallback to stderr.
- Use a bounded queue and return
LOG_ERR_DROPPEDwhen full.
3. Project Specification
3.1 What You Will Build
A C logging library libloglite with log levels, pluggable sinks (stderr, file, callback), structured formatting (logfmt/JSON), and optional log rotation.
3.2 Functional Requirements
- Logger handle: create/destroy logger instances.
- Log levels: DEBUG, INFO, WARN, ERROR with filtering.
- Sinks: at least stderr and file sinks; optional callback sink.
- Formatting: logfmt and JSON output modes.
- Rotation: size-based rotation for file sink.
3.3 Non-Functional Requirements
- Performance: 10k logs/sec without crashes.
- Reliability: logging failures never crash app.
- Usability: deterministic output for tests.
3.4 Example Usage / Output
2026-01-01T12:00:00Z INFO server started port=8080
2026-01-01T12:00:01Z WARN cache miss key=user:1
2026-01-01T12:00:02Z ERROR db timeout_ms=3000
3.5 Data Formats / Schemas / Protocols
logfmt example:
ts=2026-01-01T12:00:02Z level=ERROR msg="db timeout" timeout_ms=3000
JSON example:
{"ts":"2026-01-01T12:00:02Z","level":"ERROR","msg":"db timeout","timeout_ms":3000}
3.6 Edge Cases
- Sink write fails (disk full).
- Logger used after destroy.
- Log message longer than buffer.
- Reentrant logging from sink.
3.7 Real World Outcome
A developer can log to stderr and a file with consistent formatting and predictable behavior under errors.
3.7.1 How to Run (Copy/Paste)
make
./log-demo
3.7.2 Golden Path Demo (Deterministic)
Use a fixed clock function to produce constant timestamps.
3.7.3 If CLI: Exact Terminal Transcript
$ ./log-demo
2026-01-01T12:00:00Z INFO server started port=8080
2026-01-01T12:00:01Z WARN cache miss key=user:1
2026-01-01T12:00:02Z ERROR db timeout_ms=3000
$ echo $?
0
Failure demo (file sink error):
$ ./log-demo --log-file /root/forbidden.log
Error [LOG_ERR_IO]: cannot open log file
$ echo $?
2
Exit Codes:
0success2log file error3invalid arguments
4. Solution Architecture
4.1 High-Level Design
logger -> formatter -> sinks
| |-> stderr
| |-> file
| |-> callback
4.2 Key Components
| Component | Responsibility | Key Decisions | |———–|—————-|—————| | Logger handle | Store sinks and level | Per-logger config | | Formatter | Build log strings | logfmt + JSON | | File sink | Write + rotate | size-based rotation |
4.3 Data Structures (No Full Code)
typedef struct {
log_level min_level;
log_sink *sinks;
size_t sink_count;
clock_fn now;
} logger;
4.4 Algorithm Overview
Key Algorithm: log_emit
- Check level filter.
- Build event with timestamp and fields.
- Format into buffer.
- Send to each sink.
- Return status.
Complexity Analysis:
- Time: O(k) for k sinks.
- Space: O(1) per log event (fixed buffer).
5. Implementation Guide
5.1 Development Environment Setup
cc --version
make --version
5.2 Project Structure
libloglite/
|-- include/
| `-- loglite.h
|-- src/
| |-- logger.c
| |-- sinks.c
| |-- format.c
| `-- rotate.c
|-- demo/
| `-- log-demo.c
`-- Makefile
5.3 The Core Question You’re Answering
“How do you design a logging interface that is safe, fast, and reusable?”
5.4 Concepts You Must Understand First
- Sink interfaces and thread safety.
- Safe formatting and structured logs.
- Failure handling and rotation.
5.5 Questions to Guide Your Design
- Will logging be synchronous or async?
- How do you prevent deadlocks when sinks reenter?
- What is your rotation policy?
5.6 Thinking Exercise
Design a log line format that is both human-readable and machine-parseable. How do you escape values?
5.7 The Interview Questions They’ll Ask
- How do you make logging thread-safe?
- How do you avoid deadlocks with callbacks?
- How do you handle log file rotation?
5.8 Hints in Layers
Hint 1: Use a logger handle
logger *logger_create(void);
Hint 2: Use vsnprintf
vsnprintf(buf, sizeof(buf), fmt, ap);
5.9 Books That Will Help
| Topic | Book | Chapter | |——-|——|———| | Interfaces | “Clean Code” | Ch. 3 | | Concurrency | “The Linux Programming Interface” | Ch. 31 | | Defensive coding | “Code Complete” | Ch. 8 |
5.10 Implementation Phases
Phase 1: Core Logger (2-3 days)
Goals: logger handle, levels, stderr sink. Checkpoint: logs printed with timestamps.
Phase 2: Sinks & Formatting (2-3 days)
Goals: file sink, logfmt/JSON. Checkpoint: logs written to file and stderr.
Phase 3: Rotation & Hardening (1-2 days)
Goals: rotation, error handling. Checkpoint: logs rotate at size threshold.
5.11 Key Implementation Decisions
| Decision | Options | Recommendation | Rationale | |———-|———|—————-|———–| | Concurrency | no lock vs mutex | mutex | prevent interleaving | | Format | plain vs logfmt | logfmt | machine-parseable | | Rotation | size vs time | size | deterministic tests |
6. Testing Strategy
6.1 Test Categories
| Category | Purpose | Examples | |———-|———|———-| | Unit tests | Formatting | JSON escaping | | Integration | Sinks | file + stderr | | Failure | Rotation errors | permission denied |
6.2 Critical Test Cases
- Reentrant sink: callback logs again -> no deadlock.
- Long message: output truncated safely.
- Rotation: file exceeds size and rotates.
6.3 Test Data
msg="hello" key=value
msg="quote \""
7. Common Pitfalls & Debugging
7.1 Frequent Mistakes
| Pitfall | Symptom | Solution |
|——–|———|———-|
| Missing lock | Interleaved logs | Add mutex |
| Unsafe formatting | Buffer overflow | Use vsnprintf |
| Rotation bugs | Lost logs | Rotate before write |
7.2 Debugging Strategies
- Add a sink that writes to memory for tests.
- Use deterministic timestamps.
7.3 Performance Traps
- Synchronous logging to slow disk in hot path.
8. Extensions & Challenges
8.1 Beginner Extensions
- Add per-module log levels.
- Add colored output for stderr.
8.2 Intermediate Extensions
- Async logging with background thread.
- Network sink for UDP logs.
8.3 Advanced Extensions
- Structured binary logs.
- Pluggable formatters.
9. Real-World Connections
9.1 Industry Applications
- Server logs and observability pipelines.
- Embedded device diagnostics.
9.2 Related Open Source Projects
- spdlog (C++) - high-performance logging.
- log.c - minimal C logger.
9.3 Interview Relevance
- Concurrency boundaries and safe I/O.
- API design and error handling.
10. Resources
10.1 Essential Reading
- “Clean Code” - Ch. 3
- “The Linux Programming Interface” - Ch. 31
10.2 Video Resources
- “Logging at Scale” - conference talk (searchable title)
10.3 Tools & Documentation
man strftime,man localtime_r
10.4 Related Projects in This Series
11. Self-Assessment Checklist
11.1 Understanding
- I can explain log levels and filtering.
- I can explain sink reentrancy hazards.
- I can explain rotation policy.
11.2 Implementation
- All functional requirements are met.
- Logs are deterministic in tests.
- Rotation works without data loss.
11.3 Growth
- I can explain logging design in interviews.
- I documented failure handling rules.
12. Submission / Completion Criteria
Minimum Viable Completion:
- Logger handle with stderr sink.
- Level filtering.
- Safe formatting with bounded buffers.
Full Completion:
- File sink with rotation.
- Structured logfmt/JSON output.
- Thread-safe logging.
Excellence (Going Above & Beyond):
- Async logging with queue policies.
- Network sink.