Project 4: Logging Library

Build a reusable C logging library with pluggable sinks, log levels, and safe concurrency semantics.

Quick Reference

Attribute	Value
Difficulty	Intermediate
Time Estimate	~1 week
Main Programming Language	C (Alternatives: Rust, Go)
Alternative Programming Languages	Rust, Go
Coolness Level	Level 6 - production-grade diagnostics
Business Potential	Level 6 - reusable logging module
Prerequisites	File I/O, time functions, threading basics
Key Topics	Sink interfaces, formatting, thread safety

1. Learning Objectives

By completing this project, you will:

Design a stable logging API with clear ownership and threading rules.
Implement pluggable log sinks (stderr, file, callback).
Provide structured log formats (logfmt/JSON) and level filtering.
Handle log failures without crashing the application.
Add log rotation and deterministic output for tests.

2. All Theory Needed (Per-Concept Breakdown)

2.1 Sink Architecture & Thread Safety

Fundamentals

A log sink is a destination for log events (stderr, file, network). A logging library is a boundary between application code and these sinks. It must define how log events are emitted, how sinks are registered, and whether logging is thread-safe. If multiple threads log concurrently, messages can interleave or corrupt output unless you synchronize. The contract must specify whether logging is synchronous, whether callbacks may reenter the logger, and how the library prevents deadlocks.

Deep Dive into the Concept

A robust logging library separates log event creation from log event delivery. The simplest design is synchronous: when log_info() is called, it formats a string and writes to each sink immediately. This is easy to reason about, but it means the caller’s thread pays the cost of I/O. If a sink is slow (e.g., disk or network), logging can become a bottleneck. The contract must document this. For small libraries, synchronous logging is acceptable, but you should still avoid deadlocks.

Thread safety is a core boundary concern. If you use a global logger with multiple sinks, you need a mutex to guard the sink list and output. Without a mutex, two threads can interleave their writes and create corrupted lines. However, holding a mutex while calling a user-provided sink callback can be dangerous: if the callback calls the logger again, you can deadlock. This is a classic reentrancy problem. The safe pattern is: copy the sink list under lock, release the lock, then invoke each sink. For file sinks, you can keep a per-sink lock instead of a global lock, so slow sinks do not block others.

A more advanced design is asynchronous logging: push log events onto a queue and have a background thread flush them. This improves performance and avoids blocking, but introduces complexity (queue limits, flush on shutdown, lost logs on crash). For this project, synchronous logging with optional async extension is a good balance. The key is to clearly document that logging is synchronous and thread-safe, and to ensure that each sink is called in a deterministic order.

The sink interface should be simple and stable. A common pattern is:

typedef void (*log_sink_fn)(void *ctx, const log_event *ev);

Here, ctx is a user-provided pointer (for file handles, network sockets, etc.). This pattern keeps the API flexible and avoids global state. Ownership is explicit: the library owns log_event during the call; the sink may not store pointers to its internal buffers unless documented. This is another boundary contract that prevents dangling pointers.

Finally, consider per-logger instances. A logger_t* handle makes logging more explicit and testable. It also allows different components to use different sinks or log levels. This mirrors good design in other libraries and aligns with ownership rules from Projects 2 and 3.

How This Fits in This Project

You will implement a logger handle with a list of sinks (Sec. 4.2) and define thread-safe emission in Sec. 5.10 Phase 2. This concept also informs error handling in Sec. 7.1. The same handle pattern is used in Project 2’s network client. Also used in: Project 2.

Definitions & Key Terms

Sink -> Destination for log events.
Reentrancy -> Safe to call again before returning.
Synchronous logging -> Log call blocks until sinks complete.
Asynchronous logging -> Log call enqueues and returns quickly.

Mental Model Diagram (ASCII)

log_info() -> format event -> [sink list] -> stderr / file / callback
                 ^
                 | (mutex to protect sink list)

How It Works (Step-by-Step)

Caller invokes log_info(logger, ...).
Logger builds a log_event with timestamp and fields.
Logger copies sink list under lock.
Logger iterates sinks and calls each sink function.
Errors are recorded but do not crash the caller.

Minimal Concrete Example

typedef struct {
    log_sink_fn fn;
    void *ctx;
} log_sink;

Common Misconceptions

“Logging is always thread-safe.” -> Only if you design it so.
“Callbacks are harmless.” -> They can reenter and deadlock.
“Global logger is simplest.” -> It limits testability and flexibility.

Check-Your-Understanding Questions

Why should sink callbacks not be called under the global lock?
What is the trade-off between synchronous and asynchronous logging?
Why use a logger_t* handle instead of globals?

Check-Your-Understanding Answers

To avoid reentrancy deadlocks.
Synchronous is simpler but can block; async is faster but complex.
Handles allow multiple configurations and explicit ownership.

Real-World Applications

Web servers with multiple log destinations.
Embedded systems logging to UART and file.

Where You’ll Apply It

In this project: Sec. 4.2 (components), Sec. 5.10 Phase 2 (thread safety).
Also used in: Project 2.

References

“The Linux Programming Interface” - Ch. 31 (Threads)
“Clean Code” - Ch. 3 (Interfaces)

Key Insight

A logging library is a concurrency boundary; reentrancy rules are part of the API.

Summary

A good logging system separates sinks from log event creation, defines thread safety explicitly, and avoids deadlocks by design.

Homework/Exercises to Practice the Concept

Write a logger with two sinks and test in two threads.
Create a sink that logs again; observe deadlock without safeguards.
Implement per-logger handles with different sinks.

Solutions to the Homework/Exercises

Protect sink list with a mutex and serialize output.
Release lock before calling sinks.
Store sink list per logger instance.

2.2 Formatting, Levels, and Structured Logs

Fundamentals

Logs are only useful if they are consistent and parseable. A logging library must define log levels (DEBUG, INFO, WARN, ERROR) and filter messages accordingly. It must also define a format: plain text, logfmt, or JSON. Structured logs allow machines to parse fields without brittle string parsing. A good interface lets callers include key/value pairs while keeping message formatting safe and deterministic.

Deep Dive into the Concept

Log levels establish severity and allow filtering. The library should define a clear ordering (DEBUG < INFO < WARN < ERROR) and provide a minimum level threshold. This allows expensive debug logs to be disabled in production. The threshold can be global or per logger. A per-logger threshold is more flexible and avoids global state. The contract is simple: if event.level < logger.min_level, the log call is a no-op.

Formatting must be safe. In C, formatting with printf-style variadics can be dangerous if format strings are constructed dynamically. The library should treat the format string as user-provided but still avoid buffer overflows by using vsnprintf into a bounded buffer. If a message exceeds the buffer, the library should truncate and indicate truncation (e.g., append ...). This is an explicit boundary rule: you guarantee no overflow and deterministic output.

Structured logging can be implemented by allowing key/value pairs. One approach is to provide an API like:

log_info(logger, "cache miss", "key", key, "ttl", ttl_str, NULL);

This is simple but not type-safe. Another approach is to define a log_field struct array. For this project, you can implement a simple log_kv API with string keys and values. For JSON output, ensure proper escaping of quotes and control characters. For logfmt, follow the standard key=value with quoting when necessary.

Time is also part of formatting. Use ISO-8601 timestamps in UTC (e.g., 2026-01-01T12:00:00Z). This is deterministic and easy to parse. The library should allow injecting a clock function for tests so that output is deterministic. This is a key boundary for testability.

Finally, consider the design of error logs. When logging an error, include both a human-readable message and a machine-readable field (e.g., err=KV_ERR_CONN). This makes logs searchable and improves debugging. It also ties into Project 2’s error model.

How This Fits in This Project

Formatting and levels drive your API surface (Sec. 3.2), output examples (Sec. 3.7), and test strategy (Sec. 6.2). You’ll reuse the deterministic timestamp idea in Project 5 for request logs. Also used in: Project 5.

Definitions & Key Terms

Log level -> Severity category for filtering.
Structured logs -> Logs with key/value fields.
logfmt -> key=value format with quoting rules.
ISO-8601 -> Standard timestamp format.

Mental Model Diagram (ASCII)

log_event {level, ts, msg, fields} -> formatter -> sink output

How It Works (Step-by-Step)

Build a log_event with level, timestamp, message, fields.
Check against min_level; drop if below.
Format into a bounded buffer using vsnprintf.
Escape fields for logfmt or JSON.
Write to sinks.

Minimal Concrete Example

logger_set_level(logger, LOG_INFO);
log_info(logger, "cache miss", "key", "user:1", NULL);

Common Misconceptions

“Logging is just printf.” -> Formatting must be safe and consistent.
“JSON output is always valid.” -> You must escape strings correctly.
“Levels are cosmetic.” -> They control performance and noise.

Check-Your-Understanding Questions

Why use vsnprintf instead of sprintf?
What advantage does logfmt have over plain text?
Why should timestamps be injectable for tests?

Check-Your-Understanding Answers

It prevents buffer overflows.
It is machine-parseable without full JSON overhead.
It makes outputs deterministic.

Real-World Applications

Observability pipelines (ELK, Splunk).
CLI tools with structured diagnostics.

Where You’ll Apply It

In this project: Sec. 3.7 (output format), Sec. 5.10 Phase 1 (formatter).
Also used in: Project 5.

References

“Logging Best Practices” - community guide
“Clean Code” - Ch. 3

Key Insight

A log is a data product; format and structure are part of the API.

Summary

Levels and structured formatting make logs useful for both humans and machines. Safe formatting prevents crashes and keeps output deterministic.

Homework/Exercises to Practice the Concept

Implement a logfmt formatter with escaping.
Add a JSON formatter and verify valid JSON output.
Add a fixed clock for tests.

Solutions to the Homework/Exercises

Quote values with spaces and escape quotes.
Escape " and \n in JSON strings.
Use a function pointer for now() in logger.

2.3 Reliability, Rotation, and Failure Handling

Fundamentals

Logging should never crash the application. A logging library must define how it behaves when sinks fail (disk full, file permission errors). It should also support log rotation to prevent unbounded file growth. These behaviors are part of the boundary contract: callers need to know whether logging failures propagate or are swallowed.

Deep Dive into the Concept

A sink failure is an error in an auxiliary system, not necessarily in the main application. Most logging libraries treat sink failures as non-fatal: they return an error code to the caller but do not abort. For a synchronous logging API, you can return a status code from log calls, or you can store a “last sink error” in the logger handle. The design choice should match your error model. For this project, a simple approach is: log functions return int (0 success, negative error). This allows callers to check if they care, but keeps logging non-fatal.

File rotation is a common requirement. A simple rotation policy is size-based: when the log file exceeds N bytes, close it, rename it to app.log.1, shift older files (.2, .3), and open a new file. Rotation must be synchronized to avoid interleaved writes. If multiple threads log, only one should rotate at a time. You can achieve this by locking the file sink and checking size before writing. Rotation is a great boundary exercise because it requires careful state transitions and error handling.

Failure handling must be deterministic. If rotation fails (e.g., rename error), log it to stderr or store it in last_error, but continue logging to the existing file if possible. If opening a file fails at startup, the logger should still operate with other sinks (stderr). This is part of “fail open” behavior: the system stays usable even if a sink is broken.

Backpressure is another concern. If logging is synchronous and a sink is slow, it can block the application. You can mitigate this by offering an async extension with a bounded queue. If the queue is full, drop logs or block based on a policy. Even if you keep synchronous logging in the core, you should describe these trade-offs and provide hooks for future async support.

Finally, deterministic testability. To test rotation, you need a deterministic size threshold and a test logger that writes fixed-size strings. For failure handling, simulate errors by opening logs in a read-only directory. The boundary is not just the API; it includes the observable behavior under stress.

How This Fits in This Project

You will implement rotation in Sec. 5.10 Phase 3 and document failure behaviors in Sec. 3.3 and Sec. 7.1. These reliability rules are reused in Project 5’s server logs. Also used in: Project 5.

Definitions & Key Terms

Rotation -> Replacing log files once they exceed size.
Fail-open -> Continue operation despite logging errors.
Backpressure -> Control of logging speed under load.

Mental Model Diagram (ASCII)

write log -> check size -> rotate? -> write -> flush
                      ^
                      | (sink lock)

How It Works (Step-by-Step)

Before write, check current file size.
If size exceeds threshold, close and rotate files.
Reopen new log file.
Write message; return status.
On failure, store error and continue with other sinks.

Minimal Concrete Example

int log_sink_file_write(file_sink *s, const char *msg) {
    if (s->size > s->max_size) rotate_logs(s);
    return fputs(msg, s->fp) == EOF ? -1 : 0;
}

Common Misconceptions

“Logging failure should abort the app.” -> It usually shouldn’t.
“Rotation is just renaming.” -> It must be synchronized and safe.
“Dropping logs is always bad.” -> Sometimes it’s better than blocking.

Check-Your-Understanding Questions

Why should logging failures be non-fatal?
What is a safe rotation policy?
How do you test rotation deterministically?

Check-Your-Understanding Answers

Logging is auxiliary; it shouldn’t crash the main app.
Size-based rotation with a lock around file operations.
Use fixed message sizes and small rotation thresholds.

Real-World Applications

Application servers with rotating log files.
Embedded devices with limited storage.

Where You’ll Apply It

In this project: Sec. 3.3 (non-functional), Sec. 5.10 Phase 3 (rotation).
Also used in: Project 5.

References

“The Linux Programming Interface” - File I/O chapters
“The Pragmatic Programmer” - logging practices

Key Insight

A logging system must fail gracefully; reliability is part of its interface.

Summary

Rotation and failure handling define the reliability boundary of your logging library. They keep logs useful without destabilizing the application.

Homework/Exercises to Practice the Concept

Implement size-based rotation with 3 backup files.
Simulate permission errors and ensure logger still writes to stderr.
Add a policy to drop logs when queue is full (for async extension).

Solutions to the Homework/Exercises

Rename app.log to app.log.1, shift older files, reopen.
Attempt to open logs in a read-only directory and fallback to stderr.
Use a bounded queue and return LOG_ERR_DROPPED when full.

3. Project Specification

3.1 What You Will Build

A C logging library libloglite with log levels, pluggable sinks (stderr, file, callback), structured formatting (logfmt/JSON), and optional log rotation.

3.2 Functional Requirements

Logger handle: create/destroy logger instances.
Log levels: DEBUG, INFO, WARN, ERROR with filtering.
Sinks: at least stderr and file sinks; optional callback sink.
Formatting: logfmt and JSON output modes.
Rotation: size-based rotation for file sink.

3.3 Non-Functional Requirements

Performance: 10k logs/sec without crashes.
Reliability: logging failures never crash app.
Usability: deterministic output for tests.

3.4 Example Usage / Output

2026-01-01T12:00:00Z INFO  server started port=8080
2026-01-01T12:00:01Z WARN  cache miss key=user:1
2026-01-01T12:00:02Z ERROR db timeout_ms=3000

3.5 Data Formats / Schemas / Protocols

logfmt example:

ts=2026-01-01T12:00:02Z level=ERROR msg="db timeout" timeout_ms=3000

JSON example:

{"ts":"2026-01-01T12:00:02Z","level":"ERROR","msg":"db timeout","timeout_ms":3000}

3.6 Edge Cases

Sink write fails (disk full).
Logger used after destroy.
Log message longer than buffer.
Reentrant logging from sink.

3.7 Real World Outcome

A developer can log to stderr and a file with consistent formatting and predictable behavior under errors.

3.7.1 How to Run (Copy/Paste)

make
./log-demo

3.7.2 Golden Path Demo (Deterministic)

Use a fixed clock function to produce constant timestamps.

3.7.3 If CLI: Exact Terminal Transcript

$ ./log-demo
2026-01-01T12:00:00Z INFO  server started port=8080
2026-01-01T12:00:01Z WARN  cache miss key=user:1
2026-01-01T12:00:02Z ERROR db timeout_ms=3000
$ echo $?
0

Failure demo (file sink error):

$ ./log-demo --log-file /root/forbidden.log
Error [LOG_ERR_IO]: cannot open log file
$ echo $?
2

Exit Codes:

0 success
2 log file error
3 invalid arguments

4. Solution Architecture

4.1 High-Level Design

logger -> formatter -> sinks
              |        |-> stderr
              |        |-> file
              |        |-> callback

4.2 Key Components

4.3 Data Structures (No Full Code)

typedef struct {
    log_level min_level;
    log_sink *sinks;
    size_t sink_count;
    clock_fn now;
} logger;

4.4 Algorithm Overview

Key Algorithm: log_emit

Check level filter.
Build event with timestamp and fields.
Format into buffer.
Send to each sink.
Return status.

Complexity Analysis:

Time: O(k) for k sinks.
Space: O(1) per log event (fixed buffer).

5. Implementation Guide

5.1 Development Environment Setup

cc --version
make --version

5.2 Project Structure

libloglite/
|-- include/
|   `-- loglite.h
|-- src/
|   |-- logger.c
|   |-- sinks.c
|   |-- format.c
|   `-- rotate.c
|-- demo/
|   `-- log-demo.c
`-- Makefile

5.3 The Core Question You’re Answering

“How do you design a logging interface that is safe, fast, and reusable?”

5.4 Concepts You Must Understand First

Sink interfaces and thread safety.
Safe formatting and structured logs.
Failure handling and rotation.

5.5 Questions to Guide Your Design

Will logging be synchronous or async?
How do you prevent deadlocks when sinks reenter?
What is your rotation policy?

5.6 Thinking Exercise

Design a log line format that is both human-readable and machine-parseable. How do you escape values?

5.7 The Interview Questions They’ll Ask

How do you make logging thread-safe?
How do you avoid deadlocks with callbacks?
How do you handle log file rotation?

5.8 Hints in Layers

Hint 1: Use a logger handle

logger *logger_create(void);

Hint 2: Use vsnprintf

vsnprintf(buf, sizeof(buf), fmt, ap);

5.9 Books That Will Help

5.10 Implementation Phases

Phase 1: Core Logger (2-3 days)

Goals: logger handle, levels, stderr sink. Checkpoint: logs printed with timestamps.

Phase 2: Sinks & Formatting (2-3 days)

Goals: file sink, logfmt/JSON. Checkpoint: logs written to file and stderr.

Phase 3: Rotation & Hardening (1-2 days)

Goals: rotation, error handling. Checkpoint: logs rotate at size threshold.

5.11 Key Implementation Decisions

6. Testing Strategy

6.1 Test Categories

6.2 Critical Test Cases

Reentrant sink: callback logs again -> no deadlock.
Long message: output truncated safely.
Rotation: file exceeds size and rotates.

6.3 Test Data

msg="hello" key=value
msg="quote \""

7. Common Pitfalls & Debugging

7.1 Frequent Mistakes

7.2 Debugging Strategies

Add a sink that writes to memory for tests.
Use deterministic timestamps.

7.3 Performance Traps

Synchronous logging to slow disk in hot path.

8. Extensions & Challenges

8.1 Beginner Extensions

Add per-module log levels.
Add colored output for stderr.

8.2 Intermediate Extensions

Async logging with background thread.
Network sink for UDP logs.

8.3 Advanced Extensions

Structured binary logs.
Pluggable formatters.

9. Real-World Connections

9.1 Industry Applications

Server logs and observability pipelines.
Embedded device diagnostics.

spdlog (C++) - high-performance logging.
log.c - minimal C logger.

9.3 Interview Relevance

Concurrency boundaries and safe I/O.
API design and error handling.

10. Resources

10.1 Essential Reading

“Clean Code” - Ch. 3
“The Linux Programming Interface” - Ch. 31

10.2 Video Resources

“Logging at Scale” - conference talk (searchable title)

10.3 Tools & Documentation

man strftime, man localtime_r

11. Self-Assessment Checklist

11.1 Understanding

I can explain log levels and filtering.
I can explain sink reentrancy hazards.
I can explain rotation policy.

11.2 Implementation

All functional requirements are met.
Logs are deterministic in tests.
Rotation works without data loss.

11.3 Growth

I can explain logging design in interviews.
I documented failure handling rules.

12. Submission / Completion Criteria

Minimum Viable Completion:

Logger handle with stderr sink.
Level filtering.
Safe formatting with bounded buffers.

Full Completion:

File sink with rotation.
Structured logfmt/JSON output.
Thread-safe logging.

Excellence (Going Above & Beyond):

Async logging with queue policies.
Network sink.