Project 3: Database Connection Pool

Build a robust, concurrency-safe connection pool with timeouts, health checks, and strict lifecycle invariants.

Quick Reference

Attribute	Value
Difficulty	Advanced
Time Estimate	2-3 weeks
Main Programming Language	C (Alternatives: Rust, Go)
Alternative Programming Languages	Rust, Go, C++
Coolness Level	Level 3: Genuinely Clever
Business Potential	Level 2: Micro-SaaS / Pro Tool
Prerequisites	Threads, mutex/condvar basics, sockets or DB client API
Key Topics	Resource lifecycle, concurrency, timeouts, state machines

1. Learning Objectives

By completing this project, you will:

Model a connection pool as a state machine with explicit invariants.
Implement safe acquisition and release under concurrent load.
Enforce timeouts and backpressure for fairness.
Detect and recycle unhealthy connections deterministically.
Produce deterministic test traces for contention scenarios.

2. All Theory Needed (Per-Concept Breakdown)

2.1 Resource Pool State Machines and Lifecycle Guarantees

Fundamentals

A connection pool is a system for managing a finite resource (connections) under concurrent demand. The pool must ensure that each connection is in exactly one state at any time: available, in-use, or dead. Without explicit state tracking, connections get leaked, double-freed, or reused while still in use. This is a resource lifecycle problem: acquire, use, release, with invariants enforced on every path. The pool itself is also a state machine: it can be initialized, active, draining, or shutdown. Every API call must validate that the pool is in the right state and must maintain invariant counts (e.g., available + in-use + dead == max).

Deep Dive into the concept

Resource pools are a classic systems pattern because they enforce scarcity. Databases and external services cannot tolerate unlimited connections, so the pool enforces a hard ceiling. This means every API call must respect capacity. If a pool contains N connections, then at any moment, exactly N resources exist, and each resource must be either free or owned by a caller. The simplest invariant is: total_connections = free + in_use + dead. You enforce this invariant with a single authoritative data structure, typically a queue for free connections and a set for in-use connections.

The pool is also a state machine with its own lifecycle. A correct design defines states like INIT (connections not created), ACTIVE (serving requests), DRAINING (no new acquisitions; wait for in-use to return), and SHUTDOWN (all connections closed). The pool API should refuse acquisitions in DRAINING or SHUTDOWN states. This prevents the most common lifecycle bug: someone acquires a connection while shutdown is in progress, causing use-after-close. The pool’s state should be explicit, not implied by ad-hoc booleans.

Connection lifecycle is more complex than just open/close. A connection can be healthy, stale, or broken. The pool must detect health before reuse. A common strategy is to validate a connection before handing it out: either by checking a flag updated by previous errors or by performing a lightweight “ping.” If validation fails, the connection is closed and replaced. This introduces state transitions inside the pool itself: AVAILABLE -> VALIDATING -> IN_USE, or AVAILABLE -> DEAD. The pool must ensure that validation itself does not leak resources; if a validation attempt fails, the connection must be returned to the pool or closed.

The correct acquisition path also needs cleanup discipline. Suppose a thread acquires a connection, does some work, and then errors before it can return it. The pool must provide an API that makes this hard to forget. In C, you cannot rely on destructors, so you must design API patterns that make release explicit and easy. For example, return a wrapper struct that must be passed to a pool_release() function. Also consider timeouts: if a thread waits too long, it should either time out and return an error or, if it did acquire a connection, it must guarantee that it releases it before returning.

Another subtlety is fairness and starvation. If the pool uses a simple mutex and a condition variable, waking threads may be unordered. Some threads may starve while others repeatedly acquire. A fair design uses a wait queue or ticketing system to ensure that waiting threads are served in order. This is a state machine problem at the pool level: WAITING -> ACQUIRED or WAITING -> TIMEOUT. Each transition must be explicit, and the pool must cleanly remove timed-out waiters from the queue.

Finally, pool semantics should be deterministic. Under the same sequence of events, the pool should behave the same way. This means that connection acquisition and release should be well-defined even when multiple threads race. You must be explicit about locking order and about when a connection becomes visible to other threads. If you fail to do this, you will see elusive bugs: two threads using the same connection, or a connection never returned to the free list. These are exactly the bugs that destroy production systems, so the pool becomes a perfect project for control flow discipline.

How this fit on projects

This concept defines the pool’s core correctness: explicit lifecycle states for the pool and for each connection, with invariants checked on every transition.

Definitions & key terms

Pool state: INIT, ACTIVE, DRAINING, SHUTDOWN.
Connection state: AVAILABLE, IN_USE, DEAD.
Invariant: a condition that must remain true (free + in-use + dead = max).
Lifecycle: acquire -> use -> release, exactly once.

Mental model diagram (ASCII)

Pool: INIT -> ACTIVE -> DRAINING -> SHUTDOWN

Connection: AVAILABLE -> IN_USE -> AVAILABLE
                       \-> DEAD (on failure)

How it works (step-by-step)

Initialize pool, create N connections.
On acquire, check pool state is ACTIVE.
Move one connection from AVAILABLE to IN_USE.
On release, validate, then return to AVAILABLE or mark DEAD.
On shutdown, disallow new acquisitions and close all free connections.

Failure modes: double release, forgotten release, connection reused after close, acquisitions during shutdown.

Minimal concrete example

typedef enum { POOL_INIT, POOL_ACTIVE, POOL_DRAINING, POOL_SHUTDOWN } PoolState;

typedef struct {
    PoolState state;
    int total, free, in_use, dead;
} Pool;

Common misconceptions

“Connections are just sockets.” They have protocol state and can become invalid.
“Shutdown can just close everything immediately.” In-use connections must be handled safely.

Check-your-understanding questions

Why must the pool reject acquisitions during DRAINING?
What invariant ensures no connection leaks?
What should happen if validation fails on release?

Check-your-understanding answers

Because shutdown is in progress; new work would use closed resources.
Total connections must equal free + in-use + dead.
The connection should be closed and replaced, not returned to the free list.

Real-world applications

Database drivers and ORM frameworks
HTTP client pools
Thread pools and worker pools

Where you’ll apply it

In this project: see §3.2 Functional Requirements and §4.2 Key Components.
Also used in: P05-embedded-sensor-state-machine.md (resource lifecycle), P06-git-like-version-control-system.md (stateful operations).

References

“The Linux Programming Interface” (threading and synchronization)
“Designing Data-Intensive Applications” (resource constraints)

Key insights

A pool is a state machine that enforces scarcity; correctness depends on explicit lifecycle transitions.

Summary

If you cannot state the pool invariants clearly, you cannot build a correct pool.

Homework/Exercises to practice the concept

Write an invariant check function that asserts pool counts are consistent.
Simulate acquire/release sequences and verify the invariant never fails.

Solutions to the homework/exercises

Assert total == free + in_use + dead after every mutation.
Use a small scripted sequence and check counts after each step.

2.2 Concurrency, Timeouts, and Backpressure

Fundamentals

A connection pool is only valuable if it works under concurrent load. Multiple threads must be able to acquire and release connections safely without races. This requires synchronization primitives (mutexes, condition variables) and a queue of waiters. Timeouts prevent threads from waiting forever, and backpressure protects the database by limiting concurrent access. Fairness ensures that threads waiting longer get served first, preventing starvation. These aspects turn the pool into a concurrency state machine: waiting threads are in a WAITING state that can transition to ACQUIRED or TIMEOUT.

In addition, the pool must expose thread-safe metrics (counts, wait time, failures) without introducing races or deadlocks. A correct design decides which fields are protected by the same lock and which can be read atomically. This avoids subtle bugs where a monitoring thread observes impossible combinations of counts or a logging path accidentally breaks invariants.

Deep Dive into the concept

Concurrency makes lifecycle management harder because multiple threads can race for the same resources. The simplest approach is a mutex protecting the pool’s state and a condition variable that threads wait on when no connections are free. But this only works if every path follows a strict protocol: acquire lock, check state, wait if needed, allocate a connection, update counts, release lock. If any path fails to update counts or signal waiters, the system can deadlock. Control flow discipline is critical.

Timeouts are the second essential element. If a thread waits on a condition variable indefinitely, it might block the whole application. Instead, implement timed waits. A thread enters WAITING state, adds itself to a wait queue, then waits with a deadline. If it times out, it must remove itself from the queue and return an error. If it is signaled, it must verify that a connection is actually available before proceeding (spurious wakeups are real). This is why the wait loop is always “while no connection, wait” not “if no connection, wait”.

Backpressure is the global policy that limits concurrent work. The pool enforces a hard maximum and refuses new acquisitions beyond it. For a more graceful policy, you can implement a maximum wait queue size: if too many threads are waiting, reject new requests immediately. This protects the system from load spikes and avoids unbounded memory usage. In the pool’s state machine, this is a transition from WAITING to REJECTED rather than WAITING to ACQUIRED.

Fairness matters because a naive condition variable does not guarantee which thread is woken. Under heavy load, some threads may starve. A fair pool can be implemented by storing waiters in a FIFO queue. When a connection is released, the pool signals the next waiter explicitly. In C, this can be done by storing a per-waiter condition variable or by using a semaphore and ticketing. The key is that fairness is an explicit policy, not an accidental property.

Health checks and timeouts interact. Suppose a connection becomes stale and validation fails. That reduces the pool size temporarily, which may increase wait times. The pool must respond by creating a replacement connection or by reducing capacity. If creation fails, the pool must enter a DEGRADED state that still serves existing connections but reports lower capacity. This is a design choice, but it must be explicit. Otherwise, failures will manifest as random timeouts that are impossible to debug.

Another concurrency hazard is thundering herds. If you wake all waiters when a single connection is released, they will all race, most will go back to sleep, and your system will burn CPU. A good pool wakes only one waiter (or uses a ticketing scheme) and keeps the wake-up path short. This is a small detail that has a large effect on performance and predictability under load.

Finally, deterministic testing of concurrency requires fixed schedules. You can simulate contention with a test harness that spawns threads with known delays. Use fixed seeds for any randomness. Record the order of acquisitions and releases and assert that no thread waits longer than expected given the policy. The pool should behave predictably under these controlled scenarios, which demonstrates that your synchronization and state transitions are correct.

How this fit on projects

This concept drives the pool’s concurrency design: waiting threads, timeouts, fairness, and backpressure. It also shapes the test strategy and error reporting.

Definitions & key terms

Mutex: Mutual exclusion lock protecting shared state.
Condition variable: Wait/notify mechanism for threads.
Spurious wakeup: Condition variable wakes without a signal.
Backpressure: Policy to limit load by refusing or delaying requests.

Mental model diagram (ASCII)

Thread -> WAITING --(signal)--> ACQUIRED
             |                    |
             v                    v
          TIMEOUT              RELEASE

How it works (step-by-step)

Acquire lock.
While no connection, wait with timeout.
If timeout, remove waiter and return error.
If signaled, re-check availability and acquire.
Release lock and return connection handle.

Failure modes: deadlocks from missed signals, starvation, unbounded wait queues, spurious wakeup bugs.

Minimal concrete example

pthread_mutex_lock(&pool->mu);
while (pool->free == 0) {
    if (pthread_cond_timedwait(&pool->cv, &pool->mu, &deadline) == ETIMEDOUT) {
        pthread_mutex_unlock(&pool->mu);
        return POOL_TIMEOUT;
    }
}
// acquire connection
pthread_mutex_unlock(&pool->mu);

Common misconceptions

“Condition variables wake exactly one thread.” They can wake spuriously.
“Fairness is automatic.” It is not; you must implement it.

Check-your-understanding questions

Why must waits be in a loop, not an if?
How does a timeout transition affect pool invariants?
What is a fair wakeup policy?

Check-your-understanding answers

Because of spurious wakeups and races; the condition must be rechecked.
The waiter must be removed so that the pool does not think it is still waiting.
FIFO queueing of waiters, so the longest-waiting thread is served first.

Real-world applications

Database client pools in web servers
HTTP connection pools in SDKs
Thread pools and worker queues

Where you’ll apply it

In this project: see §3.3 Non-Functional Requirements, §5.10 Phase 2.
Also used in: P02-http-1-1-parser.md (backpressure), P04-undo-redo-engine.md (consistency under concurrency).

References

“Programming with POSIX Threads” by David Butenhof
“The Linux Programming Interface” (pthread condvars)

Key insights

Concurrency is a state machine: waiting threads transition to acquired, timed out, or rejected.

Summary

Correctness under contention requires explicit wait states, timed waits, and deterministic policies.

Homework/Exercises to practice the concept

Implement a timed wait loop and simulate a timeout.
Build a FIFO wait queue and verify fairness with two threads.

Solutions to the homework/exercises

Use pthread_cond_timedwait and return a timeout error when it expires.
Use a queue of waiter objects and signal only the front waiter on release.

3. Project Specification

3.1 What You Will Build

A connection pool library that manages a fixed number of connections (mocked or real), with a CLI demo that spawns threads to acquire, use, and release connections. The pool must handle timeouts, shutdown, and health checks without leaks.

Included:

Pool lifecycle (init, active, draining, shutdown)
Thread-safe acquire/release
Timeout and fairness policies
Health check on release

Excluded:

Actual database protocol implementation

3.2 Functional Requirements

Pool init: create N connections and enter ACTIVE state.
Acquire: block with timeout if none available.
Release: validate and return or replace.
Shutdown: transition to DRAINING then SHUTDOWN.
Health check: detect bad connections and replace.
Metrics: report counts (free, in-use, dead, waiters).

3.3 Non-Functional Requirements

Performance: low contention for up to 32 threads.
Reliability: no double-release, no leaks.
Usability: clear error codes for timeout and shutdown.

3.4 Example Usage / Output

$ ./pool_demo --threads 4 --size 2
[pool] acquire -> conn#1
[pool] acquire -> conn#2
[pool] wait -> timeout after 200ms

3.5 Data Formats / Schemas / Protocols

CLI output: log lines and optional JSON (--json).

JSON error shape:

{ "error": { "code": "POOL_TIMEOUT", "message": "waited 200ms" } }

3.6 Edge Cases

Acquire during shutdown
Release after shutdown
Connection fails validation on release
Spurious wakeups

3.7 Real World Outcome

Deterministic demo uses fixed thread delays and timeouts.

3.7.1 How to Run (Copy/Paste)

cc -std=c11 -pthread -O2 -o pool_demo src/pool_demo.c
./pool_demo --threads 4 --size 2 --timeout-ms 200 --seed 42

3.7.2 Golden Path Demo (Deterministic)

Expected output shows two threads acquiring, two waiting, then successful releases and acquisitions in FIFO order.

3.7.3 CLI Transcript (Success + Failure)

$ ./pool_demo --threads 3 --size 1 --timeout-ms 100 --seed 1
[pool] acquire -> conn#1 (thread 1)
[pool] wait -> thread 2
[pool] wait -> thread 3
[pool] release <- conn#1 (thread 1)
[pool] acquire -> conn#1 (thread 2)
[pool] timeout -> thread 3
$ echo $?
0

$ ./pool_demo --threads 1 --size 1 --shutdown-immediately
ERROR: POOL_SHUTDOWN (no new acquisitions)
$ echo $?
1

Exit codes:

0 success
1 user/pool error (timeout, shutdown)
2 system error

4. Solution Architecture

4.1 High-Level Design

[Threads] -> [Acquire/Release API] -> [Pool State + Wait Queue]
                                   -> [Connection List]

4.2 Key Components

4.3 Data Structures (No Full Code)

typedef struct Conn { int id; int healthy; } Conn;

typedef struct Pool {
    PoolState state;
    Conn *free_list;
    Conn *in_use;
    int free_count, in_use_count, dead_count;
    pthread_mutex_t mu;
    pthread_cond_t cv;
} Pool;

4.4 Algorithm Overview

Key Algorithm: Acquire with Timeout

Lock pool.
While no free conn, wait with timeout.
On timeout, return error.
On availability, move conn to in-use and return.

Complexity Analysis:

Time: O(1) per acquire/release (amortized)
Space: O(n) for connections

5. Implementation Guide

5.1 Development Environment Setup

cc --version

5.2 Project Structure

project-root/
├── src/
│   ├── pool.c
│   ├── pool.h
│   └── pool_demo.c
├── tests/
│   └── test_pool.c
└── Makefile

5.3 The Core Question You’re Answering

“How do I guarantee that every resource is acquired and released exactly once, even under concurrency and failure?”

5.4 Concepts You Must Understand First

Mutex + condition variable patterns
Timeouts and spurious wakeups
Resource lifecycle invariants

5.5 Questions to Guide Your Design

How will you ensure fairness for waiting threads?
What happens if a thread crashes without releasing?
How will you detect and replace dead connections?

5.6 Thinking Exercise

Simulate 3 threads and 2 connections with 100ms timeouts. Which thread times out first under FIFO ordering?

5.7 The Interview Questions They’ll Ask

“How do you avoid deadlocks in a connection pool?”
“Why must timed waits be in a loop?”
“How do you handle connection validation?”

5.8 Hints in Layers

Hint 1: Start with a single-threaded pool and add locking later.

Hint 2: Implement acquire/release with counters and asserts.

Hint 3: Add timeouts once basic correctness is proven.

5.9 Books That Will Help

5.10 Implementation Phases

Phase 1: Core Pool (4-5 days)

Implement pool struct and free list.
Add acquire/release with invariants.

Phase 2: Concurrency (1 week)

Add mutex/condvar.
Implement timed waits and fairness.

Phase 3: Health + Shutdown (3-5 days)

Add validation and replacement.
Implement draining and shutdown.

5.11 Key Implementation Decisions

6. Testing Strategy

6.1 Test Categories

6.2 Critical Test Cases

Two threads acquire one connection with timeout.
Release after shutdown returns error.
Dead connection replaced correctly.

6.3 Test Data

threads=4, pool=2, timeout=200ms

7. Common Pitfalls & Debugging

7.1 Frequent Mistakes

7.2 Debugging Strategies

Log pool state transitions and counts.
Add thread IDs to acquire/release logs.

7.3 Performance Traps

Excessive locking can reduce throughput; keep critical sections minimal.

8. Extensions & Challenges

8.1 Beginner Extensions

Add metrics export (free/in-use counts).
Add connection idle timeout.

8.2 Intermediate Extensions

Dynamic pool resizing with max/min limits.
Per-thread connection affinity.

8.3 Advanced Extensions

Integrate with a real DB client library.
Implement circuit breaker behavior on repeated failures.

9. Real-World Connections

9.1 Industry Applications

Database pools in web servers
Service discovery clients with connection reuse

HikariCP (Java)
pgbouncer (PostgreSQL connection pool)

9.3 Interview Relevance

Concurrency, resource lifecycle, and fairness are common system design topics.

10. Resources

10.1 Essential Reading

“Programming with POSIX Threads” by Butenhof
“The Linux Programming Interface” by Kerrisk

10.2 Video Resources

Concurrency and condition variable tutorials

10.3 Tools & Documentation

pthread manpages

Project 2: HTTP Parser for streaming and backpressure.
Project 6: Git-like VCS for lifecycle states.

11. Self-Assessment Checklist

11.1 Understanding

I can state the pool invariants clearly.
I can explain timed waits and spurious wakeups.
I can explain how validation prevents reuse of dead connections.

11.2 Implementation

All functional requirements are met.
All tests pass under concurrency.
No leaks or double-releases.

11.3 Growth

I can reason about fairness trade-offs.
I can explain pool shutdown semantics.

12. Submission / Completion Criteria

Minimum Viable Completion:

Acquire/release works with invariants.
Timeout returns deterministic error.

Full Completion:

Fair wait queue, health checks, and shutdown semantics.

Excellence (Going Above & Beyond):

Dynamic resizing and circuit breaker logic.

13. Additional Content Rules (Compliance)

Deterministic demo provided in §3.7.
Failure demo with exit codes included.
Cross-links included in §2.1 and §2.2.