Project 14: Thread Pool

Build a thread pool with a task queue, worker threads, and graceful shutdown.

Quick Reference

Attribute Value
Difficulty Expert
Time Estimate 2-3 weeks
Language C
Prerequisites pthreads, synchronization
Key Topics Concurrency, mutexes, condition variables

1. Learning Objectives

By completing this project, you will:

  1. Create and manage worker threads.
  2. Build a thread-safe task queue.
  3. Coordinate workers with condition variables.
  4. Implement clean shutdown and resource cleanup.

2. Theoretical Foundation

2.1 Core Concepts

  • Mutual exclusion: Protect shared state with mutexes.
  • Condition variables: Allow threads to wait efficiently.
  • Task queues: Producers submit work, workers consume.

2.2 Why This Matters

Thread pools power servers, job systems, and parallel workloads. Correct concurrency is a key systems skill.

2.3 Historical Context / Background

Thread pools emerged to reduce thread creation overhead and to bound concurrency in servers.

2.4 Common Misconceptions

  • “Mutex alone is enough”: Without condition variables, threads busy-wait.
  • “Threads can exit anytime”: Shutdown requires coordination.

3. Project Specification

3.1 What You Will Build

A thread pool API that supports:

  • pool_create(n)
  • pool_submit(task_fn, arg)
  • pool_shutdown()
  • Optional pool_wait()

3.2 Functional Requirements

  1. Spawn N worker threads.
  2. Allow concurrent task submission.
  3. Workers block when queue is empty.
  4. Shutdown waits for running tasks to finish.

3.3 Non-Functional Requirements

  • Safety: No data races or deadlocks.
  • Reliability: Clean shutdown without leaks.
  • Usability: Clear API and documentation.

3.4 Example Usage / Output

ThreadPool *pool = pool_create(4);
pool_submit(pool, do_work, arg);
pool_shutdown(pool);

3.5 Real World Outcome

You can run many tasks concurrently without managing threads manually. The pool provides predictable throughput and clean shutdown behavior.


4. Solution Architecture

4.1 High-Level Design

submit -> queue -> worker threads -> execute tasks

4.2 Key Components

Component Responsibility Key Decisions
Task queue Store pending jobs Linked list or ring buffer
Worker loop Consume tasks Wait on condition var
Shutdown Signal workers Flag + broadcast

4.3 Data Structures

typedef struct Task {
    void (*fn)(void *);
    void *arg;
    struct Task *next;
} Task;

typedef struct {
    pthread_t *threads;
    size_t num_threads;
    Task *head;
    Task *tail;
    pthread_mutex_t mutex;
    pthread_cond_t cond;
    int shutdown;
} ThreadPool;

4.4 Algorithm Overview

Key Algorithm: Worker loop

  1. Lock mutex.
  2. Wait while queue empty and not shutdown.
  3. Pop task, unlock, execute.
  4. Exit when shutdown and queue empty.

Complexity Analysis:

  • Task enqueue/dequeue: O(1)
  • Worker scheduling: O(1)

5. Implementation Guide

5.1 Development Environment Setup

cc -Wall -Wextra -O2 -g -pthread -o test_pool test_pool.c pool.c

5.2 Project Structure

threadpool/
├── src/
│   ├── pool.c
│   └── pool.h
├── tests/
│   └── test_pool.c
└── README.md

5.3 The Core Question You’re Answering

“How do I safely share work between threads without races or deadlocks?”

5.4 Concepts You Must Understand First

Stop and research these before coding:

  1. Mutexes
    • When to lock/unlock?
  2. Condition variables
    • Why use while instead of if?
  3. Thread lifecycle
    • How do threads exit cleanly?

5.5 Questions to Guide Your Design

Before implementing, think through these:

  1. Will you bound the queue size?
  2. How will you signal shutdown?
  3. How do you handle task submission after shutdown?

5.6 Thinking Exercise

Spurious Wakeups

Why must you check queue state in a loop after pthread_cond_wait?

5.7 The Interview Questions They’ll Ask

Prepare to answer these:

  1. “What is a condition variable and why do we need it?”
  2. “How do you avoid deadlocks?”
  3. “How does a thread pool improve performance?”

5.8 Hints in Layers

Hint 1: Build the queue first Test enqueue/dequeue in a single thread.

Hint 2: Add worker threads Start with one worker before scaling.

Hint 3: Add shutdown logic Use a flag and pthread_cond_broadcast.

5.9 Books That Will Help

Topic Book Chapter
Pthreads “The Linux Programming Interface” Ch. 29-33
Concurrency “Operating Systems: Three Easy Pieces” Ch. 28-32

5.10 Implementation Phases

Phase 1: Foundation (4-6 days)

Goals:

  • Queue and single worker

Tasks:

  1. Implement queue data structure.
  2. Start one worker thread.

Checkpoint: Tasks execute correctly.

Phase 2: Core Functionality (5-7 days)

Goals:

  • Multi-threaded workers

Tasks:

  1. Spawn N threads.
  2. Add condition variable signaling.

Checkpoint: Multiple tasks run concurrently.

Phase 3: Polish & Edge Cases (3-5 days)

Goals:

  • Graceful shutdown

Tasks:

  1. Add shutdown flag.
  2. Join threads on exit.

Checkpoint: No leaks or deadlocks.

5.11 Key Implementation Decisions

Decision Options Recommendation Rationale
Queue type Linked list vs ring buffer Linked list Simpler to start
Shutdown Immediate vs draining Drain queue Predictable completion

6. Testing Strategy

6.1 Test Categories

Category Purpose Examples
Unit Tests Queue operations Enqueue/dequeue
Stress Tests Many tasks 10k tasks
Concurrency Tests Race detection ThreadSanitizer

6.2 Critical Test Cases

  1. Queue empty: Workers wait without busy spin.
  2. Shutdown: Threads exit cleanly.
  3. High load: No lost tasks.

6.3 Test Data

Compute Fibonacci tasks, sleep tasks

7. Common Pitfalls & Debugging

7.1 Frequent Mistakes

Pitfall Symptom Solution
Missing lock Data races Lock around queue ops
Using if in wait Spurious wakeups Use while
Shutdown race Hang on exit Broadcast and join

7.2 Debugging Strategies

  • Use ThreadSanitizer (-fsanitize=thread).
  • Add logging with thread IDs.

7.3 Performance Traps

Holding the mutex while executing tasks will serialize the pool. Release lock before running tasks.


8. Extensions & Challenges

8.1 Beginner Extensions

  • Add pool_wait to block until tasks complete.
  • Add task IDs and status.

8.2 Intermediate Extensions

  • Add bounded queue with backpressure.
  • Add per-thread statistics.

8.3 Advanced Extensions

  • Work-stealing queues.
  • Priority-based tasks.

9. Real-World Connections

9.1 Industry Applications

  • Servers: Handle concurrent requests.
  • Build systems: Parallel compilation.
  • libuv: Uses thread pools for background work.

9.3 Interview Relevance

Concurrency primitives and thread pools are common systems interview topics.


10. Resources

10.1 Essential Reading

  • “The Linux Programming Interface” - Ch. 29-33
  • “Operating Systems: Three Easy Pieces” - Ch. 28-32

10.2 Video Resources

  • Concurrency and synchronization lectures

10.3 Tools & Documentation

  • man 3 pthread_create, man 3 pthread_cond_wait
  • HTTP Server: Uses thread pools for concurrency.
  • Debugger: Requires careful thread control.

11. Self-Assessment Checklist

11.1 Understanding

  • I can explain condition variables.
  • I can avoid data races and deadlocks.
  • I understand graceful shutdown.

11.2 Implementation

  • Tasks execute concurrently.
  • Shutdown is clean.
  • No race conditions detected.

11.3 Growth

  • I can implement priority queues.
  • I can explain this project in an interview.

12. Submission / Completion Criteria

Minimum Viable Completion:

  • Thread pool executes tasks with one worker.

Full Completion:

  • Multiple workers and graceful shutdown.

Excellence (Going Above & Beyond):

  • Bounded queue and work stealing.

This guide was generated from C_PROGRAMMING_COMPLETE_MASTERY.md. For the complete learning path, see the parent directory.