Project 2: Promise-Based Task Coordinator

Build a task execution system where jobs can depend on other jobs, using std::future and std::promise to coordinate completion and pass results between tasks.


Quick Reference

Attribute Value
Language C++ (alt: Rust, JavaScript, Go)
Difficulty Intermediate
Time 1-2 weeks
Prerequisites Project 1, move semantics
Coolness Genuinely Clever
Portfolio Value Resume Gold

Learning Objectives

By completing this project, you will:

  1. Master the future/promise paradigm: Understand that a std::promise is one end of a one-shot channel, and std::future is the other end
  2. Implement dependency-based task scheduling: Build task graphs where tasks execute only when their dependencies complete
  3. Propagate exceptions across threads: Use promise::set_exception() to transmit errors through the async boundary
  4. Handle diamond dependencies with shared_future: Understand why std::shared_future exists and when multiple consumers need the same result
  5. Avoid future-related deadlocks: Recognize that calling future::get() from the wrong thread can block forever
  6. Design task cancellation strategies: Implement cascading cancellation when upstream tasks fail
  7. Compare with callback-based approaches: Understand why futures compose better than nested callbacks

Theoretical Foundation

Core Concepts

Futures and Promises as One-Shot Channels

A std::promise<T> and std::future<T> together form a one-time communication channel between threads. The promise is the “write end” and the future is the “read end”:

Producer Thread                    Consumer Thread
      |                                  |
std::promise<T> p;                 std::future<T> f = p.get_future();
      |                                  |
   [compute result]                      |
      |                                  |
p.set_value(result); --------->    T val = f.get();  // blocks until ready
      |                                  |
   [done]                             [has val]

Key insight: Unlike a queue or channel, a future can only be “read” once. After you call get(), the future is empty. This is fundamentally different from observable patterns or reactive streams.

The Promise Contract

A promise makes exactly one of these guarantees:

  1. A value will be set via set_value()
  2. An exception will be set via set_exception()
  3. The promise will be destroyed without setting anything (broken promise exception)
std::promise<int> p;
std::future<int> f = p.get_future();

// Option 1: Success
p.set_value(42);

// Option 2: Failure
p.set_exception(std::make_exception_ptr(std::runtime_error("failed")));

// Option 3: Broken promise (if p is destroyed without calling either)
// f.get() will throw std::future_error with broken_promise

std::async vs Manual Promises

std::async is a convenience that creates a promise, spawns a thread (maybe), and returns the future:

// Using std::async
std::future<int> f = std::async(std::launch::async, []() {
    return expensive_computation();
});

// Equivalent manual approach
std::promise<int> p;
std::future<int> f = p.get_future();
std::thread t([p = std::move(p)]() mutable {
    try {
        p.set_value(expensive_computation());
    } catch (...) {
        p.set_exception(std::current_exception());
    }
});
t.detach();

When to use which:

  • std::async: Simple one-off async computations
  • Manual promises: Complex dependency graphs, custom thread pools, integration with existing infrastructure

std::shared_future for Multiple Consumers

A regular std::future can only be moved, not copied. What if two tasks depend on the same upstream result?

        Task A
       /      \
  (future)  (future)   <-- Problem: future is move-only!
     |          |
  Task B     Task C

Solution: std::shared_future can be copied and read multiple times:

std::promise<int> p;
std::shared_future<int> sf = p.get_future().share();

// Now both threads can safely call sf.get()
auto b = std::async([sf]() { return sf.get() * 2; });
auto c = std::async([sf]() { return sf.get() + 10; });

Critical difference: std::shared_future::get() returns a const T&, not T. The value is shared, not moved.

Why This Matters

Problem 1: Callback Hell

Without futures, expressing “do A, then B, then C” requires nested callbacks:

// Callback approach (hard to read, error-prone)
download_file(url, [](Result r1) {
    if (r1.success()) {
        parse_json(r1.data, [](Result r2) {
            if (r2.success()) {
                validate(r2.json, [](Result r3) {
                    // ... and so on
                });
            }
        });
    }
});

// Future approach (linear, composable)
auto f1 = download_file_async(url);
auto f2 = f1.then([](auto data) { return parse_json(data); });
auto f3 = f2.then([](auto json) { return validate(json); });

Problem 2: Shared State Complexity

Without futures, coordinating completion requires manual synchronization:

// Manual approach (error-prone)
std::mutex m;
std::condition_variable cv;
bool ready = false;
int result;

// Producer
{
    std::lock_guard lk(m);
    result = compute();
    ready = true;
}
cv.notify_one();

// Consumer
{
    std::unique_lock lk(m);
    cv.wait(lk, [&]{ return ready; });
    use(result);
}

// Future approach (encapsulated)
auto f = std::async(std::launch::async, compute);
use(f.get());

Problem 3: Error Propagation

Exceptions thrown in one thread must somehow reach the waiting thread. Futures handle this automatically:

auto f = std::async(std::launch::async, []() {
    throw std::runtime_error("something went wrong");
    return 42;
});

try {
    int result = f.get();  // Re-throws the exception here!
} catch (const std::runtime_error& e) {
    std::cerr << "Caught: " << e.what() << "\n";
}

Historical Context

Futures and promises have a rich history:

  • 1976: Baker and Hewitt describe “futures” in the context of message-passing concurrency
  • 1988: MultiLisp implements futures for parallel Lisp computation
  • 2000s: Java gets java.util.concurrent.Future, C# gets Task<T>
  • 2011: C++11 introduces std::future, std::promise, std::async
  • 2017: C++17 adds std::shared_future improvements
  • 2020: C++20 introduces coroutines (co_await works with futures)

Why C++ futures are “low-level”: Unlike JavaScript Promises or C# Tasks, C++ futures don’t have built-in .then() chaining (until C++23’s std::execution). This project teaches you to build that yourself.

Common Misconceptions

Misconception 1: “std::async always spawns a thread” Reality: With std::launch::deferred, the computation runs lazily when you call get(). Default behavior is implementation-defined.

Misconception 2: “I can call get() multiple times on a future” Reality: std::future::get() can only be called once. Use std::shared_future for multiple reads.

Misconception 3: “Futures are like callbacks” Reality: Futures are values representing eventual results. They can be stored, passed around, and composed. Callbacks are functions that get invoked.

Misconception 4: “If the producer thread dies, the future hangs forever” Reality: If the promise is destroyed without setting a value, get() throws std::future_error with broken_promise.

Misconception 5: “I should always use async over manual promises” Reality: Manual promises give you control over thread placement and lifecycle. Essential for thread pools and task graphs.


Project Specification

What You Will Build

A Task Coordinator library that:

  1. Accepts task definitions with explicit dependencies
  2. Builds a Directed Acyclic Graph (DAG) of task execution order
  3. Executes tasks in parallel when dependencies allow
  4. Propagates results from upstream to downstream tasks
  5. Handles failures by cancelling dependent tasks
  6. Provides progress reporting and timing

Functional Requirements

FR1: Task Definition

  • Tasks are defined as: ID + callable + list of dependency IDs
  • Callables can be lambdas, function pointers, or std::function
  • Tasks can return any copyable/movable type
  • Tasks can accept results from their dependencies

FR2: Dependency Management

  • Detect cycles during task graph construction
  • Ensure tasks only run after all dependencies complete
  • Support diamond dependencies (A -> B, A -> C, B -> D, C -> D)
  • Allow tasks with no dependencies (entry points)

FR3: Parallel Execution

  • Independent tasks run in parallel on a thread pool
  • Number of worker threads is configurable
  • Tasks that share dependencies correctly receive the same result

FR4: Result Propagation

  • Task results are passed to dependent tasks
  • Results are available via shared_future for diamond patterns
  • Final results are retrievable after execution completes

FR5: Error Handling

  • Exceptions in tasks are captured and propagated
  • Downstream tasks are cancelled when upstream fails
  • Clear error reporting indicates which task failed

FR6: Progress & Timing

  • Report when each task starts and completes
  • Track execution time per task
  • Report total graph execution time

Non-Functional Requirements

  • Thread Safety: All coordinator methods are thread-safe
  • No Busy Waiting: Workers wait efficiently (condition variables)
  • Clean Shutdown: Graceful termination with no resource leaks
  • Deterministic Execution: Given same inputs, same logical execution order
  • Modern C++: Uses C++17 or later features appropriately

Example Usage and Output

// Define tasks
TaskCoordinator coordinator(4);  // 4 worker threads

// Task A: Download data (no dependencies)
coordinator.add_task("A", []() -> std::string {
    std::this_thread::sleep_for(std::chrono::milliseconds(1200));
    return download_file("data.json");
}, {});

// Task B: Parse JSON (depends on A)
coordinator.add_task("B", [](const std::string& json) -> Document {
    std::this_thread::sleep_for(std::chrono::milliseconds(800));
    return parse_json(json);
}, {"A"});

// Task C: Validate schema (depends on A)
coordinator.add_task("C", [](const std::string& json) -> bool {
    std::this_thread::sleep_for(std::chrono::milliseconds(300));
    return validate_schema(json);
}, {"A"});

// Task D: Transform data (depends on B AND C)
coordinator.add_task("D", [](const Document& doc, bool valid) -> TransformedData {
    std::this_thread::sleep_for(std::chrono::milliseconds(500));
    return transform(doc);
}, {"B", "C"});

// Task E: Upload results (depends on D)
coordinator.add_task("E", [](const TransformedData& data) -> void {
    std::this_thread::sleep_for(std::chrono::milliseconds(900));
    upload(data);
}, {"D"});

// Execute the graph
auto report = coordinator.run();

Expected Output (Success Case):

$ ./task_coordinator

Submitting task graph:
  Task A: Download data.json (no dependencies)
  Task B: Parse JSON (depends on: A)
  Task C: Validate schema (depends on: A)
  Task D: Transform data (depends on: B, C)
  Task E: Upload results (depends on: D)

Building execution plan...
  Detected 5 tasks in 4 execution layers
  Layer 0: [A]
  Layer 1: [B, C]  <- can run in parallel
  Layer 2: [D]
  Layer 3: [E]

[00:00.000] Executor: Starting Task A...
[00:01.203] Executor: Task A completed (1.203s) -> "data.json contents..."
[00:01.205] Executor: Starting Task B (dependency A satisfied)...
[00:01.206] Executor: Starting Task C (dependency A satisfied)...
[00:01.509] Executor: Task C completed (0.303s) -> true
[00:02.008] Executor: Task B completed (0.803s) -> Document{...}
[00:02.010] Executor: Starting Task D (dependencies B, C satisfied)...
[00:02.512] Executor: Task D completed (0.502s) -> TransformedData{...}
[00:02.514] Executor: Starting Task E (dependency D satisfied)...
[00:03.418] Executor: Task E completed (0.904s)

================================================================================
EXECUTION COMPLETE
================================================================================
Total tasks:     5
Completed:       5
Failed:          0
Cancelled:       0

Total time:      3.418s
Sequential time: 3.706s (if run serially)
Speedup:         1.08x (limited by critical path A->B->D->E)

Per-task timing:
  Task A: 1.203s (34.2%)
  Task B: 0.803s (22.9%)
  Task C: 0.303s (8.6%)
  Task D: 0.502s (14.3%)
  Task E: 0.904s (25.7%)

Expected Output (Failure Case):

$ ./task_coordinator --inject-failure B

[00:00.000] Executor: Starting Task A...
[00:01.205] Executor: Task A completed (1.205s)
[00:01.207] Executor: Starting Task B (dependency A satisfied)...
[00:01.208] Executor: Starting Task C (dependency A satisfied)...
[00:01.512] Executor: Task C completed (0.304s)
[00:01.823] Executor: Task B FAILED: JSON parse error at line 42
[00:01.824] Executor: Cancelling Task D (dependency B failed)
[00:01.824] Executor: Cancelling Task E (dependency D cancelled)

================================================================================
EXECUTION FAILED
================================================================================
Total tasks:     5
Completed:       2 (A, C)
Failed:          1 (B)
Cancelled:       2 (D, E)

Failure chain:
  B: JSON parse error at line 42
    -> D: cancelled (dependency B failed)
      -> E: cancelled (dependency D cancelled)

Partial results available for: A, C

Real World Outcome

This project directly maps to real-world systems:

This Project Real World Equivalent
Task CI/CD job, ETL step, build target
Dependencies Job dependencies in GitHub Actions, Make dependencies
Task Graph DAG in Apache Airflow, Bazel build graph
Coordinator Scheduler in Kubernetes, Celery task queue
Result passing Data pipeline outputs

After completing this project, you will understand:

  • How build systems like Bazel schedule parallel builds
  • How CI/CD systems like GitHub Actions handle job dependencies
  • How data orchestrators like Airflow manage task execution
  • The foundation for implementing co_await and coroutine schedulers

Solution Architecture

High-Level Design

                                TASK COORDINATOR
┌─────────────────────────────────────────────────────────────────────────────┐
│                                                                             │
│  ┌─────────────────┐    ┌──────────────────┐    ┌───────────────────────┐  │
│  │  Task Registry  │───>│   Graph Builder  │───>│  Execution Planner   │  │
│  │                 │    │                  │    │                       │  │
│  │ - ID -> Task    │    │ - Cycle detect   │    │ - Topological sort   │  │
│  │ - Dependencies  │    │ - Adjacency list │    │ - Parallelism layers │  │
│  └─────────────────┘    └──────────────────┘    └───────────┬───────────┘  │
│                                                              │              │
│                                                              ▼              │
│  ┌─────────────────────────────────────────────────────────────────────┐   │
│  │                         Thread Pool Executor                         │   │
│  │                                                                      │   │
│  │  ┌──────────┐   ┌──────────┐   ┌──────────┐   ┌──────────┐         │   │
│  │  │ Worker 1 │   │ Worker 2 │   │ Worker 3 │   │ Worker 4 │   ...   │   │
│  │  └────┬─────┘   └────┬─────┘   └────┬─────┘   └────┬─────┘         │   │
│  │       │              │              │              │                │   │
│  │       └──────────────┴──────────────┴──────────────┘                │   │
│  │                              │                                       │   │
│  │                              ▼                                       │   │
│  │                    ┌─────────────────┐                              │   │
│  │                    │   Ready Queue   │ <── Tasks with all deps met  │   │
│  │                    │   (thread-safe) │                              │   │
│  │                    └─────────────────┘                              │   │
│  └─────────────────────────────────────────────────────────────────────┘   │
│                                                                             │
│  ┌─────────────────────────────────────────────────────────────────────┐   │
│  │                         Result Storage                              │   │
│  │                                                                      │   │
│  │    Task A ──> shared_future<string>                                 │   │
│  │    Task B ──> shared_future<Document>                               │   │
│  │    Task C ──> shared_future<bool>                                   │   │
│  │    ...                                                              │   │
│  └─────────────────────────────────────────────────────────────────────┘   │
└─────────────────────────────────────────────────────────────────────────────┘

Task Execution Flow

                    Task Lifecycle

     REGISTERED         PENDING           RUNNING         COMPLETED/FAILED
         │                 │                  │                  │
         │  Graph built    │  Deps satisfied  │   Execution      │
         ▼                 ▼                  ▼                  ▼
    ┌─────────┐       ┌─────────┐       ┌─────────┐       ┌─────────┐
    │  Task   │ ────> │  Task   │ ────> │  Task   │ ────> │  Task   │
    │ waiting │       │  ready  │       │ running │       │  done   │
    └─────────┘       └─────────┘       └─────────┘       └─────────┘
                           │                                   │
                           │                                   │
                           ▼                                   ▼
                    Ready Queue                        Notify dependents
                    (picked by worker)                 (check if now ready)

Diamond Dependency Pattern

              Task A (returns string)
             /                      \
            /                        \
    shared_future<string>      shared_future<string>
           |                          |
           ▼                          ▼
        Task B                     Task C
    (needs A's result)         (needs A's result)
           |                          |
    future<Document>            future<bool>
            \                        /
             \                      /
              \    shared_future   /
               \      (both)      /
                \                /
                 ▼              ▼
                    Task D
            (needs B's Document AND C's bool)

Key Components

Component Responsibility Key Data
Task Wraps callable with dependencies ID, callable, dep IDs, promise, state
TaskGraph Manages task relationships Adjacency list, in-degree counts
Executor Runs tasks on thread pool Worker threads, ready queue, mutex/CV
ResultStore Holds completed task results Map of ID -> shared_future
Scheduler Decides which tasks run next Topological ordering, layer assignment

Data Structures

// Task states
enum class TaskState {
    Pending,      // Waiting for dependencies
    Ready,        // Dependencies satisfied, waiting for worker
    Running,      // Currently executing
    Completed,    // Finished successfully
    Failed,       // Threw exception
    Cancelled     // Upstream dependency failed
};

// Single task representation
template<typename Result>
struct Task {
    std::string id;
    std::function<Result(/* dependency results */)> callable;
    std::vector<std::string> dependencies;
    std::promise<Result> promise;
    std::atomic<TaskState> state{TaskState::Pending};
    std::chrono::steady_clock::time_point start_time;
    std::chrono::steady_clock::time_point end_time;
};

// Type-erased task for heterogeneous storage
struct TaskBase {
    std::string id;
    std::vector<std::string> dependencies;
    std::atomic<TaskState> state{TaskState::Pending};
    virtual void execute(/* result store */) = 0;
    virtual ~TaskBase() = default;
};

// Graph structure
struct TaskGraph {
    std::unordered_map<std::string, std::unique_ptr<TaskBase>> tasks;
    std::unordered_map<std::string, std::vector<std::string>> dependents;  // reverse edges
    std::unordered_map<std::string, size_t> in_degree;  // remaining deps
};

// Thread pool with ready queue
struct Executor {
    std::vector<std::thread> workers;
    std::queue<TaskBase*> ready_queue;
    std::mutex queue_mutex;
    std::condition_variable queue_cv;
    std::atomic<bool> shutdown{false};
    std::atomic<size_t> tasks_remaining{0};
};

// Result storage (type-erased)
struct ResultStore {
    std::unordered_map<std::string, std::any> results;  // stores shared_future<T>
    std::mutex mutex;

    template<typename T>
    std::shared_future<T> get(const std::string& id);

    template<typename T>
    void store(const std::string& id, std::shared_future<T> future);
};

Algorithm Overview

Task Graph Construction:

  1. Accept task definitions via add_task()
  2. Store task and dependency edges
  3. Build reverse adjacency list (who depends on whom)
  4. Calculate in-degrees (how many deps each task has)

Cycle Detection (before execution):

  1. Perform topological sort using Kahn’s algorithm
  2. If not all tasks are visited, a cycle exists
  3. Report the cycle to the user

Execution Scheduling:

  1. Find all tasks with in-degree 0 (no dependencies)
  2. Add them to the ready queue
  3. Workers pick tasks from ready queue
  4. When a task completes:
    • Store result in ResultStore
    • Decrement in-degree of all dependents
    • If any dependent’s in-degree becomes 0, add to ready queue
  5. Repeat until all tasks complete or failure occurs

Failure Propagation:

  1. When a task fails, mark it as Failed
  2. Find all transitive dependents (BFS/DFS)
  3. Mark them as Cancelled
  4. Set their promises to a cancellation exception
  5. Continue with remaining independent tasks

Implementation Guide

Development Environment Setup

# Compiler with C++17 support
# GCC 8+, Clang 7+, or MSVC 2017+

# Ubuntu/Debian
sudo apt-get install g++ cmake build-essential

# macOS
xcode-select --install
brew install cmake

# Create project structure
mkdir -p promise-task-coordinator/{src,include,tests,examples}
cd promise-task-coordinator

# CMakeLists.txt
cat > CMakeLists.txt << 'EOF'
cmake_minimum_required(VERSION 3.16)
project(TaskCoordinator CXX)

set(CMAKE_CXX_STANDARD 17)
set(CMAKE_CXX_STANDARD_REQUIRED ON)

# Thread support
find_package(Threads REQUIRED)

# Enable sanitizers for development
add_compile_options(-fsanitize=thread)
add_link_options(-fsanitize=thread)

add_library(task_coordinator
    src/task.cpp
    src/executor.cpp
    src/graph.cpp
    src/coordinator.cpp
)
target_include_directories(task_coordinator PUBLIC include)
target_link_libraries(task_coordinator Threads::Threads)

add_executable(demo examples/demo.cpp)
target_link_libraries(demo task_coordinator)

enable_testing()
add_executable(tests tests/test_main.cpp)
target_link_libraries(tests task_coordinator)
add_test(NAME unit_tests COMMAND tests)
EOF

# Build
mkdir build && cd build
cmake .. && make

Project Structure

promise-task-coordinator/
├── include/
│   ├── task_coordinator.hpp      # Main public interface
│   ├── task.hpp                  # Task types and states
│   ├── graph.hpp                 # DAG representation
│   ├── executor.hpp              # Thread pool
│   └── result_store.hpp          # Type-erased result storage
├── src/
│   ├── task.cpp
│   ├── graph.cpp
│   ├── executor.cpp
│   └── coordinator.cpp
├── tests/
│   ├── test_main.cpp
│   ├── test_basic.cpp            # Single task, simple deps
│   ├── test_diamond.cpp          # Diamond pattern
│   ├── test_failure.cpp          # Exception propagation
│   └── test_parallel.cpp         # Parallelism verification
├── examples/
│   ├── demo.cpp                  # Basic usage
│   ├── build_system.cpp          # Simulated build graph
│   └── data_pipeline.cpp         # ETL simulation
├── CMakeLists.txt
└── README.md

The Core Question You’re Answering

“How do you express ‘do A, then when A is done, do B with A’s result’ without callbacks or shared state?”

This question forces you to understand that futures are values representing eventual results. You can pass them around, store them, and compose them. The synchronization is implicit in the future/promise contract, not in explicit locks.


Concepts You Must Understand First

Before starting this project, verify you can answer these questions:

Concept Self-Check Question Where to Learn
Move semantics Why is std::future move-only? What happens if you copy it? C++ Concurrency in Action 4.2.1
std::function How does std::function store and invoke different callable types? cppreference, any C++ book
Exception safety What does it mean for an operation to provide the strong exception guarantee? Effective C++ Items 25-29
Thread basics What’s the difference between std::thread and what the thread is running? C++ Concurrency in Action 2.1
Condition variables Why must you hold a mutex when calling wait()? C++ Concurrency in Action 4.1
Template type deduction How does auto deduce types for lambda return values? Effective Modern C++ Item 1-4

Deep Concept Exploration

  1. The Future/Promise Contract
    • What exactly happens when you call promise::set_value()?
    • What memory synchronization does the future/promise pair provide?
    • What happens if you call set_value() twice?
    • C++ Concurrency in Action Chapter 4.2
  2. std::async Launch Policies
    • What’s the difference between launch::async and launch::deferred?
    • What happens with the default policy?
    • Why might std::async NOT spawn a thread?
    • C++ Concurrency in Action Chapter 4.2.1
  3. shared_future Semantics
    • Why does shared_future::get() return const T&?
    • What happens if multiple threads call get() simultaneously?
    • When do you need shared_future vs future?
    • C++ Concurrency in Action Chapter 4.2.2
  4. Exception Propagation
    • How does std::current_exception() capture an exception?
    • What’s the difference between set_exception() and set_exception_ptr()?
    • How do you re-throw a captured exception?
    • C++ Concurrency in Action Chapter 4.2.4

Questions to Guide Your Design

Task Representation

  1. How do you represent a task that can return any type? Do you use std::any, templates, or type erasure?

  2. How do you store the callable? Is std::function sufficient, or do you need something more flexible?

  3. Where does the promise live? Who owns it, and how is it accessed by the executor?

  4. How do you handle tasks with different numbers of dependencies? Does your callable signature support variadic dependency results?

Dependency Resolution

  1. How do you detect cycles in the task graph? When do you check for cycles?

  2. How do you know when all of a task’s dependencies are satisfied?

  3. How do you efficiently track which tasks are ready to run?

  4. For diamond dependencies, how do you ensure A’s result is computed once but read twice?

Execution

  1. How do workers know when new tasks are ready? Polling or notification?

  2. How do you handle a task that throws an exception? What happens to its dependents?

  3. What’s the shutdown strategy? Do you wait for all tasks, or can you cancel mid-execution?

  4. How do you handle the case where a task’s dependency was cancelled (not failed)?

Result Management

  1. How do you retrieve results for tasks with different return types?

  2. How do you handle tasks that return void?

  3. When are task results cleaned up? Do you keep them forever or allow disposal?


Thinking Exercise

Before writing code, work through this scenario on paper:

Task Graph:
    A (returns int: 10)
   / \
  B   C
  |   | \
  D   E  F
   \ /
    G (needs D and E results)

B: takes A's result, returns A * 2
C: takes A's result, returns A + 5
D: takes B's result, returns B + 1
E: takes C's result, returns C * 2
F: takes C's result (fails with exception)
G: takes D and E results, returns D + E

Questions to Answer:

  1. Draw the execution timeline assuming 2 worker threads. Which tasks run in parallel?

  2. What shared_future objects are needed? Trace the ownership of A’s result.

  3. If F fails, what happens to G? (Trick question: think carefully about the dependency graph)

  4. What is the final value of G if all tasks succeed?

  5. Write out the sequence of:
    • promise::set_value() calls
    • future::get() calls
    • shared_future copies
  6. If C throws an exception instead of F:
    • Which tasks are cancelled?
    • Which tasks complete normally?
    • What does G.get() throw?

Hints in Layers

Reveal hints progressively as needed:

Hint 1: Getting Started - Core Abstractions

Start with the minimal task representation:

// Start simple: tasks with no input dependencies
struct SimpleTask {
    std::string id;
    std::function<void()> work;
    std::vector<std::string> deps;
    std::promise<void> done;
};

// Get this working first:
// 1. Add tasks to a map
// 2. Find tasks with no deps (in_degree == 0)
// 3. Run those tasks on the thread pool
// 4. When done, signal dependents

Key insight: Get a working system with void tasks first. Add result passing later.

Hint 2: Result Passing - Type Erasure Strategy

The challenge: tasks have different result types. Solution: type erase with std::any:

class ResultStore {
    std::unordered_map<std::string, std::any> results_;
    std::mutex mutex_;

public:
    template<typename T>
    void store(const std::string& id, std::shared_future<T> future) {
        std::lock_guard lock(mutex_);
        results_[id] = future;
    }

    template<typename T>
    std::shared_future<T> get(const std::string& id) {
        std::lock_guard lock(mutex_);
        return std::any_cast<std::shared_future<T>>(results_.at(id));
    }
};

The task’s execute method retrieves typed results:

// Inside Task<R, Deps...>::execute()
auto dep_results = std::make_tuple(
    result_store.get<Dep1Type>("dep1").get(),
    result_store.get<Dep2Type>("dep2").get()
);
auto result = std::apply(callable_, dep_results);
promise_.set_value(std::move(result));
Hint 3: Scheduling - In-Degree Tracking

Use in-degree counting for efficient ready detection:

class Scheduler {
    std::unordered_map<std::string, size_t> in_degree_;  // remaining deps
    std::unordered_map<std::string, std::vector<std::string>> dependents_;

    void task_completed(const std::string& id) {
        for (const auto& dep_id : dependents_[id]) {
            if (--in_degree_[dep_id] == 0) {
                mark_ready(dep_id);
            }
        }
    }

    std::vector<std::string> initial_ready_tasks() {
        std::vector<std::string> ready;
        for (const auto& [id, degree] : in_degree_) {
            if (degree == 0) ready.push_back(id);
        }
        return ready;
    }
};

The ready queue is populated from:

  1. Initially: all tasks with in_degree == 0
  2. Dynamically: tasks whose last dependency just completed
Hint 4: Exception Handling - Cascading Cancellation

When a task fails, propagate cancellation to all transitive dependents:

void handle_task_failure(const std::string& failed_id, std::exception_ptr ex) {
    // Mark the failed task
    tasks_[failed_id]->state = TaskState::Failed;
    tasks_[failed_id]->set_exception(ex);

    // BFS to find all transitive dependents
    std::queue<std::string> to_cancel;
    for (const auto& dep_id : dependents_[failed_id]) {
        to_cancel.push(dep_id);
    }

    while (!to_cancel.empty()) {
        auto id = to_cancel.front();
        to_cancel.pop();

        if (tasks_[id]->state == TaskState::Pending ||
            tasks_[id]->state == TaskState::Ready) {

            tasks_[id]->state = TaskState::Cancelled;
            tasks_[id]->set_exception(
                std::make_exception_ptr(
                    TaskCancelledException(failed_id)));

            for (const auto& next_id : dependents_[id]) {
                to_cancel.push(next_id);
            }
        }
    }
}

Important: A cancelled task’s promise must still be fulfilled (with an exception) so that get() doesn’t hang.


The Interview Questions They’ll Ask

After completing this project, you will be ready for these common interview questions:

Question 1: “Explain std::future and std::promise. How do they differ from callbacks?”

What they want: Understanding that future/promise is a one-shot channel. The promise is the write end (producer), future is the read end (consumer). Unlike callbacks, futures are values that can be stored, passed around, and composed.

Strong answer elements:

  • Future represents an eventual value, not an invocation
  • Promise can only be set once (one-shot)
  • Automatic exception propagation across threads
  • get() blocks until value is available
  • Comparison: callbacks invert control flow, futures preserve it

Question 2: “What’s the difference between std::future and std::shared_future?”

What they want: Understanding of move semantics and multiple consumer patterns.

Strong answer elements:

  • std::future is move-only; get() can only be called once
  • std::shared_future is copyable; get() returns const reference
  • Use shared_future when multiple threads need the same result
  • Common pattern: future.share() to convert

Question 3: “How would you implement a task scheduler with dependencies?”

What they want: System design thinking, understanding of DAG execution.

Strong answer elements:

  • Represent as DAG with tasks as nodes
  • Topological sort for execution order
  • In-degree tracking for ready detection
  • Thread pool picks from ready queue
  • Completion triggers dependent check

Question 4: “What happens if a thread throws an exception and has a promise?”

What they want: Exception safety in concurrent code.

Strong answer elements:

  • Must catch and call promise::set_exception()
  • Use std::current_exception() to capture
  • If promise destroyed without setting, get() throws broken_promise
  • Wrapper pattern to ensure exception capture

Question 5: “How do you avoid deadlocks with futures?”

What they want: Understanding of blocking in get() and circular waits.

Strong answer elements:

  • Never call get() on a future from the thread that should fulfill it
  • Avoid circular dependencies in task graphs
  • Be careful with thread pool starvation (all workers blocked on get())
  • Use wait_for() with timeout for defensive programming
  • Cycle detection in dependency graphs

Question 6: “Compare std::async with using a thread pool”

What they want: Understanding of when to use each approach.

Strong answer elements:

  • std::async creates threads per call (potentially)
  • Thread pool amortizes thread creation cost
  • std::async with deferred runs lazily
  • Thread pool gives control over parallelism level
  • For task graphs, thread pool is more appropriate

Books That Will Help

Topic Book Chapter
Futures and promises fundamentals C++ Concurrency in Action (2nd Ed) - Anthony Williams Chapter 4.2
std::async and launch policies C++ Concurrency in Action (2nd Ed) - Anthony Williams Chapter 4.2.1
std::shared_future C++ Concurrency in Action (2nd Ed) - Anthony Williams Chapter 4.2.2
Exception handling in futures C++ Concurrency in Action (2nd Ed) - Anthony Williams Chapter 4.2.4
Designing concurrent data structures C++ Concurrency in Action (2nd Ed) - Anthony Williams Chapter 6
Thread pool design C++ Concurrency in Action (2nd Ed) - Anthony Williams Chapter 9.1
DAG scheduling algorithms Introduction to Algorithms (CLRS) Chapter 22 (Graph Algorithms)
Work scheduling in parallel systems The Art of Multiprocessor Programming - Herlihy & Shavit Chapter 16

Implementation Phases

Phase 1: Foundation (Days 1-3)

Goals:

  • Basic task representation with void return type
  • Task graph construction and cycle detection
  • Simple sequential execution

Tasks:

  1. Implement Task class with ID, callable, dependencies
  2. Implement TaskGraph with add, cycle detection
  3. Implement topological sort
  4. Test: add tasks, verify ordering

Checkpoint: Can build a graph of void tasks and detect cycles.

Phase 2: Thread Pool Executor (Days 4-6)

Goals:

  • Worker threads pick tasks from ready queue
  • Proper shutdown and synchronization
  • Basic parallelism working

Tasks:

  1. Implement ready queue with mutex/condition variable
  2. Implement worker thread loop
  3. Implement in-degree tracking and ready notification
  4. Test: verify tasks run in parallel

Checkpoint: Tasks run on multiple threads in dependency order.

Phase 3: Result Passing (Days 7-9)

Goals:

  • Tasks return typed results
  • Results passed to dependent tasks
  • shared_future for diamond dependencies

Tasks:

  1. Add templated Task with result type
  2. Implement ResultStore with type erasure
  3. Add dependency result retrieval
  4. Handle shared_future for multiple consumers
  5. Test: diamond dependency pattern

Checkpoint: Results flow through the graph correctly.

Phase 4: Exception Handling (Days 10-12)

Goals:

  • Exceptions captured and propagated
  • Downstream tasks cancelled on failure
  • Clean error reporting

Tasks:

  1. Wrap task execution with try/catch
  2. Call set_exception() on failure
  3. Implement cancellation cascade
  4. Generate failure report
  5. Test: various failure scenarios

Checkpoint: Failures propagate correctly, partial results available.

Phase 5: Polish & Extensions (Days 13-14)

Goals:

  • Progress reporting
  • Timing statistics
  • Clean API

Tasks:

  1. Add timing instrumentation
  2. Implement progress callbacks
  3. Clean up public API
  4. Documentation
  5. Comprehensive testing

Checkpoint: Production-ready library with examples.

Key Implementation Decisions

Decision Options Recommendation Rationale
Task type erasure std::any vs inheritance std::any Simpler, modern C++
Result storage Promises vs map of results Map of shared_futures Allows multiple reads
Cycle detection DFS vs Kahn’s algorithm Kahn’s Also gives topological order
Thread pool Fixed vs dynamic size Fixed Predictable, simpler
Ready queue std::queue vs priority_queue std::queue FIFO is fair

Testing Strategy

Test Categories

Category Purpose Example
Unit Tests Test components in isolation Future/promise pair works
Integration Tests Test component interaction Task graph + executor
Stress Tests Verify under load 1000 tasks, 16 threads
Failure Tests Verify error handling Exception propagation
Edge Cases Handle unusual scenarios Single task, empty graph

Critical Test Cases

  1. Single Task (No Dependencies)
    coordinator.add_task("A", []{ return 42; }, {});
    auto result = coordinator.run();
    EXPECT_EQ(result.get<int>("A"), 42);
    
  2. Linear Chain (A -> B -> C)
    coordinator.add_task("A", []{ return 1; }, {});
    coordinator.add_task("B", [](int a){ return a + 1; }, {"A"});
    coordinator.add_task("C", [](int b){ return b + 1; }, {"B"});
    // C should equal 3
    
  3. Diamond Pattern (A -> B,C -> D)
    coordinator.add_task("A", []{ return 10; }, {});
    coordinator.add_task("B", [](int a){ return a * 2; }, {"A"});
    coordinator.add_task("C", [](int a){ return a + 5; }, {"A"});
    coordinator.add_task("D", [](int b, int c){ return b + c; }, {"B", "C"});
    // A runs once, B and C get shared_future, D = 20 + 15 = 35
    
  4. Failure Propagation
    coordinator.add_task("A", []{ return 1; }, {});
    coordinator.add_task("B", [](int){ throw std::runtime_error("fail"); return 0; }, {"A"});
    coordinator.add_task("C", [](int){ return 3; }, {"B"});  // Should be cancelled
    
  5. Cycle Detection
    coordinator.add_task("A", []{}, {"B"});
    coordinator.add_task("B", []{}, {"A"});
    EXPECT_THROW(coordinator.run(), CycleDetectedException);
    
  6. Parallel Execution Verification
    // A, B, C have no dependencies - should run in parallel
    std::atomic<int> concurrent_count{0};
    int max_concurrent = 0;
    
    auto task = [&]{
        int c = ++concurrent_count;
        max_concurrent = std::max(max_concurrent, c);
        std::this_thread::sleep_for(100ms);
        --concurrent_count;
    };
    
    coordinator.add_task("A", task, {});
    coordinator.add_task("B", task, {});
    coordinator.add_task("C", task, {});
    coordinator.run();
    
    EXPECT_GE(max_concurrent, 2);  // At least 2 ran in parallel
    

Common Pitfalls & Debugging

Frequent Mistakes

Pitfall Symptom Solution
Calling get() twice on future Exception: “future already retrieved” Use shared_future or store result
Promise destroyed without setting Exception: “broken promise” Always set value or exception
Circular dependencies Infinite wait or exception Detect cycles before execution
Thread pool starvation Deadlock when all workers blocked Limit blocking gets, use timeouts
Forgetting to move promise Compile error (deleted copy) Use std::move() for promise capture
Race condition in ready check Tasks missed or run twice Protect in_degree with mutex

Debugging Strategies

Deadlock diagnosis:

# Run with timeout
timeout 10 ./task_coordinator || echo "Possible deadlock"

# Use GDB to see thread states
gdb -ex "thread apply all bt" ./task_coordinator

Thread Sanitizer:

g++ -fsanitize=thread -o coordinator coordinator.cpp
./coordinator  # Reports data races

Logging strategy:

#define LOG_TASK(id, msg) \
    std::cerr << "[" << std::this_thread::get_id() << "] " \
              << id << ": " << msg << std::endl

// In task execution:
LOG_TASK(id, "starting");
LOG_TASK(id, "completed with value: " + std::to_string(result));

Verification:

  • Print the topological order before execution
  • Log each dependency satisfaction
  • Print final states of all tasks

Extensions & Challenges

Beginner Extensions

  • Task timeout: Cancel tasks that exceed a time limit
  • Priority levels: High-priority tasks run first when ready
  • Progress callback: Notify external code as tasks complete

Intermediate Extensions

  • Continuation chains: Add .then() method to futures
  • Task groups: Cancel entire groups of related tasks
  • Retry logic: Automatically retry failed tasks with backoff
  • Resource limits: Tasks can declare resource requirements

Advanced Extensions

  • Distributed execution: Tasks can run on remote nodes
  • Persistent task graphs: Serialize/deserialize task graphs
  • Incremental execution: Re-run only changed tasks (like Make)
  • Coroutine integration: Make tasks co_await-able

Research Extensions

  • Optimal scheduling: Minimize total execution time given task durations
  • Fault tolerance: Continue despite worker failures
  • Speculative execution: Start likely-needed tasks early

Real-World Connections

Industry Systems Using These Patterns

System Pattern Used Scale
Bazel/Buck Task graph, incremental execution Billions of artifacts at Google
Apache Airflow DAG scheduling, dependency tracking ETL pipelines
GitHub Actions Job dependencies, parallel execution CI/CD
Kubernetes Jobs Dependency ordering, failure handling Container orchestration
Celery Task queue, result backend Python async tasks
TBB/OpenMP Task-based parallelism HPC, games

How Production Systems Differ

Your implementation is pedagogical. Production systems add:

  1. Persistence: Tasks survive process restarts
  2. Distribution: Tasks run across machines
  3. Observability: Metrics, tracing, logging
  4. Backpressure: Handle more tasks than workers
  5. Versioning: Handle task definition changes
  6. Security: Authentication, authorization

Career Applications

This project prepares you for:

  • Build system engineering: Bazel, Buck, Gradle internals
  • Data platform engineering: Airflow, Prefect, Dagster
  • Game engine development: Job systems, async loading
  • High-frequency trading: Task scheduling, latency optimization
  • Compiler engineering: Parallel compilation, incremental builds

Resources

Essential Reading

  • C++ Concurrency in Action, 2nd Edition (Anthony Williams) - Chapter 4
  • cppreference.com - std::future, std::promise, std::async pages
  • Herb Sutter’s “Effective Concurrency” series on DDJ

Video Resources

  • CppCon 2014: “C++ Futures and How to Use Them” - Scott Meyers
  • CppCon 2016: “The Continuing Future of Concurrency in C++” - Anthony Williams
  • CppCon 2019: “Back to Basics: Concurrency” - Arthur O’Dwyer

Code References

  • folly::Future (Facebook’s continuation-based futures)
  • Intel TBB task_group
  • HPX (High Performance ParalleX) futures
  • Boost.Fiber (cooperative scheduling)

Self-Assessment Checklist

Before considering this project complete, verify:

Understanding

  • I can explain why futures are better than callbacks for async coordination
  • I understand the one-shot nature of promise/future pairs
  • I can explain when to use shared_future vs future
  • I understand how exceptions propagate through futures
  • I can identify potential deadlocks in task graphs

Implementation

  • Task graph correctly rejects cycles
  • Independent tasks run in parallel
  • Diamond dependencies work (shared result)
  • Exceptions propagate to downstream tasks
  • Cancelled tasks don’t run
  • Thread pool shuts down cleanly

Testing

  • All critical test cases pass
  • Thread sanitizer reports no races
  • Performance scales with thread count
  • Edge cases handled (empty graph, single task)

Growth

  • I can now explain task-based parallelism in interviews
  • I understand how build systems schedule work
  • I’m ready to learn coroutines (C++20)
  • I can read production task scheduler code

Submission / Completion Criteria

Minimum Viable Completion:

  • Tasks with dependencies execute in correct order
  • Results pass from producers to consumers
  • Basic error handling (exceptions caught)
  • Works with at least 2 worker threads

Full Completion:

  • Diamond dependencies work with shared_future
  • Cascading cancellation on failure
  • Timing and progress reporting
  • Comprehensive test suite
  • Clean public API

Excellence (Going Above & Beyond):

  • .then() continuation support
  • Task priorities
  • Timeout and retry logic
  • Performance benchmarks
  • Comparison with production systems (Folly, TBB)

This guide was expanded from LEARN_CPP_CONCURRENCY_AND_PARALLELISM.md. For the complete learning path, see the project index.