C++ async futures std::async parallel concurrency
|

C++ async & Futures: High-Level Concurrency with std::async Guide 2026

Why async Over Raw Threads

Managing threads manually requires creating threads, passing output parameters or using promises, joining at the right time, and handling exceptions across thread boundaries. std::async wraps all of this into one function call: launch a task, get a future, call .get() for the result. No manual join, no shared state, no mutex needed for the return value.

std::async is to threads what std::lock_guard is to mutexes — a higher-level abstraction that handles the common case safely and simply. For parallel computation where you want results back, it’s almost always the right first choice.

std::async — Launch and Forget

#include <future>
#include <iostream>
#include <string>
#include <chrono>

// A function that takes time
int compute(int x) {
    std::this_thread::sleep_for(std::chrono::seconds(1));
    return x * x;
}

std::string fetch_data(const std::string& url) {
    std::this_thread::sleep_for(std::chrono::milliseconds(500));
    return "Data from " + url;
}

int main() {
    auto start = std::chrono::steady_clock::now();

    // Launch tasks asynchronously
    std::future<int> result1 = std::async(compute, 42);
    std::future<std::string> result2 = std::async(fetch_data, "api.example.com");

    // Do other work while tasks run in background
    std::cout << "Tasks launched, doing other work...
";

    // Get results (blocks if not ready yet)
    int value = result1.get();
    std::string data = result2.get();

    auto elapsed = std::chrono::steady_clock::now() - start;
    auto ms = std::chrono::duration_cast<std::chrono::milliseconds>(elapsed);

    std::cout << "Result 1: " << value << "
";
    std::cout << "Result 2: " << data << "
";
    std::cout << "Total time: " << ms.count() << "ms
";
    // ~1000ms, not 1500ms — tasks ran in parallel!
}

Launch Policies

#include <future>
#include <iostream>

int work() {
    std::cout << "Thread ID: " << std::this_thread::get_id() << "
";
    return 42;
}

int main() {
    std::cout << "Main thread: " << std::this_thread::get_id() << "
";

    // std::launch::async — MUST run in a new thread
    auto f1 = std::async(std::launch::async, work);
    // Guaranteed different thread ID

    // std::launch::deferred — run lazily when .get() is called
    auto f2 = std::async(std::launch::deferred, work);
    // Runs in the calling thread when f2.get() is called
    // Same thread ID as main

    // Default (no policy) — implementation decides
    auto f3 = std::async(work);
    // May run async or deferred — don't rely on either

    f1.get();
    f2.get();  // work() runs HERE, in main's thread
    f3.get();
}

Use std::launch::async when you need true parallelism. Use std::launch::deferred for lazy evaluation — the computation runs only if and when you actually need the result. The default policy is convenient but unpredictable.

std::future — Getting the Result

#include <future>
#include <iostream>
#include <chrono>

int slow_computation() {
    std::this_thread::sleep_for(std::chrono::seconds(2));
    return 42;
}

int main() {
    auto future = std::async(std::launch::async, slow_computation);

    // Check if result is ready (non-blocking)
    auto status = future.wait_for(std::chrono::milliseconds(100));
    if (status == std::future_status::ready) {
        std::cout << "Already done!
";
    } else if (status == std::future_status::timeout) {
        std::cout << "Still computing...
";
    } else if (status == std::future_status::deferred) {
        std::cout << "Hasn't started (deferred)
";
    }

    // Wait without getting (no return value)
    // future.wait();

    // Wait with deadline
    auto deadline = std::chrono::steady_clock::now() + std::chrono::seconds(3);
    future.wait_until(deadline);

    // Get the result (blocks if needed, can only call ONCE)
    int result = future.get();
    std::cout << "Result: " << result << "
";

    // future.get();  // ERROR: future already consumed!
}

Exception Propagation

#include <future>
#include <iostream>
#include <stdexcept>

int risky_computation(int x) {
    if (x < 0) {
        throw std::runtime_error("Negative input not allowed");
    }
    return x * 2;
}

int main() {
    auto f1 = std::async(std::launch::async, risky_computation, 10);
    std::cout << "Good result: " << f1.get() << "
";  // 20

    auto f2 = std::async(std::launch::async, risky_computation, -5);

    try {
        int result = f2.get();  // Exception re-thrown here!
    } catch (const std::runtime_error& e) {
        std::cout << "Caught: " << e.what() << "
";
    }
    // Exceptions thrown in async tasks are captured by the future
    // and re-thrown when you call .get() — no lost exceptions!
}

std::promise — Manual Future Control

#include <future>
#include <thread>
#include <iostream>

int main() {
    // Promise-future pair: promise sets the value, future reads it
    std::promise<int> promise;
    std::future<int> future = promise.get_future();

    std::thread worker([&promise]() {
        // Do some work...
        int result = 42;
        promise.set_value(result);  // Fulfill the promise
    });

    // future.get() blocks until promise is fulfilled
    std::cout << "Result: " << future.get() << "
";
    worker.join();

    // Promise with exception
    std::promise<int> p2;
    std::future<int> f2 = p2.get_future();

    std::thread t2([&p2]() {
        try {
            throw std::runtime_error("Something failed");
        } catch (...) {
            p2.set_exception(std::current_exception());
        }
    });

    try {
        f2.get();
    } catch (const std::exception& e) {
        std::cout << "Got exception: " << e.what() << "
";
    }
    t2.join();
}

std::packaged_task

#include <future>
#include <thread>
#include <iostream>
#include <functional>
#include <queue>

int main() {
    // packaged_task wraps a callable and provides a future
    std::packaged_task<int(int, int)> task([](int a, int b) {
        return a + b;
    });

    std::future<int> result = task.get_future();

    // Run in a thread
    std::thread t(std::move(task), 10, 20);
    std::cout << "Sum: " << result.get() << "
";  // 30
    t.join();

    // Use case: task queue
    std::queue<std::packaged_task<int()>> task_queue;

    // Enqueue tasks
    for (int i = 0; i < 5; ++i) {
        std::packaged_task<int()> t([i]() { return i * i; });
        task_queue.push(std::move(t));
    }

    // Execute tasks
    std::vector<std::future<int>> futures;
    while (!task_queue.empty()) {
        auto task = std::move(task_queue.front());
        task_queue.pop();
        futures.push_back(task.get_future());
        task();  // Execute
    }

    for (auto& f : futures) {
        std::cout << f.get() << " ";  // 0 1 4 9 16
    }
}

Multiple Parallel Tasks

#include <future>
#include <vector>
#include <iostream>
#include <numeric>
#include <chrono>

// Parallel map-reduce
template<typename Iter, typename Func>
auto parallel_reduce(Iter begin, Iter end, Func f, int num_tasks = 4) {
    auto total = std::distance(begin, end);
    auto chunk = total / num_tasks;

    std::vector<std::future<typename std::iterator_traits<Iter>::value_type>> futures;

    for (int i = 0; i < num_tasks; ++i) {
        auto start = begin + i * chunk;
        auto stop = (i == num_tasks - 1) ? end : start + chunk;

        futures.push_back(std::async(std::launch::async,
            [start, stop, &f]() {
                return f(start, stop);
            }
        ));
    }

    using T = typename std::iterator_traits<Iter>::value_type;
    T result{};
    for (auto& f : futures) {
        result += f.get();
    }
    return result;
}

int main() {
    std::vector<int> data(1'000'000, 1);

    auto start = std::chrono::steady_clock::now();
    auto sum = parallel_reduce(data.begin(), data.end(),
        [](auto begin, auto end) {
            return std::accumulate(begin, end, 0);
        }, 8);
    auto elapsed = std::chrono::steady_clock::now() - start;

    std::cout << "Sum: " << sum << "
";
    std::cout << "Time: "
              << std::chrono::duration_cast<std::chrono::microseconds>(elapsed).count()
              << "us
";
}

std::shared_future

#include <future>
#include <thread>
#include <iostream>
#include <vector>

int main() {
    // shared_future: multiple threads can wait for the same result
    std::promise<int> promise;
    std::shared_future<int> shared = promise.get_future().share();

    // Multiple consumers
    std::vector<std::thread> consumers;
    for (int i = 0; i < 5; ++i) {
        consumers.emplace_back([shared, i]() {
            int val = shared.get();  // All get the same value
            std::cout << "Consumer " << i << " got: " << val << "
";
        });
    }

    // Producer sets the value once
    promise.set_value(42);

    for (auto& t : consumers) t.join();
    // Unlike std::future, shared_future::get() can be called multiple times
}

Real-World Patterns

#include <future>
#include <vector>
#include <string>
#include <iostream>
#include <chrono>

// Pattern 1: Parallel API calls
struct ApiResponse { std::string data; int status; };

ApiResponse call_api(const std::string& endpoint) {
    std::this_thread::sleep_for(std::chrono::milliseconds(200));
    return {endpoint + "_data", 200};
}

// Pattern 2: First-to-complete (race pattern)
template<typename T>
T wait_for_first(std::vector<std::future<T>>& futures) {
    while (true) {
        for (auto& f : futures) {
            if (f.wait_for(std::chrono::milliseconds(1)) == std::future_status::ready) {
                return f.get();
            }
        }
    }
}

// Pattern 3: Timeout wrapper
template<typename Func, typename... Args>
auto with_timeout(std::chrono::milliseconds timeout, Func f, Args... args) {
    auto future = std::async(std::launch::async, f, args...);
    if (future.wait_for(timeout) == std::future_status::timeout) {
        throw std::runtime_error("Operation timed out");
    }
    return future.get();
}

int main() {
    // Parallel API calls
    std::vector<std::string> endpoints = {"/users", "/posts", "/comments"};
    std::vector<std::future<ApiResponse>> futures;

    for (const auto& ep : endpoints) {
        futures.push_back(std::async(std::launch::async, call_api, ep));
    }

    for (auto& f : futures) {
        auto resp = f.get();
        std::cout << resp.data << " (status " << resp.status << ")
";
    }
}

Performance Considerations

std::async with std::launch::async creates a new thread for each call. Thread creation takes microseconds to milliseconds — for many small tasks, this overhead dominates. Use a thread pool for thousands of short tasks. Use std::async for a moderate number of substantial computations.

The std::launch::deferred policy has zero overhead until .get() is called. Use it for lazy evaluation — compute only if needed. The default policy lets the implementation use a thread pool internally, but this behavior varies between compilers.

A future’s destructor blocks if it was created by std::async and hasn’t been gotten yet. This means unused futures still wait for completion — they don’t cancel the task. There’s no standard way to cancel an async task in C++; use std::jthread with stop_token if you need cancellation.

Common Mistakes

#include <future>

// MISTAKE 1: Discarding the future
void fire_and_forget() {
    std::async(std::launch::async, []() {
        // This BLOCKS here because the temporary future
        // is destroyed immediately, and its destructor waits!
    });
    // Not truly fire-and-forget — it's synchronous!
}

// MISTAKE 2: Calling .get() twice
void double_get() {
    auto f = std::async([]() { return 42; });
    f.get();  // OK
    // f.get();  // UNDEFINED BEHAVIOR — future is in invalid state
}

// MISTAKE 3: Assuming async policy means new thread
void assumed_parallel() {
    auto f = std::async([]() { return 42; });
    // Default policy might be deferred — might not run in parallel!
    // Use std::launch::async if you need parallelism
}

Practice Exercises

Exercise 1: Write a function that takes a vector of URLs (strings) and uses std::async to “fetch” them in parallel (simulate with sleep). Return a vector of results. Measure speedup vs sequential.

Exercise 2: Implement a parallel merge sort using std::async. Split the array, sort halves in parallel, then merge.

Exercise 3: Create a timeout_future wrapper that takes a function and a duration. If the function doesn’t complete within the duration, throw a timeout exception.

Exercise 4: Build a simple task scheduler using std::packaged_task and a queue. A worker thread pulls tasks from the queue and executes them. Callers hold futures to get results.

The async/future model is the highest-level concurrency abstraction in standard C++. It handles thread management, result transfer, and exception propagation automatically. Combined with threads for fine-grained control and mutexes for shared state, you have everything needed for real-world concurrent programs. Next, we’ll move to build systems with CMake.

Similar Posts

Leave a Reply

Your email address will not be published. Required fields are marked *