libdav1d Dynamic Threading and CPU Core Scaling

This article explores how the libdav1d AV1 decoder manages its threading model in relation to system CPU cores. It explains the mechanism behind libdav1d’s automatic thread allocation, distinguishes between frame and tile threading, and clarifies whether the library can adjust its thread pool dynamically during runtime.

Automatic CPU Core Detection and Thread Allocation

By default, libdav1d optimizes its performance by automatically detecting the number of logical CPU cores available on the host system. When configuring the decoder, developers can set the thread count parameters to 0.

When these parameters are set to zero, libdav1d queries the operating system at startup to determine the hardware capability and automatically instantiates an optimal number of threads. This ensures that the decoder scales out of the box on everything from low-power dual-core mobile processors to high-end multi-core desktop and server CPUs.

The Dual-Threading Model: Frames and Tiles

To achieve high-speed AV1 decoding, libdav1d utilizes a hybrid threading model consisting of two distinct types of threading:

When automatic threading is enabled, libdav1d balances these two methods. It allocates a specific ratio of frame and tile threads tailored to the detected CPU core count to maximize decoding efficiency without exhausting system memory.

Is Threading Dynamically Adjusted at Runtime?

While libdav1d dynamically scales its thread pool at initialization based on the available CPU cores, it does not dynamically resize the thread pool during active decoding if system conditions or CPU topologies change (such as CPU hotplugging or changing CPU affinity mid-stream).

Once the decoder context is initialized, the size of the thread pool remains static. However, libdav1d features a highly dynamic work-stealing scheduler. This internal scheduler continuously and dynamically distributes decoding tasks (such as symbol decoding, motion compensation, and loop filtering) across the pre-allocated thread pool. This ensures that even though the thread count is fixed, CPU utilization remains balanced and efficient throughout the playback session.

Configuration for Developers

For developers integrating libdav1d, automatic thread scaling is controlled via the Dav1dSettings structure.

To enable automatic CPU-based scaling, the settings should be configured as follows:

Dav1dSettings settings;
dav1d_default_settings(&settings);

// Setting these to 0 enables automatic thread allocation based on CPU cores
settings.n_frame_threads = 0;
settings.n_tile_threads = 0;

// Initialize the decoder context with these settings
Dav1dContext *c;
dav1d_open(&c, &settings);

By allowing libdav1d to manage these values, applications ensure the best possible AV1 decoding performance across diverse hardware configurations without manual profiling.