libdav1d CDF Updates in AV1 Entropy Decoding
This article provides a technical overview of how the
libdav1d AV1 decoder manages Cumulative Distribution
Function (CDF) updates during entropy decoding. It explains the purpose
of CDFs in multi-symbol arithmetic coding, the mathematical formulas
used to update symbol probabilities, and the highly optimized software
techniques libdav1d employs to perform these updates
efficiently.
The Role of CDFs in AV1 Entropy Decoding
Unlike older video codecs that rely on binary arithmetic coding (such as CABAC in H.264 and H.265), the AV1 standard uses multi-symbol arithmetic coding. This allows the decoder to process symbols with up to 16 possible values (alphabet sizes) in a single step.
To determine the probability of each symbol, the decoder utilizes a Cumulative Distribution Function (CDF). A CDF is an array of 16-bit integers where each element represents the cumulative probability of a symbol occurring in a given context. Because video statistics change dynamically, these CDFs must be constantly updated during the decoding process to match the local characteristics of the video frame.
The CDF Update Mechanism
In AV1, entropy decoding and CDF updates are tightly coupled. Each time a symbol is decoded using a specific context, the corresponding CDF is immediately updated. The update process shifts the probability distribution toward the decoded symbol, making subsequent occurrences of that symbol more expected.
The mathematical update for each element \(i\) of the CDF array is defined by the following formula:
\[\text{CDF}[i] = \text{CDF}[i] + ((\text{Target}[i] - \text{CDF}[i]) \gg \text{rate})\]
Where: * Target[i] is \(32768\) (which represents \(100\%\) probability in 15-bit precision) if \(i\) is greater than or equal to the decoded symbol index, and \(0\) otherwise. * rate is the adaptation factor that controls how quickly the CDF adapts to new statistics. This rate is dynamically calculated based on the number of times the context has been used (tracked by a counter associated with the CDF). * \(\gg\) represents a bitwise right-shift operation.
As a context is used more frequently, the adaptation rate increases (meaning the shift value increases), which makes the updates smaller and more stable over time.
Performance Optimization in libdav1d
CDF updates are one of the most computationally expensive parts of
AV1 decoding because they occur millions of times per second. To achieve
its status as the fastest AV1 software decoder, libdav1d
implements several critical optimizations in its Multi-Symbol Arithmetic
Decoder (msac) module:
SIMD Vectorization
Since CDF arrays are small (typically containing between 2 and 16
elements), they are ideal targets for Single Instruction, Multiple Data
(SIMD) vectorization. libdav1d features hand-written
assembly language implementations (using AVX2, AVX-512, and ARM NEON)
that update the entire CDF array in parallel. Instead of looping through
each symbol probability individually, a vector processor can perform the
subtraction, shifting, and addition for all CDF elements in a single
cycle.
Tight Integration with Symbol Decoding
To minimize memory latency, libdav1d combines the symbol
decoding step and the CDF update step into a single pipeline. In the
assembly-optimized decoding functions, the decoded symbol index is used
immediately to construct the target mask, and the vector registers
containing the CDF are updated before they are written back to L1
cache.
Dynamic Counter Handling
The adaptation rate is governed by a usage counter prepended to the
CDF memory block. libdav1d handles this counter efficiently
by combining the increment operation and the rate determination step
into simple bitwise operations, avoiding slow conditional branches in
the hot execution path.