How libdav1d Decodes Monochrome AV1 Video
This article explains how the libdav1d decoder processes monochrome AV1 video profiles. It covers the detection of grayscale streams, the internal pixel layout representations used by the decoder, and the performance optimizations achieved by skipping color-channel processing.
In the AV1 specification, monochrome video is signaled by setting the
mono_chrome flag to 1 in the sequence header. When libdav1d
parses this header, it recognizes that the stream contains only luma (Y)
data and no chroma (U and V) data. The decoder then configures its
internal pipeline to process a single plane of pixel data instead of the
standard three planes used in color video.
To represent monochrome frames in its output, libdav1d utilizes the
DAV1D_PIXEL_LAYOUT_I400 pixel layout. Under this layout,
the decoder allocates memory solely for the luma plane. This reduces the
memory footprint of the decoded frames. Applications integrating
libdav1d can read this pixel format directly and render the grayscale
output without performing unnecessary color-space conversions.
Processing monochrome video also yields significant performance optimizations within libdav1d. Because there are no chroma planes, the decoder entirely bypasses chroma-specific decoding steps. Highly intensive in-loop filtering processes—such as the deblocking filter, the Constrained Directional Enhancement Filter (CDEF), and Loop Restoration—are executed only on the luma plane. By skipping these operations for the non-existent U and V channels, libdav1d reduces CPU utilization and delivers faster decoding speeds compared to standard color AV1 profiles.