How libdav1d Decodes Monochrome AV1 Video

This article explains how the libdav1d decoder processes monochrome AV1 video profiles. It covers the detection of grayscale streams, the internal pixel layout representations used by the decoder, and the performance optimizations achieved by skipping color-channel processing.

In the AV1 specification, monochrome video is signaled by setting the mono_chrome flag to 1 in the sequence header. When libdav1d parses this header, it recognizes that the stream contains only luma (Y) data and no chroma (U and V) data. The decoder then configures its internal pipeline to process a single plane of pixel data instead of the standard three planes used in color video.

To represent monochrome frames in its output, libdav1d utilizes the DAV1D_PIXEL_LAYOUT_I400 pixel layout. Under this layout, the decoder allocates memory solely for the luma plane. This reduces the memory footprint of the decoded frames. Applications integrating libdav1d can read this pixel format directly and render the grayscale output without performing unnecessary color-space conversions.

Processing monochrome video also yields significant performance optimizations within libdav1d. Because there are no chroma planes, the decoder entirely bypasses chroma-specific decoding steps. Highly intensive in-loop filtering processes—such as the deblocking filter, the Constrained Directional Enhancement Filter (CDEF), and Loop Restoration—are executed only on the luma plane. By skipping these operations for the non-existent U and V channels, libdav1d reduces CPU utilization and delivers faster decoding speeds compared to standard color AV1 profiles.