Parse AV1 Headers with libdav1d Without Decoding

This article explains how to use the open-source AV1 decoder libdav1d to parse AV1 video stream headers and metadata without undergoing the resource-intensive process of decoding the full pixel data. We will cover the specific API functions, the underlying workflow, and the benefits of lightweight header parsing for media inspection, demuxing, and stream analysis.

Parsing AV1 Headers with libdav1d

Yes, it is entirely possible to use libdav1d to parse AV1 headers without decoding the actual pixel data. AV1 bitstreams are structured into Open Bitstream Units (OBUs). The metadata required to understand the stream’s properties—such as resolution, profile, level, color space, and bit depth—is stored in specific control OBUs, primarily the Sequence Header OBU.

Because libdav1d separates bitstream parsing from the heavy lifting of pixel reconstruction, you can extract this metadata efficiently.

The API Approach: dav1d_parse_sequence_header

To extract stream parameters without initializing a full decoding pipeline, libdav1d exposes a dedicated function in its public API:

int dav1d_parse_sequence_header(Dav1dSequenceHeader *out, const uint8_t *buf, size_t sz);

This function allows you to bypass the creation of a full decoder context (Dav1dContext), which otherwise allocates threads and picture buffers.

Step-by-Step Workflow

  1. Locate the Sequence Header OBU: Extract the raw bitstream packet containing the Sequence Header OBU from your media container (such as an MP4, MKV, or WebM file) or your network stream.
  2. Allocate the Header Structure: Define a Dav1dSequenceHeader structure to hold the parsed metadata.
  3. Call the Parser: Pass the pointer of the raw OBU buffer and its size to dav1d_parse_sequence_header().
  4. Extract Metadata: Once the function returns successfully (returning 0), the Dav1dSequenceHeader structure is populated with properties such as:
    • profile and level
    • max_width and max_height
    • layout (chroma subsampling format, e.g., 4:2:0, 4:2:2, 4:4:4)
    • bitdepth (8, 10, or 12-bit)
    • Color description metadata (color primaries, transfer characteristics, and matrix coefficients)

Benefits of Header-Only Parsing