How libdav1d Handles Frame Reordering and Delayed Output

This article explains how the high-performance AV1 decoder, libdav1d, manages delayed output and frame reordering. It explores the relationship between input bitstream order and output display order, detailing how the decoder utilizes internal queuing, picture buffering, and its asynchronous API to deliver decoded frames in the correct sequence while minimizing latency.

Decode Order vs. Display Order in AV1

Like most modern video compression standards, AV1 uses inter-frame prediction. This means frames are not always compressed in the order they are meant to be displayed. For example, a future frame (like an alt-ref frame) may be decoded first so it can be used as a reference for preceding frames.

Because of this non-chronological decoding process, the decoder cannot simply output frames in the order they are parsed from the bitstream. It must decode the frames, store them in a buffer, and reorder them so they can be presented to the user in the correct chronological sequence.

The Decoded Picture Buffer (DPB) and Reference Management

To manage frame reordering, libdav1d implements a Decoded Picture Buffer (DPB). The DPB serves two primary purposes: 1. Reference Storage: Holding decoded frames that future frames need for temporal prediction. 2. Reordering Buffer: Holding decoded frames until their presentation time arrives.

AV1 bitstream headers contain syntax elements such as show_frame and show_existing_frame flags, alongside presentation timestamps (PTS) or frame output order hints. libdav1d reads these flags to determine exactly when a decoded frame should be displayed versus when it should merely be kept in the DPB as a reference.

The Asynchronous API Pipeline: Send and Receive

libdav1d handles the delay caused by frame reordering using a non-blocking, asynchronous “push/pull” API model. This pipeline relies on two primary functions: dav1d_send_data() and dav1d_get_picture().

1. Feeding the Decoder (dav1d_send_data)

The host application inputs compressed AV1 bitstream packets into the decoder using dav1d_send_data(). This function takes ownership of the compressed data and queues it for the internal decoding threads.

2. Retrieving Decoded Frames (dav1d_get_picture)

To retrieve decoded, reordered frames, the application calls dav1d_get_picture(). * If a frame is ready for display: The function returns 0 and populates a picture structure with the decoded frame in the correct presentation order. * If more data is required: If the next frame in the display sequence cannot yet be reconstructed because its reference frames have not been submitted, dav1d_get_picture() returns EAGAIN. This indicates that the application must feed more input data via dav1d_send_data() before an output frame can be released.

This handshaking process naturally manages the “delayed output” latency inherent in temporal video coding.

Flushing the Delayed Frames

At the end of a video stream, or when a user performs a seek operation, there will be decoded frames left resting in the DPB that have not yet been output because the decoder is still waiting for subsequent input.

To retrieve these remaining delayed frames, the application must flush the decoder: * Drain Sequence: The application sends a NULL or empty data packet to dav1d_send_data(). This signals to libdav1d that the end of the stream has been reached. * Emptying the DPB: The application then calls dav1d_get_picture() repeatedly in a loop. The decoder will output all remaining buffered frames in their proper display order until the function returns EAGAIN with no more frames left in the pipeline.