How libdav1d Handles Frame Reordering and Delayed Output
This article explains how the high-performance AV1 decoder,
libdav1d, manages delayed output and frame reordering. It
explores the relationship between input bitstream order and output
display order, detailing how the decoder utilizes internal queuing,
picture buffering, and its asynchronous API to deliver decoded frames in
the correct sequence while minimizing latency.
Decode Order vs. Display Order in AV1
Like most modern video compression standards, AV1 uses inter-frame prediction. This means frames are not always compressed in the order they are meant to be displayed. For example, a future frame (like an alt-ref frame) may be decoded first so it can be used as a reference for preceding frames.
Because of this non-chronological decoding process, the decoder cannot simply output frames in the order they are parsed from the bitstream. It must decode the frames, store them in a buffer, and reorder them so they can be presented to the user in the correct chronological sequence.
The Decoded Picture Buffer (DPB) and Reference Management
To manage frame reordering, libdav1d implements a
Decoded Picture Buffer (DPB). The DPB serves two primary purposes: 1.
Reference Storage: Holding decoded frames that future
frames need for temporal prediction. 2. Reordering
Buffer: Holding decoded frames until their presentation time
arrives.
AV1 bitstream headers contain syntax elements such as
show_frame and show_existing_frame flags,
alongside presentation timestamps (PTS) or frame output order hints.
libdav1d reads these flags to determine exactly when a
decoded frame should be displayed versus when it should merely be kept
in the DPB as a reference.
The Asynchronous API Pipeline: Send and Receive
libdav1d handles the delay caused by frame reordering
using a non-blocking, asynchronous “push/pull” API model. This pipeline
relies on two primary functions: dav1d_send_data() and
dav1d_get_picture().
1. Feeding the Decoder
(dav1d_send_data)
The host application inputs compressed AV1 bitstream packets into the
decoder using dav1d_send_data(). This function takes
ownership of the compressed data and queues it for the internal decoding
threads.
2. Retrieving
Decoded Frames (dav1d_get_picture)
To retrieve decoded, reordered frames, the application calls
dav1d_get_picture(). * If a frame is ready for
display: The function returns 0 and populates a
picture structure with the decoded frame in the correct presentation
order. * If more data is required: If the next frame in
the display sequence cannot yet be reconstructed because its reference
frames have not been submitted, dav1d_get_picture() returns
EAGAIN. This indicates that the application must feed more
input data via dav1d_send_data() before an output frame can
be released.
This handshaking process naturally manages the “delayed output” latency inherent in temporal video coding.
Flushing the Delayed Frames
At the end of a video stream, or when a user performs a seek operation, there will be decoded frames left resting in the DPB that have not yet been output because the decoder is still waiting for subsequent input.
To retrieve these remaining delayed frames, the application must
flush the decoder: * Drain Sequence: The application
sends a NULL or empty data packet to dav1d_send_data().
This signals to libdav1d that the end of the stream has
been reached. * Emptying the DPB: The application then
calls dav1d_get_picture() repeatedly in a loop. The decoder
will output all remaining buffered frames in their proper display order
until the function returns EAGAIN with no more frames left
in the pipeline.