Does libdav1d Support AV1 Spatial Scalability?

This article explores whether the popular open-source AV1 decoder, libdav1d, supports all spatial scalability features defined in the AV1 video coding specification. It details how the decoder handles Scalable Video Coding (SVC), manages multiple operating points, and processes reference picture resampling to deliver scalable video playback.

Full Support for AV1 Spatial Scalability

Yes, libdav1d supports all spatial scalability features defined in the AV1 specification. Developed by VideoLAN and the Alliance for Open Media (AOMedia), libdav1d is designed to be a fully compliant, highly optimized AV1 decoder. Because spatial scalability is a core component of the AV1 standard, complete decoding support is built directly into the library.

How libdav1d Handles Spatial Scalability

Spatial scalability allows a single AV1 bitstream to contain multiple resolution layers (such as 360p, 720p, and 1080p). libdav1d manages this capability through several key mechanisms:

1. Operating Points Selection

The AV1 specification uses “Operating Points” to define different combinations of spatial and temporal layers within a stream. libdav1d allows client applications to select a specific operating point. The decoder will then decode only the layers required for that selected resolution and frame rate, ignoring higher-resolution layers to save processing power.

2. Reference Picture Resampling (RPR)

Spatial scalability in AV1 relies heavily on Reference Picture Resampling. This feature allows a higher-resolution frame to use a lower-resolution frame as a reference by scaling it up during the decoding loop. libdav1d includes highly optimized assembly code to handle this scaling process efficiently, ensuring smooth transitions and predictions between different spatial layers.

3. Inter-Layer Prediction

To improve compression efficiency, scalable AV1 streams use inter-layer prediction, where higher-resolution layers predict pixel data from already decoded lower-resolution layers. libdav1d fully decodes these dependencies, maintaining visual quality while adhering to the strict decoding dependency chain.

Performance and Optimization

While decoding multiple spatial layers can be computationally expensive, libdav1d utilizes advanced multi-threading techniques (both frame-level and tile-level threading) and platform-specific assembly optimizations (AVX2, AVX-512, NEON). This ensures that processing scalable AV1 streams remains fast and viable for real-time applications like video conferencing and live streaming.