Does libdav1d Support AV1 Spatial Scalability?
This article explores whether the popular open-source AV1 decoder,
libdav1d, supports all spatial scalability features defined
in the AV1 video coding specification. It details how the decoder
handles Scalable Video Coding (SVC), manages multiple operating points,
and processes reference picture resampling to deliver scalable video
playback.
Full Support for AV1 Spatial Scalability
Yes, libdav1d supports all spatial scalability features
defined in the AV1 specification. Developed by VideoLAN and the Alliance
for Open Media (AOMedia), libdav1d is designed to be a
fully compliant, highly optimized AV1 decoder. Because spatial
scalability is a core component of the AV1 standard, complete decoding
support is built directly into the library.
How libdav1d Handles Spatial Scalability
Spatial scalability allows a single AV1 bitstream to contain multiple
resolution layers (such as 360p, 720p, and 1080p). libdav1d
manages this capability through several key mechanisms:
1. Operating Points Selection
The AV1 specification uses “Operating Points” to define different
combinations of spatial and temporal layers within a stream.
libdav1d allows client applications to select a specific
operating point. The decoder will then decode only the layers required
for that selected resolution and frame rate, ignoring higher-resolution
layers to save processing power.
2. Reference Picture Resampling (RPR)
Spatial scalability in AV1 relies heavily on Reference Picture
Resampling. This feature allows a higher-resolution frame to use a
lower-resolution frame as a reference by scaling it up during the
decoding loop. libdav1d includes highly optimized assembly
code to handle this scaling process efficiently, ensuring smooth
transitions and predictions between different spatial layers.
3. Inter-Layer Prediction
To improve compression efficiency, scalable AV1 streams use
inter-layer prediction, where higher-resolution layers predict pixel
data from already decoded lower-resolution layers. libdav1d
fully decodes these dependencies, maintaining visual quality while
adhering to the strict decoding dependency chain.
Performance and Optimization
While decoding multiple spatial layers can be computationally
expensive, libdav1d utilizes advanced multi-threading
techniques (both frame-level and tile-level threading) and
platform-specific assembly optimizations (AVX2, AVX-512, NEON). This
ensures that processing scalable AV1 streams remains fast and viable for
real-time applications like video conferencing and live streaming.