In the graphs above, aomdec stops scaling after 6 threads for the Chimera video (from Netflix) and doesn’t scale much further after 2 threads on the Feels Like Summer video (from Youtube/Google). dav1d maintains high-scaling after that. For a more extreme example, we tested on a 32-core AMD Epyc processor:

aomdec reaches it’s peak performance at 8 threads, while dav1d doesn’t stop scaling until over a thousand threads. This results in hugely improved performance on systems with many cores, which are more and more common these days.

While the scaling is great, a small weakness of dav1d is revealed in the process. Scaling should stop at 64-threads, but it doesn’t. This means that one tile or frame thread isn’t able to fully utilize a CPU core or thread, which means a lot of threads are needed and that makes it kind of guesswork how many threads of which type are needed. It would be an improvement if this process could be handled backstage.

Bit depths and chroma subsampling

dav1d 0.1.0 focused on 8-bit color depth and 4:2:0 chroma subsampling performance with the assembly code since almost all content uses that format right now. All color depths (8, 10 and 12 bit) and chroma formats (4:0:0, 4:2:0, 4:2:2 and 4:4:4) are supported in dav1d however, and here is a quick overview of its performance.

As you can see 8-bit performance is very high in all chroma formats and is more than twice as fast as aomdec. Moving to 10- and 12-bit color depth we see performance degrading fast, but 4:0:0 (monochrome) and 4:2:0 are still faster than aomdec, while 4:2:2 is on par. Only 4:4:4 shows significant lower performance than aomdec.

This benchmark was run on Zen which is slightly biased towards dav1d, on Intel Haswell/Skylake aomdec’s performance is relatively a little higher.

Overall performance

Looking at some of the most viewed content, which is all 8-bit 4:2:0, we see that dav1d is quite fast. 1080p 120fps, 1440p 60fps and 4k 30fps is no problem on a 5-year old mid-range quad-core processor, and 1080p 60fps and 1440p 30fps will playback fine on any dual-core with AVX2 (except maybe the extreme low-power versions).