Radio

An up-to-date objective AV1 benchmark

Posted at , last edit:

!☕ Last April I rented a large virtual machine in the Amazon Web Services cloud to run some AV1 benchmarks. They test the aom, SVT-AV1 and rav1e AV1 encoders against the x264 H.264/AVC encoder. The versions used are very recent development versions, pulled directly from source code repositories. Benchmark information and results follow !

AV1 logo
The AV1 standard was released in 2018 and nowadays is supported by all major web browsers, except for Safari. Youtube is a major provider of AV1-encoded content, while Netflix and Facebook also provide some video content encoded with AV1.

Benchmark information

  • Source content: crowd_run sample, available from Xiph.org :: Derf’s Test Media Collection, 1920x1080 50 Hz , YUV, 8-bits per pixel, progressive, 500 frames, 10 seconds (readme)
  • Benchmark type: objective metric
  • Encoding mode: single pass, constant quality mode
  • Machine: AWS EC2 c5.9xlarge spot instance (Intel Xeon Scalable Cascade Lake, 36 vCPUs, 72 GiB RAM, all-core turbo frequency 3.4 GHz, with AVX-512)
  • Operating System: Ubuntu Server 20.04 LTS (HVM), SSD Volume Type, 64-bit x86
  • Encoder front-end: ffmpeg
  • Default encoding parameters: 2x2 tiles, multithread enabled
  • Quality metric: VMAF with default model file vmaf_v0.6.1.json
  • Encoding date: April 12 to April 15, 2021
  • Instance usage time: 73.9 hours (includes restarting some rav1e encodings)
  • Instance cost information: region US East (Ohio), spot price US$ 0.342 per hour
  • Total cost: US$ 29.81 (includes data transfer, data storage, extra t2.micro instance and taxes)

Important notes:

  • There are no timing results on these benchmarks. Sorry, there will be next time.
  • Many rav1e benchmarks with speed 0 are missing because the spot instance was shut down in the middle of the execution. Again, this will be left for a next time.
  • There was a pause in commits in the SVT-AV1 repository, so its commit date is from February instead of April.

The list of versions of custom built components used, with commit ID and commit date, follow. The Rust compiler version 1.51.0 was installed with rustup. All other components required by ffmpeg, including libx264 were installed from the Ubuntu package repository.

component commit ID commit date
aom v3.0.0-gd6f767b4 2021-04-05
dav1d 0.8.2-gae8958bd 2021-04-02
ffmpeg n4.5-dev-gb972dab3 2021-04-09
rav1e v0.5.0-alpha-g02106e0b 2021-04-08
SVT-AV1 v0.8.6-76-g44486d23 2021-02-10
vmaf v2.1.1-ge2373266 2021-03-23

The following list of encoder configurations were benchmarked (the full command lines of all encodings are available in the encoding script’s state file:

encoder preset parameter quality parameter
aom -cpu-used 0 to 6 -crf 25, 30, 35, 40, 45, 50, 55
aom none (use defaults) none (use defaults)
rav1e -speed 0 to 10 -qp 70, 80, 90, 100, 110, 120, 130, 140, 150, 160, 170, 180, 190, 200
rav1e none (use defaults) none (use defaults)
SVT-AV1 -preset 0 to 8 -qp 30, 35, 40, 45, 50, 55
SVT-AV1 none (use defaults) none (use defaults)
x264 -preset ultrafast to placebo -crf 17, 20, 23, 26, 29, 32
x264 none (use defaults) none (use defaults)

Parameter characterization

Before the actual benchmarks, this section shows some results that relate the encoders parameters and actual results in bit rate and VMAF score. These are useful to better situate each encoders (abstract) parameter values with real world metrics. You may skip this section and go straight to the benchmarks if you prefer.

aom preset vs bitrate aom preset vs VMAF aom quality vs bitrate aom quality vs VMAF

rav1e preset vs bitrate rav1e preset vs VMAF rav1e quality vs bitrate rav1e quality vs VMAF

svt-av1 preset vs bitrate svt-av1 preset vs VMAF svt-av1 quality vs bitrate svt-av1 quality vs VMAF

x264 preset vs bitrate x264 preset vs VMAF x264 quality vs bitrate x264 quality vs VMAF

Benchmarks separated per encoder

These are actual benchmarks, containing the encoding curves per preset. Individual symbols in the chart represent one of the preset x quality combination tested and the curves are interpolated from them. Encoding without any extra parameters is indicated as the “default” symbol. Zoomed in charts are plotted with the same data, but with limited VMAF x bitrate range to the highest quality region (VMAF above 85).

aom bitrate vs VMAF aom bitrate vs VMAF zoomed in

rav1e bitrate vs VMAF rav1e bitrate vs VMAF zoomed in

svt-av1 bitrate vs VMAF svt-av1 bitrate vs VMAF zoomed in

x264 bitrate vs VMAF x264 bitrate vs VMAF zoomed in

Consolidated multiple encoder benchmark

Finally, here follows the consolidated chart, with selected curves from each encoder. The x264 “slower” preset curve is used as the benchmark: the chart separates encoding worse and better than it.

multiple encoder VMAF vs bitrate

Conclusion

In this benchmark, most work revolved around creating the benchmark scripts and researching a cheap and powerful enough cloud server to run the benchmarks in reasonable time. For the next steps, some tweaks in the encoder configurations might be useful for removing encoding modes with too low quality and adding modes with very high quality. Other quality metrics might also be used and different encoding modes too. Different videos and source types are a must.

The scripts used to run the benchmarks and generate the charts are available on gitlab. The generated metadata for all the encoded files, including codec and container information and encoder settings is available on this site in file mediainfo.json. The data files for the charts and the complete VMAF log file for all the encoded frames is available in the benchmarks directory.