profile picture

Beating QOI - Part 1

February 11, 2024 - images rust

This is the first part of a series on how the png crate was able to acheive performance on-par or better than QOI, while remaining fully compatible with the PNG standard.

The purpose of lossless image compression is to reduce the size of an image while retaining the full pixel information. Modern formats generally use the same couple conceptual steps...

The QOI format

The Quite OK Image format (QOI) was proposed as an extremely simple image format. It achieves compression nearly as good as PNG while being many times faster than existing PNG encoders and decoders.

A quick read of the QOI spec might suggest that it is using a very different approach. But in fact, the format actually fits into the same framework.

In QOI, the entropy coding is done using a fixed set of prefix codes all of which are a multiple of eight bytes. A code word starting with 0b11 indicates a run, 0b01 indicates small delta from the predicted value (i.e. the difference from the previous pixel), 0xFF indicates a literal, etc.

Using a fixed set of codes reduces the maximum possible compression ratio, but it turns out to have less of an impact than one might expect. The QOI format achieve quite respectable compression ratios across a wide range of images.

Applying fixed codes to PNG

It turns out that we can use the same strategy for encoding PNGs. Ordinarily, PNG encoding requires two passes over the image data: once to gather statistics to determine the ideal entropy code, and then a second time to process the image data. However, we can instead use a offline process to build a single entropy code trained on a large corpus. At runtime, our encoder can always pick this same code, thereby skipping an entire pass over the image contents.

Unlike with QOI, the resulting codes do not have to be multiples of 8-bits. Small deltas from the predicted value and smaller RLE counts get assigned shorter code lengths because they occur more frequently. Another adavantage of the PNG format is that it allows the "Paeth" predictor which incorporates three prevously seen pixel values and tends to produce slightly better predictions than QOI's simple "previous" predictor (which PNG also supports).

At the same time, there are some limitations. QOI has the concept of a indexed lookup of a previously seen pixel value which PNG lacks. Similarly, to acheive QOI-class encoding performance, we cannot devote much time to finding back-references and so only RLE encoding is used.