Yestermorning I presented an ad hoc compression method known as RLE, or run-length encoding and I got quite more reactions than I thought I would. Many suggested doing plenty of other stuff despite the fact that the post was about RLE, not about <insert your favorite compression method>. But something very nice happened: people got interested and many suggestions came forth.

I decided to implement the median-based prediction method as suggested by Dark Shikari here. Let us see the results!

The median prediction, as used in the FFV1 lossless video codec that uses only intra-frame coding, uses the following context:

The prediction is given by:

Which means that if color images are considered, the predicted color of the pixel ‘?’ is given by the median of the colors. In RGB (or more like some other YUV-like color space), that means that the prediction is applied on each color plane separately. The difference with the actual color and the predicted color is then encoded.

Using “continuous” colors, the residual (the difference) is likely to be smaller in magnitude than the original range of the colors; using an efficient code yields compression.

What happens when you have a palette image? As Dark Shikari suggested, let us use the color in the palette that is the closest to the prediction as the predicted color. But that leaves the problem of encoding the error. Palette images have a problem in this regard: colors are not necessarily sorted in any smart way! Encoding the error pretty much means encoding the correct palette index.

To circumvent this problem, I use the same technique as I used with the simpler context prediction of yesterday; I compute the ranking for each color in the palette at each pixel. The closest to the prediction gets rank 0, the furthest, rank 3.

To measure the difference with the palette colors and the predicted color, I tried two approaches. One is based on the approximate perceived brightness difference, where the difference is given by the formula . The second is to count the number of components that differ, something like . I labeled the first “median-luma” and the second “median-diff”.

The “media-luma” method yields an average compression ratio of 1.77:1, and the “median-diff” method 1.82:1, just wee short of the context-prediction that yields 1.9:1. The median prediction is followed by the same steps as the context predictor; so results compare directly the quality of prediction. Why median-diff does a bit better than the median-luma method, I am not sure; intuitively I would say that the bitmaps exhibit some color structure that makes some color components (not palette entries, but separate components such as R, G, B) more likely to be the same as they are pretty much primary-colored. What’s your take on this?

*

* *

So the important point here is not that RLE is state-of-the-art (because it certainly is not) but that compression algorithms perform better when there are multiple stages of analysis, prediction, and encoding. A good prediction or transform helps the following stages exploit redundancy. In our case, we have median or context prediction that transforms the original pixel stream into a ranking stream, which, it turns out, has more repetitions than the original stream. The repetitions are then exploited by an auto-trigger RLE encoder, whose output is further encoded by a code reminescent of Huffman codes.

And most codecs are structured like that, at the global scale—although a lot more complex.

Share this: Reddit

Twitter

More

Facebook

Email



Like this: Like Loading... Related