July 2013

Please note that republishing this article in full or in part is only allowed under the conditions described here.

Dubious HTTP II - Unusual HTTP Content-Encodings

The Content-Encoding header is usually used to specify a compression of the content. The usual values are either gzip (RFC1952) and deflate (RFC1951). While combining these encodings does not make much sense, the HTTP standard (RFC2616) allows for other Content-Encodings and also allows to apply multiple encodings.

To determine the behavior of the browsers I tested with:

Microsoft Internet Explorer (MSIE) versions 8 and 10

Firefox 22

Google Chrome 28

Opera 12.15 (before WebKit)

Rekonq (KDE project) 2.2.1 - Konqueror (KDE) seems to behave the same

To evaluate the behavior of intermediate systems I let virustotal.com (2013/7/1) check some URLs with unusual content-encodings. I also looked at the source code of common IDS:

Bro IDS 2.1

Snort IDS 2.9.4.6

Suricata IDS 1.4.3

To reproduce the results you might point your browser to my test site or set up your own using my test suite.

Supported Encodings

Microsoft Internet Explorer and Bro IDS seems to support only gzip and deflate,

Firefox, Google Chrome, Opera, Rekonq and virustotal.com support also deflate according to RFC1950, e.g. raw zlib without header and checksum

Firefox, Chrome, Opera, Rekonq, virustotal.com and Snort IDS support x-gzip as an alias to gzip

Opera and Rekonq support x-deflate as an alias to deflate

Suricata IDS supports gzip and x-gzip, but no deflate

Interpretation of Content-Encoding Header

MSIE and Rekonq do not support continuation lines (e.g. Content-encoding:\r

gzip)

Opera and Firefox interpret "Content-encoding: gzip x" as gzip

Firefox also interprets "Content-encoding: x gzip" as gzip

Mismatch Between Specified and Real Encoding

if an encoding of deflate is specified, but gzip content is provided, MSIE8, MSIE10 and Opera detect it and apply gzip decoding, but specififying gzip and using deflate is not detected

the others don't try to guess encoding

virustotal.com reports invalid data if real and specified encoding don't match

Stacking of Multiple Content-Encodings

Chrome, Firefox and Rekonq can handle multiple encodings, like double gzip or deflate followed by gzip

the rest only understand a single encoding, but Opera tries to use the first encoding virustotal.com tries to use the latest encoding MSIE8 and MSIE10 ignore any encoding and will use the raw data based on looking at the source code Bro IDS and Snort IDS seem to use the latest encoding, while Suricata uses the first (but Suricata only knows about gzip/x-gzip anyway)



Behavior on Unknown Encodings

virustotal.com reports invalid data if Content-encoding is not gzip, x-gzip or deflate

the tested IDS ignore any content-encoding header they don't understand without even logging the problem

all browsers ignore the content-encoding header if they don't know the encoding, thus using the raw data

Transfer-Encoding versus Content-Encoding

the HTTP standard explicitly allows compression using the Transfer-Encoding header, but only Opera and Rekonq support it (gzip and deflate)

Conclusion

If an attacker has full control over a web server serving malware, he can use Content-Encoding or Transfer-Encoding to easily bypass security systems.