Google Docs Billion Laughs

June 8, 2012

This is a writeup of a bug (now fixed) I reported to Google last year.

A billion laughs attack was present in the Google Docs document parser. When the engine would parse a document it would resolve internal entities by expanding them. This eventually earned me a spot on the Google Wall of Sheep, but alas, no reward because it’s a DoS bug and DoS bugs don’t qualify.

With any type of DoS bug on a large service, it’s fairly difficult to determine the exact severity. There might be things that mitigate this somewhat, like there is probably throttling, the APIs might restrict on a per user basis, etc. Regardless, with one request it’s likely that it could tie up a processor for some amount of time (evidence suggests at least a couple minutes, but I suspect a lot more). This was clearly a bug; Google Docs was resolving internal entities without limit which is a CPU intensive operation and requires very little client processing/bandwidth.

Here’s a link 17th order billion laughs document: test17. Unzip and look in content.xml. To increase order of attack, change the entity it points to (eg &a19;, &a20, etc). The attack itself is very straightforward, and for those who don’t want to look at the doc, it just looks something like this, generated mostly from a legitimate ODT file:

<!--?xml version="1.0" encoding="UTF-8"?--> <?xml version="1.0" encoding="UTF-8"?> <!DOCTYPE billion [ <!ENTITY a0 "Bomb!"> <!ENTITY a1 "&a0;&a0;"> <!ENTITY a2 "&a1;&a1;"> ... <!ENTITY a25 "&a24;&a24;"> ]> ... <text:p text:style-name="Standard">Test &a17;</text:p>

Here are some interesting metrics on various laugh sizes:

16th order took about 4 seconds for the app to return

17th order took about 8 seconds for the app to return

18th order took about 20 seconds and the upload is eventually rejected, after several empty files are created with the same name

20th order took about 1:20 and the upload is eventually rejected, dozens of empty files are created (I’m not sure what was going on with that).

21st order never seemed to “finish”

One quick note is I never found anywhere that external entities were resolved. It’s hard to tell for sure because certain egress/chroot-type mitigations could have just helped make this hard to exploit. Correct or not, I sort of suspected the external entity thing might not be a problem. Openoffice resolved internal but not external entities, so (right or wrong) I guessed that it’s somewhat likely that Google is re-using or at least sort of mimicking openoffice’s document parser.

The reporting process was fairly nighmarish, but I put the blame mostly on MSVR rather than Google… not to point fingers, all I know is I never heard much of anything back.

In any serious service I think DoS bugs get a bad rap. I’ve met a lot of people who consider DoS bugs as low severity just based on that classification alone. In reality, I think people tend to care a lot more about when an online service is unavailable. When Sony did everything terribly, did most people care about the data they lost? Some did (I did), but I think most just wanted to start playing games again already. So what’s more severe? A clickjacking bug in Facebook that allows you to do a targeted attack to take over someone’s account, or a DoS bug that can bring down Facebook?

With some imagination, I wonder if something like this could have been used in the right (wrong?) hands to bring down a service as large as Google Docs.