OTORI - Example 1: Apache Solr

article by Ben Lincoln

This article describes security testing-related software whose use may be restricted or prohibited in your place of residence or your workplace. The penalties for violating laws and regulations regarding security testing-related tools can be severe. Ensuring that you are allowed to use this software is your responsibility.

The instructions in this tutorial are slightly less-detailed than for other articles on this website. This is because the software described should not be used if you are unfamiliar with basic- to intermediate-level use of Linux. Proceeding without having that knowledge is very likely to result in damage and/or loss of data.

Apache Solr was the software which introduced me to the exciting world of XXE vulnerabilities, and to this day it's still one of the most useful systems that I know of which has this type of problem.

Although it can only be used to access ASCII text which does not contain XML markup, there are (to my knowledge) no restrictions on the length of files that can be retrieved.
All variations of the vulnerability are very easy to exploit.
Solr has zero security controls built in (project statement: "First and foremost, Solr does not concern itself with security either at the document level or the communication level."), so it is usually possible to exploit without knowing any credentials. This makes it an excellent initial entrypoint for a pen-test.
As a general-purpose search engine, it is used in a variety of other open-source packages and commercial products, so it is somewhat widely-available.
At least one of the three related vulnerabilities exists in a large number of versions - versions from 1.4.1 (2010-06-25) to 4.3 (2013-05-06) can be exploited.

In addition, the Solr developers helpfully included a prebuilt example system with every release, which makes Solr by far the easiest system to learn the basics of On The Outside, Reaching In. Some developers of commercial products have also appreciated this example system so much that they've incorporated it with minimal/no changes into their own systems, so this will also be a very realistic exercise.

Test Environment Systems

You'll need two systems for this exercise:

The attacker/pen-testing system. This is where On The Outside, Reaching In will run. This tutorial was written and tested using a vanilla Debian 7 x64 VM, as well as a Kali Linux 1.0 x64 VM for this role. Both were patched at the current level when this document was written.
The target system. This tutorial was written and tested using Ubuntu 12.04 x64 for this role. The Solr modules were re-tested on 21 June 2014 with libxml2 version 2.7.8.dfsg-5.1ubuntu4.9 installed on the target system to ensure they still worked as expected.

I strongly recommend that you build VMs specifically for this testing, or at least snapshot existing VMs before you begin, so that if something goes wrong, you won't lose any data/work. This software is a pre-alpha preview of a tool designed for penetration testing (in other words, it actively misuses its target(s)). Take appropriate safety precautions.

You can probably make Windows® work for one or both systems, but I have not tested this.

On the attacking systems, On The Outside, Reaching In was run using Python 2.7.3, because it was the default Python version for Debian 7 x64 (including Kali 1.0) as of this writing. It might work with Python 3, or it might not.

Test Environment Setup

Setting up the attacking system is easy — just unpack the current release of On The Outside, Reaching In.

For the target system:

Make sure you have a JRE installed. If you run the command java -version and get an error or get a version earlier than 1.6, install or upgrade the JRE. On Ubuntu and Debian, you can do this by running (using sudo or as the root account) apt-get install openjdk-7-jre or apt-get install openjdk-6-jre depending on whether you want Java 6 or 7. This tutorial was written and tested using Java 6, but 7 should be fine too.
Download one of the vulnerable versions of Solr. All historical releases of Solr are available from apache.org. I recommend starting with 1.4.1, because it's the smallest download, and it is vulnerable to all three modules described below. 4.0.0 is the most recent version which is vulnerable to all three. If you only want to test the CVE-2013-6407-DARH module, you can go as high as 4.3.0.
Unpack the archive. In my case, the content ended up in /home/blincoln/solr/apache-solr-1.4.1, and this will be referred to as the "Solr base directory" for the remainder of the tutorial.
cd into the Solr base directory, and then into the example subdirectory.
Run this command to start the example Solr instance:

java -jar start.jar
You should see a long chain of log entries. At the end should be one that includes the interface Solr is listening on. If you followed these instructions exactly, it will read Started SocketConnector @ 0.0.0.0:8983.

Preparing to Exploit Solr

Screenshots of key steps are below the list of instructions.

In a web browser on the attacking system, navigate to http://TARGET:8983/, where TARGET is the name or IP address of the target system.
You should see an HTTP 404 error page, but this is a special, helpful 404 page which gives you a link to the Solr web interface. Go ahead and click that link.
If you are testing against Solr 1.4.1, then literally the only thing you can do via the web interface is perform administrative functions by clicking the Solr Admin link. Of course, you should go ahead and do that.
Yes, by simply clicking on that link, you now have unrestricted access to Solr's web administration interface. When the Solr developers say "First and foremost, Solr does not concern itself with security either at the document level or the communication level.", they mean business! Feel free to explore the system - if you are using version 3.6.2 or earlier for this tutorial, you should be able to find at least two cross-site scripting bonus exploits.^[1]
All three modules described below require that you know the paths to the vulnerable request handlers. There are a couple of ways to determine this, but the easiest is to read the Solr configuration XML file via the web admin interface. In older versions of Solr, you can simply click the config link (in the example instance, this will take you to /solr/admin/file/?file=solrconfig.xml). In newer versions with the fancier dashboard, you may have trouble finding a link that will take you to that file, so just go ahead and try changing /#/ in the URL to /solr/admin/file/?file=solrconfig.xml. For example, if the main dashboard URL was http://172.16.17.2:8983/solr/#/, then change it to http://172.16.17.2:8983/solr/admin/file/?file=solrconfig.xml, and you should be taken to the configuration file.
Search in the text file for DocumentAnalysis, and at least one hit should be a requestHandler definition that looks something like this:

<requestHandler name="/analysis/document" class="solr.DocumentAnalysisRequestHandler" startup="lazy"/>
The value of the name parameter (in this case, /analysis/document) is the URI stem which is bound to Solr's Document Analysis Request Handler - the handler which is vulnerable up to and including version 4.3.0. In other words, if the target system's base Solr URL is http://172.16.17.2:8983/solr/#/, then one URL for a Document Analysis Request Handler is http://172.16.17.2:8983/solr/analysis/document. Make a note of this, because it will be important later. Solr can have multiple bindings to the same request handler, so especially if you are dealing with a real-world deployment, it may very well be something other than /analysis/document that is vulnerable, which is why it's important to know how to locate the correct URI stem.
While you're looking at the file, do another search for AdminHandlers. The URI stems you'll find aren't important for this tutorial, but are definitely worth exploring if you are actually pen-testing a Solr instance.^[2]
If you want to try out the CVE-2013-6407-URH-DI module, you'll need to know the URI stem for Solr queries. In older versions, you can simply submit the query *:* from the main admin page and note the URL you're taken to (probably something like http://172.16.17.2:8983/solr/select, with a bunch of parameters after that). For newer versions, it's probably just /select after the base Solr URL (e.g. http://172.16.17.2:8983/solr/select), but please let me know if there's a better way to determine this.
If you want to try out the CVE-2013-6407-URH-DI module or the CVE-2013-6407-URH-NMVF module, you'll need to know the URI stem for Solr's Update Request Handler. In every installation I've seen, this handler was simply /update (e.g. http://172.16.17.2:8983/solr/update), but if anyone knows of a way to determine it from the web interface, please let me know.
At this point, you should have all the information you need to use any of the three Solr-related modules, which are described below.

Preparing to Exploit Solr - Screenshots

Most helpful 404 page evar

The initial Solr page in versions up to 3.6.2

The Solr admin interface, including the query *:*

DocumentAnalysisRequestHandler binding in solrconfig.xml

AdminHandlers binding in solrconfig.xml

Finding the query handler (note the URL) in older versions of Solr

The Solr web interface in newer versions (4.3.0, etc.)

Illustrations related to the preceeding section. As you can see from the query test screenshot, this particular instance of Solr was used for testing early versions of the module which failed to delete at least one document.

CVE-2013-6407-DARH - DocumentAnalysisRequestHandler Module

OK — let's start accessing some files we shouldn't be able to get to!

The CVE-2013-6407-DARH module is (for most purposes) the best Solr module included with On The Outside, Reaching In. It's also very easy to use. We'll start out with the basic example of obtaining the /etc/passwd file from the target system, because knowing the names of its users can be very helpful.

In your notes from the previous section, look up the full URL for the Document Analysis Request Handler that you found by examining solrconfig.xml. For the example instance, this will be something like http://172.16.17.2:8983/solr/analysis/document.
Determine the base path that you'd like On The Outside, Reaching In to drop output files into. It will recreate (as closely as possible) the directory structure from the target system. For purposes of this tutorial, we'll have it use a subdirectory of the current directory called output-solr. If that directory doesn't already exist, that's fine — it will be created automatically.
On the attacking system, cd into the directory where you unpacked On The Outside, Reaching In, then run this command - substituting your own values for the two just mentioned:

python ./otori.py --clone --module "CVE-2013-6407-DARH" --singleuri "file:///etc/passwd" --module-options "http://172.16.17.2:8983/solr/analysis/document" --outputbase "./output-solr" --overwrite --noemptyfiles
Almost immediately, you should see a result like the one in the first illustration, below - a single successful file-cloning operation. If you received an error, make sure you use the correct URL for your own simulated target system.
If you browse through the output directory structure (./output-solr/etc/ in this case), you should find the /etc/passwd from the target system.

Well, that was neat, but the real power of On The Outside, Reaching In is in its ability to automate a large number of this type of request. In particular, because Solr is Java-based, On The Outside, Reaching In can "walk" a directory structure for you, recreating it (as best as possible based on which files can and cannot be retrieved) on the attacking system. In most cases, it's perfectly possible to point it at the root of the target server's filesystem and let it go, but this will usually take many hours to complete (and will generate enough traffic that an attentive administrator or IDS/IPS may notice your activities), so a more targeted attack is advisable if possible.

From examining the /etc/passwd file you just retrieved, determine the base path on the target system for users' home directories. When writing this tutorial, that base path was /home, as it is on many (but not all) Unix/Linux systems. Convert that path into a URI — in this case, file:///home/.
Run the following command to clone as much of the /home directory structure as possible from the target system:

python ./otori.py --clone --module "CVE-2013-6407-DARH" --rooturi "file:///home/" --module-options "http://172.16.17.2:8983/solr/analysis/document" --outputbase "./output-solr" --overwrite --noemptyfiles --noemptydirs
This operation will take longer than the first example. In addition to the obvious reason that On The Outside, Reaching In is attempting to download a number of files instead of just one, there is another hitch: although Java's XML library allows for directory listings to be obtained (this is the foundation upon which the dynamic recursion is based), it does not provide a way of determining whether the text it returns is actually a list of files or the content of a file.^[3] Therefore, the method that On The Outside, Reaching In uses is (when operating in recursive mode) to treat most results as if they were directory listings, and attempt to request files accordingly. If more than a certain number of results come back indicating that no such file exists, then that particular result will be treated as a file instead of a directory listing. Obviously, this is fairly noisy and — especially for elaborate directory structures — can slow down the operation.
In my case, I got lucky and found a home directory (/home/chibi/ with insecure file permissions — I was able to retrieve the SSH key files for that user. Most SSH key files are stored as base64-encoded data, which means that they can be successfully retrieved by the Solr modules because they only contain ASCII text.

If system administrators are at all diligent, you probably won't be able to retrieve SSH private keys for other users in a real pen-test. But what is very likely — at least in larger organizations — is that Solr will be running as a service account, and you will be able to retrieve any SSH keys in the home directory for that service account.

However, let's say you came up empty, and it's getting late in the day. You don't mind kicking off a long-running process, because you're going to get some sleep and start actively pen-testing again in the morning.

First, build up a list of directories you want to be sure to ignore. Ideally, you would grab everything, but there are a few directory trees on Linux that can turn into rabbit holes that will slow you down, at least if you let On The Outside, Reaching In run to deep levels. There is a basic pattern file included with On The Outside, Reaching In (data/uri_blacklist-linux-basic.txt) which will exclude /dev/, /sys/, and most of the contents of /proc/, and that file will be used for this example^[4]. Of course, you can create your own as well — it's just a list of regular expressions.
Run the following command, which will perform another recursive file-cloning operation from the root of the target server's filesystem, but excluding the paths previously mentioned, limiting the maximum depth of recursion to 10 directories down in any given tree, and discarding any files which contain only whitespace:

python ./otori.py --clone --module "CVE-2013-6407-DARH" --rooturi "file:///" --module-options "http://172.16.17.2:8983/solr/analysis/document" --outputbase "./output-solr-hail_mary" --overwrite --maxdepth 10 --uriblacklist "./data/uri_blacklist-linux-basic.txt" --noemptyfiles --nowhitespacefiles --noemptydirs
Several-to-many hours later (depending on the complexity of the target filesystem), you should have a recreated directory structure that you can use to perform one of my favourite Unix/Linux pen-testing techniques against — grepping for passwords! Unix and Linux admins can rarely resist putting at least one password for a privileged account in plaintext somewhere on a server, and since this operation ran in the context of the account used to run Solr, it may very well have access to that type of file.

CVE-2013-6407-DARH - Screenshots

Retrieving the /etc/passwd file

[ Retrieving as much of the <b>/home/chibi</b> directory structure as possible ]

Retrieving as much of the /home/chibi directory structure as possible

Retrieved SSH private key

Illustrations related to the preceeding section.

One final example scenario for this module: the use of the memory-exhaustion denial-of-service functionality. I am not normally a fan of DoS attacks, because they tend to remove possibilities instead of opening new ones. However, in certain cases, they can be helpful. Perhaps you have a way to change the configuration of a running service, but you don't have the ability to restart that service to cause the change to take effect — causing that service to stop responding may prompt a system administrator to restart it for you. Maybe you need to create a distraction by causing the outage of a system you're not interested in to divert attention away from your other activities. You may also be a maniac who can't be reasoned with, on a mission to break as many things as possible in order to obtain the maximum possible number of "lulz".

The Solr modules are also the only ones (as of this writing) which can generally be counted on to reliably work in DoS mode, so this is as good a place as any to demonstrate them.

While you are testing this functionality, I strongly recommend having a kill -9 command at the ready on the target system so that you can immediately shut down Solr once the out-of-memory condition has occurred. This will save you some time, since typing commands may become very tedious with that system under load.

First, the "Billion Laughs" method:

On the attacking system, run the following command:

python ./otori.py --dos-lulz --module "CVE-2013-6407-DARH" --module-options "http://172.16.17.2:8983/solr/analysis/document"
If you are monitoring memory usage on the target system, you should see the Java process' memory jump way up. If everything worked as intended, the Solr web application will also be unresponsive, although this depends on the target system's patch level and configuration.

Second, the "Quadratic Blowup" method:

On the attacking system, run the following command:

python ./otori.py --dos-quad --module "CVE-2013-6407-DARH" --module-options "http://172.16.17.2:8983/solr/analysis/document"
Again, if you are monitoring memory usage on the target system, you should see the Java process' memory jump, but a true DoS condition may be elusive depending on a variety of factors.
Sometimes it can be helpful to inch closer and closer to the memory limit of the target application. In addition to the string which is repeated in memory (controlled using the --dos-string parameter), the two denial-of-service modes both have two fine-tuning values. For --dos-quad, these are --dos-quad-base and --dos-quad-mult. For --dos-lulz, they are --dos-lulz-base and --dos-lulz-exp. Tuning either or both of these values will change the memory-usage estimates displayed in the On The Outside, Reaching In output.
If you haven't already caused Solr to stop responding, try running the following command a few times to see if you can bump the Java process' memory usage up closer to its limit:

python ./otori.py --dos-quad --module "CVE-2013-6407-DARH" --module-options "http://172.16.17.2:8983/solr/analysis/document" --dos-quad-base 1000 --dos-quad-mult 1000

CVE-2013-6407-DARH - Denial-of-Service Screenshots

Normal Java memory use for Solr

Billion Laughs

Post-Lulz Java memory usage

Quadratic Blowup

Post-QB Java memory usage

Illustrations related to the preceeding section.

CVE-2013-6407-URH-DI - UpdateRequestHandler / Document Insert

The CVE-2013-6407-URH-DI module works by attempting to insert a valid document into Solr, read it back, and then delete that document. Of course, the document's contents include the text obtained via the XXE attack. If that text includes something that Solr can't store (binary data, or unescaped HTML/XML-style markup), then the insert operation will fail.

Since this module is making use of Solr's core functionality, it is more likely to go unnoticed than the other two modules in this tutorial. However, it is also noticeably slower. It will also only work against Solr versions 4.0.0 and earlier, unlike the CVE-2013-6407-DARH module.

Using the CVE-2013-6407-URH-DI module is very similar to CVE-2013-6407-DARH, but you will need two different URLs from your initial notes: the URL of the Update Request Handler, and the URL of the Solr query interface. If you are using the example instance, these are almost certainly something like http://172.16.17.2:8983/solr/update and http://172.16.17.2:8983/solr/select, respectively.
All of the other arguments to On The Outside, Reaching In are the same as for the CVE-2013-6407-DARH module. For example, to attempt to download the entire /opt directory tree (to a maximum depth of 10 directories deep), the command would be:

python ./otori.py --clone --module "CVE-2013-6407-URH-DI" --rooturi "file:///opt" --module-options "http://172.16.17.2:8983/solr/update" "http://172.16.17.2:8983/solr/select" --outputbase "./output-solr" --overwrite --maxdepth 10 --uriblacklist "./data/uri_blacklist-linux-basic.txt" --noemptyfiles --nowhitespacefiles --noemptydirs

Again, all of the other arguments are exactly the same, so you can request individual files, attempt a denial-of-service, and so on. All that must be changed is the module name, and its options (in this case, the CVE-2013-6407-URH-DI module requires two URLs instead of the one required by the CVE-2013-6407-DARH module.

CVE-2013-6407-URH-NMVF - UpdateRequestHandler / Non-Multi-Valued-Field Module

The CVE-2013-6407-URH-NMVF module is a modified, cut-down version of CVE-2013-6407-URH-DI. It makes a request to insert a document, but the document is crafted to violate a constraint limiting the number of values for a particular field to one. The XXE-based content is reflected back to the client in the error message, and no document is actually inserted into the Solr index. It is a very close equivalent to CVE-2013-6407-URH-DI in terms of speed and noisiness of errors in the Solr logs, but the vulnerability only exists up through version 4.0.0 of Solr (version 4.3.0 for CVE-2013-6407-DARH). Unless access to the Document Analysis Request Handler has been disabled, you should probably use CVE-2013-6407-DARH instead of CVE-2013-6407-URH-NMVF. However, if you would like to practice:

Only one URL is required from your notes - the URL of the Update Request Handler (http://172.16.17.2:8983/solr/update, in the example).
All of the other arguments to On The Outside, Reaching In are the same as for the CVE-2013-6407-DARH and CVE-2013-6407-URH-DI modules. For example, to attempt to download the file /etc/hosts, the command would be:

python ./otori.py --clone --module "CVE-2013-6407-URH-NMVF" --singleuri "file:///etc/hosts" --module-options "http://172.16.17.2:8983/solr/update" --outputbase "./output-solr" --overwrite --noemptyfiles

Footnotes

1.	One of them is in analysis.jsp (/solr/admin/analysis.jsp in the example instance). The other is in debug.jsp (/solr/admin/debug.jsp in the example instance).
2.	You can also unpack the solr.war file for the version of Solr you're testing (e.g. from /example/webapps under the base Solr directory on the target system) and examine the JSP files to find more paths - not all of the content is linked from the main web admin interface.
3.	For example, if there is a file named /home/chibi/somefile.txt on the target system, then both file:///home/chibi/somefile.txt and file:///home/chibi/somefile.txt/ are URIs which will cause Java to retrieve the contents of that file.
4.	A targeted scraping of /proc/ can be very useful - see OTORI - Exploring the Linux Filesystem.