[ Beneath the Waves ]

OTORI - Exploring the Linux Filesystem

article by Ben Lincoln


This article describes security testing-related software whose use may be restricted or prohibited in your place of residence or your workplace. The penalties for violating laws and regulations regarding security testing-related tools can be severe. Ensuring that you are allowed to use this software is your responsibility.

The Linux filesystem includes all sorts of handy features that can help you find out more about potential target systems in an authorized pen-test, and even discover vulnerabilities in cases of poorly-written or improperly-configured software. This is not an exhaustive list, but it does include my favourites (besides the obvious ones like /etc/shadow). Some of this content also exists on other Unix-like operating systems, of course, but this writeup will assume a Linux host. The specific focus is on files that can be retrieved via XXE, and even more specifically on retrieving those files using On The Outside, Reaching In.

Table of contents

  1. Processes: Command Lines and Environment Variables
  2. Processes: File Descriptors
  3. Processes: Executables
  4. Mounted Filesystems
  5. Symbol Tables
  6. System Information

Processes: Command Lines and Environment Variables

For a process with ID N:

Any or all of these may contain credentials, paths (allowing targeted retrieval of files), and other sensitive/valuable information. Unfortunately, the presence of the null bytes in the two most interesting files means you'll probably need to retrieve them using a On The Outside, Reaching In module which targets a PHP-based application, as the Java XML parser will reject those special characters.

I know what you are probably thinking - "It's 2014 (or later)! Surely no one would be foolish enough to put sensitive information like passwords in the command line for starting a service." That's where you are wrong, my friend! For example, up until version 1.9.0 (released in February of 2014), reSIProcate (specifically, the Repro component) had its web administration password set via the command-line at startup. I have personally seen such a version used in a fairly well-known "enterprise" software package.

In addition, a lot of people still write batch scripts (for both Unix/Linux and Windows®) which accept credentials as parameters. If you can view the command line for those batch scripts while they're executing, you've just captured those handy credentials without having to do a true memory-scraping attack against that process.

Both of these cases are easy to find with shell access to a system, but if all you have is access to service vulnerable to XXE attacks, they can also be captured that way — which may remedy your lack of shell access.

On The Outside, Reaching In 0.2.1 and later include three files which attempt to do a blind reads of some of this content for every process ID from 1 to 32768:


This file will attempt to capture the following content from within each process' directory. The list is a bit long, so it will take awhile to run — about 10 seconds for each process when using the SOS-12-007-YU-OOB module, if you're lucky. That means a complete harvesting will probably take 3-4 days or longer.


This abbreviated file will attempt to capture the following content from within each process' directory. This typically runs much faster than the full list — when using the SOS-12-007-YU-OOB module, I was able to get a rate of about 2-3 seconds per process ID directory. You'll probably want to let this run over a weekend.


This file captures only the two files most likely to include information that would be valuable in compromising the system, combined with the files being likely to exist. The single exception is the mount information files, which are not included because they're usually identical across all processes anyway. When using the SOS-12-007-YU-OOB module, I was able to get a rate of about 0.5 - 1 second per process ID directory. At that speed, it will take something like 5-9 hours to spin through all 32768 process IDs.

With any of these lists, you'll want to have a find . -type d -empty -delete command at the ready to clean up the small mess afterwards.

If you're using a module that supports recursion, you can have it walk the directory structure in a much more efficient manner. For example:


python ./otori.py --clone --module "CVE-2013-6407-DARH" --rooturi "file:///proc/" --module-options "http://repro.vuln.local:8983/solr/analysis/document" --uriblacklist "./data/uri_blacklist-linux-proc_exploration.txt" --outputbase "./output-solr-proc" --overwrite --noerrorfiles --noemptyfiles --nowhitespacefiles --noemptydirs --maxdepth 7

...however, because most of the interesting files contain at least one non-ASCII character, this is likely to return fewer useful results than one of the modules that can return binary data.

Command Line
[ Repro - admin password ]
Repro - admin password

The full command used to launch an instance of reSIProcate's Repro component — including the password — retrieved using the SOS-12-007-YU-OOB module against a Squiz Matrix instance on the same target system.


Processes: File Descriptors

Any time a process on Linux opens a file handle or a network connection, it consumes a "file descriptor". These are tracked as symlinks from within the fd subdirectory within each process' directory in /proc. If you can read them (e.g. using the lists/linux-process_information.txt list mentioned above), you may be able to obtain valuable data. Generally these files are only accessible to root, so unless the server is badly misconfigured, you probably won't be able to read most of them, although for some processes they are world-readable.

File Descriptor Symlinks
[ Retrieved file content ]
Retrieved file content

An example of an open file retrieved using an XXE attack.


Processes: Executables

The exe symlink within each process' directory in /proc points to the actual executable for the process. If it's binary and it's very small, then the modules based on vulnerable PHP applications (Squiz Matrix and Mahara) may be able to retrieve it. These files could easily contain useful information, but usually the constraints on the various modules prevent them from being retrieved, at least on my test systems.

Mounted Filesystems

/proc/mounts, /proc/N/mounts (where N is a valid process ID), and /etc/fstab each contain information about filesystems accessible to the OS or process. For local filesystems, this information may be of lesser use to an attacker, but if it is mounted over the network, it's possible to obtain at least some information regarding the existence of another machine on the network, and possibly the credentials used to authenticate to it.

Mounted Filesystems
[ The mounts file for a process ]
The mounts file for a process
[ System-level /etc/fstab file and credentials ]
System-level /etc/fstab file and credentials

An example of sensitive data retrieved about mounted filesystems using XXE. The mounts file for a process reveals the UNC path to a Windows fileshare and the username used to connect to it. The /etc/fstab file includes the UNC path and the path to the file containing the credentials used for access. If that credential file is accidentally left accessible to the user under which an XXE-vulnerable app is running, it can be read remotely (as shown in the second screenshot).


Symbol Tables

If you spend any time looking at the Linux-related code in the Exploit Database, you'll notice that a lot of it depends upon reading symbol information from /proc/kallsyms. The more flexible code will also use files in /boot whose names begin with System.map (/boot/System.map-3.5.0-23-generic, /boot/System.map-3.5.0-45-generic, etc.).

Most relatively recent versions of Linux will display the content of /proc/kallsyms with all addresses zeroed-out for anything other than the root account. However, my experience is that the map files in /boot will display the real addresses for all users. Not all distributions allow access to these files for accounts other than root, but some (e.g. Red Hat) do.

Most of the time, this information will probably be trivia, but if you ever find yourself in a position to execute a small amount of code on the server but need to know the symbol information in advance, maybe there is e.g. a vulnerable installation of Apache Solr running that you can use to retrieve these files.

System Information

While it may not be as likely to help you compromise a system, a variety of files will help you figure out what sort of system it is, and what capacity it has.

[ Page Icon ]