With all the web applications out on the internet today, and especially the ones built and configured by novices, it's easy to find vulnerabilities. Some are more perilous than others, but the consequences of even the slightest breach can be tremendous in the hands of a skilled hacker. Directory traversal is a relatively simple attack but can be used to expose sensitive information on a server.
Modern web applications and web servers usually contain quite a bit of information in addition to the standard HTML and CSS, including scripts, images, templates, and configuration files. A web server typically restricts the user from accessing anything higher than the root directory, or web document root, on the server's file system through the use of authentication methods such as access control lists.
- Don't Miss: Null Byte's Guides on Performing SQL Injection
Directory traversal attacks arise when there are misconfigurations that allow access to directories above the root, permitting an attacker to view or modify system files. This type of attack is also known as path traversal, directory climbing, backtracking, or the dot-dot-slash (../) attack because of the characters used.
Directory traversal vulnerabilities can be found by testing HTTP requests, forms, and cookies, but the easiest way to see if an application is vulnerable to this type of attack is by simply determining if a URL uses a GET query. A GET request contains the parameters directly in the URL and would look something like this:
It takes a bit of guesswork, but sometimes sensitive information can be exposed by climbing up the directory. The command cd is used to change directories, and when used with two dots (cd ..), it changes to the parent directory or one directory above the current directory.
By appending ../ directly to the file path in the URL, we can attempt to change into higher directories in an effort to view system files and information not meant to be internet-facing. We can start out by trying to go up a few levels to access /etc/passwd, but we can see this throws some errors:
After climbing a few more levels, we finally hit paydirt and the contents of /etc/passwd are displayed to us right in the browser:
The /etc/passwd file contains information about users on the system, such as usernames, identifiers, home directories, and password information (although this is typically set to x or *, as the actual password information is usually stored elsewhere).
Other files of interest include the /etc/group file, which contains information about the groups to which users belong:
The /etc/profile file, which defines umask and default variables for users:
The /etc/issue file, which contains system information or a message to be displayed at login:
The /proc/version file, which lists the Linux kernel version in use:
The /proc/cpuinfo file, which contains CPU and processor information:
And the /proc/self/environ file, which contains information about current threads and certain environmental variables:
Directory traversal on other operating systems works in a similar manner, but there are slight differences involved. For instance, Windows uses the backslash character as a directory separator and the root directory is a drive letter (often C:\). Some notable files to look for on Windows are:
Of course, there are a lot more files that could yield interesting things, so if system-level access is attained, it would be wise to spend some time digging around for sensitive information.
In certain situations, such as when a web application is filtering special characters, encoding is used to circumvent input validation in order for an attack to be successful. We've seen this used in other attacks such as SQL injection, but the same sort of techniques can be applied here to directory traversal as well.
The two primary methods of encoding that are normally used are URL encoding and Unicode encoding. On Unix systems that typically utilize forward slashes, URL encoding for the character sequence ../ would look like one of these:
%2e%2e%2f %2e%2e/ ..%2f
Unicode encoding for the same character sequence:
On Windows systems that typically use backslashes, URL encoding for ..\ would look like:
%2e%2e%5c %2e%2e\ ..%5c
Unicode encoding for the same sequence:
Oftentimes an application will only allow a certain file type to be viewed, whether it's a page that explicitly ends in .php or a PDF document. We can get around this by appending a null byte to the request in order to terminate the filename and bypass this restriction, like so:
While directory traversal has the potential to be a devastating attack for an administrator, it is fortunately relatively easy to protect against. The most important thing to do is use appropriate access control lists and ensure the proper file privileges are set in place. Also, unless it is absolutely necessary, avoid storing any sensitive information or configuration files inside the web document root. If there is nothing of importance on the server to begin with, the repercussions of an attack are greatly reduced.
Like most other web-facing configurations, another important step to take is to ensure proper input validation is being used. In fact, if it can be avoided, it's better to omit user input completely when dealing with file system operations. Whitelisting known good input can also be utilized as an additional measure in order to minimize the risk of an attacker exploiting any misconfigurations.
Another thing that can be done (especially if an admin wants to go above and beyond the call of duty) is to actually test if their application is vulnerable to directory traversal. It's easy enough to manually attempt these procedures, but there are tools out there that can easily automate most of the testing like DirBuster, ZAP, and DotDotPwn.
Directory traversal allows an attacker to exploit security misconfigurations in an attempt to view or modify sensitive information. This is one of the simpler attacks to perform, but the results can be disastrous, particularly if personal or financial data is gleaned or if critical information about the server is compromised and used as a pivot point. As you can see, in the world of hacking information is king.