Directory Traversal

Description

Web servers are generally set up to restrict public access to a specific portion of the Web server's file system. In a Directory Traversal or Path Traversal attack, an intruder manipulates a URL in such a way that the Web server executes or reveals the contents of a file anywhere on the server, residing outside of the Web server's root directory. Path Traversal attacks take advantage of special character sequences in URL input parameters, cookies, and HTTP request headers.

A common Path Traversal attack uses the "../" character sequence to alter the document or resource location requested in a URL. Although most Web servers prevent this method by escaping sequences, alternate encodings of the "../" sequence can bypass basic security filters. Even if a Web server properly restricts Path Traversal attempts in the URL path, any application that exposes an HTTP-based interface is also potentially vulnerable to such attacks.

These method variations include valid and invalid Unicode-encoding of:

The forward slash character, for example, "..%u2216" or "..%c0%af" .
The backslash characters, for example, URL encoded characters "%2e%2e%2f" , or double URL encoding "..%255c" .

Examples

Several typical Path Traversal attacks are shown below:

Path Traversal Attacks Against a Web Server

Example Code 1

http://example.test/../../../secret/file

This attack is the "classic" version of a Path Traversal attack. Most Web servers and applications will at least filter the '../' character string. However, it is worth noting that many applications running under Windows might also be vulnerable to the '..\' character string (backslash instead of slash).

Example Code 2

http://example.test/..%5c..%5c..%5csecret/file

The second attack uses escaped encoding ('%5c' translates to '\'). It relies on the assumption that the target application either has no relevant security checks for Path Traversal in place or that those checks are done before the translation of the escaped characters.

Example Code 3

http://example.test/..%255c..%255c..%255csecret/file

The third attack is a special version that is widely known for its use against a Web server that (unintentionally) translated escaped encoded characters twice. However, the security checks were done only after the first conversion. As '%25' translates to '%' after the first conversion the third attack looked exactly like the second attack but was not detected by the security checks in place. After the second conversion the '%5c' were replaced by '/' and the attack string was complete.

Note that the string "%5c" within the URL is a Web server escape code. Escape codes are used to represent normal characters in the form %nn, where nn stands for a two-digit number. The escape code "%5c" represents the character "\". The problem is that the IIS root directory enforcer did not check for escape codes and allowed that request to execute. The Web server's operating system understands escape codes and executes the command.

This example demonstrates how 'creative' exploitable programming errors can be. Multiple decoding of masked characters is a common problem for many applications.

Path Traversal Attacks Against a Web Application

Original URL

http://example.test/cgi-bin/index.cgi?web/web.html

Example of a Path Traversal Attack

http://example.test/cgi-bin/index.cgi?../cgi-bin/index.cgi

Obviously, the Web pages on this Web server are not addressed directly. Rather this work is done by a script called index.cgi . The script evaluates the parameter ( web/web.html ) included in the URL after the question mark and outputs the designated file, probably doing some standard extra work like adding header and footer. If the attacker guessed the directory structure and the script did not perform appropriate input validation, the script would probably display its source code to the attacker in a Web page thus giving away valuable hints for further attacks.

Path Traversal Attacks Using Special Characters

Original URL

http://example.test/cgi-bin/index.cgi?web/web.html

Example of a Path Traversal Attack

http://example.test/cgi-bin/index.cgi?../cgi-bin/index.cgi%00.html

One input validation technique consists of checking the extension of a file name parameter. The underlying idea is to only display files with a 'correct' extension like ' html ' or '.txt ', thus preventing the application from displaying, for example, script code. The attack above uses the escaped encoded NULL character ('%00') creating a URL that ends with '.html ' to pass this validation step. However, it is likely that the script - when using the parameter - will stop evaluating the parameter string as soon as it reaches the NULL character and once again might be tricked into displaying its source code to the attacker.

Possible ASCII Characters Used in Path Traversal Attacks

ASCII	Escaped encoding
NULL	%00
Space	%20
%	%25
.	%2e
/	%2f
:	%3a
\	%5c

What Do I Need to Do?

General recommendations to prevent Path Traversal attacks:

Do not implement file access functionality that is based on user input, unless there is no other alternative.
If you must allow user input, try to constrain it to a list of allowed files/paths.

You should also ensure that:

A codepage (such as charset = ISO-8859-1 ) is defined to clearly decide which characters are problematic.
The given input is filtered for malicious metacharacters.

In addition to the above mentioned aspects to prevent Path Traversal attacks, the Web server provides two main security mechanisms:

The root directory, which limits users' access to a specific directory in the Web server's file system.
The administrators' access control list, which limits users' access to specific files within the root directory.