Exploring the Dangers of Path Traversal Vulnerabilities in Web Applications

Path Traversal Vulnerabilities are a critical security flaw that allows attackers to access files and directories outside the intended scope of a web application. By manipulating variables that reference files with ".." sequences and other directory traversal patterns, attackers can navigate the file system of a server to gain unauthorized access to sensitive files, configuration files, and other critical data.

This might include:

  • Application code and data.
  • Credentials for back-end systems.
  • Sensitive operating system files.

In some cases, an attacker might be able to write to arbitrary files on the server, allowing them to modify application data or behavior, and ultimately take full control of the server.

Reading arbitrary files via path traversal

Imagine a shopping application that displays images of items for sale. This might load an image using the following HTML:

<img src="/loadImage?filename=218.png">

The loadImage URL takes a filename parameter and returns the contents of the specified file. The image files are stored on disk in the location /var/www/images/. To return an image, the application appends the requested filename to this base directory and uses a filesystem API to read the contents of the file. In other words, the application reads from the following file path:

/var/www/images/218.png

This application implements no defenses against path traversal attacks. As a result, an attacker can request the following URL to retrieve the /etc/passwd file from the server's filesystem:

https://insecure-website.com/loadImage?filename=../../../etc/passwd

This causes the application to read from the following file path:

/var/www/images/../../../etc/passwd

The sequence ../ is valid within a file path, and means to step up one level in the directory structure. The three consecutive ../ sequences step up from /var/www/images/ to the filesystem root, and so the file that is actually read is:

/etc/passwd

On Unix-based operating systems, this is a standard file containing details of the users that are registered on the server, but an attacker could retrieve other arbitrary files using the same technique.

On Windows, both ../ and ..\ are valid directory traversal sequences. The following is an example of an equivalent attack against a Windows-based server:

https://insecure-website.com/loadImage?filename=..\..\..\windows\win.ini

Example:

Reading `/etc/passwd` file through filename parameter

File path traversal, traversal sequences blocked with absolute path bypass

Many applications that place user input into file paths implement defenses against path traversal attacks. These can often be bypassed.

If an application strips or blocks directory traversal sequences from the user-supplied filename, it might be possible to bypass the defense using a variety of techniques.

It might be possible to use an absolute path from the filesystem root, such as filename=/etc/passwd, to directly reference a file without using any traversal sequences.

Example Scenario:

Imagine a web application that allows users to download files from a specific directory on the server. The application includes a check to prevent path traversal attacks by blocking traversal sequences like "../". However, it does not adequately handle absolute path inputs.

Code vulnerable to path traversal attack

The above code correctly identifies and blocks traversal sequences like "../". However, it does not prevent an attacker from using an absolute path to access files outside the intended directory.

Secure Implementation:

Secure code that is not vulnerable to path traversal attack

The code now also checks if the file_name starts with a "/", preventing absolute paths.

The code resolves both the base path and the full path to their canonical forms using os.path.realpath(). This ensures that any symbolic links or relative paths are properly resolved.

Real World Example:

Fetching /etc/passwd file as absolute path

File path traversal, traversal sequences stripped non-recursively

This likely means that the application is removing or rejecting traversal sequences (such as ../ or ..\) in a straightforward, non-recursive manner. This can be achieved using string manipulation functions or regular expressions that detect and remove or reject sequences like ../ from the input path.

For example, a basic input validation might remove "../" from user inputs to prevent directory traversal. However, if it's done non-recursively, an attacker could potentially bypass this protection by using variations such as "....//" or "..././" or other encoding techniques that might not be caught by a simple filter.

Example:

/etc/passwd file can still be accessed because file path traversal sequence stripped non-recursively

File Path Traversal with Stripped Traversal Sequences and Superfluous URL-Decode

A file path traversal vulnerability can become even more problematic when an application tries to strip traversal sequences but fails to handle cases where superfluous URL-decoding allows an attacker to bypass these defenses. This kind of sanitization can sometimes be bypassed by URL encoding, or even double URL encoding, the ../ characters. This results in %2e%2e%2f and %252e%252e%252f respectively. Let's walk through an example of this scenario.

Vulnerable Code Example

Consider a web application that allows users to download files by specifying a file name in a URL parameter. The application attempts to prevent path traversal by stripping out traversal sequences like ../, but it doesn't account for URL-encoded input that can bypass these defenses.

This code failed to strip the encoded characters from the filename

An attacker could exploit this vulnerability by using URL-encoding to bypass the traversal sequence stripping.

An attacker can URL-encode the traversal sequence to bypass the stripping logic:

Here, %2e is the URL-encoded representation of . and %2f is the URL-encoded representation of /. When decoded, %2e%2e%2f%2e%2e%2f becomes ../../.

Real World Example:

Bypassed the traversal sequence stripping and access /etc/passwd file

File path traversal, validation of start of path

When dealing with file path traversal vulnerabilities, validating that the requested file path starts with a specific base directory path is crucial. This ensures that users cannot navigate outside the intended directory, thereby preventing unauthorized access to sensitive files. Here's how you can implement such a validation correctly.

An application may require the user-supplied filename to start with the expected base folder, such as /var/www/images. In this case, it might be possible to include the required base folder followed by suitable traversal sequences. For example: filename=/var/www/images/../../../etc/passwd.

Vulnerable Code Example

Initially, let's consider a flawed implementation that attempts to construct a file path but fails to properly validate it:

This code fails to validate the base path.

An attacker could input something like:

Secure Implementation

To secure the application, we need to ensure that the constructed file path starts with the base path after resolving any relative paths. Here’s how we can do this:

This code properly validated filename contains base directory or not.

The code checks if the canonical full path starts with the canonical base path. By adding os.sep (which ensures the trailing slash), it ensures that the requested file is within the allowed directory. This prevents access to files outside the base directory, such as /etc/passwd. 

Real World Example

Base path is included in filename and successfully accessed /etc/passwd file

File path traversal, validation of file extension with null byte bypass

Null byte injection is a technique used by attackers to bypass file extension validation by inserting a null byte (%00 in URL encoding) into the input. This can cause the application to misinterpret the input, allowing the attacker to bypass security checks. Here’s how to handle this and ensure robust file path validation and extension checks.

Vulnerable Code Example

Consider an application that validates the file extension but does not handle null byte injection:

This code failed to validate null byte character leading to bypass other security checks

An attacker could exploit this by appending a null byte to bypass the extension check:

After URL-decoding, ../../etc/passwd%00.txt becomes ../../etc/passwd\x00.txt. Many programming languages, including C and C-based languages, interpret the null byte as a string terminator, potentially leading to the file name being treated as ../../etc/passwd.

Real World Example:

Null byte injection at input bypassed security checks leads to access /etc/passwd file

Secure Implementation

To secure the application against null byte injection and ensure proper file extension validation, follow these steps:

  1. Remove Null Bytes: Strip out null bytes from the user input.
  2. Canonicalize and Validate: Ensure the file path is within the allowed directory.
  3. Validate File Extension: Check the file extension after canonicalizing the path.

Here’s how to implement it:

Code after removing null byte

Explanation

  1. URL-Decoding: The input is URL-decoded to handle any encoded traversal sequences.
  2. Remove Null Bytes: The code replaces any null bytes in the file name with an empty string, mitigating null byte injection.
  3. Construct Full Path: The code constructs the full path by combining the base path and the user-provided file name.
  4. Canonicalization: The code uses os.path.realpath() to resolve both the base path and the full path to their canonical forms, ensuring proper path resolution.
  5. Validate File Extension: The file extension is validated after the full path has been canonicalized, ensuring the check is performed on the actual resolved path.
  6. Directory Whitelisting: The code checks if the canonical full path starts with the canonical base path, ensuring the requested file is within the allowed directory.

By following these steps, the application can effectively prevent file path traversal attacks, including those involving null byte injection.

Schedule a Pentest:

Penetration Testing

Start a Free Trial:

Vulnerability Scanner