How to identify Path Traversal vulnerabilities

Tutorials
by Jacob Riggs
14-01-2020

Your vote is:

4.78 of 68 votes

Websites are made up of two types of file, those intended to be accessible by the browser (like JavaScript and CSS files), and those that are not. Web servers often route URLs to particular template files or assets on the file system. Typically, the layout of the files on the server mirror the URL structure of the website. A poorly configured web server can be too permissive about what files it returns and this can allow an attacker to access files which were never intended for public access. This may allow commands to be executed on the web server and data can be copied or modified.

How does this work?

For example, a website hosts a document at the https://jacobriggs.io/documents?files=whitepaper.pdf URL.

An attacker visits this URL triggering a request to download the document via their browser. The attacker notices the name of the requested document is passed in the files parameter of the query string. Unfortunately, in this instance, the raw file name whitepaper.pdf is used in the download URL, so an attacker can abuse the files parameter to climb out of the directory that holds documents within the files location.

An attacker can now craft a URL using relative path syntax ../ to explore the local file system.

If the normal file path URL is https://jacobriggs.io/documents?files=whitepaper.pdf then the instruction that is passed to the server when requesting this is to open the /etc/apache2/jacobriggs.io/documents/whitepaper.pdf location. If the attacker crafted URL is https://jacobriggs.io/documents?/files=../../../../ssl/private/private.key then the instruction that is passed to the server when requesting this is to access the /etc/ssl/private/private.key location.

Since the server does not validate file paths, an attacker is now able to access sensitive files.

Tutorial

For this example I will be using PortSwigger’s web security lab. To solve the lab, we must retrieve the contents of the /etc/passwd file. Whilst this is a rudimentary task and can be done without the need of automation, I will be using their lab environment to demonstrate Burp Suite tooling and common path traversal fuzzing techniques used in practice.

First we open the lab environment website and navigate to a product listing.

Preview of website

We view the source code of the web page and notice that the requested images are being passed in the filename parameter.

Filename source code

We open up Burp, turn on Proxy > Intercept, and we request the image directly via its normal file path URL in our browser.

We then check our captured HTTP GET request, and send the vulnerable parameter to Intruder.

Filename send to intruder

We then navigate to Intruder > Positions and we review the payload positions within the request template. We can see that the cookie session parameter within our base request is automatically assigned a payload position. This is not needed, as the current session will be sufficient for our attack, so we highlight and clear the entry.

Filename position clear

We then navigate to Intruder > Payloads to configure our attack. Under Payload Options we click Load and add the dotdotpwn.txt file which contains a list of common attack strings to help automate our fuzzing.

Payloads load dotdotpwn

Now we need to instruct Burp to identify which of our captured responses contain the strings we're interested in. We know that the passwd location is structured in the format of a standard user account table. We also know that this table always contains the user root within it. So we navigate to Intruder > Options and instruct Burp to flag any responses matching the root: string.

Options Grep match root

As everything is now configured, we can navigate to Intruder > Target to double check we are targeting the correct host and initiate our attack.

Intruder target attack

For each query sent within our payload parameters, Burp captures the responses. This is where we pay particular attention to any anomolies within the length of the responses, and any flags matching the root: string. We can see that request 9 is identified as a match, and that when viewing the server response the full structure of the passwd table format is returned.

Payload response review

We have now achieved the path traversal attack by proving we can access the passwd file and content within it. Whilst this is a basic example, it demonstrates the simple risks associated with websites allowing the placement of user input in file paths.

Prevention

To prevent path traversal vulnerabilities, it is important that web applications avoid passing arbitrary strings of user supplied input to filesystem APIs. Where passing arbitrary strings of user supplied input is necessary, two layers of defence can be adopted:

User supplied input should be validated by the web application before processing it. This validation should compare against a whitelist of permitted values.
Once user supplied input has been validated, the web application should append it to the base directory, ensure the path is canonical to the filesystem, and that the canonicalised path starts with the expected base directory. .

ABOUT THE AUTHOR

Follow Jacob Riggs

Jacob Riggs

Jacob Riggs is a senior cyber security professional based in the UK with over a decade of experience working to improve the cyber security of various private, public, and third sector organisations. His contributions focus on expanding encryption tools, promoting crypto-anarchist philosophy, and pioneering projects centred on leveraging cryptography to protect the privacy and political freedoms of others.

E3FE 4B44 56F5 69BE 76C1 E169 E3C7 0A52 9AEF DB6F