Show TOC

 URL Encoding and ManipulationLocate this document in the navigation structure


In technical literature, URL encoding, UTF encoding, escape-encoding, percent-encoding, and Web encoding are used interchangeably. To obtain a better understanding of malicious attacks such as XSS or SQL injection attacks, you need to gain an insight into URL encoding techniques.

Web applications transfer data between the client and the server using the HTTP or HTTPS protocols. Normally, all user input may be passed to the server either in the HTTP headers (submitted through the cookie field, or the post data field) or included in the query portion of the requested URL. If the data is transferred by a URL, it has to be specially encoded to obey the proper syntax rules of URLs.

The standard (RFC2396) distinguishes between two types of character class:

  • The unreserved class comprises the characters:
    • a-z, A-Z, 0-9  _ . ! ~ * # ( )
  • The reserved class contains the following characters:
    • ; / ? : @ & = + $ ,

Characters from the reserved class could conflict with the correct interpretation of a URL. Escape-encoding allows the correct syntax interpretation of these reserved characters. The URL encoding is achieved by a triplet sequence consisting of a percentage character (%) followed by the two hexadecimal digits representing the octet code of the original character.

The percentage character acts as the escape indicator within a URL and therefore has to be escaped itself as "%25" in order to be used as data in an URL. For URI encoding, we recommend that you ensure that you do not escape or un-escape the same string more than once, since un-escaping an already un-escaped string might lead to the misinterpretation of a percentage data character as another escaped character, or the converse in the case of escaping an already escaped string.

Multiple escape-encoding at different layers of an application may circumvent security checks during the initial decoding pass. An example of a multiple escape-encoding of this type is shown below using the character sequence "\" or "..\".

The backslash "\" can be described as "%5c" or the following permutations:

  • %255c
  • %%35c
  • %%35%63
  • %25%35%63
Examples of Possible URL Attacks

These different escape-encoding sequences give an example of possible entry points for URL attacks, such as:

URL Attack as a Multiple Decoding Attack

Example URL Attack


Attack Result

The directory list of C:\ is revealed.

URL Attack as an XSS Attack

Example URL Attack


Attack Result

<script src=""></script>

URL Attack as an SQL Injection Attack

Example URL Attack

http://target/login.asp?userid=bob%27%3b%20update%20logintable%20set%20 passwd%3d%270wn3d%27%3b--%00

Attack Result

Executed database query:

SELECT preferences FROM logintable WHERE userid='bob' update logintable set password='0wn3d'

Useful Hint:

There is a reference table of ASCII characters in URL encoding form (hexadecimal format) at .

What Do I Get from the SAP NetWeaver Platform?

For Web Dynpro for Java:

Due to the architecture of Web Dynpro, automatic integration of output encoding functionality is available. Therefore, you do not need to manually implement additional output encoding functionality.

What Do I Need to Do?

The different character encoding schemes and their variety of application offer an infinite number of malicious encodings. The developer is therefore responsible for securing his or her application against encoding attacks of this type, in accordance with the following rules:

  • Read the 'Request for Comments' (RFC) 3986 on Uniform Resource Identifier (URI):generic syntax carefully for the correct syntax processing of URLs (search at ).
  • User input has to be regarded as potentially malicious code.
  • Avoid submitting data using the 'GET' method, because the data is appended to the URL and can be easily manipulated. It is better to use the 'POST' method instead.
  • Do not rely on client-side content checks.
  • Validate and sanitize all data on server side.

    Always restrict the type of acceptable data as much as possible using strict validation rules.

  • Always perform independent validation and sanity checks of the supplied data.
  • Ensure that the application does not repeat any character-decoding process. Decoding should be done by the operating system. If the data remains encoded or contains unacceptable characters, treat the data as malicious and reject the input.
  • Thoroughly test your application for system behavior on encoded and incorrect data formats.