Tuesday, August 31, 2010

org.owasp.esapi.encodeForURL (shocked again)

My quick look at OWASP ESAPI-2.0-rc6 again

java.lang.String encodeForURL(java.lang.String input) throws EncodingException
Encode for use in a URL. This method performs URL encoding on the entire string.

For the docs the "URL encoding" is defined by referencing wikipedia !
http://en.wikipedia.org/wiki/Percent-encoding
Don't we have RFCs for this ?

Being curious what it really does ? Look in the code:

return java.net.URLEncoder.encode(input,
ESAPI.securityConfiguration().getCharacterEncoding());


What ? Read JavaDoc !
Translates a string into application/x-www-form-urlencoded format.
This is not ment for building URLs but for encoding form data !
There is another JavaDoc:
The URLEncoder and URLDecoder classes can also be used,
but only for HTML form encoding,
which is not the same as the encoding scheme defined in RFC2396.
And another one:
The recommended way to manage the encoding and decoding of URLs is to use URI

Reading URI docs you will learn about all deviations Java has from RFC2396.
javase/6/docs/api/java/net/URI.html

The OWASP JavaScript version of "the same" is even "better"
(I bet a beer, not producing the same results as Java code):


encodeForURL: function(sInput) {
return !sInput ? null : escape(sInput);
}


Reading MDC docs:
escape and unescape Functions
The escape and unescape functions do not work properly for non-ASCII characters
and have been deprecated. In JavaScript 1.5

and later, use encodeURI, decodeURI, encodeURIComponent, and decodeURIComponent.


Bad naming or ignorance ?

There is just small chance that authors realy ment to code
"HTML form encoding" and not to solve URI building and encoding,
and that the method has just a bad name. I would suggest Encoder.encodeForHtmlForm
instead of misleading encodeForURL with even more confisung wiki link !



In the case OWASP really ment to solve
encoding for URI or http scheme URLs, there should be totaly
another code behind !!!!


If you really plan to encode URI components there is API needed to
encode path, path-segment, query, fragment with separate rules defined by
RFC (and I vote for the "new rfc3986" instead of buggy java implementation of old "RFC2396").

If you code or find rfc3986 compliant java uri implementation,
let me know,
until then I will not
replace my code for UNRELIABLE OWASP REFERENCE IMPLEMENTATION.


Strong suggestion again: search for "Jena IRI".

4 comments:

  1. the encodeForURL method is intended to encode parameters that are part of a URL. This method is not for building a complete URI, and doing so is far beyond the scope of the method (and there are numerous *correct* ways to do so using only the core Java API)

    From a security perspective, imagine the following scenario:

    Code in page:
    <% String myVariable = request.getParameter("myVariable"); %>
    <a href="http://mysite.com/someaction?key=<%=myVariable%>">Link</a>

    A blind man could spot the issue there from 100 miles away. Here you would use:

    <% String myVariable = request.getParameter("myVariable"); %>
    <a href="http://mysite.com/someaction?key=<%= ESAPI.encodeForURL(myVariable) %>">Link</a>

    (This is a very dumbed down version to illustrate a single point)

    This could be considered encoding form-data for a GET request, however, as evident by the code above, this is not at all a form generated request, but a dynamically built GET request with parameters being passed in that are subject to injection attacks if not properly escaped.

    ReplyDelete
  2. so consider this "bad naming",
    however you comment does not change anything on the fact that Java and JavaScript implementations are out of sync (produce different results).

    1) whitespace (20) is encoded as %20 by escape and encodeUri and is encoded as + by URIEncoder

    2) as mentioned escape is buggy and does not work uver 0xFF range producing %u0100 instead of %C4%80 produced by Java and encodeUriComponent
    ...

    ReplyDelete
  3. 3) and of cource all 3 methods have different
    lists of encoded chars.

    So what ?
    are we trying to code something reasonable here
    or just align JS code with silly Java code ?

    ReplyDelete
  4. in core Java API THERE IS NOT CORRECT NOR EASY WAY TO encode parts of URI, and it is shame that ESAPI API is ignorant to this fact, specially if URI and Path related issues are on top vuln. lists.
    We are currently evaluating this Spring API, and will publish results soon:
    http://static.springsource.org/spring/docs/3.0.x/javadoc-api/org/springframework/web/util/UriUtils.html

    ReplyDelete