• frist (unregistered)


  • Ray10k (unregistered)

    I still don't quite get what caused the bug. Did it occasionally encounter the + encode string or something?

  • G (unregistered)

    Should have been titled "%2B or not %2B"

  • Bert (unregistered) in reply to Ray10k

    Yeah. Not entirely made clear here, though.

  • wiseguy (unregistered)

    The code and explanation don't match: it says UrlDecode i.e. it is turning + into (space)

  • Dieter H (unregistered)

    Not really a WTF.

  • chreng (unregistered)

    +1 for a good bug hunt and nice solution!

  • Nobody (unregistered) in reply to wiseguy

    Exactly! The symptoms would be the same but the explanation of the cause is backwards.

  • Paula Bean (unregistered)

    TRWTF is that this is a Remy post and there's no hidden cornify clickety thingy.

  • (nodebb) in reply to G
    Should have been titled "%2B or not %2B"

    Hahahaaaaaaa -- funniest snark I've seen in awhile. Thanks, G

  • (nodebb) in reply to Ray10k
    I still don't quite get what caused the bug. Did it occasionally encounter the + encode string or something?

    Ya, that's it. The function is decoding something that is not encoded. So, it was getting a '+' here and there as part of a token. This is '%2B' as part of a URL. Problem is, the literal strings '+' and '%2B' are not equal, so comparison for validity check will be false.

  • Carl Witthoft (google)

    Remy wrote "Another development team needed their own versions of the code, organized a bit differently, for infrastructure reasons. "
    The correct wording should replace "for infrastructure" with "because" .

  • Paul Neumann (unregistered)

    And TRWTF is the un-de-en-coding issue in the article first decumented by wiseguy. Decoding a '+' yields a ' '. Encoding a '+' yields "%2B". Either way, "+" <> " " <> "%2B"

  • Mike Unfried (github)

    Maybe the programmer thought UrlDecode would protect against SQL Injection Attacks? ;)

  • Sha (unregistered) in reply to Ray10k

    The string is cryptographic data that's been encoded (by the OpenID server) using MIME base 64 encoding. It's not a lot of data (about a line's worth) so most of the time, when MIME encoded, there is no "+" sign in the encoded data. The string is NOT encded with the URL-encoding (the one that turns spaces into plusses that you see if you ever request a page with a space in the filename), but the code was incorrectly calling urlDecode, treating it as if it was encoded this way.

    The thing to remember is the url-encoding is a very simple encoding -- it's not a heck of a lot more than "Space" to "+" and non-ascii to "%<hexadecimal code>", and MIME base 64 only generates standard ascii characters, so unless the MIME encoded string contained a plus (or perhaps %<hex code> and a few other unlikely combinations), the UrlDecode(..) would do nothing at all.

    70% of time, due to the resonalble shortness of the original crypo data, the MIME-encoded version wouldn't contain a "+" so would pass though the urldecode(..) unscathed and on to the rest of the system that would MIME-decode it sucessfully and do whatever cryptographic checking is required. 30% of the time it did, the "+" was converted to a space, and it was no longer base-64 encoded correctly, so it would fail that and, hence, failure. (Either that or it would now treat the "+" as a seperator and only have part of the token!)

    So... yeah. It was encoded, but not double-encoded with mime-64 AND URL-encoding.

    Fix: Stop decoding the string with an encoding that was never applied to it in the first place.

  • Ulysses (unregistered)

    Least it didn't trim the JSON or any bools-to-strings.

  • Felix (unregistered) in reply to Ray10k

    Hi Ray .. I can't tell you for sure, because, given "unrealistic deadlines" (I guess we're not the only shop with this problem), having the bug finally fixed was enough for me :) I GUESS the culprit was an occasional plus sign, but it could have been a percent-sequence as well. For completely understanding the bug, you need some knowledge about Openid-Connect authentication flows ... with the "hybrid with proof key" flow, the client has to actively get the access token from the identity provider, authenticating through a challenge-response process based on a randomly picked "authorization key". This key (as part of a JSON object) was retreived by the code presented here. Sometimes, authentication failed because the key was corrupted, obviously by the needless urldecode(). So far for the technical explanation. Kudos to Remy for simplifying it, because as the actual domain of the bug is complicated, the original reason (aka TRWTF) is simple to understand: decoding something that isn't encoded in the first place :) Still took me DAYS to locate :o well ....

  • Banyaluka (unregistered) in reply to Carl Witthoft

    Remy wrote "Another development team needed their own versions of the code, organized a bit differently, for infrastructure reasons. " The correct wording should replace "for infrastructure" with "because" . Why? Aren't they equal in this sense? Like "forgive me father for I have sinned"?

  • Lawrence (unregistered) in reply to Bananafish

    I want that T-Shirt! I already have (bb|[^b]{2})

  • dave (unregistered)

    Another WTF - arguably way worse - is the fact that this code swallows errors by returning a 'default' in this case. Why would you want to preclude your callers from handling errors sanely? If it's an OAuth token, what is a sensible default OAuth token?

  • Felix (unregistered) in reply to dave

    The description is simplified. What's returned here is an object containing an access code which is necessary for obtaining an access token. For the object, default(T) actually is null. And returning null COULD be the right thing to do, IF the error code from the http call was 404. Well, that's about to be refactored as soon as there's some time to do so (let's keep the hope) ... I still think the biggest WTF was the call to urldecode.

  • Nadir (unregistered) in reply to Lawrence

    Two bs or two (not-b)s?

  • (nodebb)

    A lot of our legacy code used JavaScript escape() instead of encodeURIComponent() for putting data into query strings, or even into post data strings. Of course escape('+') becomes '+' and so for the longest time it was common knowledge that the system could simply not support plus signs in email addresses (it would get converted to space and then stripped server-side)

  • moridin84 (unregistered)

    "Felix cringed a little at seeing a call to CreateHttpClient with each execution- there was a lot of overhead there,"

    There is no overhead with CreateHttpClient right? HttpClient objects are pretty light. As long as you aren't making a 100 of them a second or soomething.

  • Felix (unregistered) in reply to moridin84

    moridin84, there is a lot of overhead. Arguably the worst is that disposing the HttpClient instance brings down the TCP connection, defeating request pipelining and enforcing a new handshake each and every time. But there's more, HttpClient manages cookies, default headers, custom message handlers (used here e.g. for correlation ids) etc, all of these are set up again with each and every instantiation. Therefore, it's best practice to keep the HttpClient static and only instantiate a HttpRequestMessage for each request.

  • Richard (unregistered) in reply to moridin84

    In addition to what Felix said, the ASP.NET Monsters post from August has more information:


  • Rubberduck (unregistered)

    The TWTF is that he didn't fix the potential deadlock...

  • owlstead (unregistered)

    I've seen code within a connection.read method that added XML when retrieving information because "that was expected by the caller". Not documented of course. Not being sure if data is escaped/encoded or not can easily escape a coders attention.

Leave a comment on “Un-Encoding”

Log In or post as a guest

Replying to comment #:

« Return to Article