When evaluating a new development tool or framework, the first thing the smart developer does is check out the vendor documentation. Those docs will either be your best friend, or your worst enemy. A great product with bad documentation will provide nothing but frustration, but even a mediocre product with clean and clear documentation at least lets you get useful stuff done.

Stuart Longland's company has already picked the product, unfortunately, so Stuart's left combing through the documentation. This particular product exposes a web-socket based API, and thus includes JavaScript samples in its documentation. Obviously, one could use any language they like to talk to web-sockets, but examples are nice…

webSocket.onopen = function () { let session_id = "SESSION-ID"; // Received from API create session request let network_id = "NETWORK-ID"; // Received from API create session request let reference = "REFERENCE-ID"; // Reference handle created by client to link messages to relevant callbacks let wire = 1; // Connection ID, incremental value to identify messages of network/connection let type = 1; // Client type, use value 1 (FRONTEND) const OPEN = JSON.stringify({ "method": "open", "id": network_id, "session": session_id, "ref": reference, "wire": wire, "type": type }); this.send(decodeURIComponent(escape(OPEN))); };

This all seems mostly reasonable until you get to the last line:

this.send(decodeURIComponent(escape(OPEN)));

escape is a deprecated method similar to encodeURIComponent. So this encodes our JSON string, then decodes it, then sends it over the web-socket. Which seems… like a useless step. And it probably is- this is probably a developer's brain-fart that happened to end up in the code-base, and then later on, ended up in the documentation.

But it might not be. Because escape and encodeURIComponent are not the same method. They don't escape characters the same way, because one of them is unicode compliant, and the other isn't.

escape('äöü'); //"%E4%F6%FC" encodeURIComponent('äöü'); //"%C3%A4%C3%B6%C3%BC"

And that unicode awareness goes for the inverse method, too.

unescape(escape('äöü')); //outputs "äöü" decodeURIComponent(encodeURIComponent('äöü')); //outputs "äöü" unescape(encodeURIComponent('äöü')); //outputs "äöü" decodeURIComponent(escape('äöü')); // throws a URIError exception

Now, it's unlikely that this particular JSON message contains any characters that would cause any problems- REFERENCE-ID, SESSION-ID and the others are probably just long hex-strings. So in real-world use, this probably would never cause an actual problem.

But in the situations where it does, this would create a surprising and probably annoying to debug glitch.

Character encodings are hard, but good documentation is even harder.

[Advertisement] ProGet’s got you covered with security and access controls on your NuGet feeds. Learn more.