When evaluating a new development tool or framework, the first thing the smart developer does is check out the vendor documentation. Those docs will either be your best friend, or your worst enemy. A great product with bad documentation will provide nothing but frustration, but even a mediocre product with clean and clear documentation at least lets you get useful stuff done.
Stuart Longland's company has already picked the product, unfortunately, so Stuart's left combing through the documentation. This particular product exposes a web-socket based API, and thus includes JavaScript samples in its documentation. Obviously, one could use any language they like to talk to web-sockets, but examples are nice…
webSocket.onopen = function () {
let session_id = "SESSION-ID"; // Received from API create session request
let network_id = "NETWORK-ID"; // Received from API create session request
let reference = "REFERENCE-ID"; // Reference handle created by client to link messages to relevant callbacks
let wire = 1; // Connection ID, incremental value to identify messages of network/connection
let type = 1; // Client type, use value 1 (FRONTEND)
const OPEN = JSON.stringify({
"method": "open",
"id": network_id,
"session": session_id,
"ref": reference,
"wire": wire,
"type": type
});
this.send(decodeURIComponent(escape(OPEN)));
};
This all seems mostly reasonable until you get to the last line:
this.send(decodeURIComponent(escape(OPEN)));
escape
is a deprecated method similar to encodeURIComponent
. So this encodes our JSON string, then decodes it, then sends it over the web-socket. Which seems… like a useless step. And it probably is- this is probably a developer's brain-fart that happened to end up in the code-base, and then later on, ended up in the documentation.
But it might not be. Because escape
and encodeURIComponent
are not the same method. They don't escape characters the same way, because one of them is unicode compliant, and the other isn't.
escape('äöü'); //"%E4%F6%FC"
encodeURIComponent('äöü'); //"%C3%A4%C3%B6%C3%BC"
And that unicode awareness goes for the inverse method, too.
unescape(escape('äöü')); //outputs "äöü"
decodeURIComponent(encodeURIComponent('äöü')); //outputs "äöü"
unescape(encodeURIComponent('äöü')); //outputs "äöü"
decodeURIComponent(escape('äöü')); // throws a URIError exception
Now, it's unlikely that this particular JSON message contains any characters that would cause any problems- REFERENCE-ID
, SESSION-ID
and the others are probably just long hex-strings. So in real-world use, this probably would never cause an actual problem.
But in the situations where it does, this would create a surprising and probably annoying to debug glitch.
Character encodings are hard, but good documentation is even harder.