- Feature Articles
- CodeSOD
-
Error'd
- Most Recent Articles
- Two Blinded Mice
- Zeroics
- Not Really an Error'd Error'd
- Twicely Done
- Fresh Bugs
- Absolutely Execrable
- Boy Howdy
- NaaN
-
Forums
-
Other Articles
- Random Article
- Other Series
- Alex's Soapbox
- Announcements
- Best of…
- Best of Email
- Best of the Sidebar
- Bring Your Own Code
- Coded Smorgasbord
- Mandatory Fun Day
- Off Topic
- Representative Line
- News Roundup
- Editor's Soapbox
- Software on the Rocks
- Souvenir Potpourri
- Sponsor Post
- Tales from the Interview
- The Daily WTF: Live
- Virtudyne
Admin
It's interesting how the UTF-8 encoding of
U+00A9 COPYRIGHT SIGN
(C2 A9), when interpreted as Windows-1252, decodes as the same character, but with aU+00C2 LATIN CAPITAL LETTER A WITH CIRCUMFLEX
in front of it.Admin
According to Google Translate, the series of Chinese characters Cole T. found in the Amazon search box translates as "Enameled Cast Iron Dutch Skillet". Are we sure it's mojibake and not a legitimate amazon search that somehow got transferred from a different domain?
Admin
Yup, that's the case for all characters in the range U+0080 to U+00BF. In binary, the 8-bit code point
10abcdef
gets encoded in UTF-8 as the 2-byte sequence11000000 10abcdef
. Fun little consequence of how UTF-8 works.Admin
Well, Romans did not generally use spaces, so maybe space aren't Latin characters in some sense of the word 'Latin' at least.
Admin
Indeed, and had I not seen a lot of mojibake before in my life, I probably wouldn't even have noticed this one. But the A with circumflex was a telltale.