The Daily WTF: Curious Perversions in Information Technology

Medinoc · 2024-02-06 Reply Admin

This looks like the number of bytes required to encode a codepoint in UTF-8.

2024-02-06 Reply Admin

Maybe someone really liked Matlab and wanted 1-based indexes. SQL AUTO_INCREMENT is also 1-based and we've already established that this is a database programmer. Why are you bringing MATLAB into this?

Maybe it's mapping Unicode code points to the number of bytes they require to be represented? That doesn't jive either. It jives with the UTF-8 byte count for all the code-points given. That does make U+270F seem an odd place to stop, but they're obviously committed to thinking in decimal.

Oh, and TRWTF isn't even shown here, because you just know from that first column that this is going to be used in one of those "iterate over the mapping until I find the key I'm after instead of just indexing by the key" patterns that makes things O(n) when they should be O(1) / O(log n).

Yazeran1 · 2024-02-06 Reply Admin

Which hints at some truly frighting string-manipulation code elsewhere (Think FORTRAN style positional data using this to determine the byte offset to start reading the date or what not shivers))! Why else would you need to know how many bytes a given UTF-8 character takes up???

Medinoc · 2024-02-06 Reply Admin

You might need to know how many bytes a given UTF-8 character string takes up if you're dealing with fixed-size storage, but AFAIK you're supposed to already have classes for this in the standard library...

2024-02-06 Reply Admin

It's a mapping between the number of WTFs in the codebase and the number of human sacrifices required to appease the divine wrath incurred by the horrible code quality.

WTFGuy · 2024-02-06 Reply Admin

@Medinoc Classes in SQL? One shudders to imagine such a horror.

Yazeran1 · 2024-02-06 Reply Admin

Yes, just as horrible as using XML parsing of fields inside SQL in order to be able to search for specific fields (which SHOULD have been in separate columns in the database bur for some reason was just mashed into XML and stored in a single db field instead..... And then the people that be don't understand why it performs so horrible when searching large data sets.....

Yazeran

2024-02-06 Reply Admin

And there is no doubt that someone actually wrote the code to generate this table, which could easily be used to replace this table.

2024-02-06 Reply Admin

On the assumption that the UTF-8 explanation is correct, we would have a "programmer" who doesn't understand:

Unicode
hexadecimal
arrays
if statements

Really good at counting, though. Would do it all day if you paid them.

jeremypnet · 2024-02-06 Reply Admin

The submitter really needs to tell us what the answer is.

FTR it could be bytes needed to store the Unicode code point in UTF-8, although I wouldn't be able to explain why they stopped at 9999. Maybe they got bored, or why 1 is considered a Unicode code point but 0 isn't..

Addendum 2024-02-06 16:27: Well I screwed the punctuation of that up real good.

Anyway, the theory of UTF-8 width seems plausible, but there are easier ways to calculate it and Java probably has one built in.

molleafauss · 2024-02-06 Reply Admin

Bonus points for the "hungarian" m_ prefix of the variable name. Never forget this is a member of a class!

2024-02-06 Reply Admin

Other things which shouldn't exist: The bracing style of the 2-D array members.

prueg · 2024-02-06 Reply Admin

When all you have is ~~a hammer~~ database programming experience, every problem looks like a ~~nail~~ table.

Addendum 2024-02-06 21:38: When all you have is ~a hammer~ database programming experience, every problem looks like a ~nail~ table.

TRWTF is the lack of a preview function.

Watson · 2024-02-07 Reply Admin

/jive/jibe/

nerd4sale · 2024-02-07 Reply Admin

you can also write stored procedures in Java.

Ah yes, that 90s fad which has been discouraged by Oracle since 9i came out in 2001. The real WTF is still having Java stored procedures in Oracle.

Scarlet_Manuka · 2024-02-07 Reply Admin

One thing the Oracle database does well is backwards compatibility. You wrote one stored proc in Java in 1992 and never used it again? Oracle has your back, that thing will work forever (maybe with some database parameters needing to be set).

Of course this is a two edged sword. It also prevents them from correcting a lot of design choices that turned out to be mistakes in retrospect. Though in many cases (e.g. empty string = null) it's not really possible for them to do anything else; so much code has been written (and continues to be written) assuming this behaviour that changing it would break all the things.

Max Character Width

Leave a comment on “Max Character Width”