The Daily WTF: Curious Perversions in Information Technology

2023-08-21 Reply Admin

The frist problem is obvious. The Num field represents a counting number but is typed as a string. Which means that "12" sorts before "2". Oops.

Now the real problem of course was whoever typed the field as a string, and the second problem is whoever stored the data there without leading zeros.

The solution here is pure code-monkey dumb. Complete with out of order clauses, repeated replacements that are NOP, and the non-delimited ".2" & ".3" at the end. Which would have he fun effect of turning a Num value of "21" into "021" and really fouling up their sorting.

MaxiTB · 2023-08-21 Reply Admin

Sometimes people really don't understand the purpose of a persistent organized storage device like a rational database.

2023-08-21 Reply Admin

rational database

So, not MySQL then.

jeremypnet · 2023-08-21 Reply Admin

then I suspect someone just had no clue what was happening here

count me in that group.

What the hell is it trying to do?

Addendum 2023-08-21 07:16: Is it just putting leading zeroes on some numbers?

2023-08-21 Reply Admin

Num holds something like 1.2.3 and that is supposed to come before 1.12.3.1 so an alphabetical sort doesn't work. Obviously you want to avoid being in this situation in the first place as I don't suppose there is a right way to achieve it in SQL.

2023-08-21 Reply Admin

Q: Why TF did you implement it that way?

A: John told us to use replace() several times.

2023-08-21 Reply Admin

Input data didn't meet spec, and this fudge was easier than getting the supplying system fixed?

2023-08-21 Reply Admin

bit of googling brings you to:

sql = sql + "ORDER BY Section, cast('/' + replace(Num , '.', '/') + '/' as hierarchyid)"

for the right result

:-)

Eric Ray · 2023-08-21 Reply Admin

Is Num something like a semantic version number? Version 1.2.3 should sort earlier than 1.12.0. And yes, if the version number is just saved as a string, that's going to be difficult to sort in SQL.

2023-08-21 Reply Admin

Nothing about this appears to be rational.

(I realise you almost certainly meant "relational", but the opportunity was too good to miss. )

MaxiTB · 2023-08-21 Reply Admin

Yeah, just a typo :-)

2023-08-21 Reply Admin

Surely then you'd just use string parts and cast it to an integer.

Seen code before to compare version numbers that splits it into parts and converts each part to an integer to compare it. Something like:

    CAST( LEFT(versionNo, CHARINDEX('.', versionNo) - 1) AS Integer) AS VersionMajor

This is horrible to do as it gets complex fast(grabbing the second reference requires you to use the index of the first dot+1 as the index for the second, then the third requires the index of the second, etc.

If you do bother to set this up as a computed column though you can at least get away with

    ORDER BY
        versionMajor DESC,
        versionMinor DESC,
        versionBuild DESC

Which at least is very easy to understand and works brilliantly.

2023-08-21 Reply Admin

Look carefully! The base of all the replaces is: '.' +Num + '.'

That says to me that the requirement was simply "0-pad single digits". I'd bet the idea of the .'s was to ensure that the pattern matched the entire input. Somewhere outside the code we are being shown I presume the extra .'s are being stripped away again.

2023-08-21 Reply Admin

I've done almost this exact thing. I wanted some characters to be ignored when full-text indexing a column so I had to copy it into another column without those characters. It was either use nested replace or move out of an SQL-only process into a combination of SQL and code. The nested replace was a lot easier to do

2023-08-21 Reply Admin

It never ceases to amaze me how many people here don't understand the code before they comment. Perhaps that's the source of so many of the WTFs we each create in the wild.

Conrad Buck got it almost right. It's simply zero-padding any single digit numbers. But there's no need to strip those added dots later because this transform is part of an ORDER BY clause. It's not affecting the returned values, just the sequence they're presented in.

AGlezB · 2023-08-21 Reply Admin

To me that looks a lot like sorting a dynamic pivot table and the replacement feels a lot like date handling.

2023-08-21 Reply Admin

Any time I see SQL constructed from string concatenation, it's automatically a WTF.

2023-08-21 Reply Admin

I didn't know Rational made a database. Wouldn't it clash with DB2?

LorenPechtel · 2023-08-21 Reply Admin

We have a dot-separated number--say, a version number (the best example would be an IP address but this code won't work above 99) that needs to sort correctly so it's padding a zero to single digits. Other than the fact they didn't really do it right I don't see what else they could have done given what's in the database vs the desired outcome.

BatConley · 2023-08-21 Reply Admin

Oh, man, is this creative.

The only way this makes sense is if the query result set contains colums named ".01.", ".02.", ".03.".

(Since it is possible, but stupid, to name table columns that way, let's assume they are created during the query itself.)

The "Num" variable specifies which of these columns is used to sort, in addition to the column named "Section". Replace had to be used because you can't have expressions In ORDER BY.

There must be later processing somewhere to process the replace functions, since that could not be done as part of the query run.

2023-08-21 Reply Admin

I believe we've just uncovered a subset of relational databases called "rational databases". It appears that the set of rational databases is currently identical to the empty set.

Ralf · 2023-08-22 Reply Admin

This is obviously padding an "a.b.c" format with zeroes to have proper numerical sorting work via a string compare ordering. And the source could be either a version number or a date, it doesn't really matter.

Granted it's quite ugly, but what's the alternative? One could make a complicated subquery breaking this into 3 SQL fields just to use "ORDER BY a, b, c". Which could potentially do the same thing in a way more overkill way.

Not pretty but not really a wtf.

The only thing is that with our internal custom SQL engine at work, I'd have done this using a regex replacement. Maybe they don't have that luxury.

dkf · 2023-08-22 Reply Admin

That they're doing this in an ORDER BY is what is worst of all, as it prevents the database from optimizing the thing with an index. What this ought to become is a stored generated column that can be indexed over so that the cost of building these strings is only borne when they're inserted (or that column is updated). It would still be ugly code (because SQL isn't good at string manipulation), but at least it would be fast.

2023-08-22 Reply Admin

Some alternatives I can think of in the time it takes me to type out a post:

Do the input padding on insertion, not on each data sort. Better to be ugly once and be done with it than be ugly all the time.
Do the input padding on the frontend or wherever the data is coming from. The database should be storing your input and not thinking about it's correctness.
Do actual input validation on the frontend. Do I even need to explain this?
Have a separate integer ID calculated based off the complex key on insertion. Like for example have 1.01.02 become 10102. That's definitively going to sort the same way. And you can always either reformat it for output or just keep a formatted string for display in your DB if you are lazy.

Either way it boils down to doing these things either on insertion (only once, much better performance vise) or on the display side where they aren't your problem. Better to hang the UI thread than the database after all.

Steve_The_Cynic · 2023-08-22 Reply Admin

Maybe they don't have that luxury.

Maybe they didn't want to have two problems...

2023-08-22 Reply Admin

Only 13? Amateurs! the system I work on has nested replaces that are 20 deep!

LorenPechtel · 2023-08-22 Reply Admin

As for the various alternatives being proposed--all rely on modifying the database or what's in it.

I doubt whoever wrote this had the ability to solve it correctly.

2023-08-23 Reply Admin

As long as it is SQLServer ....

Scarlet_Manuka · 2023-08-25 Reply Admin

That they're doing this in an ORDER BY is what is worst of all, as it prevents the database from optimizing the thing with an index.

That's what function-based indices are for!

More and More Replacements

Leave a comment on “More and More Replacements”