• WTFGuy (unregistered)

    The frist problem is obvious. The Num field represents a counting number but is typed as a string. Which means that "12" sorts before "2". Oops.

    Now the real problem of course was whoever typed the field as a string, and the second problem is whoever stored the data there without leading zeros.

    The solution here is pure code-monkey dumb. Complete with out of order clauses, repeated replacements that are NOP, and the non-delimited ".2" & ".3" at the end. Which would have he fun effect of turning a Num value of "21" into "021" and really fouling up their sorting.

  • (nodebb)

    Sometimes people really don't understand the purpose of a persistent organized storage device like a rational database.

  • IxamBT (unregistered) in reply to MaxiTB

    rational database

    So, not MySQL then.

  • (nodebb)

    then I suspect someone just had no clue what was happening here

    count me in that group.

    What the hell is it trying to do?

    Addendum 2023-08-21 07:16: Is it just putting leading zeroes on some numbers?

  • Pag (unregistered)

    Num holds something like 1.2.3 and that is supposed to come before 1.12.3.1 so an alphabetical sort doesn't work. Obviously you want to avoid being in this situation in the first place as I don't suppose there is a right way to achieve it in SQL.

  • Sauron (unregistered)

    Q: Why TF did you implement it that way?

    A: John told us to use replace() several times.

  • Tim Ward (unregistered)

    Input data didn't meet spec, and this fudge was easier than getting the supplying system fixed?

  • Stephan (unregistered)

    bit of googling brings you to:

    sql = sql + "ORDER BY Section, cast('/' + replace(Num , '.', '/') + '/' as hierarchyid)"

    for the right result

    :-)

  • (nodebb)

    Is Num something like a semantic version number? Version 1.2.3 should sort earlier than 1.12.0. And yes, if the version number is just saved as a string, that's going to be difficult to sort in SQL.

  • Robin (unregistered) in reply to MaxiTB

    Nothing about this appears to be rational.

    (I realise you almost certainly meant "relational", but the opportunity was too good to miss. )

  • (nodebb) in reply to Robin

    Yeah, just a typo :-)

  • Scragar (unregistered) in reply to Eric Ray

    Surely then you'd just use string parts and cast it to an integer.

    Seen code before to compare version numbers that splits it into parts and converts each part to an integer to compare it. Something like:

        CAST( LEFT(versionNo, CHARINDEX('.', versionNo) - 1) AS Integer) AS VersionMajor
    

    This is horrible to do as it gets complex fast(grabbing the second reference requires you to use the index of the first dot+1 as the index for the second, then the third requires the index of the second, etc.

    If you do bother to set this up as a computed column though you can at least get away with

        ORDER BY
            versionMajor DESC,
            versionMinor DESC,
            versionBuild DESC
    

    Which at least is very easy to understand and works brilliantly.

  • Conrad Buck (unregistered)

    Look carefully! The base of all the replaces is: '.' +Num + '.'

    That says to me that the requirement was simply "0-pad single digits". I'd bet the idea of the .'s was to ensure that the pattern matched the entire input. Somewhere outside the code we are being shown I presume the extra .'s are being stripped away again.

  • Tim R (unregistered)

    I've done almost this exact thing. I wanted some characters to be ignored when full-text indexing a column so I had to copy it into another column without those characters. It was either use nested replace or move out of an SQL-only process into a combination of SQL and code. The nested replace was a lot easier to do

  • WTFGuy (unregistered)

    It never ceases to amaze me how many people here don't understand the code before they comment. Perhaps that's the source of so many of the WTFs we each create in the wild.

    Conrad Buck got it almost right. It's simply zero-padding any single digit numbers. But there's no need to strip those added dots later because this transform is part of an ORDER BY clause. It's not affecting the returned values, just the sequence they're presented in.

  • (nodebb)

    To me that looks a lot like sorting a dynamic pivot table and the replacement feels a lot like date handling.

  • Randal L. Schwartz (github)

    Any time I see SQL constructed from string concatenation, it's automatically a WTF.

  • Conradus (unregistered) in reply to MaxiTB

    I didn't know Rational made a database. Wouldn't it clash with DB2?

  • (nodebb)

    We have a dot-separated number--say, a version number (the best example would be an IP address but this code won't work above 99) that needs to sort correctly so it's padding a zero to single digits. Other than the fact they didn't really do it right I don't see what else they could have done given what's in the database vs the desired outcome.

  • (nodebb)

    Oh, man, is this creative.

    The only way this makes sense is if the query result set contains colums named ".01.", ".02.", ".03.".

    (Since it is possible, but stupid, to name table columns that way, let's assume they are created during the query itself.)

    The "Num" variable specifies which of these columns is used to sort, in addition to the column named "Section". Replace had to be used because you can't have expressions In ORDER BY.

    There must be later processing somewhere to process the replace functions, since that could not be done as part of the query run.

  • efahl (unregistered)

    I believe we've just uncovered a subset of relational databases called "rational databases". It appears that the set of rational databases is currently identical to the empty set.

  • (nodebb)

    This is obviously padding an "a.b.c" format with zeroes to have proper numerical sorting work via a string compare ordering. And the source could be either a version number or a date, it doesn't really matter.

    Granted it's quite ugly, but what's the alternative? One could make a complicated subquery breaking this into 3 SQL fields just to use "ORDER BY a, b, c". Which could potentially do the same thing in a way more overkill way.

    Not pretty but not really a wtf.

    The only thing is that with our internal custom SQL engine at work, I'd have done this using a regex replacement. Maybe they don't have that luxury.

  • (nodebb)

    That they're doing this in an ORDER BY is what is worst of all, as it prevents the database from optimizing the thing with an index. What this ought to become is a stored generated column that can be indexed over so that the cost of building these strings is only borne when they're inserted (or that column is updated). It would still be ugly code (because SQL isn't good at string manipulation), but at least it would be fast.

  • LZ79LRU (unregistered) in reply to Ralf

    <Granted it's quite ugly, but what's the alternative?>

    Some alternatives I can think of in the time it takes me to type out a post:

    1. Do the input padding on insertion, not on each data sort. Better to be ugly once and be done with it than be ugly all the time.

    2. Do the input padding on the frontend or wherever the data is coming from. The database should be storing your input and not thinking about it's correctness.

    3. Do actual input validation on the frontend. Do I even need to explain this?

    4. Have a separate integer ID calculated based off the complex key on insertion. Like for example have 1.01.02 become 10102. That's definitively going to sort the same way. And you can always either reformat it for output or just keep a formatted string for display in your DB if you are lazy.

    Either way it boils down to doing these things either on insertion (only once, much better performance vise) or on the display side where they aren't your problem. Better to hang the UI thread than the database after all.

  • (nodebb) in reply to Ralf

    Maybe they don't have that luxury.

    Maybe they didn't want to have two problems...

  • Fordom Greeman (unregistered)

    Only 13? Amateurs! the system I work on has nested replaces that are 20 deep!

  • (nodebb)

    As for the various alternatives being proposed--all rely on modifying the database or what's in it.

    I doubt whoever wrote this had the ability to solve it correctly.

  • PlanB (unregistered) in reply to Stephan

    As long as it is SQLServer ....

  • (nodebb) in reply to dkf

    That they're doing this in an ORDER BY is what is worst of all, as it prevents the database from optimizing the thing with an index.

    That's what function-based indices are for!

Leave a comment on “More and More Replacements”

Log In or post as a guest

Replying to comment #:

« Return to Article