- Feature Articles
- CodeSOD
- Error'd
- Forums
-
Other Articles
- Random Article
- Other Series
- Alex's Soapbox
- Announcements
- Best of…
- Best of Email
- Best of the Sidebar
- Bring Your Own Code
- Coded Smorgasbord
- Mandatory Fun Day
- Off Topic
- Representative Line
- News Roundup
- Editor's Soapbox
- Software on the Rocks
- Souvenir Potpourri
- Sponsor Post
- Tales from the Interview
- The Daily WTF: Live
- Virtudyne
Admin
…yeah, that didn't really make it any better :laughing:
INB4 you did that manually
Admin
20,000 fields under the select.
Admin
Next step: ANSI joins.
Admin
I readily admit that I know just enough SQL to be dangerous, but there are a couple of specific things, beyond the overall complexity, that make me wonder if they are WTFs:
Why is the
enterprise
table used three times with different aliases?Obviously, I don't know their data, but wouldn't/shouldn't the
order_id
be unique, rendering the rest of the terms in theGROUP BY
clause superfluous?Somebody please kill those magic numbers. WTF is special about
n1_status_id
s 3 and 10, or all thosecontact_type_id
s?BTW, @RaceProUK, that really did help a lot.
Admin
From the WHERE,
E
is the customer's enterprise,rs
is the reseller's enterprise, andprs
is the reseller's referrer.Admin
They're different things:
Customer, reseller, referrer (apparently). This is a major reason why you should always use ANSI joins:
It makes it clearer how the table is being used and makes it more difficult to forget to specify the join condition / spot missing ones. I've had to clean up retarded queries from multiple cow-orkers that made mistakes because they used those old style joins. :facepunch:
Also: :hanzo:d
Admin
You have to put anything that isn't an aggregate function (
sum
,max
, etc) in theGROUP BY
clause.Admin
Depends on your database engine. I know oracle requires that, but mysql does not.
Admin
Eh...fair enough.
Admin
Ah, that makes sense. You're (well, not you; the idiot that wrote that query) doing three separate joins to get three separate sets of rows. One might misread the other form of the query to think you were getting one set of rows that satisfied all three conditions; I did.
Admin
To me, the big deal is all those subqueries and the UNION. If you really want to slow things down, that can be a very effective way to do it.
Admin
Bit-masking in SQL? :wtf:
Admin
http://www.postgresql.org/docs/8.0/static/functions-bitstring.html
Admin
I work with worse on a daily basis, sadly. It's such a treat to deal with massive stored procedures that take an "object ID" as a parameter and do different selects, calls and inserts based on said object ID, which later is read by a SQL job from a table which does other things based on the object ID and other parameters. It's an entire application built in SQL so it can be automated.
Admin
Honestly, it looks pretty reasonable once formatted. I would have personally moved the person selector in a subquery in the from clause.
Speed: It might be slow for a number of reasons (indexes, partitions, sort space, statistics, optimizer, etc.), but the query looks reasonable. If you have worked with complex data models and need to make reports that's how SQL looks (and this one doesn't include anything really fancy like hierarchical clauses).
ANSI JOIN: NO! They look nice and work for simple models but when you start making complex joins, or develop complex queries, you stop using them: it makes the code unreadable or unworkable. The key is properly naming your alias.
Admin
This isn't really funny. :wink:
I had a job once where I had to deal with a query like this. It wasn't fun at all.
Admin
Except for the magic numbers and the person_id selector subquery (and MAYBE the lack of ANSI joins), I can't see an issue with this query. Even the what.thedailywtf.com queries would look worse than this if formatted into one line
If anyone doesn't agree, feel free to show me your improved version
Admin
Hey, @antiquarian, I think we have a live one!
Admin
I once had to fix a 1250 line report query. After trying a few hours, I rewrote it.
Admin
You don't write nontrivial queries on one line. Though I agree if this came out of the DB layer's debug statements, intermediate newlines may have been outside the query text, and thus already gone.
Admin
Deliverance?
Admin
There are a whole host of reasons how this query could have ended up in one line on the front page, I doubt it is how the original query looked. But I agree, if it was the case, it would be TRWTF
Admin
Just a standard Front Page Troll™. Nothing to see here, please move along.
Admin
No trolling, simply years of experience.
ANSI joins work fine when your query is structured like a simple tree, but when your join conditions are graph like, you end jumping all over the place to try to make sense of what is going on.
Admin
Yeah, this is really not so bad. Some of the operations could have been done before/after, but doing everything in a single SQL guarantees read consistency without the headache of keeping what could potentially become a much longer running transaction hanging out there. Just because an operation has a lot of words in it doesn't mean its a bad one.
As for the aliases, I do wish they'd spent the extra bytes and used words like referring_enterprise instead of PRS. Makes a lot of difference to the readability of any query.
Admin
They all say that.
Admin
Not using ANSI joins mean that you're going to be jumping all over the place to make sense of what's going on regardless of your conditions.
Admin
It's the query writing equivalent of application based referential integrity.
Admin
Ah that's a tiny query - I look after a bunch of legacy Access applications (yes, I know) where the original developer decided it was a good design decision to have numerous 300-line queries (full of subqueries and unions) stored in tables and executed via VBA when required.
Admin
Which is why I said before: you need to properly name your aliases. When you have 20-30 lines of join conditions you get to read them on a single screen page.
You have also to take into account that I'm talking about complex SQL: not simply joining a few tables, but also some subqueries.
Admin
I don't give a shit how well or poorly you named your aliases.
Yes, whether they're next to the table or not. I'm not sure what your point is.
Why should that matter? I use stuff like that all the time. Now I just think you like to make things difficult for yourself.
That said, I've used non-ANSI joins in some situations where they actually made things easier, but those have been very rare. Separating your joins from their conditions is just asking for trouble, IME.
Admin
http://www.tailgatingideas.com/wp-content/uploads/2009/01/math_cartoon.jpg
Admin
Yeah, once formatted it's not so horrible. Not pretty, but I've seen much worse. The query definitely needs ANSI joins, that old-school Oracle join syntax has got to go (yes, I know, they defined it before ANSI, but that was decades ago). There are a couple of pieces that jump out as particularly evil, though. Whatever the hell is going on here...
Sign(Nvl(Sum(DISTINCT(r.order_id * ( -(Sign(Nvl(r.override_person_id, 0)) -1)) )), 0)) AS rsrc_ctrl_violation,
...that logic could be pulled out into a view. And that DECODE with the long list of magic numbers is a disaster.
Admin
As someone who's written an eight-way inner join on subqueries, including a unioned, joining subquery, I'd rather have the ANSI join syntax, TYVM.
Also, try to write an outer join without ANSI syntax in a non-Orrible database. ;)
Yeah, if they could get their heads out of the Oracle 9 days, they could put those in virtual columns....Still not as :wtf: as the recursive query I once experimented with to "unpack" CLOBs of comma-separated child record IDs into 1 row per child record ID, though -- that one was scrapped in favor of doing the job in C++ though because it was just too dang slow, despite the C++ code having to make a bunch of network round trips to talk to the DB.
Admin
I've seen some pretty huge ones, but our huge ones tend to be in the data processing logic, with 100-200 columns and 2-3 unions across database links. Anonimized, slightly, from the comments saying what's going on:
Of course, each select there gets different sets of filters, different group by's, etc...
Admin
There's a typo: I'm pretty sure "C.CUST_SERVICE_REGI ON" in the last GROUP BY is supposed to be "C.CUST_SERVICE_REGION".
Admin
In TDWTF system Database WTFs are represented by two separate, yet equally important groups: The programs that dump the entire DB processing everything in code, and the programs that attempt to do all the logic in SQL. These are their stories.
BUMP BUMM!
Admin
And then you need to restructure one of the joins into an
OUTER JOIN
and the whole query goes to shit.Source: year(s) of experience. One too many...
Admin
It could be worse. The developer (who happens to live in India) could take your flowchart literally, and interpret the loops and branches usings LOOP and IF/THEN. So all these values were iterated over, additional queries run, branches branched, to make the whole thing excruciatingly slow. In one of the more complex things I've done in my career, I reduced the 1076 lines monstrosity to 440 lines, used a (VERY complex) query, and brought running time from around one hour to 16 seconds.
Admin
Coding confession time:
It also hits a temporary table in the meantime. In my defense, the queries in cursors are already a good 20-30 lines long, and joining them against each other is fairly impossible.
Admin
Based on my fifteen years of experience, I find ANSI joins way easier to read. Personal preference for "readable" is subjective, but I think a preference for ANSI style joins is the majority opinion. There are also issues with doing outer joins using the old-style syntax (if your DBMS even supports it, which some no longer do).
Anyway, to actually help the original poster:
(1) I suspect that instead of two big queries UNIONed together, this could be combined into one, which would be faster.
(2) I think there is a bug. I suspect this:
Should actually be:
I mean, I can't be sure, by my intuition is that the latter is the intended logic. If this is stored server-side in a stored procedure or something, a couple of inline comments could be used to make it clear.
Admin
I get it! TRWTF is SQL itself, right?
Admin
Or never did -- I don't think SQLite3 ever supported any sort of old-style outer join syntax, for instance...
Admin
Probably there is an entry for each status (e.g. received, checked, confirmed, sent, received, paid) the order runs through. That could be used to track the timestamps per order step. And probably several storage locations can be needed to fullfil an order. That would explain the addition of the "o" table.
Admin
Ah, good old optimizer_features_enable('9.2.0')...because why retest the application every five years there is a major new Oracle release? Let's throw away all improvements done on the optimizer in the last 13 years...god forbid it might even make sense out of this mess!
Admin
I suspect you are right about the bug -- and once again, ANSI join syntax would have made things clearer. I would want to see something like this... LEFT OUTER JOIN country itm_ctry ON item.country = itm_ctry.country_cd LEFT OUTER JOIN country e_ctry ON e.country = e_ctry.country_cd
.
Admin
Did you notice how they used a UNION to set a column? reissue_ind is set based on which part of the query you are in. Was this necessary or could you modify the WHERE clause and have a calculated column? Note that the problem isn't that they did that, it is that answering the question of whether it was the right thing to do is nearly impossible now.
BTW for the curious the set of tables is different so this way avoid OUTER joins, but that isn't necessarily a good thing. It also accepts two sets of variables, using one for the former and another for the latter. This likely means they are duplicating the inputted parameters but there is no way to be sure.
By using ANSI joins (and a couple OUTER joins) you would end up with half the code and could more easily tell what the relationships are.
Also using ANSI joins does not preclude you from excluding things in the WHERE clause. For instance I usually try to do primary keys in the ON and anything more complex in the WHERE to avoid that issue to great effect.
Admin
For some reason it appears that this was a query that slowly aggregated over a period of time to become the query we see here. This is similar to the Winchester mystery house theory that one must keep building to stay alive. Unfortunately the only real solution is a tactical nuclear strike to eliminate the enemy and start over. I'm sure that a good BOFH would have a solution that involves "extreme prejudice" somewhere, and that might be necessary as well.
I'm sure that this started out with "good intentions", but the path to hell is paved with them, and this is the end of the line.
Admin
No, a good BOFH would have a solution that involves a bag of lime and little-travelled forest path accessible by car.
Admin
A quick and obvious example: