The Daily WTF: Curious Perversions in Information Technology

Red5 · 2006-10-18 Reply Admin

"The client instead bought a much faster and clustered database server."

If all else fails, just throw more money at it. That's the real WTF here.

2006-10-18 Reply Admin

Alex Papadimoulis:
he ... rewrote the procedure to use a single query to handle both the INSERT and the SELECT. ... It ran in less than five minutes instead of several hours. Sadly, since he was "just a Contractor," his improvement never saw the light of day. The client instead bought a much faster and clustered database server.

Hey, they're the clients. Who're we to blame them if they wish to spend massive amounts of cash on a new server instead of improving the design to overcome such trouble from square one :)

2006-10-18 Reply Admin

Well, at least the consultants have something new to screw up.

2006-10-18 Reply Admin

Oh dear Elvis, that's so sad. Too bad the ones that can't code are paid more. But it's not the coding skills that count, it's about how good you can sell yourself... It's a hard knock life for us...

PS. Not gonna tell you what my captcha was...

2006-10-18 Reply Admin

I think I'm going to cry.

R.Flowers · 2006-10-18 Reply Admin

Alex Papadimoulis:
...the consultant's service had to handle the constraints of bulk-sending, often times tens of thousands of individual SMS messages per transaction. Yes, it's one of those systems used by those "Text 782537 Your $1.99 Joke-A-Day Subscription" services.

The fact that tens of thousands of people (and I presume more) actually buy these services... that depresses me to no end.

Red5:
"The client instead bought a much faster and clustered database server."
If all else fails, just throw more money at it. That's the real WTF here.

Q. But is it scalable?

A. What? Of course it's scalable - you can always spend more money!

BradC · 2006-10-18 Reply Admin

So are these two SQL statements functionally identical? (and identical speed to run?)

SELECT *
FROM table1, table2
WHERE table1.key = table2.key

and

SELECT *
FROM table1
INNER JOIN table2
ON table1.key = table2.key

I usually use the second, but the example uses the first--is this a secondary wtf? Or just a usage convention difference?

2006-10-18 Reply Admin

Of course, it wan't scalable. The consultants blamed the problem on the lack of hardware and when they put in the hardware they stole JM's fix. So of course, now it scales, but not because there is a cluster of servers.

biter · 2006-10-18 Reply Admin

I had to do some SMS integration on an existing project as my first assignment at a new job. The previous programmer opted to leave it unfinished. I'm guessing he discovered how much of a pain in the ass it was going to be. The service is slow, unreliable, and - best of all - runs in a browser window for gasp hours. Sadly, it is the most well-coded project I have taken over to date.

jaspax · 2006-10-18 Reply Admin

I'm sure that somewhere out there are competent, proefessional, and dedicated consultants that really are worth every bit as much as they're paid. However, reading this site almost convinces me that they don't exist. I'm developing a bitter prejudice against outsourcing and consulting. Which is bad, because my boss is trying to convince me that we should outsource our next component...

(Ah, but our business is one giant WTF, so what do I care?)

2006-10-18 Reply Admin

Fools! I am pretty sure the optimal solution involved TRUNCATE TABLE or perhaps DROP DATABASE.

2006-10-18 Reply Admin

Red5:
"The client instead bought a much faster and clustered database server."
If all else fails, just throw more money at it. That's the real WTF here.

Not if you're the hardware vendor... then it's a FTW.

2006-10-18 Reply Admin

The first one is easier to understand for those that don't use SQL commonly.

adamsane · 2006-10-18 Reply Admin

Yes, they are identical functionally.

However, on some db systems they will work differently.

Command #1:

Join all rows together
Use the where clause to exclude rows
Return the result set

Command #2:

Join only the rows were the On clause is true
Return the result set

Most of the large (expensive) DB like oracle and MSSQL will convert this to the sane execution plan.

2006-10-18 Reply Admin

So what I don't get (don't know much about databases myself) is:
Why does the non-WTF query still take nearly five minutes for 2000 SMS on a presumably
large server? A SMS is not longer than ~160 characters afaik, that's not THAT
much of data.

ammoQ · 2006-10-18 Reply Admin

BradC:
So are these two SQL statements functionally identical? (and identical speed to run?)
SELECT *
FROM table1, table2
WHERE table1.key = table2.key
and
SELECT *
FROM table1
INNER JOIN table2
ON table1.key = table2.key
I usually use the second, but the example uses the first--is this a secondary wtf? Or just a usage convention difference?

In the world of Oracle, the first version is more common. And yes, the query optimizer usually understands how to execute that efficiently.

2006-10-18 Reply Admin

Anonymous:
So what I don't get (don't know much about databases myself) is:
Why does the non-WTF query still take nearly five minutes for 2000 SMS on a presumably
large server? A SMS is not longer than ~160 characters afaik, that's not THAT
much of data.

The wtf is the loop to make individual inserts. (If I'm reading correctly.)

2006-10-18 Reply Admin

Anonymous:
Anonymous:
So what I don't get (don't know much about databases myself) is:
Why does the non-WTF query still take nearly five minutes for 2000 SMS on a presumably
large server? A SMS is not longer than ~160 characters afaik, that's not THAT
much of data.

The wtf is the loop to make individual inserts. (If I'm reading correctly.)

I'm obviously not reading correctly. The non-wtf still takes a long time because it's still a crapload of data to move? Maybe there's more info we don't see? I dunno.

captcha = bedtime. I wish.

2006-10-18 Reply Admin

Anonymous:
Anonymous:
So what I don't get (don't know much about databases myself) is:
Why does the non-WTF query still take nearly five minutes for 2000 SMS on a presumably
large server? A SMS is not longer than ~160 characters afaik, that's not THAT
much of data.

The wtf is the loop to make individual inserts. (If I'm reading correctly.)

Yes, sure. I was talking about the sane and optimized version of the query... ?

GrandmasterB · 2006-10-18 Reply Admin

This whole thing makes no sense.... even fetching each of the 2000 rows individually should NOT take 3 hours. 3 seconds, maybe. Were they running the db on an old palm pilot or something?

2006-10-18 Reply Admin

GrandmasterB:
This whole thing makes no sense.... even fetching each of the 2000 rows individually should NOT take 3 hours. 3 seconds, maybe. Were they running the db on an old palm pilot or something?

I'll chalk it up to anonymizing taking out necessary info.

rbriem · 2006-10-18 Reply Admin

BradC:

So are these two SQL statements functionally identical? (and identical speed to run?)
SELECT *
FROM table1, table2
WHERE table1.key = table2.key
and
SELECT *
FROM table1
INNER JOIN table2
ON table1.key = table2.key
I usually use the second, but the example uses the first--is this a secondary wtf? Or just a usage convention difference?

The INNER JOIN suppresses any records from either table that don't have a matching key field in the other (pretty sure - been a while since I SQL'ed); IIRC, this is the default behavior for some databases, so this *may* be functionally equivalent. Speedwise, an inner joind could be faster than an outerjoin depending on index usage.

Vague enough for you? Good. Now I'm gonna go order me a batch of business cards that say "Doer of Things".

CodeRage · 2006-10-18 Reply Admin

Honestly, I think the real WTF here is using the technique of copying a bunch of rows from one table to another to indicate a change in status. This sounds like a really stupid design. I guess something like this might need to be done if there were two different databases and systems involved, but the optimization done by our hero would indicate that these two tables are in the same database.

Why not use just ONE table, and have some sort of STATUS field on it, and set such status field to "SENT" or similiar after each messages was processed/sent/whatever? Really, given the need to fix the problem quickly, our hero's solution was great, but WTF, what kind of design is this in the first place?

Please fill me in if I'm completely missing something here!

2006-10-18 Reply Admin

GrandmasterB:
This whole thing makes no sense.... even fetching each of the 2000 rows individually should NOT take 3 hours. 3 seconds, maybe. Were they running the db on an old palm pilot or something?

Actually you'll notice that they join to several different tables. If some of these tables are large (IE users table) and they do not have proper index's (entirely possible) then this query could end up taking quite some time.

However, given the information we have, I believe that this query should preform faster.

--doc0tis

zip · 2006-10-18 Reply Admin

Anonymous:
Anonymous:
Anonymous:

So what I don't get (don't know much about databases myself) is:
Why does the non-WTF query still take nearly five minutes for 2000 SMS on a presumably
large server? A SMS is not longer than ~160 characters afaik, that's not THAT
much of data.

The wtf is the loop to make individual inserts. (If I'm reading correctly.)
Yes, sure. I was talking about the sane and optimized version of the query... ?

This bothers me too. 5 minutes for 2000 insert commands? That seems really slow. I must be missing something.

Jeff S · 2006-10-18 Reply Admin

Anonymous:
Anonymous:
So what I don't get (don't know much about databases myself) is:
Why does the non-WTF query still take nearly five minutes for 2000 SMS on a presumably
large server? A SMS is not longer than ~160 characters afaik, that's not THAT
much of data.

The wtf is the loop to make individual inserts. (If I'm reading correctly.)

It's not only that they are individual inserts, it's also the fact that each row must first be returned to client, then then the client sends each row back to the server. Even using a CURSOR on the SQL Server itself would be more efficient that forcing the client to process each row ,because at least the data never needs to leave SQL. Of course, the simple set-based INSERT that was never implemented is the easiest and most efficient solution.

2006-10-18 Reply Admin

Anonymous:
I think I'm going to cry.

I'm with you! I didn't realize how fortunate I was to be at a company (or at least a group within the company) where getting the best solution matters more than making yourself look good, but I promise never to take it for granted again.

(captcha: craptastic -- you'd only see that on thedailywtf!)

Jeff S · 2006-10-18 Reply Admin

zip:
Anonymous:
Anonymous:
Anonymous:

So what I don't get (don't know much about databases myself) is:
Why does the non-WTF query still take nearly five minutes for 2000 SMS on a presumably
large server? A SMS is not longer than ~160 characters afaik, that's not THAT
much of data.

The wtf is the loop to make individual inserts. (If I'm reading correctly.)
Yes, sure. I was talking about the sane and optimized version of the query... ?
This bothers me too. 5 minutes for 2000 insert commands? That seems really slow. I must be missing something.

In addition to the slowness caused by what I mentioned in my previous post, I can only guess that maybe a cross join was involved accidentally in the sql statement constructed .. we can't see the whole thing, but it is doing all joins in the WHERE clause so it's very possible. That's perhaps the biggest consequence of not using ANSI syntax -- it is not always clear when all tables are not joined properly and it is very easy to miss a relation or two.

2006-10-18 Reply Admin

The WTF is if the contractor who is more qualified than the consultants remains a contractor.

GoatCheez · 2006-10-18 Reply Admin

Alex Papadimoulis:
A "smallish" broadcast of 2,000 messages took about three hours to process. From a technical perspective, this involved inserting 2,000 rows in a "Queue" table in their Microsoft SQL Server database. Obviously, the users were not to thrilled about this either, especially when they wanted to broadcast news to their subscribers.

So lemme get this straight, Alex couldn't post the WTF until 4pm today because his joke sending SMS service was backed up for 3 hours? Now THERE'S the WTF!

;-P

2006-10-18 Reply Admin

Not that this is related to the WTF, but... most modern databases will treat those to queries identically. The second form is usually preferrable, though, since some outer joins will be ambiguous when expressed that way. Also, at least in my opinion, the second form is easier to understand for joins with many tables.

Alex: this is the first time I've literally gasped at a daily WTF. This really, really hurt. As a DBA myself, I hate to see the way companies are penny wise but pound foolish about getting in soemone with database expertise.

GrandmasterB · 2006-10-18 Reply Admin

Anonymous:
Actually you'll notice that they join to several different tables. If some of these tables are large (IE users table) and they do not have proper index's (entirely possible) then this query could end up taking quite some time.

I noticed it... but still, 2 hours? I suspect its the inserts that are the time killer here, since the same select query is run in both case. The difference is 2000 inserts vs 1. If the db is over-indexed /super-enterprisey, low on memory or disk space, or just in the proximity of a 'consultant' the performance can get dragged down.

2006-10-18 Reply Admin

zip:
Anonymous:
Anonymous:
Anonymous:

So what I don't get (don't know much about databases myself) is:
Why does the non-WTF query still take nearly five minutes for 2000 SMS on a presumably
large server? A SMS is not longer than ~160 characters afaik, that's not THAT
much of data.

The wtf is the loop to make individual inserts. (If I'm reading correctly.)
Yes, sure. I was talking about the sane and optimized version of the query... ?
This bothers me too. 5 minutes for 2000 insert commands? That seems really slow. I must be missing something.

Fools - Alex anonymized out the cursor that was created to handle the 2000 row result set

2006-10-18 Reply Admin

I want my new title to be:

"Grand Imperial PooBah of all Things Great and Small"

2006-10-18 Reply Admin

CodeRage:
Why not use just ONE table, and have some sort of STATUS field on it, and set such status field to "SENT" or similiar after each messages was processed/sent/whatever? Really, given the need to fix the problem quickly, our hero's solution was great, but WTF, what kind of design is this in the first place?

Separate queue tables are good for processing transactions that may fail and be re-tried multiple times. You can have a single "INSERT INTO foo SELECT FROM ..." that executes in a couple of seconds. Then you have a process (or set of processes) that work through those sending the messages. This is why MSMQ exists.

2006-10-18 Reply Admin

CodeRage:

Honestly, I think the real WTF here is using the technique of copying a bunch of rows from one table to another to indicate a change in status. This sounds like a really stupid design. I guess something like this might need to be done if there were two different databases and systems involved, but the optimization done by our hero would indicate that these two tables are in the same database.
Why not use just ONE table, and have some sort of STATUS field on it, and set such status field to "SENT" or similiar after each messages was processed/sent/whatever? Really, given the need to fix the problem quickly, our hero's solution was great, but WTF, what kind of design is this in the first place?
Please fill me in if I'm completely missing something here!

Such designs aren't uncommon, and have merit. Think about it. There's a queue table (Pending) and a history table (Complete). The history table may be date oriented or partitioned for manageability and performance, especially for lookups and purges. However, your batch broadcast shouldn't have to look through all your historical data to find only the Pending records.

If one were designing it now, for a DBMS such as Oracle, you'd use composite partitioning (range by date, list by status), or you'd have a function-based index to allow you to highlight only the records in a Pending status.

But if you want something simple, and you don't really understand all the features and functions of the underlying RDBMS -- and exactly how many code-monkey contractors do? -- you'd design it exactly in such a way.

2006-10-18 Reply Admin

My eyes...the goggles do nothing!

As for the forms of select, on MS SQL Server 2000, the first form is about 6 times as slow as the second form. They will return the same results. I'm basing this off of a stored procedure that would take 3 hours to run and when I switched it from the layout of #1 to #2 it took only 30 minutes. I think 7 tables were being joined together. I can't say anything about the speed of MS SQL Server 2005 as I keep being put on adding new features to old systems, reusing old code for a new site, or bug fixing.

BradC:
So are these two SQL statements functionally identical? (and identical speed to run?)
SELECT *
FROM table1, table2
WHERE table1.key = table2.key
and
SELECT *
FROM table1
INNER JOIN table2
ON table1.key = table2.key
I usually use the second, but the example uses the first--is this a secondary wtf? Or just a usage convention difference?

2006-10-18 Reply Admin

CodeRage:
Why not use just ONE table, and have some sort of STATUS field on it, and set such status field to "SENT" or similiar after each messages was processed/sent/whatever? Really, given the need to fix the problem quickly, our hero's solution was great, but WTF, what kind of design is this in the first place?
Please fill me in if I'm completely missing something here!

If you're dealing with large volumes of data, archiving old rows out of a table sometimes makes sense as a way to optimize queries over that table. It's not ideal normalized design, but there is a logical argument for it.

Of course, we don't know what details were anonymized/edited out, either... and we obviously can't assume everyone working on this project was competent!

2006-10-18 Reply Admin

CodeRage:
Honestly, I think the real WTF here is using the technique of copying a bunch of rows from one table to another to indicate a change in status. This sounds like a really stupid design. I guess something like this might need to be done if there were two different databases and systems involved, but the optimization done by our hero would indicate that these two tables are in the same database.
Why not use just ONE table, and have some sort of STATUS field on it, and set such status field to "SENT" or similiar after each messages was processed/sent/whatever? Really, given the need to fix the problem quickly, our hero's solution was great, but WTF, what kind of design is this in the first place?
Please fill me in if I'm completely missing something here!

you are obviously a contractor. the consultant secrets are not for your eyes...

emurphy · 2006-10-18 Reply Admin

CodeRage:
Honestly, I think the real WTF here is using the technique of copying a bunch of rows from one table to another to indicate a change in status. This sounds like a really stupid design. I guess something like this might need to be done if there were two different databases and systems involved, but the optimization done by our hero would indicate that these two tables are in the same database.
Why not use just ONE table, and have some sort of STATUS field on it, and set such status field to "SENT" or similiar after each messages was processed/sent/whatever? Really, given the need to fix the problem quickly, our hero's solution was great, but WTF, what kind of design is this in the first place?
Please fill me in if I'm completely missing something here!

They're copying a bunch of rows from a set of tables to another table. Consider:

Table 1 contains records like

"send message M1 to group G1"

"send message M2 to group G2"

...

"send message M99 to group G1"

"send message M100 to group G2"

Table 2 contains records like

"group G1 contains user U1"

"group G1 contains user U2"

...

"group G1 contains user U100"

"group G2 contains user U101"

...

"group G2 contains user U150"

What this query does is throw a bunch of expanded data into a third table:

"send message M1 to user U1"

"send message M1 to user U2"

etc.

Then another process pulls a row at a time from the third table, actually sends the message, and then (probably) deletes the row.

ParkinT · 2006-10-18 Reply Admin

Isn't that typical?

Don't solve the software problem. Just throw more hardware at it and hope it keeps working!

How many times have you seen an example of this in your career?

Yes, I am asking for some responses to the question.

emurphy · 2006-10-18 Reply Admin

Anonymous:
As for the forms of select, on MS SQL Server 2000, the first form is about 6 times as slow as the second form. They will return the same results. I'm basing this off of a stored procedure that would take 3 hours to run and when I switched it from the layout of #1 to #2 it took only 30 minutes. I think 7 tables were being joined together. I can't say anything about the speed of MS SQL Server 2005 as I keep being put on adding new features to old systems, reusing old code for a new site, or bug fixing.

Are you sure it wasn't just taking advantage of memory cache the second time? As a sanity check, try doing #2 and then #1.

Even if they run at equal speeds, I still prefer #2 because it separates the conditions into useful blocks:

select (list of fields)

from A

join B on (list of conditions that join B to A)

join C on (list of conditions that join C to A and/or B)

where (list of non-join conditions, e.g. the ever-popular "salary >= 80000")

CodeRage · 2006-10-18 Reply Admin

Anonymous:
CodeRage:

Honestly, I think the real WTF here is using the technique of copying a bunch of rows from one table to another to indicate a change in status. This sounds like a really stupid design. I guess something like this might need to be done if there were two different databases and systems involved, but the optimization done by our hero would indicate that these two tables are in the same database.
Why not use just ONE table, and have some sort of STATUS field on it, and set such status field to "SENT" or similiar after each messages was processed/sent/whatever? Really, given the need to fix the problem quickly, our hero's solution was great, but WTF, what kind of design is this in the first place?
Please fill me in if I'm completely missing something here!

Such designs aren't uncommon, and have merit. Think about it. There's a queue table (Pending) and a history table (Complete). The history table may be date oriented or partitioned for manageability and performance, especially for lookups and purges. However, your batch broadcast shouldn't have to look through all your historical data to find only the Pending records.
If one were designing it now, for a DBMS such as Oracle, you'd use composite partitioning (range by date, list by status), or you'd have a function-based index to allow you to highlight only the records in a Pending status.
But if you want something simple, and you don't really understand all the features and functions of the underlying RDBMS -- and exactly how many code-monkey contractors do? -- you'd design it exactly in such a way.

Well yes, but since you still have to 'look through all your historical records' to find the records with which to populate the queue, why not just kill two birds with one stone and have the process that finds these messages actually be the one that also sends the message and marks it as sent?

Or, if there really is a need to send many in bulk in a short amount of time (so, for example, the time between the first and last message is minimized so it is as simultaneous as possible for the recipients), why not just create a flat file of messages to send and have a second process grab that file and send them quickly. Even stranger about this WTF is they still have not removed and/or marked the messages in the first table as "sent to queue", so I imagine this has to be done also. Given you still have to do this in some way, all these multi table solutions seem like a huge amount of DB processing, and the one table/status field solution I give seems to me it would still have better performance.

2006-10-18 Reply Admin

Maybe the SMS DB is nowhere near the web server or something

emurphy · 2006-10-18 Reply Admin

Anonymous:
If one were designing it now, for a DBMS such as Oracle, you'd use composite partitioning (range by date, list by status), or you'd have a function-based index to allow you to highlight only the records in a Pending status.

Not necessarily. In this case, history of attempts normalizes differently (e.g. "send message M to group G" x 1) from pending/complete/failed transmissions (e.g. "send message M to user U" x number of users in group G), and you probably wouldn't explicitly store complete transmissions (except during testing) - you would simply assume that anything derivable from the history of attempts, but not in the transmission table, was transmitted without incident and thus deleted from the transmission table.

Red5 · 2006-10-18 Reply Admin

Anonymous:
I want my new title to be:
"Grand Imperial PooBah of all Things Great and Small"

LOL, You picked up on another minor WTF, job titles and their ambiguous meanings.

I choose "Commander of the Known Universe" for my next title :)

2006-10-18 Reply Admin

The real WTF(tm) is that the client was performing these SQL Queries at all. Why didn't they have a data layer, developed and maintained by a competent DBA? Then he could have written Stored Procedures to do all of this kind of transactional stuff, it would have been fast and all that money spent for new systems could be saved.

I will say however, that all that aside, the original WTF with the query is a very worthy WTF indeed.

maweki · 2006-10-18 Reply Admin

JOIN is evil, indeed very evil.

You can conclude that JOIN was invented by Oracle :D

achille · 2006-10-18 Reply Admin

BradC:
So are these two SQL statements functionally identical? (and identical speed to run?)
SELECT *
FROM table1, table2
WHERE table1.key = table2.key
and
SELECT *
FROM table1
INNER JOIN table2
ON table1.key = table2.key
I usually use the second, but the example uses the first--is this a secondary wtf? Or just a usage convention difference?

The first syntax was used before Oracle 9i came out, what it means is simply: perform a cartesian product of the two rows and select only the ones that referr to the same person.

There virtually no difference other than the syntax. I personally like the old way better. The only problem is when you want to do a full outer join (and other other outer join restrictions), which you can't really (You have to use the Union operator)

2006-10-18 Reply Admin

CodeRage:
Honestly, I think the real WTF here is using the technique of copying a bunch of rows from one table to another to indicate a change in status. This sounds like a really stupid design. I guess something like this might need to be done if there were two different databases and systems involved, but the optimization done by our hero would indicate that these two tables are in the same database.
Why not use just ONE table, and have some sort of STATUS field on it, and set such status field to "SENT" or similiar after each messages was processed/sent/whatever? Really, given the need to fix the problem quickly, our hero's solution was great, but WTF, what kind of design is this in the first place?
Please fill me in if I'm completely missing something here!

I'm not defending the WTF code, but CodeRage, you missed the fact that the data is coming from several tables, not just one. They were cherry-picking values from across the data model and inserting them into a single record. A good coder with deficient data modeling skills would see this as a good solution.

However, since this is a "batch" and the data seems to be coming from far and wide, I would imagine that there is a whole lot of data duplication here. A single record probably looks something like this:

(sender, recipient, time_sent, subject, message, etc.) I draw this conclusion from the OP: "...the web UI would send the service a message with a list of recipients and the service would convert it to individually-addressed SMS messages and then send the messages to the appropriate queue." The last three columns in the record are dupes.

Additionally: 3 hours for 2000 records....16 rows a minute? BS. Something else must have been wrong. Even 2000 records in 5 minutes sounds slow on a reasonably busy system. And don't respond with "Well, perhaps the sender service was hammering the DB with updates and deletes." It couldn't hammer the DB; It's quite obvious that they couldn't populate the queues fast enough to give the sender service any considerable amount of work to do to generate your scape-goat updates and deletes.

Very Slow Service

Leave a comment on “Very Slow Service”