The Daily WTF: Curious Perversions in Information Technology

Buzer · 2009-02-05 Reply Admin

At least he didn't use XML on DataType column...

dsckeld · 2009-02-05 Reply Admin

Wouldn't that only be enterprisey enough if the offsets had been defined in a table?

Apart from that: If someone seems to be guarding their secrets a little too zealously, don't be afraid, be VERY afraid.

2009-02-05 Reply Admin

If you are going to base this on elapsed times, then you need to get the data buffered the same, and the disks the same speed. That is hard to do on your laptop. But you could measure teh CPU used instead, and still see a large difference.

Aaron · 2009-02-05 Reply Admin

It could have been an indexed view, no?

Still would make more sense to parse during import, but nevertheless, if it took less than 1 second to look up records, the view could very well have been indexed even if the table wasn't.

2009-02-05 Reply Admin

For 1000 selects based on the indexed VARCHAR, 30ms. For 1000 selects based on a substring, a whopping 379220ms.

I dare to ask, but you ARE aware about expression indices, aren't you?

Example: http://www.postgresql.org/docs/8.3/interactive/indexes-expressional.html

Quango · 2009-02-05 Reply Admin

Honeyman:
> For 1000 selects based on the indexed VARCHAR, 30ms. For 1000 selects based on a substring, a whopping 379220ms.
I dare to ask, but you ARE aware about expression indices, aren't you?

Example: http://www.postgresql.org/docs/8.3/interactive/indexes-expressional.html

I think since DTS was mentioned it's some flavour of Microsoft SQL Server not PostGres - don't think MS SQL supports this.

Even so doing something inefficiently in a slightly more efficient way is probably going to be slower than doing efficiently just once.

2009-02-05 Reply Admin

Maybe I'm just dumb, but what does the beginning of the story have to do with the end?

What does a retarded table that is super slow to query have to do with the missing data?

Quango · 2009-02-05 Reply Admin

I'm a sucker for such a challenge.. Just ran this SQL on my workstation

1000 selects by indexed NPI: 453ms 1000 selects by substring NPI: .. still waiting after 13 minutes

My workstation: Core quad Q6600, 8GB ram, sql 2005 64bit, running Vista x64, raid disk array, NOT a slow system. What the heck kind of laptop do you have that does the first in 30ms?

2009-02-05 Reply Admin

greg:
Maybe I'm just dumb, but what does the beginning of the story have to do with the end?
What does a retarded table that is super slow to query have to do with the missing data?

Agreed. Even if the table were fixed, I don't see how the missing number would magically appear, unless maybe it was missing due to column spacing. Did he ever find it? Inquiring minds want to know!

2009-02-05 Reply Admin

Even so doing something inefficiently in a slightly more efficient way is probably going to be slower than doing efficiently just once.

A lookup over the indexed field does not care if the index is based on the field directly or on the function over the fields. Doing SELECT ... WHERE f1(f2(field)) = 'blaA' (if an index exists over the f1(f2(field)) ) is not "slightly more efficient" than doing SELECT ... WHERE field = 'blaB' (if an index exists over the field) - it is virtually the same.

don't think MS SQL supports this.

Even though I've never used MS SQL, I seriously doubt MS SQL doesn't support it - or some similar feature.

2009-02-05 Reply Admin

Quango:
Vista x64, raid disk array, NOT a slow system.

This supremely brilliant antithesis.

jo42 · 2009-02-05 Reply Admin

Quango:
I'm a sucker for such a challenge.. Just ran this SQL on my workstation
1000 selects by indexed NPI: 453ms 1000 selects by substring NPI: .. still waiting after 13 minutes

My workstation: Core quad Q6600, 8GB ram, sql 2005 64bit, running Vista x64, raid disk array, NOT a slow system. What the heck kind of laptop do you have that does the first in 30ms?

You're running Vista. He is running XP.

2009-02-05 Reply Admin

Could have been a materialised view with indexes. The view definition probably wouldn't have shown that.

2009-02-05 Reply Admin

My best guess is that they were still waiting on the record to show up after the previous night's import.

Andy Goth · 2009-02-05 Reply Admin

Quango:
raid disk array

I always chuckle inwardly whenever someone says "RAID array". Spelling it out even further ("RAID disk array") made me laugh out loud. Now I'm just waiting for someone to start talking about their redundant RAID. :^)

2009-02-05 Reply Admin

jo42:
Quango:
I'm a sucker for such a challenge.. Just ran this SQL on my workstation
1000 selects by indexed NPI: 453ms 1000 selects by substring NPI: .. still waiting after 13 minutes

My workstation: Core quad Q6600, 8GB ram, sql 2005 64bit, running Vista x64, raid disk array, NOT a slow system. What the heck kind of laptop do you have that does the first in 30ms?

You're running Vista. He is running XP.

I read this site for a loooong time and appreciate reading the comments. Sometimes they are even better than the article, but this (combined with the previous) was the first comment that actually made me laugh out loud. Thank you.

2009-02-05 Reply Admin

So... you left us hanging! What happened to the missing NPI number?

2009-02-05 Reply Admin

Andy Goth:
Quango:
raid disk array
I always chuckle inwardly whenever someone says "RAID array". Spelling it out even further ("RAID disk array") made me laugh out loud. Now I'm just waiting for someone to start talking about their redundant RAID. :^)

So, anyway, I got these Redundant Array of Independent Disks Arrays going and...

2009-02-05 Reply Admin

Harold:
If you are going to base this on elapsed times, then you need to get the data buffered the same, and the disks the same speed. That is hard to do on your laptop. But you could measure teh CPU used instead, and still see a large difference.

Wrong. CPU time is not what's important here. What's important is how long it takes to return the data.

If this is MS SQL, and it can't use indexes (preferably a clustered index seek), it has to do lots of disk I/O and a very small amount of CPU to find the right records. Performing disk I/O involves a lot of waiting time. That is measured by wall clock time, and not by CPU time.

David Walker

2009-02-05 Reply Admin

Ben:
jo42:
Quango:
I'm a sucker for such a challenge.. Just ran this SQL on my workstation
1000 selects by indexed NPI: 453ms 1000 selects by substring NPI: .. still waiting after 13 minutes

My workstation: Core quad Q6600, 8GB ram, sql 2005 64bit, running Vista x64, raid disk array, NOT a slow system. What the heck kind of laptop do you have that does the first in 30ms?

You're running Vista. He is running XP.
I read this site for a loooong time and appreciate reading the comments. Sometimes they are even better than the article, but this (combined with the previous) was the first comment that actually made me laugh out loud. Thank you.

I would like to second that. While I don't know if it would make THAT big of a difference I find Vista's use of system resources appalling.

And my friends wonder what my problem is. . .

Captcha: populus. LOVE IT!

2009-02-05 Reply Admin

That's my NPI number!

heltoupee · 2009-02-05 Reply Admin

Redundant RAID is actually a product. On extremely high-availability systems, you will have 2 separate mirrored arrays (each with their own inbuilt redundancy) running off of 2 separate controllers. This way if one of the controllers goes tits-up, you can failover to the other without being up the creek.

2009-02-05 Reply Admin

Well, it is to solve a little ambiguity. "I got a new RAID"... a new RAID what? A RAID card? A RAID cage? A RAID disk array? Understand yet?

Andy Goth:
Quango:
raid disk array
I always chuckle inwardly whenever someone says "RAID array". Spelling it out even further ("RAID disk array") made me laugh out loud. Now I'm just waiting for someone to start talking about their redundant RAID. :^)

2009-02-05 Reply Admin

Meh, It could be worse. Way worse.

2009-02-05 Reply Admin

MindChild:
Well, it is to solve a little ambiguity. "I got a new RAID"... a new RAID what? A RAID card? A RAID cage? A RAID disk array? Understand yet?

Some bug spray.

2009-02-05 Reply Admin

Dan:
greg:
Maybe I'm just dumb, but what does the beginning of the story have to do with the end?
What does a retarded table that is super slow to query have to do with the missing data?

Agreed. Even if the table were fixed, I don't see how the missing number would magically appear, unless maybe it was missing due to column spacing. Did he ever find it? Inquiring minds want to know!

I second these comments! Sure the database design is a huge WTF, but the story seems to be missing a punchline.

2009-02-05 Reply Admin

TRWTF is Grandpa Simpson posting WTF stories:

Grandpa: One way to get rid of them is to tell 'em stories that dont go anywhere. Like the time we went over to Shelbyville during the war, I wore an onion on my belt -- which was the style at the time. You couldn't get those white ones, you could only get those big yellow ones... now where was I... Oh yeah, the important thing was I was wearing an onion on my belt, which was the style at the time, you couldnt get those... (trails off)

2009-02-05 Reply Admin

Clearly one that can calculate the average run time and not the total run time.

dpm · 2009-02-05 Reply Admin

heltoupee:
Redundant RAID is actually a product.

Sure, and I won't complain about anyone using it that sense. But we all know that almost anyone who says "Redundant RAID" is merely being . . . redundant.

2009-02-05 Reply Admin

Anonymous:
Quango:
Vista x64, raid disk array, NOT a slow system.

This supremely brilliant antithesis.

http://en.wikipedia.org/wiki/Litotes

Not a word I manage to work into conversation every day.

2009-02-05 Reply Admin

Gee, I didn't know health care providers had a federally-assigned ID. Now I have to add a column to my database table for that.

2009-02-05 Reply Admin

Can't believe someone hasn't noticed this yet.

That's over 9000 times as long!

Code Dependent · 2009-02-05 Reply Admin

Johann:
So, anyway, I got these Redundant Array of Independent Disks Arrays going and...

My RAID array is inexpensive.

Auction_God · 2009-02-05 Reply Admin

The design of the table seems OK for the "import" (except that I would have added an "Error" column to track which rows have errors. That is the best way to do it so that you don't get DTS failing to import rows. Then a Stored Procedure can cycle through and process the just imported rows and build "the real table" with the data correctly parsed out. Those rows with errors can get flagged as such and the appropriate person (MORT) notified of the issues.

An alternative that was mentioned in a previous post is to create a materialized view on the table that gets regenerated after the import is completed. This 2nd method doesn't allow you to process errors as nicely.

2009-02-05 Reply Admin

Fibonacci:
That's my NPI number!

Well, too bad for you that it's not a valid NPI! :P 1123581322, on the other hand, would be just fine.

2009-02-05 Reply Admin

How is it that the WTF's alternate between "so poorly edited the reader can't figure out what caused the initial problem" and "so heavily edited that the original submitter shows up to complain and correct the record?"

2009-02-05 Reply Admin

I don't know what the issue is, but I definitely clicked through thinking that the RSS feed had a truncated version of the story or something. Nope :(

We'll never know why the NPI number was missing.

(though actually my guess is that indeed it's some type of materialized view, and it's just stale)

2009-02-05 Reply Admin

Andy Goth:
I always chuckle inwardly whenever someone says "RAID array". Spelling it out even further ("RAID disk array") made me laugh out loud. Now I'm just waiting for someone to start talking about their redundant RAID. :^)

I don't get it. Why are people making fun of my redundant RAID inexpensive disk array?

2009-02-05 Reply Admin

So...the WTF is that indexes are faster than table scans?

Yeah...hey, that sure is WTFy, alright.

Say, how is the view from the bottom of the barrel, anyway?

2009-02-05 Reply Admin

Was the view created a materialized view or not? While the underlying implementation is pretty crappy, a materialized view would get rid of all the performance problems and would be a really good reason for the dev in question to insist that people only use the view.

Shial · 2009-02-05 Reply Admin

Looks like the table isn't the only thing missing a few lines.

halcyon1234 · 2009-02-05 Reply Admin

I can't believe put everything into separate ROWS. A single cell would have sufficed, if he'd set it to NVARCHAR(Max). Sure you would use an extra delimiter per record, but you'd eliminate the table scan problem.

ParkinT · 2009-02-05 Reply Admin

dsckeld:
Wouldn't that only be enterprisey enough if the offsets had been defined in a table?

A wooden table, of course.

2009-02-05 Reply Admin

Andy Goth:
Quango:
raid disk array
I always chuckle inwardly whenever someone says "RAID array". Spelling it out even further ("RAID disk array") made me laugh out loud. Now I'm just waiting for someone to start talking about their redundant RAID. :^)

I suffer from RAS syndrome you insensitive clod!

2009-02-05 Reply Admin

One more vote for telling us what happened to the missing number.

Or is the start of the story just there to explain why the guy looked at the other guy's code?

2009-02-05 Reply Admin

Downfall:
How is it that the WTF's alternate between "so poorly edited the reader can't figure out what caused the initial problem" and "so heavily edited that the original submitter shows up to complain and correct the record?"

I agree with this. How hard is it to explain what happened without either: a. Fictionalizing it so much that it is unbelievable b. Having it make no sense.

I miss the days when the DailyWTF was less "Story time with Alex" and more "Look at this bad code...make sure you're not dumb enough to do this!'

At least the Error'd and CodeSOD sections still speak for themselves.

2009-02-05 Reply Admin

dpm:
heltoupee:
Redundant RAID is actually a product.
Sure, and I won't complain about anyone using it that sense. But we all know that almost anyone who says "Redundant RAID" is merely being . . . redundant.

I'd submit that RAID these days has lost its meaning, it's no longer a valid acronym, and so in some ways Redundant RAID does make some sense. RAID 0 provides no data redundancy, so you could argue either that a) it's not truly RAID and therefore should be called something else or that b) RAID really doesn't mean anything more than "a bunch of disks as a single logical disk" and therefore Redundant RAID is a perfectly valid way of referring to the old-school RAID we know and love. Also, Redundant RAID implies a tinge or sarcasm and therefore gets my vote.

Completely off-topic I know, but then again this story had no real ending so...

wee · 2009-02-05 Reply Admin

greg:
Maybe I'm just dumb, but what does the beginning of the story have to do with the end?
What does a retarded table that is super slow to query have to do with the missing data?

My thoughts as well. I've seen this before, and, yeah, it's a WTF. But what happened to the missing data?

2009-02-05 Reply Admin

Comment FTW!

2009-02-05 Reply Admin

CtrlAltDel:
Comment FTW!

Quoting original comment when replying FAIL!

The Secret DTS Package

Leave a comment on “The Secret DTS Package”