• Buzer (cs)

    At least he didn't use XML on DataType column...

  • dsckeld (cs)

    Wouldn't that only be enterprisey enough if the offsets had been defined in a table?

    Apart from that: If someone seems to be guarding their secrets a little too zealously, don't be afraid, be VERY afraid.

  • Harold (unregistered)

    If you are going to base this on elapsed times, then you need to get the data buffered the same, and the disks the same speed. That is hard to do on your laptop. But you could measure teh CPU used instead, and still see a large difference.

  • Aaron (cs)

    It could have been an indexed view, no?

    Still would make more sense to parse during import, but nevertheless, if it took less than 1 second to look up records, the view could very well have been indexed even if the table wasn't.

  • Honeyman (unregistered)
    Comment held for moderation.
  • Quango (cs) in reply to Honeyman
    Honeyman:
    > For 1000 selects based on the indexed VARCHAR, 30ms. For 1000 selects based on a substring, a whopping 379220ms.

    I dare to ask, but you ARE aware about expression indices, aren't you?

    Example: http://www.postgresql.org/docs/8.3/interactive/indexes-expressional.html

    I think since DTS was mentioned it's some flavour of Microsoft SQL Server not PostGres - don't think MS SQL supports this.

    Even so doing something inefficiently in a slightly more efficient way is probably going to be slower than doing efficiently just once.

  • greg (unregistered)

    Maybe I'm just dumb, but what does the beginning of the story have to do with the end?

    What does a retarded table that is super slow to query have to do with the missing data?

  • Quango (cs)

    I'm a sucker for such a challenge.. Just ran this SQL on my workstation

    1000 selects by indexed NPI: 453ms 1000 selects by substring NPI: .. still waiting after 13 minutes

    My workstation: Core quad Q6600, 8GB ram, sql 2005 64bit, running Vista x64, raid disk array, NOT a slow system. What the heck kind of laptop do you have that does the first in 30ms?

  • Dan (unregistered) in reply to greg
    greg:
    Maybe I'm just dumb, but what does the beginning of the story have to do with the end?

    What does a retarded table that is super slow to query have to do with the missing data?

    Agreed. Even if the table were fixed, I don't see how the missing number would magically appear, unless maybe it was missing due to column spacing. Did he ever find it? Inquiring minds want to know!

  • Honeyman (unregistered) in reply to Quango
    Even so doing something inefficiently in a slightly more efficient way is probably going to be slower than doing efficiently just once.
    A lookup over the indexed field does not care if the index is based on the field directly or on the function over the fields. Doing SELECT ... WHERE f1(f2(field)) = 'blaA' (if an index exists over the f1(f2(field)) ) is not "slightly more efficient" than doing SELECT ... WHERE field = 'blaB' (if an index exists over the field) - it is virtually the same.
    don't think MS SQL supports this.
    Even though I've never used MS SQL, I seriously doubt MS SQL doesn't support it - or some similar feature.
  • Anonymous (unregistered) in reply to Quango
    Quango:
    Vista x64, raid disk array, NOT a slow system.

    This supremely brilliant antithesis.

  • jo42 (cs) in reply to Quango
    Quango:
    I'm a sucker for such a challenge.. Just ran this SQL on my workstation

    1000 selects by indexed NPI: 453ms 1000 selects by substring NPI: .. still waiting after 13 minutes

    My workstation: Core quad Q6600, 8GB ram, sql 2005 64bit, running Vista x64, raid disk array, NOT a slow system. What the heck kind of laptop do you have that does the first in 30ms?

    You're running Vista. He is running XP.
  • jamface (unregistered)

    Could have been a materialised view with indexes. The view definition probably wouldn't have shown that.

  • The Dopefish (unregistered) in reply to Dan

    My best guess is that they were still waiting on the record to show up after the previous night's import.

  • Andy Goth (cs) in reply to Quango
    Quango:
    raid disk array
    I always chuckle inwardly whenever someone says "RAID array". Spelling it out even further ("RAID disk array") made me laugh out loud. Now I'm just waiting for someone to start talking about their redundant RAID. :^)
  • Ben (unregistered) in reply to jo42
    jo42:
    Quango:
    I'm a sucker for such a challenge.. Just ran this SQL on my workstation

    1000 selects by indexed NPI: 453ms 1000 selects by substring NPI: .. still waiting after 13 minutes

    My workstation: Core quad Q6600, 8GB ram, sql 2005 64bit, running Vista x64, raid disk array, NOT a slow system. What the heck kind of laptop do you have that does the first in 30ms?

    You're running Vista. He is running XP.
    I read this site for a loooong time and appreciate reading the comments. Sometimes they are even better than the article, but this (combined with the previous) was the first comment that actually made me laugh out loud. Thank you.
  • William Furr (unregistered)

    So... you left us hanging! What happened to the missing NPI number?

  • Johann (unregistered) in reply to Andy Goth
    Andy Goth:
    Quango:
    raid disk array
    I always chuckle inwardly whenever someone says "RAID array". Spelling it out even further ("RAID disk array") made me laugh out loud. Now I'm just waiting for someone to start talking about their redundant RAID. :^)

    So, anyway, I got these Redundant Array of Independent Disks Arrays going and...

  • DWalker (unregistered) in reply to Harold
    Harold:
    If you are going to base this on elapsed times, then you need to get the data buffered the same, and the disks the same speed. That is hard to do on your laptop. But you could measure teh CPU used instead, and still see a large difference.

    Wrong. CPU time is not what's important here. What's important is how long it takes to return the data.

    If this is MS SQL, and it can't use indexes (preferably a clustered index seek), it has to do lots of disk I/O and a very small amount of CPU to find the right records. Performing disk I/O involves a lot of waiting time. That is measured by wall clock time, and not by CPU time.

    David Walker

  • Addison (unregistered) in reply to Ben
    Ben:
    jo42:
    Quango:
    I'm a sucker for such a challenge.. Just ran this SQL on my workstation

    1000 selects by indexed NPI: 453ms 1000 selects by substring NPI: .. still waiting after 13 minutes

    My workstation: Core quad Q6600, 8GB ram, sql 2005 64bit, running Vista x64, raid disk array, NOT a slow system. What the heck kind of laptop do you have that does the first in 30ms?

    You're running Vista. He is running XP.
    I read this site for a loooong time and appreciate reading the comments. Sometimes they are even better than the article, but this (combined with the previous) was the first comment that actually made me laugh out loud. Thank you.

    I would like to second that. While I don't know if it would make THAT big of a difference I find Vista's use of system resources appalling.

    And my friends wonder what my problem is. . .

    Captcha: populus. LOVE IT!

  • Fibonacci (unregistered)

    That's my NPI number!

  • heltoupee (cs) in reply to Andy Goth

    Redundant RAID is actually a product. On extremely high-availability systems, you will have 2 separate mirrored arrays (each with their own inbuilt redundancy) running off of 2 separate controllers. This way if one of the controllers goes tits-up, you can failover to the other without being up the creek.

  • MindChild (unregistered) in reply to Andy Goth

    Well, it is to solve a little ambiguity. "I got a new RAID"... a new RAID what? A RAID card? A RAID cage? A RAID disk array? Understand yet?

    Andy Goth:
    Quango:
    raid disk array
    I always chuckle inwardly whenever someone says "RAID array". Spelling it out even further ("RAID disk array") made me laugh out loud. Now I'm just waiting for someone to start talking about their redundant RAID. :^)
  • Patrick (unregistered)

    Meh, It could be worse. Way worse.

  • Walleye (unregistered) in reply to MindChild
    MindChild:
    Well, it is to solve a little ambiguity. "I got a new RAID"... a new RAID what? A RAID card? A RAID cage? A RAID disk array? Understand yet?

    Some bug spray.

  • Anon (unregistered) in reply to Dan
    Dan:
    greg:
    Maybe I'm just dumb, but what does the beginning of the story have to do with the end?

    What does a retarded table that is super slow to query have to do with the missing data?

    Agreed. Even if the table were fixed, I don't see how the missing number would magically appear, unless maybe it was missing due to column spacing. Did he ever find it? Inquiring minds want to know!

    I second these comments! Sure the database design is a huge WTF, but the story seems to be missing a punchline.

  • Bart (unregistered)

    TRWTF is Grandpa Simpson posting WTF stories:

    Grandpa: One way to get rid of them is to tell 'em stories that dont go anywhere. Like the time we went over to Shelbyville during the war, I wore an onion on my belt -- which was the style at the time. You couldn't get those white ones, you could only get those big yellow ones... now where was I... Oh yeah, the important thing was I was wearing an onion on my belt, which was the style at the time, you couldnt get those... (trails off)
  • ObstinateCoder (unregistered) in reply to Quango

    Clearly one that can calculate the average run time and not the total run time.

  • dpm (cs) in reply to heltoupee
    heltoupee:
    Redundant RAID is actually a product.
    Sure, and I won't complain about anyone using it that sense. But we all know that almost anyone who says "Redundant RAID" is merely being . . . redundant.
  • The Linguist (unregistered) in reply to Anonymous
    Comment held for moderation.
  • Robert S. Robbins (unregistered)

    Gee, I didn't know health care providers had a federally-assigned ID. Now I have to add a column to my database table for that.

  • Brett Allen (unregistered)

    Can't believe someone hasn't noticed this yet.

    That's over 9000 times as long!

  • Code Dependent (cs) in reply to Johann
    Johann:
    So, anyway, I got these Redundant Array of Independent Disks Arrays going and...
    My RAID array is inexpensive.
  • Auction_God (cs)

    The design of the table seems OK for the "import" (except that I would have added an "Error" column to track which rows have errors. That is the best way to do it so that you don't get DTS failing to import rows. Then a Stored Procedure can cycle through and process the just imported rows and build "the real table" with the data correctly parsed out. Those rows with errors can get flagged as such and the appropriate person (MORT) notified of the issues.

    An alternative that was mentioned in a previous post is to create a materialized view on the table that gets regenerated after the import is completed. This 2nd method doesn't allow you to process errors as nicely.

  • Robert (unregistered) in reply to Fibonacci
    Fibonacci:
    That's my NPI number!

    Well, too bad for you that it's not a valid NPI! :P 1123581322, on the other hand, would be just fine.

  • Downfall (unregistered)

    How is it that the WTF's alternate between "so poorly edited the reader can't figure out what caused the initial problem" and "so heavily edited that the original submitter shows up to complain and correct the record?"

  • Lerch (unregistered) in reply to Downfall

    I don't know what the issue is, but I definitely clicked through thinking that the RSS feed had a truncated version of the story or something. Nope :(

    We'll never know why the NPI number was missing.

    (though actually my guess is that indeed it's some type of materialized view, and it's just stale)

  • Smyle (unregistered) in reply to Andy Goth
    Andy Goth:
    I always chuckle inwardly whenever someone says "RAID array". Spelling it out even further ("RAID disk array") made me laugh out loud. Now I'm just waiting for someone to start talking about their redundant RAID. :^)

    I don't get it. Why are people making fun of my redundant RAID inexpensive disk array?

  • Some Wonk (unregistered)

    So...the WTF is that indexes are faster than table scans?

    Yeah...hey, that sure is WTFy, alright.

    Say, how is the view from the bottom of the barrel, anyway?

  • Fedaykin (unregistered) in reply to Buzer

    Was the view created a materialized view or not? While the underlying implementation is pretty crappy, a materialized view would get rid of all the performance problems and would be a really good reason for the dev in question to insist that people only use the view.

  • Shial (cs)

    Looks like the table isn't the only thing missing a few lines.

  • halcyon1234 (cs)

    I can't believe put everything into separate ROWS. A single cell would have sufficed, if he'd set it to NVARCHAR(Max). Sure you would use an extra delimiter per record, but you'd eliminate the table scan problem.

  • ParkinT (cs) in reply to dsckeld
    dsckeld:
    Wouldn't that only be enterprisey enough if the offsets had been defined in a table?
    A wooden table, of course.
  • D00fus (unregistered) in reply to Andy Goth
    Comment held for moderation.
  • Ambrose (unregistered)

    One more vote for telling us what happened to the missing number.

    Or is the start of the story just there to explain why the guy looked at the other guy's code?

  • Nathan (unregistered) in reply to Downfall
    Downfall:
    How is it that the WTF's alternate between "so poorly edited the reader can't figure out what caused the initial problem" and "so heavily edited that the original submitter shows up to complain and correct the record?"

    I agree with this. How hard is it to explain what happened without either: a. Fictionalizing it so much that it is unbelievable b. Having it make no sense.

    I miss the days when the DailyWTF was less "Story time with Alex" and more "Look at this bad code...make sure you're not dumb enough to do this!'

    At least the Error'd and CodeSOD sections still speak for themselves.

  • Luke (unregistered) in reply to dpm
    dpm:
    heltoupee:
    Redundant RAID is actually a product.
    Sure, and I won't complain about anyone using it that sense. But we all know that almost anyone who says "Redundant RAID" is merely being . . . redundant.
    I'd submit that RAID these days has lost its meaning, it's no longer a valid acronym, and so in some ways Redundant RAID does make some sense. RAID 0 provides no data redundancy, so you could argue either that a) it's not truly RAID and therefore should be called something else or that b) RAID really doesn't mean anything more than "a bunch of disks as a single logical disk" and therefore Redundant RAID is a perfectly valid way of referring to the old-school RAID we know and love. Also, Redundant RAID implies a tinge or sarcasm and therefore gets my vote.

    Completely off-topic I know, but then again this story had no real ending so...

  • wee (cs) in reply to greg
    greg:
    Maybe I'm just dumb, but what does the beginning of the story have to do with the end?

    What does a retarded table that is super slow to query have to do with the missing data?

    My thoughts as well. I've seen this before, and, yeah, it's a WTF. But what happened to the missing data?

  • CtrlAltDel (unregistered) in reply to ParkinT

    Comment FTW!

  • CtrlAltDel (unregistered) in reply to CtrlAltDel
    CtrlAltDel:
    Comment FTW!

    Quoting original comment when replying FAIL!

Leave a comment on “The Secret DTS Package”

Log In or post as a guest

Replying to comment #:

« Return to Article