• Peter (unregistered)

    I bet the tables also weren't properly normalized.

  • Mean Mr. Mustard (unregistered)

    How the heck did this DBA earn his title?

    (First!)

  • jtl (unregistered)

    the real wtf is a government agency actually had a win!

  • Mean Mr. Mustard (unregistered) in reply to Mean Mr. Mustard

    OK, nearly first.

  • (cs) in reply to Peter
    Peter:
    I bet the tables also weren't properly normalized.

    Define "properly normalized." You do know that normalization is anti-performance, right?

  • Chris (unregistered) in reply to Mean Mr. Mustard

    (Fail!)

  • (cs)

    Setting or adding a primary key should help just a tad. Doing lots of joins without a key ? Doesn't seem like its relational.

  • JJ (unregistered)

    I've used databases before, of course, but I'm surprised by what a HUGE difference a few indices would make.

  • Paul (unregistered) in reply to andrewbadera

    I'll be they managed to combine anti-performance and no normalization just fine, though.

  • FlySwat (unregistered)

    And yet, even today the Ruby on Rails guys don't know what a index or a primary key is.

  • JohnLocke (unregistered) in reply to jtl
    jtl:
    the real wtf is a government agency actually had a win!

    When I read [i]"There's no way it could've run that fast. I must've totally destroyed the production database."[i], I thought the DBA would then say: "Damn, I'm gonna have to restore the database from a backup, and revoke your database access until we find out what really happened."

  • Matt S (unregistered)

    We're getting taxed faster! Awesome! The Process be praised!

  • b1xml2 (unregistered) in reply to JJ

    when it comes to the magic figure of millions of rows per table, indexing can be the difference between hours and seconds.

    but as they say, seeing is believing and when you touch your hands on production grade data in excess of say 2 million rows, then you will know the difference.

  • Haikulicious (unregistered)

    Race past rebel base. Lt. Rat so startles able Bert's ape car.

  • (cs)

    This guy's heart was in the right place but it's clear that he's just a rookie. The real solution is to point out the fact the DB server is running on a 30Mhz processor, then say you can get them a deal on one of these. Don't work smarter, work harder!

    Whoops, it was supposed to say 150Mhz. Fixed! -ed.

  • Michael (unregistered)

    Indexs, Keys, who needs them. I tend to store my data in colums rather than rows. Easier to read.

  • Bobby Tables (unregistered)

    Oh. Yes. Little Bobby Tables, we call him.

  • (cs)

    I would be tempted to type "DROP DATABASE database_name" and just leave it in the query window.

    I wouldn't execute it, but just to give the DBA sitting there a heart attack.

  • Dave"); DROP TABLE users; -- (unregistered) in reply to Bobby Tables

    I find lots of people don't like me, especially on the Web, but I've no idea why.

  • (cs)

    Clearly, the REAL WTF™ is that he was rewarded for fixing a problem.

  • Matt (unregistered) in reply to JJ

    Even having done this before (This database could have been the ERP database for an insurance company I used to work for), I'm still amazed at the performance gains you get through proper indexing and Primary/Foreign Key relationships.

    And sadly, no longer surprised when I encounter a database with no keys, indexes, or defined relationships.

  • commenter (unregistered)

    I bet you could throw a really good party with the money saved from sacking the "DBA".

  • Martin (unregistered)
    As for us US taxpayers, we benefited as well since we're getting taxed faster!

    For a moment, I thought this was a success story. Thanks for pointing that out.

  • (cs)

    At least this could be fixed with a few CREATE INDEX statements

    I was expecting it to look like this: http://thedailywtf.com/Articles/The_Flat-File_Society.aspx

    I bet he was sweating bullets waiting for those indexes to be generated. On my Athlon 2500+, generating an index (in SQLite) on a table with 1-2M rows takes 2-4 minutes.

  • Bob Dole (unregistered)

    Indexes: none

    Primary keys: none

    Didn't it say they had DBAs? Exactly what qualifies one to be a DBA if they don't even know enough to have something as basic as PKs and Indexes?

  • (cs) in reply to ObiWayneKenobi
    ObiWayneKenobi:
    Clearly, the REAL WTF™ is that he was rewarded for fixing a problem.
    Indeed
  • Crabs (unregistered)

    The difference indexing makes is pretty simple if you think about it. Indexing basically stores another table ordered by whatever column you indexed. With this, you can do a binary search, which is O(log n), which means searching 1,000,000 records takes only twice the time that searching 1000 records. Without indexing, the table will have to be reordered with every query. As far as I know, most databases use mergesort, which is O(n log n), then the binary search O(log n).

    so, with 2 Million records in the table: indexed lg(2,000,000) ~ 21 instructions non-indexed 2,000,000 lg(2,000,000) + lg(2,000,000) = ~42,000,021 instructions

  • (cs) in reply to Bob Dole
    Bob Dole:
    # Indexes: none # Primary keys: none

    Didn't it say they had DBAs? Exactly what qualifies one to be a DBA if they don't even know enough to have something as basic as PKs and Indexes?

    Sometimes, especially when working for a big beaurocracy, the dba's blindly (admittedly stupidly) take your instructions to create stuff and simply execute it without thinking about what you're [not] doing that shoud [not] be done - because it's not their job to check your schema design. Sad.

    OTOH, we get to read about them here :)

  • Bob (unregistered) in reply to JJ
    JJ:
    I've used databases before, of course, but I'm surprised by what a HUGE difference a few indices would make.

    Let's see, table scanning each 2M row table at least once, possibly many times. Possibly creating temp indexes or hash tables on any joined table. Versus a few index reads and table probes for the relevant records with proper indexes. And you are surprised by the difference?

  • Steve (unregistered) in reply to andrewbadera

    Hence doing denormalization afterwards.

  • (cs)

    The real WTF is that they aren't using flat files to store the data instead of a slow database!

  • True American Patriot(TM) (unregistered) in reply to andrewbadera
    andrewbadera:
    Peter:
    I bet the tables also weren't properly normalized.

    Define "properly normalized." You do know that normalization is anti-performance, right?

    Sir, why do you hate America so much? You hate us for our normalized tables, isn't it?

  • Peets (unregistered) in reply to akatherder

    "I would be tempted to type "DROP DATABASE database_name" and just leave it in the query window."

    Yeah, and making sure that you don't somehow automatically press "Enter" after that instruction as you are used to do countless times a day.

    Remind me not to let you near a live terminal, I prefer other types of excitement in my life..

  • blah (unregistered)

    I guess in the government, DBA stands for Dumb Blonde Attraction.

  • jimicus (unregistered) in reply to akatherder
    I wouldn't execute it, but just to give the DBA sitting there a heart attack.

    And you should be fired for such a stunt. It's a production system, you don't even type the command unless you're sure it's what you want to do. It's too easy to hit enter and before you know it, goodnight vienna to the system.

  • my name is missing (unregistered)

    So what did the DBA do for his zillion dollars per hour?

  • (cs) in reply to Bob Dole
    Bob Dole:
    # Indexes: none # Primary keys: none

    Didn't it say they had DBAs? Exactly what qualifies one to be a DBA if they don't even know enough to have something as basic as PKs and Indexes?

    In my experience: a receding hair line, thick fishbowl glasses, an inability to interact with coworkers, and a bookshelf of dusty m204 and Oracle books. They hide in their hobbit holes and the rest of the organization chugs along, blissfully content.

  • draeath (unregistered) in reply to JJ

    only 150mhz each core, 4 cores. Something like an index on such a large database will DEFINITELY improve performance...

  • Walleye (unregistered) in reply to Mean Mr. Mustard
    Mean Mr. Mustard:
    How the heck did this DBA earn his title?

    The IRS's main database is a huge entity called the 'Master Tax File' and consists of assembly language modules which were originally written in 1962 and have been continually patched ever since. He might be a genius at manipulating these tables without knowing anything about relational databases.

  • (cs)

    you know you're in trouble when you ask enterprise mangler to generate a db diagram, and you get a little row of ants, off into the distance...

  • (cs) in reply to b1xml2
    b1xml2:
    but as they say, seeing is believing and when you touch your hands on production grade data in excess of say 2 million rows, then you will know the difference.
    Try 500 million rows sometime. And if you're still not having fun, try 5-10 billion. 2 million is a walk in the park - you can actually make major schema changes within a few minutes.

    OT, what is with all the dumb SQL injection comments in here? Are people really so ignorant of databases that they see more than a superficial connection between this story and those ones?

  • RealDatabaseDeveloper (unregistered) in reply to andrewbadera

    You know that "properly normalized" is pro-performance?

    It's comments like yours that make real DBA's watch developers like a hawk. There is a reason we run Relational Database Engines everywhere these days, when you model the problem correctly and index it properly it runs very very well.

  • Mike Dimmick (unregistered) in reply to Crabs
    Crabs:
    As far as I know, most databases use mergesort, which is O(n log n) <snip>

    No, they use a simple linear search which is O(n). The engine will only sort if a sort has been specified in an ORDER BY clause. It may sort to do a join, or it may use hashing.

    Far more important than the number of CPU instructions is the amount of disk I/O. An operation on a database is rarely CPU-bound - only if the data will fit in the processor cache. It may be memory-bandwidth-bound if the dataset fits in main memory and the data is already cached. Otherwise you're in the (relatively) glacial arena of disk.

    Remember that even a 150MHz processor could manage to execute 42m instructions in well under a second if all the instructions and data fit in its caches! Even the Pentium was superscalar and pipelined, and able to execute simple arithmetic and logical instructions at more than one per clock cycle.

    Programmers need to understand the hierarchy of bandwidths and latencies in a modern system. Watch Herb Sutter's presentation (will make most sense to C++ programmers, but managed/Java/scripting programmers should also take note).

  • SoonerMatt (unregistered) in reply to commenter
    commenter:
    I bet you could throw a really good party with the money saved from sacking the "DBA".

    Sooo true. It's hard to believe he was letting that slide. Was he/she using the lack of indexes, etc as a bargaining chip for the future?

  • Little Bobby Tables (unregistered)

    Wow! A story where a developer got the credit they deserve, I almost wept a tear of joy!

  • Jeff (unregistered) in reply to Haikulicious

    You get a star for effort, good sir

  • SoonerMatt (unregistered) in reply to Aaron
    Aaron:
    b1xml2:
    but as they say, seeing is believing and when you touch your hands on production grade data in excess of say 2 million rows, then you will know the difference.
    Try 500 million rows sometime. And if you're still not having fun, try 5-10 billion. 2 million is a walk in the park - you can actually make major schema changes within a few minutes.

    OT, what is with all the dumb SQL injection comments in here? Are people really so ignorant of databases that they see more than a superficial connection between this story and those ones?

    Are you working with a POS system?

  • Wayne (unregistered) in reply to akatherder

    Bah... these are the DBA's who didn't know enough to build an index? I doubt they'd be phased by "DROP DATABASE" or even "DROP TABLE".

  • Mike Dimmick (unregistered) in reply to RealDatabaseDeveloper
    RealDatabaseDeveloper:
    You know that "properly normalized" is pro-performance?

    It's comments like yours that make real DBA's watch developers like a hawk. There is a reason we run Relational Database Engines everywhere these days, when you model the problem correctly and index it properly it runs very very well.

    There's a trade-off to be made between putting something in a different table, and increasing the number of rows per page in the main table, but incurring increased I/O cost to do the join to put them back together for one row, and putting more related things in one table, decreasing the number of rows per page in this table and increasing the table scan cost, but reducing the cost when fetching one row.

    Both 'fully normalized' and 'fully denormalized' are usually wrong. Somewhere in between is commonly necessary.

  • Code Slave (unregistered) in reply to jimicus
    jimicus:
    I wouldn't execute it, but just to give the DBA sitting there a heart attack.

    And you should be fired for such a stunt. It's a production system, you don't even type the command unless you're sure it's what you want to do. It's too easy to hit enter and before you know it, goodnight vienna to the system.

    More importantly, Bobby's changes should have first been tested on a test server, tested, documented, handed over to the DB Admin, implemented on the beta server, re-tested and then implemented on production .

    The Real WTF, is that they were experimenting with a production system.

Leave a comment on “Hastening an Inevitable”

Log In or post as a guest

Replying to comment #:

« Return to Article