- Feature Articles
- CodeSOD
- Error'd
- Forums
-
Other Articles
- Random Article
- Other Series
- Alex's Soapbox
- Announcements
- Best of…
- Best of Email
- Best of the Sidebar
- Bring Your Own Code
- Coded Smorgasbord
- Mandatory Fun Day
- Off Topic
- Representative Line
- News Roundup
- Editor's Soapbox
- Software on the Rocks
- Souvenir Potpourri
- Sponsor Post
- Tales from the Interview
- The Daily WTF: Live
- Virtudyne
Admin
var foo = new MyBusinessObect();
Assert.AreEqual("this is a test", foo.ToString());
Can't get smaller than a string.
Admin
With good partitioning and truncates, you can dereference the exiting data in large chunks. Of course, that takes proper planning and a database that supports these concepts.
But you are right, truncating involves some writes and deleting data on a per row basis involves many writes. As our resident database guru frequently says (partially quoting Tom Kyte): Row by row equals slow by slow.
Admin
The usual way is to partition on date and then drop old partitions. Of course, if you're using the table as a message queue (assumed from the MSMQ ref), then you're being disingenuous - MQs are not persistent data.
Admin
TRWTF is that you expected people to RTFA. Ooh, it's about databases -- better get my genius thoughts posted.
captcha: jumentum .... I'm not touching that one.
Admin
I see it. The real wtf is that he misspelled "brillant" right?
Admin
You have to be liberal with the title VP in a small company because only officers can sign certain legal documents.
Admin
Admin
After someone fixed it, the VP insisted that they un-fix it in the name of data security. It's at that point they should have introduced table permissions. The problem isn't that they used a trigger to enforce table permissions; it's that they knowingly reintroduced a bug to enforce table permissions.
Keep in mind though that the unit tests (or "rogue process" as you say) are probably running in a test environment.Admin
Admin
TRWTF is using the product id as the primary key of the audit table. You might not need a primary key for the audit table, if it is infrequently read, but if you do it has to have its own synthetic key.
Admin
Even if it's infrequently read, considering how big audit tables can get I can't imagine an index being unwelcome.
Admin
You would be amazed at how easy it is to accumulate real business data at this sort of rate. I worked for a startup that was in the TB/hr range. THAT was a fun table to archive twice a day...
Admin
I think this VP worked for my company at one point. Here is a trigger from an old version of one of my production databases:
ALTER TRIGGER [dbo].[secRoleTr] on [dbo].[SecRole] FOR INSERT, UPDATE, DELETE AS Rollback transaction -- commit transaction -- No changes allowed
I wonder how the roles were initially loaded into the system? I also wonder why permission weren't good enough to do the job? After all, anyone that can alter or bypass permissions can also disable the trigger.
Admin
Admin
I think it's reasonable to assume that this is part of some unit tests on their database system, yes. Stick test data in the database, test that your stored procs work as their supposed to, remove the test data when you're done.
Admin
This is a thing with banks. I've worked with a bank (traditional not investment) that did the same. Everyone in this one building was a VP. Heh.
Admin
Some of the tables on one product I work on have a Disabled column, but for others we have to copy on use to ensure that an update doesn't have unwanted effects, and so the fact is recorded elsewhere. In those tables it makes more sense to delete rows than mark them as disabled.
Admin
[quote user="boog"][quote user="Anonymously Yours"]The WTF is a two-parter. First, that an absurd trigger was put in to stop delete commands, to protect the data from a "rogue process" performing deletes instead of appropriately limiting that power using table permissions.[/quote] Not quite. The trigger sounds like an auditing trigger, and nothing more. Preventing deletes appears to actually be a side effect due to a bug, where they were using :new.product_id instead of :old.product_id. On a delete, :new would be NULL (since it's deleting the record), so when the trigger creates the record in the audit table, it fails because the primary key can't contain NULL.
After someone fixed it, the VP insisted that they un-fix it in the name of data security. It's at that point they should have introduced table permissions. The problem isn't that they used a trigger to enforce table permissions; it's that they knowingly reintroduced a bug to enforce table permissions.[/quote]An intentionally broken piece of code was put into a trigger to stop delete commands. The fact that we don't know if that was the initial purpose is irrelevant.
[quote user="Anonymously Yours"]Second, to resolve this problem, the author wrote a rogue process to silently bypass the trigger, which could inevitably lead to validating the VP's concerns some process could delete records in a table that it should be impossible to delete records from.[/quote]Keep in mind though that the unit tests (or "rogue process" as you say) are probably running in a test environment.[/quote]If this trigger only existed in a test environment the VP of security wouldn't have stepped in to complain about "rogue processes" (his words, not mine, from the article). What happens when a clone of production has data loss? You restore it with a copy of production. The main purpose of a test environment is if something explodes it does so with data you can afford to lose. Protecting the test environment but not production is very unlikely.
Now, ultimately we do not know if the trigger override code made it into production or not. Assuming it did not then the author has still created a WTF to counter another WTF. Jody has made a test unit that does not accurately reflect what is happening in production, which could lead to code/procs/etc. being promoted that shouldn't be, given that the real behavior of said work cannot be observed until it reaches production.
Admin
(Please delete the previous reply; I am reposting to fix a formatting error.)
An intentionally broken piece of code was put into a trigger to stop delete commands. The fact that we don't know if that was the initial purpose is irrelevant. If this trigger only existed in a test environment the VP of security wouldn't have stepped in to complain about "rogue processes" (his words, not mine, from the article). What happens when a clone of production has data loss? You restore it with a copy of production. The main purpose of a test environment is if something explodes it does so with data you can afford to lose. Protecting the test environment but not production is very unlikely.Now, ultimately we do not know if the trigger override code made it into production or not. Assuming it did not then the author has still created a WTF to counter another WTF. Jody has made a test unit that does not accurately reflect what is happening in production, which could lead to code/procs/etc. being promoted that shouldn't be, given that the real behavior of said work cannot be observed until it reaches production.
Admin
I thank my lucky stars that I've never encountered this level of stupidity. Why take on an expert and ignore what they have to say? If they really knew better, why weren't they doing it themselves? Of course, the reason they paid you is so that they can ignore you and then blame you later when someone brings a product that has increased in price back for a refund...
Admin
TRWTF is the name of the table is in plural
Admin
No. If the test passes, the code is fine. End of story. At least this is the view propogated amongst managers where I am.
Admin
What about ... not deleting row after test?
After all, your test DB gets reset to consistent state after each run ... right? RIGHT?
Admin
How about using BEGIN TRANSACTION and ROLLBACK instead of deleting?
Admin
Yes, just dynamically replacing the tablename in your code is ofcourse a big no-no. Using archive tables is quite a well-known way of keeping the performance of your database on par while it grows, also foreign keys do influence the optimizer a great bunch, so they are not merely convention, not even on archive tables, and a bunch of code to do a SELECT...INTO is crappy programming.
Captcha: IMAKNOBQUOTINGCAPTCHAS.... hum.
Admin
Admin
... that TRWTF is buried in this comment
"This trigger was preventing deletes because it's trying to insert NULL into the primary key of the PRODUCTS_AUDIT table on DELETES...".
It is clear from this that no significant testing of the entire process occured, or the trigger would have always caused a primary key error on a delete.
Regardless of your position on deleteing rows, this indicates that testing was - as usual - inadequate.
Admin
I really hope you guys are all kidding... The classes (entity services) that retrieve data from the database, are also units, and hence could & should be unit tested. We don't use a "mock database" for this, but a test database, set up using the same script as the deployment one.
We once had a WTFcoder that did tests on the toString method aswell, we're still cleaning up his mess.
Captha: erat - how appropriate!
Admin
It sounds like you might have a hoarding problem. You need to learn to let go.
Admin
Ummm ... yes, that's the WTF.
Admin
disk space needs to be provisioned. disk space alerts need to be investigated. Try doing both of those for the same cost as the time it takes to think "I wanna delete some stuff".
You are ultimately right in that hard deleting things is usually a bad idea - however the idea that the best solution is to keep everything is worrying too.
Archive old data and remove it from your 'live' sources.
Admin
Admin
Admin
TRWTF is that she ran her tests against a live database...
Admin
It has been my experience that this is the sort of justification used in larger companies, not in smaller ones.
Admin
Some databases have a better way to handle this sort of thing, you can have the table automatically split across two drives or servers.
Typical for transaction data, last years stuff get pushed off to the other drive.
So if you query current data, it run quickly off the main drive, but if you need as well old data, then it uses both. From a user point of view, it acts like a single table.
Admin
Second, without knowing more about Jody's test script, it's hard to say whether altering the configuration while unit testing is a WTF. For all we know, all of the actual tests are complete before Jody disables the trigger, deletes the test data, and re-enables the trigger. In that case, the tests themselves may still accurately reflect production.
Admin
Aaaah, I miss those guys.
Admin
Disclaimer for TRWTF howlers: I don't do this. Just seens it thataway.
Admin
But as far as ex-members go, you know who I haven't seen in a long freakin' time? Gene Wirchenko. Always signing his comments. Whatever happened to that guy?
Admin
As has sadly only been mentioned once or twice, the correct solution is to control deletion by permissions. That way the VP's supposed "rogue process" is handled too.
W.R.T. the cost of disks etc. some folks seem to have forgotten that systems exist to deliver business benefit for a company, and not to keep developers employed.
If by ageing vast amounts of data out to cheap commodity drives, you can save money on expensive disks, then you damned well ought to. The fact that disk is cheaper than ever before, does NOT mean that a Terabyte gobbling application should not have cheap storage for such things as audit.
And finally, databases exist to store information. If you can find a way to do that with less data than before, well, that's called efficient design. Storing Terabytes of dross data does not mean that information useful to the business is being retained.
Admin
Admin
And I can certainly sympathize with the annoyance of having to revert fixes because something (or someone) was dependent on the bug.
It's possible, but I think swapping :new and :old is a pretty simple mistake. It's hard to say for sure why it happened. At the very least, I think we can agree that the resulting disabled-deletion state did nothing to facilitate the observation that table permission definitions were inadequate. :) Very true; considering they had a VP who insisted that they re-introduce a bug, I'm sure the organization is a wonderfully-WTFy WTF-land chock full of WTF-goodness. If I was Jody, I'd have grepped the codebase/commit-logs for that VP's name, and then I'd have a whole slew of WTFs to submit.Admin
Admin
Admin
I have - yes, I shamefully admit - implemented a trigger in order to trap a bug on the application, that we simply couldn't find after more than a year. There was a problem with product data once in a while where sometimes it would be written with wrong decimals, 12,00 instead of 1,20, 49,00 instead of 4,90 and so on. This happened quite rarely, but only enough so we simply were not prepared, and all too often to upset our customers. We had no clue about how it happend.
So we made this killer trigger, it would give up everything that had and was happening on the database at the precise moment new.value was 10 times bigger than old.value. We even put some arcane Oracle functions in it. It was fully documented so it wouldn't be dropped by accident. It was awesome, this bug was DEAD!
It's been up for six months, and was never fired. The problem has never happened again.
Admin
I agree, I think soft-deletes (e.g. [status] = 'DEL') would be fine here. 700,000 obsolete records isn't a lot nowadays, and as other people have pointed out, if you might one day need to refer back to them then an archive table or whatever is going to slow down creation and execution of queries.
Admin
There's some data that it makes sense to keep forever, like data that changes over time and for which the value at any given time in the past may be needed. For example, in a retail system, we probably want to know the price of a product at the time it was sold, so if a customer returns it and asks for a refund we know how much to give him.
But there are also many cases where old data is not particularly useful. Like, if a customer changes his phone number, do we really need to know what his phone number was at the time of his purchase 5 years ago? If there's a problem with that sale, are we going to call the phone number he had at the time of the sale, or the phone number he has now?
In a case like that, keeping old data doesn't just waste disk space, which is, indeed, getting less and less important as disk space gets cheaper. It also makes the schema more complicated: The key to the customer record (or wherever the phone number is stored) now has to include a timestamp. Queries are more complicated: At a minimum, we now have to get the record with the right customer id AND the latest timestamp instead of just the right customer id. In queries that involve joins there is now the possiblity of multiple records from this table matching. In the simple case we could just "order by updatetimestamp desc limit 1", but what if there are joins to multiple tables that all have these timestamps? Now we have to add a host of order-bys and limits or distincts and the whole thing becomes a pain. And sooner or later someone will make a mistake and we'll have sales being counted twice or something.
Admin
The poster doesn't say what his business is, but it doesn't necessarily follow that just because a product hasn't been sold for a year that it is obsolete.
Suppose that among the products you sell are Christmas decorations. (Am I allowed to say that? Or am I supposed to say "The Winter Holiday That May Not Be Named"?) The fact that you haven't sold one of these for 11 months would probably mean nothing at all. If you haven't sold one for 12 months maybe it's just timing: The last one you sold was December 18, 2009, it's now December 19, 2010, you might still sell some this season. If it's supposed to be a high volume product that might indicate a problem. But if it's a high-ticket item and you normally sell only a couple a year, maybe that's fine.
Or if you sell snow shovels, and there was no snow in your area one winter, you might go almost two years without selling one.
I certainly wouldn't go automatically deleting everything that hasn't sold in 12 months.
Admin
Why was the month of the code change redacted? I mean, I understand redacting the VP's name. You don't want someone to call him and say, "Hey Fred, I see an employee of yours just made a post about you on the web saying that you're an idiot." But why redact the month? Are you afraid that someone will track down the VP involved by getting a list of all VPs who ordered code changes in September? (I think I saw them solve a crime that way on CSI once ...) Or are you afraid of offending all the June-lovers who would be outraged at the thought that a foolish code change was made in their favorite month?