- Feature Articles
- CodeSOD
- Error'd
- Forums
-
Other Articles
- Random Article
- Other Series
- Alex's Soapbox
- Announcements
- Best of…
- Best of Email
- Best of the Sidebar
- Bring Your Own Code
- Coded Smorgasbord
- Mandatory Fun Day
- Off Topic
- Representative Line
- News Roundup
- Editor's Soapbox
- Software on the Rocks
- Souvenir Potpourri
- Sponsor Post
- Tales from the Interview
- The Daily WTF: Live
- Virtudyne
Admin
Nobody is going to read this on the 3rd or 4th page but here goes.
I agree that data integrity is the responsibility of both the RDBMS and the app.
As far as trusting the RDBMS, I couldn't tell you how many times I get a developer with some problem and he wants to blame the RDBMS. The people who wrote the code for the RDBMS are human and make mistakes and blunders just like the folks in your IT shop. However, I bet that by the time the RDBMS is released, the code has been tested more than most of my developers have tested ANYTHING in their lives combined.
Admin
Actually, keeping the other members of my team from pissing in the pool is enough to make it worth it. I have yet to work on anything big enough to require a DBA, but I do have a reputation for finding and verbally abusing anyone who makes me question the validity of data in the database.
Admin
Data Integrety is the sole responsiblity of the database. Data Validation is mostly the job of the programmer. Having a well formed highly normalized database makes it Harder to shoot yourself in the foot. Your App will get a bunch of exceptions and you will have to handle them with validation... That makes testing easier. With a highly normalized database your program becomes more extensible and maintainable. With a highly normalized database you can ensure that you wont have any orphaned records.. or absolute crap data... you will easily be able to move it into a datawarehouse without much need for scrubbing the data. You wont have to worry about loosing data because your structure allowed you to corrupt your database (lets say 2 rows being excatly the same.
Making sure a user enters a properly formated phone number into a Phone number field... Job of the App... Making sure the correct phone number is always returned/updated when asked for job of the database.
Good Database design ensures you can trust your data... share your data with other applications with confidence, allows flexibilty for the developer, increases performance of the App... increases dev time a fair amount.... but when you get use to it... can save you and your team weeks/months of time in testing, maintenance, re-doing this work for another applicaition because yours doesn't take care of it and the list goes on.
My programs know nothing about the database structure it talks too and my database structure knows nothing about the programs. Decoupled things are good.
Jon
Admin
Here here....! So say we all!
Admin
Well now, that makes no sense. DBA is not supposed to be a developer (though many of them do actually start that way) ... DBA does not provide you with stored procedures; DBA provides you with a database that performs well.
With this, I can totally relate to. I'll quote one of my favourite authors (who just happened to be one of the persons filling in for Alex just a few weeks ago) ... The data is what is important, applications come and go!
DROP DEVELOPER CASCADE APPLICATION(S);
CREATE NEW DEVELOPER;
EVALUATE DEVELOPER;
GRANT DEVELOPER PRIVILEGES;)
Admin
Umm... perhaps you don't know it, but ZIP-codes can (and often*do*) have letters in them. Oh... I see, you're only coding something to be used in some limited area where they don't. Well, good luck to you. Hope you never expand outside the area you originally targeted for :)
Admin
Oh please... You can use MySQL well; that just requires you know it well enough to do so (honestly, MyISAM tables suck; you do have better engine choices)
And this goes for *any* database I know of. You can use them badly, or you can use them right (and actually, nearly any way in between) ... you just need to freakin' know enough of them.
The problem with MySQL is that it's a bit too easy to begin with. You can start without understanding anything about what you're doing. With any system that allows this to happen, I've noticed it to be a bad thing (however, I try to make my own applications in such a way that they require no previous understanding of what is about to happen - perhaps I should think about changing jobs or something).
Admin
I think you're a bit off here... lemme explain...
The only point the database needs to take care of, is to not accept an invalid value. That way, it will never return an invalid value (as an invalid value is never accepted into the database).
All the app needs to do here is make it clear to the user that the only accepted sort value for "phone number" is like *** (whatever). After that, if the app attempts to insert an invalid value into the database, the database tells the app that this is not doable and makes a rollback (if this failed, anything we attempted since last commit is most likely not commitable). When the app receives the message that the database failed because it didn't receive a correctly formatted "phone number" it can tell the user that it failed because it didn't receive a correctly formatted "phone number". After this the user needs to modify the format into one that suits the database; before it does, the data modification/insertion will not succeed and data integrity throughout the system is safe.
I fail to see what is so difficult with all this for so many people?
Admin
Oh, man. Been there. At one job I told this guy who had built a complex database with no relationships that I'd straighten it out, but he had to create a primary key on all the tables. So he goes and adds an identity column on each one.
On another job, I walk in and the main DB has a table, said table has millions of rows, said table (a) has no date/time data types, they're all in varchars, so you have to do a convert any time you want to query, and (b) all of the chars and varchars are nchars and nvarchars! It was such a huge table that fixing it was almost impossible due to space considerations, not to mention that it was a 24/7 operation.
The developers reason? He uploaded the tables from an Ingres II/SCO system via Visual Studio into SQL Server and these were the data types that were produced.
Admin
Why waste a round trip to the DB when the client can validate the number itself?
First, the client side language generally has a richer set of constructs to validate input, and may even have a larger context to do so in, ie. there may be more information available at input time about the data than what will be stored in the DB, so the client can often do checks that are not possible in the DB.
Secondly, assuming we are talking about real DB servers here, and not local Access and MSDE style DB's, there is quite a hit in the roundtrip, especially when dealing with the multi-tiered environments common with web applications. In your scenario, a client mistypes a phone number, say for instance, he hits q instead of 1 since the keys are next to each other. That faulty phone number now gets transmitted over his low bandwidth consumer line, across the internet, hits the web server, is transmitted to the buisiness application server, is put into a business object and played around with through N business rules, and finally committed to a transaction with the database, at which time the database rejects it, rolls back the entire transaction, forcing the business layer to undo all it's data hashing, causing the webserver to serve up a special rejection, to be transmitted back across the internet to the user's browser which must render a new page describing the issue..... It would be SOOOO much simpler to have a little line of JScript catch that Q as invalid before causing all that hassle. The DB can validate it too, just to be safe, but by the time the error propagates to the DB, it has already caused alot of extraneous work to be done.
Admin
It remember me at my old job, where possible size over 8060 was not a trouble, since "it would not happen in real world data"... Yeah sure, but the day it will happen, what will we do?
Admin
I'm not saying that the client can't make the validation. Why not; if that saves the client the roundtrip, it's all for the good. However...
My view is that the DB is the place that should *always* validate data also; that's one of the things it's designed to handle - enforcing data integrity. As said, this does not mean that the client should not validate data itself. But as said, it's the data that is important ... client applications come and go but the data needs to stay intact. Designing a database for a single application is one of the shortsightest decisions I see done again and again. Most of the applications don't survive more than a couple of years, but if the database design is good enough, this is no problem.
On this, I do agree. However, as I said, the database *must* do the validation also. If the client (for any given reason) fails to validate the value, the invalid value must be catched at least in the database (if not before). It's just a failsafe for the data integrity... I personally try to design my databases "well" - usually saves me a lot of work if things go wrong in the client end. Also it usually stops quite many wtf's from ever happening in the first place :) On a sidenote, I hate most the clients who want things done "quick and dirty", it just doesn't fit me :(
The CAPTCHA appears to be "stfu" ... I believe that's a personal message for me :D
Admin
Skinner, principally, is either a) Being ironical, or b) being stupid
captcha: genius: Surely not
Admin
Personally, I'm a little disappointed in the efforts of the developer on this one. Sure Varchar(8000) is easy, but to really show off Frank should have used images instead. at least then he could have gotten around the whole 8060 problem. sheesh!
Admin
i hope this is a troll - you've got it completely backwards.
Admin
Hm, turned of AdblockPlus for the Non-WTF Jobs, eh?
Admin
No they aren't...
97.4% reduction: correct
3846% faster: correct
3800% decrease in query time: WRONG
It's quite simple, really. If you have an x% reduction of an amount y, what does that leave you?
(100 - x)/100 * y
or, if you prefer,
(1 - x%) * y
This means that a 3800% reduction is equivalent to multiplication with -37. In which case the query would now finish 74 seconds before it began (will begin?). If that were true, J.T. would be a hot candidate for next year's Nobel Prize in Physics.
Only if you know how to use it.
Admin
Actually it wasn't a 3800% decrease anyway. It was a 97.4% decrease. If it had been reversed to its former state, that would have been a 3800% INcrease.
Sorry to nitpick. It's my job...
Admin
Admin
---Web Developers don't understand being a DBA, they're not meant to, they're there to develop "internets"... being a DBA helps in this.
I am what you call a common old web developer and I can never even dream of doing something like Mr. Frank here. Mr. Frank proves that not everybody has the gift of commonsense.
Captcha: Web Developers are human too.;)
Admin
--why should the method of 'doing it in dev' be any different to the method of doing it in production?
<sarcasm>
So when the time comes to deploy the app and the boss is asking for the reason for delay the whole thing can be mounted on DBA.
Boss: Why is my app not deployed?
Developer: I already submited the code but the DBA is not uploading my DB changes to production and keeps on giving hard time to us....
DBA: But sir....
Boss: That's not true. I know the app is working in DEV.
DBA: But sir....
Boss: You are costing us $XXXXXXXXXXXX. You are no good. Our Devs know more about data then you............
</sarcasm>
Admin
accepted Forum standard:
enum eMyBoolean {
TRUE,
MOSTLY_TRUE,
COULD_BE_TRUE,
SOMETIMES_IT_IS_TRUE,
NOPE_IT_IS_FALSE,
I_WAS_WRONG_IT_IS_REALLY_TRUE,
FALSE,
FILE_NOT_FOUND,
FILE_FOUND_AND_SO_I_DELETED_IT
};
Admin
I'm sorry, but both the DBA and dev here sound pretty snotty and annoying.
The dev - yeah, obviously he didn't know what he was doing.
But the DBA, while he may have just been doing his job, didn't really put the query into context.
It reminded me of DBAs (which I was eventually to become one for a brief period) at a Fortune 50 company right out of school.
They had this (pretty understandable) rule that you couldn't have "SELECT *" in SQL queries on production servers.
Well, if you know anything about DBMSes you know that specifying the fields you want to pull back can help query performance just slightly.
The thing that was foolish, was they would very easily let you submit apps to production that repeatedly made recursive calls to the database (much more inefficently than doing a single optimized SQL call), just because they couldn't run a parser to check for this kind of thing. (whereas grep'ing for 'SELECT *' is easy)
The optimization thing on a 2sec query also made me puke a bit. Is the query running once per day, per week, per month, per minute?
If someone told me to re-optimize a 2sec query that runs once a month, down to a .2sec query, I would litterally mushroom punch him in the face.
I get paid $60 / hour. How much is 1.8sec worth of cpu time per day / week / month?
But sure, if the queryr had to run fairly often (more than once an hour), optimize it all you want.