- Feature Articles
- CodeSOD
- Error'd
- Forums
-
Other Articles
- Random Article
- Other Series
- Alex's Soapbox
- Announcements
- Best of…
- Best of Email
- Best of the Sidebar
- Bring Your Own Code
- Coded Smorgasbord
- Mandatory Fun Day
- Off Topic
- Representative Line
- News Roundup
- Editor's Soapbox
- Software on the Rocks
- Souvenir Potpourri
- Sponsor Post
- Tales from the Interview
- The Daily WTF: Live
- Virtudyne
Admin
And in addition, has anyone ever checked how many of those comments come from registered users and how many are anonymous?
Admin
Or perhaps how long it takes for a thread to move into a language war?
Admin
There was one not so long ago written in Perl. There have been plenty in C++ (I don't remember whether it was VC++ or otherwise). SQL isn't strictly an MS product, they just make one of the DB engines (that is to say, I could do this exact same WTF in DB2 or Oracle or MySQL if I wanted to).
So, rather than "shoot the tool maker", it's a case of "randomly shoot someone who may or may not have had anything to do with making the tool" or possibly "just shoot MS, even if they're completely innocent".
Admin
I think it's been said before on the forum, but I'll stress it, just in case: M$ development tools make it very easy for clueless people to become DBAs, programmers and so on. Even when the tools themselves are ok, those using them don't do it right, because they lack a minimum of formal training. And that's because with those tools, they can afford it.
Now, honestly, the Redmond guys aren't the only ones making programming too easy. Think Javascript. But that was an acident. It's promoting mediocrity on purpose I criticise.
Oh, and I think the last posts belong to another thread, in another section.
Admin
Performance of a query is 100% unrelated to the nr of lines in a query. I can write 2 line queries that have horrible performance. And I can write 2000 line queries that produce results in a matter of miliseconds.
Also, using "in ()" or "inner join" to filter out data performs the same in any modern database because the query governor will work out the same query plan for both constructions.
This query is not slow due to the way it is written, although the construction is horrible.
Admin
With some of the SQL "code" I've seen recently at this site, I guess the language doesn't even have to be Turing complete for people to make an absolute mess of it :-)
Admin
Oh? Has it occured to you that a longer query is inherently more complex? And more error-prone because of sheer code size? Think again, Mr. Expert :-)
Let me guess: you have studied the output of EXPLAIN on all modern database servers :D
You could be right in this particular case. But I wouldn't generalize.
Admin
I think it's pretty clear that the original developer was paid per line of code. Heh. They multiplied their pay for that query!
/fears days when I find contributions for The Daily WTF
Admin
Eh?
I think you do not understand that just because you create a view from your sql statement it will not perform any faster. It is the same SQL query, you just access it more conveniently through a view name... a messed up sql select will perform equally poorly when you create a view from it.
What I really find amusing, as long as I don't have to clean it up, is when people create a bunch of views, and then start joining them together... 'But it looks so simple, all I'm doing is joining these three views together, why is it taking such a long time to execute?'... Priceless.
Admin
I don't know if you were being sarcastic here or not, but I didn't realize "everyone knows" that SQL Server is slow.
In fact, in my experience, it is comparably fast compared to other fine databases such as MySQL, Oracle, and AS/400, and also has a ton more features (Views, Stored Procedures, DTS).
I have a feeling this is either sarcasm that I'm missing, or more senseless MS bashing.
Admin
I don't know if you were being sarcastic here or not, but I didn't realize "everyone knows" that SQL Server is slow.
In fact, in my experience, it is comparably fast compared to other fine databases such as MySQL, Oracle, and AS/400, and also has a ton more features (Views, Stored Procedures, DTS).
I have a feeling this is either sarcasm that I'm missing, or more senseless MS bashing.
Admin
Gag.. Cough.. This is awful... I can think of two different ways to accomplish this without giving much thought at all...
Admin
What askeeto means is that the amount of cases in which the performance of a query and its length cq complexity are not inversely related is significant, or even close to 50%, as he showed with his two pseudo-examples, and I assume he assumes that the theoretical query is error-free.
But I claim 100% truth in 0% of the cases.
Which is an inherently false statement.
Still, comparing each value to 22 10-char strings doesn't strike me as terribly efficient.
Admin
I'd bet that this Query had been clicked together by some visual SQL Admin tool...
Admin
Your sarcasm meter needs repair. I actually tried to put a "censor filter" to replace M$ and Micro$oft with humorous replacement text linking to that class Penny Arcade comic (which I now cannot find ...). Naturally, this worthless forums software doesn't like "$" symbols in censor words.
Admin
No, telling people it can only handle 10 isn't enough. They design it so physically you can't fit more than 10 live (you can pack a lot of dead people in an elevator I suppose, but that generally isn't something that people do) people in the little box. Then they make it able to handle 20.
Admin
:)
This has got to be made into some kind of quote, or experession to use in a certain situation.
eg.
"This sucks"
"Well, you sure can pack a lot of dead people in that elevator"
Admin
Not likely. I can easily tripple the LOC count for that query without making any signifiacnt changes
SELECT
DISTINCT
SMS_G_System_SYSTEM.Name
See, I've already trippled my pay compared to this guy, and I didn't even have to think about it. I wish I could get paid by lines of code, I'd be retired by the end of the year. Course I pitty the poor guy who replaces me. Particularly if whitespace is counted too - my teachers always liked double spacing in school, and tripple spacing must be better, right?
SELECT
DISTINCT
SMS_G_System_SYSTEM.Name
Ewww.... I hope none of you get paid per LOC on something I have to take over.
Admin
Alex, regarding censor words, it's probably using a reg ex for this, try "$"
Admin
I've more experience with MSSql but when recently doing a data 'cleanup' for Oracle 8i ran into the IN (n, n...) limitation of 1000. I've seen alot of comments against MSSql temp tables but create #mytable( attributes) was always a favorite of mine.
Admin
Sybase is evil?
http://en.wikipedia.org/wiki/Microsoft_SQL_Server
Admin
Yeah, but then you get frat boys stuffing pledges into elevators during Greek week or Rush. [:)]
Admin
Are you referring to the one where Tycho secretly switched Gabe's XBox controller with an 800lb Gorilla per the old Folgers commercials?
Admin
In SQL Server, there doesn't seem to be much difference between joins and equivalent IN/NOT IN conditions. In fact, some joins get rewritten by SQL Server as IN/NOT IN conditions before execution, so I hope it's not that much slower...
Admin
You're obviously a script kiddie, so just stick to what you know.
There are plenty of examples on this site alone to prove you are mercifully free of the ravages of intelligence.
Admin
The use of SELECT DISTINCT does not necessarily indicate poor normalization.
You're right, most of the time it doesn't. However, most of the time I see people using distinct, it's because they're not joining properly or some similar poor practice that needs to be corrected. It's like using count(*)-- most of the time it just isn't necessary.
Admin
If we are talking performance, complexity is not important. When querying a 90 million record table I can't afford to write "simple" queries that are less "error prone". I have to write the query that returns the data the fastest. And if that requires a 200 line super complex query, so be it.
Try it instead of making a smartass remark.
Admin
What information are you adding? I said the exact same thing in my original post:
Admin
Since Version 6.0 (released 1995), MS Sql Server has contained no Sybase written code. (They had a little party when they replaceed the last module)
Admin
Actually, it doesn't happen that often here. Now, over on Slash-Dot, that's a different story. Usually, it takes only about 10-20 messages, and it takes that long only because many discussions there have nothing to do with computers. Also, if the first mention of Microsoft on a slash-dot thread isn't a bash, the first response to the message (with 100% certainty) will be.
Admin
In my experience, horrible construction usually goes hand-in-hand with horrible inefficiency.
I happen to agree with the people who say this looks like an otherwise good coder suffering from temporary insanity (due to time pressure, or frustration, or whatever). I've been there - "JUST WORK DAMMIT".
Admin
Ahh, SMS. My nemesis rears his ugly head, yet again.
Admin
<font style="font-family: arial;" size="2">
</font><font size="2">wouldn't </font><font style="font-family: arial;" size="2">horrible inefficiency be good?
</font>
Admin
One thing that really helps is that the moderator over here (me) is pro-Microsoft. The "MS is evil" posts get quickly deleted for wasting everyone's time ... Slashdot is the place to go for that ...
Admin
Admin
Alex / James -- have you seen this:
http://www.bbspot.com/toys/slashtitle/
It's hilarious .... keep refreshing or clicking "next" to see the different "slashdot" articles that come up. It's pretty funny, and they did a good job of incorporating some Microsoft hate into at least every other article ....
Admin
I'm not an SQL guru, but I think SQL servers can optimise queries better when you just tell them what you want, one step at a time (hence my plead for using views), instead of making a big mess in which you try to impose a particular query plan yourself.
And what about my second argument, the one about query correctness? Using views is like expressing a complex algorithm as many small functions instead of one big lump of spaghetti code. Doesn't that matter at all?
Admin
Well, that's where you're wrong. You start with that approach, sure. But when it becomes painfully obvious that the server is fouling up the query (not having the 'out of the box' knowledge that you have it's understandable, of course) you have to start optimising using a "forced" query plan.
The gains you can achieve, regardless of DBMS, are enormous. You only do this on "problem" queries, of course. profiling and "explain" are your friends here.
It's an appealing view, if you'll pardon the pun. but it pretty much destroys performance in most cases.
Nope. Query "correctness" matters not a jot if it takes until the heat death of the universe to complete.
Complex joins on views are hideously hard for query optimisers to deal with. Couple this with the fact that views are generally created based on relatively 'difficult' queries that have already been optimised to do that job; pulling two or more such queries together will tend to pull the optimiser in about 40 directions simultaneously, with the net result that it gives up and ends up tablescanning pretty much everything, almost always at the bottom of massively nested subqueries.
It's hellish difficult to optimise queries based on views. The first thing to do in cases when you're faced with one of these is to reconstitute it into one query, which can be an "interesting" task in itself; I once spent 2 days optimising one query at $major_financial_institution. It started off as a 20 line query using a shitload of complex views and nasty conditions joining to other tables, and generally took up to an hour to run. The 'finished' version was big (well, very big, in fact, getting on for 300 lines) but ran in 20 seconds in the worst case, and averaged under 10 seconds. A large part of the work was to do with flattening the views (and subselects thereupon) back into the main query. It was a maddening task, and frankly it would have been easier if I'd started from scratch.
Simon
Admin
I like your idea of using views to reduce the spagetti of SQL, but the use of views has NO impact on optimisation for SQL server.
Assume I create a view:
When you run the query:
What SQL Server will execute is the following
In most cases this will produce the same query plan as the quey below, but as the complexity of your view increases this becomes less and less predictable:
There is no caching of pre-fetching of view results done. And the query plan of the view is not even stored. SQL server will cache segments of the query plan, but that is unrelated to the fact that it is a view. It does exactly the same for normal queries.
Admin
and why did this forum destroy my post?
Admin
<font face="Verdana, Arial" size="2">Fry: "Oh the fools! If only they'd built it with 6001 hulls! When will they learn?"
</font>
Admin
Your sarcasm meter needs repair. I actually tried to put a "censor filter" to replace M$ and Micro$oft with humorous replacement text linking to that class Penny Arcade comic (which I now cannot find ...). Naturally, this worthless forums software doesn't like "$" symbols in censor words.
You are truly a man after my own heart.
Admin
The part about how it is often "better to start from scratch" rather than using existing spaghetti code is absolutely true, but the rest is pretty questionable.
Using views that properly fit into the SELECT you are writing is EXACTLY the same as using derived tables in all respects, except for the "side effect" of having more readable code. Now, using random views that just happen to be there w/o understanding what they do, and with logic and joins that you don't need, and just joining them all together and randomly messing with things until it "looks" right (i.e., often by adding DISTINCT as mentioned earlier) is of course going to generate horrible execution plans. Maybe that is the argument you are trying to make? But that is NOT an arguement against using views in general ! That is like saying "code in C++ can be slow and buggy; therefore C++ is bad."
Admin
Well, at least his FROM clause prevents the query from dead-locking itself. :)
ML
Admin
Amateurs!
Now if you really want to slow the system down, it's just silly to link them by any fields. So get rid of it.
You really want a Cartesian join with the system's largest table. And if that query finishes within the day, you're going to have to join that table a couple of times.
Now, we're cooking...some silicon.
Admin
Join is valid here because he wants to only select "name" from SMS_R_System if it also exists in SMS_G_System_SYSTEM.
The 'distinct' gives him unique names (i.e. it could be a 1:M). He could use 'group by name' also to give him unique names.
Admin
Without knowing the database it's hard to say if this is a screwed up situation or not. That query could have been auto generated by a tool of some sort without the knowledge to optimize, for instance.
I like to point out to those who make condemnations quickly something: what's good for the system isn't always good for the business. Case in point: I have a stored procedure which uses (gasp) cursors. It does per record logic on a subset of data that gets fairly indepth. I used to have the business layer do it but that took upwards of five minutes because of the heavy data manipulation. Placing this routine in a stored procedure on the server it now takes less than a second to execute. Is it "horrible" code, abusing SQL Server for things it's not supposed to be used for?
You bet.
Did my end users love me when I made the change?
You bet.
Sometimes priorities have to be set, and priority #1 is almost never "good code."
Admin
Would it be even faster, shorter, and more efficient if it was set-based?
You bet.
And if it was set-based, would your users AND your fellow developers love you even more?
You bet!
Admin
Neither does Postgresql. (Just tried it, with 300 elements in an "in" statement - it worked fine)
What kind of bone-headed database allows only 255 elements in an "in" clause?
Admin
Having written hairy SQL myself all the time, I can tell what the person's trying to do, and they should have chosen EITHER a subquery OR an innerjoin, but not an innerjoin with a subquery containing an innerjoin. The alternating IN and NOT IN forms didn't help readability either.
When I first looked at the query, I thought "WTF is this inner join for? None of the data from the other table is used, so why include it" then I realized that it WAS being used... the query-writer was looking for "this ResourceID exists in both _G_ and _R_ tables".
The obvious solution would be to alter the table and add a column something like "we_care_about_this_row boolean" (sorry, I use postgres, so I dont't know SQL Server's datatypes) and just use that column instead of this IN/NOT IN junk, but if the schema just can't be changed, just replacing the IN (SELECT ...) with NOT IN (the list in the subquery) like you did would certainly speed it up.