- Feature Articles
- CodeSOD
- Error'd
- Forums
-
Other Articles
- Random Article
- Other Series
- Alex's Soapbox
- Announcements
- Best of…
- Best of Email
- Best of the Sidebar
- Bring Your Own Code
- Coded Smorgasbord
- Mandatory Fun Day
- Off Topic
- Representative Line
- News Roundup
- Editor's Soapbox
- Software on the Rocks
- Souvenir Potpourri
- Sponsor Post
- Tales from the Interview
- The Daily WTF: Live
- Virtudyne
Admin
WTF... This cannot be in production code, right?
Admin
They most have created an app to generate all the possible orders... :-|
Admin
For the sake of the coder's sanity, I hope he didn't type that out longhand... although to be fair, I'm not sure there's too much sanity left to salvage based on that mess.
"Well, this is taking hours, and it's clearly the wrong way to do it... but I've started, so I'll finish". There's a certain tragic beauty in that.
Admin
Oh gosh, well apart from the obvious simple way
SELECT * FROM @Table_Input ORDER BY @OrderByCriteria
there's the LIKE statement without wild cards which is just a way to use =, but probably slower.
Admin
Ummmmmm, that would be great if that ever worked but I think the whole point of the post was to show a really bad way of doing dynamic ORDER BY columns. If all anyone had to do was "ORDER BY @OrderByCriteria" this probably wouldn't have happened...
Admin
Most likely this was a quick fix for a program than only expected two possible permutations but as time went by, more were added and instead of fixing the problem, it was just perpetuated.
The WTF here, fix if before it gets out of hand.
Dev Managers excuse: "If it works, why fix it?"
Admin
Nice way to prevent SQL-Injection
Admin
Since some permutations are missing, that's an unlikely explanation, I'm afraid. Some poor soul coded this by hand.
Admin
Admin
Sadly, and very strangely this isn't an option with MSSQL.
The CASE option mentioned is one way, there are precious few other ways.
Admin
Some permutations are missing I imagine because this isn't the entire code snippet, as indication by the "... (original snippet continues in this manner)" text.
Admin
Depends on the database. I've noticed, for instance, that PostgreSQL will use an index rather than a sequential scan for any LIKE expression that doesn't begin with '%'. For example, if you do something like:
...PostgreSQL will take that and use the index (if an appropriate one exists) to narrow down the options to those that start with SMITH.
If you do this:
...then it will, again, use the index to narrow down the candidates before it starts gronking on all of the Smiths to find the SMITH%JOHNs.
Given that they have performed this optimisation, it would seem that if you did this:
...that it would likewise hit the index. Since there is no % in the expression, it can match on the whole expression during its index search and land on it every bit as fast as if you had used the = operator.
Your mileage may vary with other database engines. I only mention PostgreSQL because of my familiarity with it.
Captcha: dubya (speaking of slow)
Admin
No, what's really sad here is that this is very likely used to serve up some report that allows dynamic sorting. I'm guessing that the application generates the column list dynamically and passes it as a parameter, and that's what's really sad here - it would have been far simpler and more maintainable to just use dynamic SQL. Probably someone there got the idea into their head that they HAD to use stored procedures because they're more [secure / efficient / maintainable] (pick any or all of the above).
Admin
Actually, it's loaded with the single-character wildcard (underscore), although that's probably not intended.
Admin
declare @dynsql varchar(1000) declare @orderby varchar(1000) set @Orderby = 'somecolumn, somecolumn2' set @dynsql = 'select * table order by ' + isnull(@Orderby, 'reason') execute(@dynsql)
back when it was ASP instead of ASP.NET I wronte a nice set of functions for reporting that let you just pass in SQL, or a stored procedure, and it would build the output including the option to sort on the fly. worked really well for the small amount of code I put into it (a couple pages of code made ASP almost as easy as an ASP.NET datagrid...)
I know, the execute probably isn't the best way to do this... but... I couldn't come up with a better way to do it, and have the stored procedures sort without a ton of extra planning/coding. And of course, I need to use execute all the time to access an external server, so, I'm used to that type of logic.
Admin
Read The Fine Submission, people.
Emphasis added.Admin
Don't make the mistake of thinking this technique is only useful for sorting--it lends itself nicely to a search algorithm for the next Google killer.
Admin
example: 6 columns to order mean 6! = more than 700 if statements... 7 columns would even produce more than 5000 possibilities to order...
thinking about writing a programm wich produces the sql-if-elseif-statements
Admin
Just did an experiment in SQL Server 2000 and got interesting results. Both of these queries produce subtly different execution plans:
The difference lies in how the index seek is performed.
vs.Note the use of >= AND <= rather than just =. The execution cost actually ends up being a tiny bit larger, according to Query Analyzer (50.01% of the batch for the LIKE, and 49.99% for the = version). It's entirely possible that's just rounding error, though.
SQL Server 2005 comes up with some weird execution plans involving inner joins instead of the bookmark lookup when I tell it to use the index on CompanyName (which it otherwise ignores). Not sure what's going on there...
Admin
Been there, done that. Quick-n-dirty is bad if it becomes something you need a lot and may need to expand it. Well actually it's always bad no matter what...
When I saw a fellow developer create a table with six columns for six different standard messages I was thinking hey, those should be six rows, not columns, so I just noted the guy that that's not how it's supposed to go, but we let it go, cause it would've needed a lot of fixing, and it was for just six messages... This happened two years ago. The application is still in use and has been tweaked a few times since that... Last time I checked, the table had 50 columns, 1 row. It started innocent and, well.. it works... :(
Admin
I don't believe Sybase allows you to build SQL on the fly (or at least, it didn't when we started creating our stored procedures, and by golly, we're not going to start doing something different just because the Intarweb says we can). I wouldn't want to count on those strings being passed correctly, though... I'd have used magic numbers, and probably been included in a WTF of my own at some point.
(cue sound of Coyote screeching to a halt)
Wait a minute. If we can't build SQL interactively, how the heck does "Select * from @anything" work?
Admin
@anything would have to be a table object, otherwise you will get a "muhaha, you can't do that stupid!" from the sql compiler.
http://www.developer.com/db/article.php/3414331 - the bottom example
Admin
Dang! My SQL engine doesn't yell at me, you must be using "option explicit"...
Admin
Haven't read much production code, have you son?
captcha: gotcha
Admin
The REAL WTF here is something I see frequently: programmers not understanding the difference between IMPERATIVE and DECLARATIVE languages.
SQL is a declarative language. If you try to use in SQL the same algorithmic approach that you use in C# or Fortran, the performance of your SQL code will be awful. Generic or dynamic code is excellent in C# but terrible in SQL.
In a declarative language (like SQL) you provide the interpreter with the RESULT you are looking for, not the METHOD to get the result. If you make things dynamic, the server/interpreter always see the requests as something new and cannot use its internal optimizer efficiently. On the other hand if you hardcode everything, an excellent optimization can be done by the interpreter.
On SQL Server, if the server can reuse a previously calculated execution plan for a piece of code, the performance will be greatly improved. This is the main purpose of stored procedure. But when a "smartass" writes dynamic code using EXEC(), the server has no way to predict what the dynamic code will be, and therefore cannot reuse an execution plan. And performance goes down the drain.
So basically the guy who wrote the permutations is doing the right thing. Using dynamic SQL would not be optimal. Unless there is a ridiculous amount of permutations, in which case this is a design issue.
If there is a relatively small number of permutations, say under 100, a good DBA will use a scripting language to generate the SQL. And the stored procedure itself will contain the fully-generated SQL.
So before crapping on someone code, it would be better to learn about the language involved.
Admin
MSSQL is claimed to implement the most basic level of SQL-92, which is almost the same as SQL-89. However I came across a case where it follows SQL-89 semantics. In short, very little of its functionality is implemented according to current SQL standards.
Admin
If you have 100 possible permutations, and all of them are used, the database engine still has to keep plans for 100 different statements, no matter if they are created at runtime using exec() or explicitely written in the stored procedure like in the original posting. Anyway, if the string given to EXEC() is always the same, the database should be able to reuse the plan it created last time. (Disclaimer: I mostly use Oracle, but I'm confident SQLServer can do the same)
Admin
Um... No. In SQL Server 2000 Microsoft started caching execution plans for ALL QUERIES, not just Stored Procedures.
http://groups.google.com/group/microsoft.public.sqlserver.programming/browse_thread/thread/cbfbbeea3993bd88/5fa4f9cd1cde7868%235fa4f9cd1cde7868
Assuming small enough results sets, it would be better to just do the sorting in the application and save the SQL Server some cpu cycles since it's usually the bottle neck of any n-Tier architecture. Most reporting tools have this functionality built in, as does ODBC and .NET.
Admin
you owe me a new keyboard
Admin
I can think of two reasons for code similar to the example... but not identical to it (the @Table_Input parameter).
For servers other than MS SQL (which caches ad-hoc plans rather than only precompiled stored procs), there may be performance implications of using dynamic SQL.
The other one is related to security: if you use dynamic SQL, then you need to grant permissions to the underlying database objects (tables, views) rather than just to the stored procedure.
Admin
You must be new around here...
Admin
I think I know why it looks like this.
Original code was SQL Injection vulnerable. Rewrite had to be done in an hour, with no time to do a proper redesign. Retrofitting security quickly is an exercise in bad engineering and gives birth to dirt ugly code like this.
Admin
One could perhaps generate a text string containing the exact sql code to execute and then use sp_execsql to run it.
Admin
On the other hand, most databases including MSSQL cache all execution plans, no matter where the SQL "comes from", so executing the same SQL over and over again with EXEC or sp_executesql will only have to generate a single execution plan, and often the results themselves will be cached as well.
The net result is that there's no performance advantage to having a "dynamic" stored procedure with complex control logic over just generating the SQL client-side and executing it on the server.
Incidentally, I'm not inclined toward dynamic SQL within SQL server itself. I was talking about generating it from the client, a practice which you're going to see becoming far more common in the near future with the propagation of ORM tools and LINQ.
Admin
This will be prepared every time, but for a query that on the surface looks like a simple table scan, that should be zero.
EXECUTE STATEMENT 'SELECT * FROM ' || :Table_Input || ' ORDER BY ' || :OrderByCriteria
The app should probably be doing the ordering anyway, unless it's a large result set and you want to stop some time before seeing all the results.
Rick DeBay
Admin
Not sure I'd characterize the difference in those terms. In one sense C# is a declarative language too--you want the compiler to generate machine code that does a particular task in a particular way. Some specialized C++ compilers have optimizers which can find codec algorithms in their input files and replace them with equivalents from a library of hand-optimized SSE instructions. In another sense SQL can be interpreted imperatively if you use a RDBMS engine with no optimizer, although that means e.g. all table joins run in O(n*m) time for two tables of size n and m even if the join condition specifies equality between unique columns on both tables. If SQL was truly declarative, it wouldn't matter which of several possible mathematically equivalent ways to express a query were used, but in most real-world implementations it makes a huge difference.
I do agree with the rest of what was said. It is true that interpreting SQL statements is quite expensive compared to other languages, and a lot of coders don't grasp the concepts of algorithmic complexity or implementation cost. Imagine what performance would be like if you were writing code in C# or some similar language which did the following:
As silly as the above sounds, it does work, and with some caching and slightly more intelligent design it can even work well.
A simple SQL statement like
select ca, cb from tbl where cb like 'xyz' order by ca;
has to answer a bunch of questions before it even starts to execute (and for clarity I'm leaving out several necessary steps like locks and transaction handling):
Each of these questions on average requires O(log(N)) disk seeks to answer (which is why modern DB servers are dead in the water without adequate RAM caching). Just imagine how many steps there would be if there were joins, group by clauses, etc. Often there is a combinatorial explosion of possible questions to answer, so the DB has to skip a few steps and choose a suboptimal but working solution just to avoid spending the rest of the Sun's runtime analyzing the query (that's "Sun" as in "the big fusion reactor at the center of the solar system", not "the hardware vendor in California").
Some DB engines provide a facility to precompile the constant parts of an SQL expression. This incurs the cost of most of the steps above once, then produces an object which starts at step #11 with specific data values each time it is invoked; however, sometimes the result is not as optimized as it would be if the specific data values are known. Consider the potential to optimize "like 'foo'", which is equivalent to "= foo" vs. "like '%foo%'" which requires something like a permuted string index or a full table scan. If the planner is given "like [variable]" then it can't use plans that might be faster but only work for special cases of [variable].
Usually prepared queries provide the ability to substitute column values (so you could substitute a variable in place of 'xyz' in the example) but not column names, table names, or any part of SQL except constant subexpressions (so you couldn't substitute a variable in place of 'tbl' in the example, or change the "order by" clause). The reason for this limitation is due to the implementation, which is trying to save costs per query by fixing the execution plan of the query in advance--change the query in all but a few specific ways, and you'll have to redo most or all of the work you were trying to avoid.
On the other hand, if there are only a few ORDER BY expressions, a clever client could simply prepare and cache those queries that it does generate. Clever servers in turn can do this for the clients. Not-so-clever servers take this one step too far, by caching the results of the query too.
Admin
Actually there is a subtle difference between the two anyway, '=' ignores trailing spaces and LIKE doesn't
SELECT * FROM Northwind..Customers WHERE CompanyName = 'Gourmet Lanchonetes '
SELECT * FROM Northwind..Customers WHERE CompanyName LIKE 'Gourmet Lanchonetes '
Admin
Speaking as someone who works at an organization that routinely deals with tables of rather large size (we have 300 tables of >= 1M rows) the idea of using CASE and CONVERT to generate ad-hoc ORDER BY is rather boggling.
Such an approach is only useful if a table is relatively small and/or there is a need to be able to sort by every possible permutation. I can only imagine how loud our users would scream about response times on the order of hours (if not days) necessitated by such a dynamic ORDER BY applied to such large tables...especially since our transaction response time target is 3 seconds walltime.
Experience has shown that, in general, few permutations of ordering are needed for any given table. In a requisition line table, for example, who would want to sort all the rows by unit-of-measure code (each, box, pint, etc.) without first sorting by item? (Seems kind of lame to order the rows so that "each/MRI machine" is followed by "each/mug, coffee".)
Where tables are large, sorting of necessity must be backed up by an appropriate index, and the number of indexes is sharply limited by management and performance issues. So the "multiple-SQL-statement" approach is quite reasonable in such environments, since there are probably no more such statements than there are indexes.
Admin
Looking at this and thinking about maintainablity and that being pushed into production made my feel physically nauseous
No seriously.... I don't feel well
Admin
The Case statement is terribly slow, and forces a table scan regardless of the indexes on the table. Best to use dynamic SQL, with a view to encapsulate the JOIN logic.
For MS, you'd use:
declare @SQL nvarchar(4000)
select @SQL = 'Select * from Table ' + @OrderByCriteria
sp_executeSQL @SQL
Admin
your first seven steps just described ( more or less ) the way xml serialization in c# works :) [ though truth be told it's only between runs not during runs, and i hear you can cache the generated code / dll these days ]
Admin
Admin
Ordering output should not be handled by the database, this is for the front end to do.
Admin
Admin
Captcha: dubya (and his database design too...)
Admin
WTF?! No, that's one of the things the database is for.
Admin
MY EYES! MY EYES! IT BURNS!
Admin
This isn't bad, in certain cases I could consider using this. It's in a stored procedure so the interface simply has to stay the same and you can update the datalayer at will. It prevents Sql Injection which is a big plus. Its very simple and repeats itself A LOT!!! Make a program that generates this dynamically and you have your maintainablity. There are probably better more maintainable ways of doing this like trimming the string as it comes in so that it can only have this amount of chars no more no less. Also making sure each element that is expected is present.
Admin
People who liked this WTF also liked Fischer Random Chess...
Admin
Paid per code-lines --> he'll be rich ;-)