The Daily WTF: Curious Perversions in Information Technology

clockwise · 2005-07-05 Reply Admin

ARGHGHG. My eyes! They burn! Please shoot this idiot! He does not deserve to live!

Cyresse · 2005-07-05 Reply Admin

Anonymous:
I can find three nice things to say about this procedure:

It is nicely formatted so that it is very easy to see what is being done.

In some hardware configurations it may well run faster than selecting all rows and filtering them on the client side.

It is encapsulated such that someone who actually understands how to write a WHERE clause can replace this stored procedure with a decent one without affecting any other code.

Of course, the whole procedure body could (and should) consist of a single SELECT statement, the format of which is so obvious that it's not even worth typing it in here. I'd suggest that the programmer here might be very competent in other types of coding, but shouldn't be left to generate SQL with some guidance.

A programmer capable of this who claims any SQL expertise should be hung by their toenails and forced to watch episodes of "The Nanny" until they recant. Otherwise, this seems like a management WTF to have put this person on the task.

For us nubs who haven't had to use overly much SQL yet, could you post the single SELECT statement?

(Or better yet, a link to a guide to good and speedy ways to use SQL? Or a link to bad and slow ways, purely for entertainments sake? Like the UI halls of shame...)

JDog · 2005-07-05 Reply Admin

This beast of a WTF stored procedure also included several temp tables, a few cursors, 20 inserts, 38 updates, 15 deletes, and 66 selects. It was 2,000 lines long, and took 30 seconds to run. It was painfull just to look at much less rewrite.

emurphy · 2005-07-05 Reply Admin

Cyresse:

For us nubs who haven't had to use overly much SQL yet, could you post the single SELECT statement?

To do it within the database layer, see the posts from OneFactor and Maurits. Alternatively, you could dynamically build a SELECT statement in the middle tier, containing only the relevant clauses:

SELECT sProjectID, sProjectName, ...
FROM tbl_Projects
WHERE true
AND sProjectID = ### /* omitted if sProjectID is not specified */
/* 4 more criteria */
AND sProjectName = 'xxx' /* omitted if sProjectName is not specified */

Also, if you want to send e-mail from the database layer, I suggest just writing relevant data to a EmailToBeSent type table, and having a separate process periodically scan that table and send whatever it says to send, purging or archiving each entry after it's successfully processed.

pshotts · 2005-07-06 Reply Admin

There is a problem with all the proposed alternatives which is that SQL Server will compile the stored procedure the first time it is run, and that compilation may not use the optimal execution plan for most of the times the procedure is run later.

Here is an alternative procedure design (based on the Northwind data) which will compile for each combination of parameters. It is a sort of dynamic SQL wrapped inside the procedure, using the parameters of the system procedure sp_ExecuteSQL:

ALTER PROC dbo.usp_GetOrders
@OrderID    AS INT      = NULL,
@CustomerID AS NCHAR(5) = NULL,
@EmployeeID AS INT      = NULL,
@OrderDate AS DATETIME = NULL
AS

DECLARE @sql AS NVARCHAR(4000);

SET @sql =
    N'SELECT OrderID, CustomerID, EmployeeID, OrderDate, filler'
+ N' FROM dbo.Orders'
+ N' WHERE 1 = 1'
+ CASE WHEN @OrderID IS NOT NULL THEN
      N' AND OrderID = @oid' ELSE N'' END
+ CASE WHEN @CustomerID IS NOT NULL THEN
      N' AND CustomerID = @cid' ELSE N'' END
+ CASE WHEN @EmployeeID IS NOT NULL THEN
      N' AND EmployeeID = @eid' ELSE N'' END
+ CASE WHEN @OrderDate IS NOT NULL THEN
      N' AND OrderDate = @dt' ELSE N'' END;

EXEC sp_executesql
@sql,
N'@oid AS INT, @cid AS NCHAR(5), @eid AS INT, @dt AS DATETIME',
@oid = @OrderID,
@cid = @CustomerID,
@eid = @EmployeeID,
@dt = @OrderDate;
GO

Thanks to Itzik Ben-Gan of Solid Quality Learning for this!

With regard to sending emails from the database, I would not use it for a "real" application, but I use it the whole time for alerts on SQL server job or DTS package run failures and errors.

2005-07-06 Reply Admin

loneprogrammer:
This isn't so bad. At least it completes in O(N) time and memory usage.

A real WTF would have to use at least two tables to be O(N*N) in either time or memory. Possibly in both if you really bugger it up.

You are joking, right. This completes in at least O(N * M) time where M is the number of criteria, where it COULD have completed in O(N) time with a normal where clause.

2005-07-06 Reply Admin

omg. this is the best wtf yet.

Maybe... just maybe this poor programmer was not aware of using isnull for optional fields in the where cause.... then his ingenuity got the better of him

fuzzbucket · 2005-07-06 Reply Admin

im no SQL guru but for optional parameter selects for searches i always use

"SELECT * FROM Table WHERE Col1 LIKE '%" + parm1 +" %' AND Col2 LIKE '%" + parm2 "%'" etc....

makes it one statement...is there something wrong with this? Is this a serious performance hit or does it not allow use of indexes or something? please let me know..

ammoQ · 2005-07-06 Reply Admin

Anonymous:

1. isnull in the where clause: where isnull(@city,city) = city AND isnull(@customer, customer) = customer AND...
2. OR in the where clause: where (@city is null OR @city = city) AND (@customer is null or @customer = customer) AND...
3.using if statments to break it out into 2^n cases where n = number of optional parameters.

I would also prefer 2 over 1 since I have a feeling that there is a better chance that the query optimizer can optmize the statement based on the constant values. Anyway, as long as you don't have indexes on the searchable columns, all versions should perform equally fast (that means: full table scan). If you have e.g. 2 of 6 columns indexed, you could make a mix of 3. and 2. to get the best possible performance (3. for the indexed colums, 2. for the rest). Dynamic SQL is obviously best in terms of performance, but more difficult to maintain.

ammoQ · 2005-07-06 Reply Admin

Ytram:
...I created an sp_send_cdosys_mail stored procedure for sending HTML-formatted email.

Why would you use the database to send e-mails?

To make sure it works all the time, no matter which eMail program the client has installed.

ammoQ · 2005-07-06 Reply Admin

Anonymous:
There's a bit of debate on this thread on whether email, HTML, business logic, etc should exist on the database tier. My question to those that promote the "absence" of a middle tier is: what DON'T you put into the database, other than the presentation layer? My last job put all kinds of business logic in stored procs: sending & creating HTML emails using T-SQL cursors, randomly generating passwords, etc. It was a maintenance nightmare...

Also, keep in mind that in some cases, its the database gets replaced, rather than the client programs written in C#, VB, etc.

I think this is why we really need business object servers, or OOP databases or something. T-SQL just sucks at these kind of things. The next version of sql server is going to have the .NET runtime in it, but i guess thats a two edged sword.

T-SQL might be a bitch (must admit I don't know it), but Oracle's PL/SQL is a nice language to write business logic. We regulary put almost everything into the database, except a thin presentation layer and some programs communicating with other systems using "strange" APIs. Of course, this means that replacing the database with anything else than Oracle is currently not an option. Maybe in a few years competing products like Fyracle or EnterpriseDB may reach a level of PL/SQL-compatibility that makes them a viable alternative.

wtijsma · 2005-07-06 Reply Admin

Wow, this is the smartest way I've seen to run custom queries on a database that doesn't allow regular users to execute dynamic SQL, only stored procedures.[Y]

Well, it might not be the best performance, but who cares. hardware costs aren't the issue these days, right?...[H]

right guys?...

[uncomfortable silence]

DZ-Jay · 2005-07-06 Reply Admin

ammoQ:
Ytram:
...I created an sp_send_cdosys_mail stored procedure for sending HTML-formatted email.

Why would you use the database to send e-mails?

To make sure it works all the time, no matter which eMail program the client has installed.

wtf??? I was staying away from this thread -- until this. You mean your only alternatives to send e-mail are the server's DB service or the client's mail app? Have you ever heard of an MTA?

dZ.

2005-07-06 Reply Admin

Hehe, well, he stated:

"T-SQL might be a bitch (must admit I don't know it), but Oracle's PL/SQL is a nice language to write business logic. We regulary put almost everything into the database, [...]"

Yeah heck, who needs application servers, if you have a DB server. Sure this must scale like a charm. And there you also have the explanation why he sends email from his DB. He's lacking a middle tier... what a WTF!

Chris F · 2005-07-06 Reply Admin

pshotts:
There is a problem with all the proposed alternatives which is that SQL Server will compile the stored procedure the first time it is run, and that compilation may not use the optimal execution plan for most of the times the procedure is run later.

This is a great WTF from the SQL Server team. They couldn't be bothered with an immutable or constant variable declaration so that these optimizations can take place where they belong.

lucio · 2005-07-06 Reply Admin

wtijsma:
Well, it might not be the best performance, but who cares. hardware costs aren't the issue these days, right?...[H]

This seems to be mainstream thinking where I work [+o(]

Sometimes I think the WTFers are just trying to keep their jobs, like writing crap that only they will understand, consequently making them undisposable (is that a word?) for the company, and it seems to work.

I prefer to live dangerously [:D]

dhromed · 2005-07-06 Reply Admin

Re-inventing the WHERE clause... using WHERE clauses...

Somewhere along the line of coding this, you'd expect the quarter to drop and go cha-ching and he'd think wait... what am I doing.

wtijsma · 2005-07-06 Reply Admin

lucio:
consequently making them undisposable (is that a word?)

Almost ;-) but I get the point, I think you mean something like this:

public class ITGuy : Employee, IUndisposable{
public void Undispose(){
WriteCrap();
}
}

programming == universal language ;-)

2005-07-06 Reply Admin

At the very least, he should have checked whether "LEN(@sProjectID) > 0" and "LEN(@sProjectName) > 0"

before creating the search table.

2005-07-06 Reply Admin

fuzzbucket:
im no SQL guru but for optional parameter selects for searches i always use

"SELECT * FROM Table WHERE Col1 LIKE '%" + parm1 +" %' AND Col2 LIKE '%" + parm2 "%'" etc....

makes it one statement...is there something wrong with this? Is this a serious performance hit or does it not allow use of indexes or something? please let me know..

I am also fairly inexperienced in SQL, but whenever I've had a situation where I needed a dynamic query with optional parameters I also use something similar to LIKE "%" & param1 & "%". Does running a query like this come with any sort of performance hit over doing it some other way?

ammoQ · 2005-07-06 Reply Admin

DZ-Jay:
ammoQ:
Ytram:
...I created an sp_send_cdosys_mail stored procedure for sending HTML-formatted email.

Why would you use the database to send e-mails?

To make sure it works all the time, no matter which eMail program the client has installed.

wtf??? I was staying away from this thread -- until this. You mean your only alternatives to send e-mail are the server's DB service or the client's mail app? Have you ever heard of an MTA?

dZ.

Of course the stored procedure connects to a MTA to send the mails. If I had a middleware, I would let it send the mails.

ammoQ · 2005-07-06 Reply Admin

Anonymous:
Hehe, well, he stated:

"T-SQL might be a bitch (must admit I don't know it), but Oracle's PL/SQL is a nice language to write business logic. We regulary put almost everything into the database, [...]"

Yeah heck, who needs application servers, if you have a DB server. Sure this must scale like a charm. And there you also have the explanation why he sends email from his DB. He's lacking a middle tier... what a WTF!

It doesn't have to scale like a charm. The application (warehouse management) has - depending on the client - about 50 to 200 users. A single server with 2-4 processors running Oracle can easily handle that.

UncleMidriff · 2005-07-06 Reply Admin

Anonymous:
fuzzbucket:
im no SQL guru but for optional parameter selects for searches i always use

"SELECT * FROM Table WHERE Col1 LIKE '%" + parm1 +" %' AND Col2 LIKE '%" + parm2 "%'" etc....

makes it one statement...is there something wrong with this? Is this a serious performance hit or does it not allow use of indexes or something? please let me know..

I am also fairly inexperienced in SQL, but whenever I've had a situation where I needed a dynamic query with optional parameters I also use something similar to LIKE "%" & param1 & "%". Does running a query like this come with any sort of performance hit over doing it some other way?

I'm fairly inexperienced as well, but from I have read, using LIKE can potentially slow things down, especially if the first character is preceeded by a"%" as in, "SELECT * FROM Table WHERE Column LIKE '%F%'". http://www.sql-server-performance.com/transact_sql.asp says that if you can rewrite that to be "SELECT * FROM Table WHERE Column LIKE 'F%'" then, "the Query Optimizer has the ability to potentially use an index to perform the query, speeding performance and reducing the load on SQL Server." I don't know if the same rules apply to other databases.

On a somewhat unrelated note, I thought that dynamic SQL was unspeakably bad and to be avoided at all costs. Am I missing something that makes the examples you all are talking about ok?

2005-07-06 Reply Admin

Thx for the video of the sparkler adventure. Well done[:P]. visit us ad whores at url pisted and post if you get the spark to comment.also 925m.com welcomes code geeks to post at will too. i found your pastel back color interesting. same as Go Yellow chose in their latest paris Hilton ad that is spreading faster than shiznitz is suposed to spread. Thuis is just the press conference: ttp://www.goyellow.de/insight/paris/pressekonferenz.html and here: http://www.goyellow.de/insight/paris/ as a color geek, I find the use of orange in digital cerebrations hot. thx for all the interesting codes wtf! [I]

ammoQ · 2005-07-06 Reply Admin

UncleMidriff:

I'm fairly inexperienced as well, but from I have read, using LIKE can potentially slow things down, especially if the first character is preceeded by a"%" as in, "SELECT * FROM Table WHERE Column LIKE '%F%'". http://www.sql-server-performance.com/transact_sql.asp says that if you can rewrite that to be "SELECT * FROM Table WHERE Column LIKE 'F%'" then, "the Query Optimizer has the ability to potentially use an index to perform the query, speeding performance and reducing the load on SQL Server." I don't know if the same rules apply to other databases.

On a somewhat unrelated note, I thought that dynamic SQL was unspeakably bad and to be avoided at all costs. Am I missing something that makes the examples you all are talking about ok?

The same rule is true for Oracle and probably for every database that has a "like" operator and a query optimizer.

Dynamic SQL is not unspeakably bad, but should not be used without a good reason. One particular problem with dynamic SQL and Oracle (at least in version 8i, I'm not sure about the more current versions) is that whenever the literals in the query change, Oracle will not reuse a previously parsed statement; for example, "select * from x where y=1" and "select * from x where y=2" are two different statements and Oracle will parse the second one again although the only difference to the first one is the literal value. Using bind variables instead of literals is better for Oracle.

Jeff S · 2005-07-06 Reply Admin

If you cannot visualize why

LIKE '%def%'

might be inefficient, regardless of the RDBMS being used, think about it logically:

If I give you a large book with an index and ask you to give me a list of page numbers with all key words that begin with the letters 'def' (LIKE 'def%'), you can jump right to the index at the end, jump to the d's, quickly scan the index alphabetically, and there they all are.

But if I ask to give me a list of all words containing the letters 'def' (LIKE '%def%'), then you need to read the entire index (or the book itself) and search all of the letters of every possible word to determine which pages to return.

JThelen · 2005-07-06 Reply Admin

ammoQ:
UncleMidriff:

I'm fairly inexperienced as well, but from I have read, using LIKE can potentially slow things down, especially if the first character is preceeded by a"%" as in, "SELECT * FROM Table WHERE Column LIKE '%F%'". http://www.sql-server-performance.com/transact_sql.asp says that if you can rewrite that to be "SELECT * FROM Table WHERE Column LIKE 'F%'" then, "the Query Optimizer has the ability to potentially use an index to perform the query, speeding performance and reducing the load on SQL Server." I don't know if the same rules apply to other databases.

On a somewhat unrelated note, I thought that dynamic SQL was unspeakably bad and to be avoided at all costs. Am I missing something that makes the examples you all are talking about ok?

The same rule is true for Oracle and probably for every database that has a "like" operator and a query optimizer.

Dynamic SQL is not unspeakably bad, but should not be used without a good reason. One particular problem with dynamic SQL and Oracle (at least in version 8i, I'm not sure about the more current versions) is that whenever the literals in the query change, Oracle will not reuse a previously parsed statement; for example, "select * from x where y=1" and "select * from x where y=2" are two different statements and Oracle will parse the second one again although the only difference to the first one is the literal value. Using bind variables instead of literals is better for Oracle.

While Oracle has gotten better about handling dynamic SQL with 9i, and shows more improvement with 10g, I still feel that bind variables are a better way of doing things overall, due to all the things you mentioned.

On a different note regarding having the database sending emails, there are a lot of systems out there which are written to do exactly that, and all for the reasons that Mauritis mentioned earlier. Oracle's HR, Peoplesoft and Rational, just to name a few.

2005-07-06 Reply Admin

UncleMidriff:
Anonymous:

fuzzbucket:
im no SQL guru but for optional parameter selects for searches i always use

"SELECT * FROM Table WHERE Col1 LIKE '%" + parm1 +" %' AND Col2 LIKE '%" + parm2 "%'" etc....

makes it one statement...is there something wrong with this? Is this a serious performance hit or does it not allow use of indexes or something? please let me know..

I am also fairly inexperienced in SQL, but whenever I've had a situation where I needed a dynamic query with optional parameters I also use something similar to LIKE "%" & param1 & "%". Does running a query like this come with any sort of performance hit over doing it some other way? 

I'm fairly inexperienced as well, but from I have read, using LIKE can potentially slow things down, especially if the first character is preceeded by a"%" as in, "SELECT * FROM Table WHERE Column LIKE '%F%'". http://www.sql-server-performance.com/transact_sql.asp says that if you can rewrite that to be "SELECT * FROM Table WHERE Column LIKE 'F%'" then, "the Query Optimizer has the ability to potentially use an index to perform the query, speeding performance and reducing the load on SQL Server." I don't know if the same rules apply to other databases.

On a somewhat unrelated note, I thought that dynamic SQL was unspeakably bad and to be avoided at all costs. Am I missing something that makes the examples you all are talking about ok?

Forget performance, you've got a bigger problem:

What if I put in my search criteria as "a'; drop table xyz; --"?

ammoQ · 2005-07-06 Reply Admin

Anonymous:

Forget performance, you've got a bigger problem:

What if I put in my search criteria as "a'; drop table xyz; --"?

In most cases, it simply doesn't work. At least not if you use dynamic SQL inside PL/SQL in Oracle, since these procedures expect exactly one DML statement.
That put aside, it's bad enough if the program crashes when you try to search for "John O'Connor".

loneprogrammer · 2005-07-06 Reply Admin

Anonymous:
You are joking, right. This completes in at least O(N * M) time where M is the number of criteria, where it COULD have completed in O(N) time with a normal where clause.

Yes, but M is a constant having nothing to do with the number of rows. Since constants don't matter, this is still O(N).

2005-07-06 Reply Admin

Anonymous:

I don't know if anyone is "promoting" the absence of a middle tier, just pointing out that there is a cost to implementing a more complex system. For a smaller system, the efficiencies of a three-tier system might not make up for the cost of an expert programmer's hours. Some of us are implementing systems alone, or as part of a very small team. That doesn't mean that the database is the best place to generate a random password; but when choosing between two tiers, it might be the best place to send an email.

Sometimes the database gets replaced, but in practice I believe that happens less often then replacing a client program. Often I think you'll see several database clients. One makes the best predictions one can based on what one knows.

Implementing anything object-oriented does indeed suck in T-SQL, or really, in any relational database. I've seen more than one person try to use a system (with tables like "ObjectTypes", "ObjectAttributes", and "Objects") to try to fake it. There are object oriented databases (hit Google) and I'm curious whether any will gain significant market share.

.NET in Yukon SQL Server 2005 will open the door to all kinds of great WTF's, I'm sure. Things that should be done in T-SQL gratuitously being written in .NET, code that should NOT be on the database layer being done there... but the inclusion of an XML column type will be even worse, I bet. How many tables-inside-a-column will we see?

I agree with you for the most part if its a very small project. However, most of the projects I work on are large or become large after a very short time. So I almost always make 3 logical layers to my app, even if its relatively small to start with. I try to keep business logic, other than RI, keys, cascade deletes and such out of the database. On most new projects, I'm building the database from scratch and there are few if any other programs accessing it.

The object table system (objectTypes, objectAttributes) sounds truly horrific. Most of my 3 layered solutions have a class which matches a table in the database, more or less, with the fields in the table matching the primitives fields in the class. Not everything matches up that neatly, and its a huge amount of work, even with codesmith, so I've been scouring the web looking at several OOP databases and O/R mappers. I wish OOP databases would become more mainstream but i don't really know if they'd work.

I'm sure Yukon WILL have some truly horrifc WTFs. When I heard they were going to have the .NET runtime in it, I was initially dismayed because I foresaw developers using C# or VB to do all sorts of things that SQL is better suited for. Slowly I've become more comfortable with the idea, and I think its one step closer to SQL Server being an Object-Relational DBMS. In the past, I didn't realize that the primary reason I didn't like (and other developers didn't like) putting a lot of business logic in the database was because of the limitations of T-SQL. If you take away that limitation, by allowing .NET languages and/or true objects in the database, then there's less reason to have a middle tier, IMHO. Also, in most cases, you want business logic to apply to the data regardless of what client program is accessing it. With a middle tier, you have to copy DLL's around to multiple client programs, or use remoting, or (uggg!!!) copy & paste code so the same logic applies in all places. If all this was on the database, you wouldn't have to jump thru these hoops. Objects are a great way to encapsulate business logic, and to a lesser extent, data. So it would make sense that if our database become more robust, more OOP, then the middle tier could go ahead and shrink.

But I digress. The way things are now, its an absolute ass whip to try to debug email generating code, text parsing, complicated forumlas and other business tasks when they are written in T-SQL. I'm actually very comfortable with T-SQL but it just isn't very well suited to that kind of work. It was created to manage sets of data. My boss at my last job had this huge, complicated stored procedure that looped over a cursor, appended HTML together and sent out an email. It would have been so simple to do the same thing in VB.NET but everyone seemd to love putting that crap in stored procedures. He was also more of a Microsoft Access developer than a .NET/SQL Server developer so he had little concept of a 3-layered/'tiered architecture. When I got here, a contractor/project manager our company uses tried to get me to do the same thing. I told him no way.

UncleMidriff · 2005-07-06 Reply Admin

Anonymous:
UncleMidriff:
Anonymous:

fuzzbucket:
im no SQL guru but for optional parameter selects for searches i always use

"SELECT * FROM Table WHERE Col1 LIKE '%" + parm1 +" %' AND Col2 LIKE '%" + parm2 "%'" etc....

makes it one statement...is there something wrong with this? Is this a serious performance hit or does it not allow use of indexes or something? please let me know..

I am also fairly inexperienced in SQL, but whenever I've had a situation where I needed a dynamic query with optional parameters I also use something similar to LIKE "%" & param1 & "%". Does running a query like this come with any sort of performance hit over doing it some other way? 

I'm fairly inexperienced as well, but from I have read, using LIKE can potentially slow things down, especially if the first character is preceeded by a"%" as in, "SELECT * FROM Table WHERE Column LIKE '%F%'". http://www.sql-server-performance.com/transact_sql.asp says that if you can rewrite that to be "SELECT * FROM Table WHERE Column LIKE 'F%'" then, "the Query Optimizer has the ability to potentially use an index to perform the query, speeding performance and reducing the load on SQL Server." I don't know if the same rules apply to other databases.

On a somewhat unrelated note, I thought that dynamic SQL was unspeakably bad and to be avoided at all costs. Am I missing something that makes the examples you all are talking about ok?

Forget performance, you've got a bigger problem:

What if I put in my search criteria as "a'; drop table xyz; --"?

Ok, here's where I'm a little fuzzy on how SQL injection attacks work. I'm going to give two scenarios, one where that SQL injection attack will work, and one where it will not. If I'm wrong about either, please let me know:

1. In this situation, I take your input, "a'; drop table xyz; --" from, say, a web form, and set a string named param1 equal to it. Then I set a string named sql equal to "SELECT * FROM Table WHERE Blah LIKE '%" & param1 & "%'", so sql becomes "SELECT * FROM Table WHERE Blah LIKE '%a'; drop table xyz; -- %'". Then I execute sql and rue the day.

2. In this situation, I take your input, "a'; drop table xyz; --" from, say, a web form, and send it to a stored procedure named GetEverythingByBlah. The meat of this sp is "SELECT * FROM Table WHERE Blah LIKE '%' + @param1 + '%'. When the stored procedure is run it selects everything from Table where the value of column Blah is like "a'; drop table xyz; --". So, if for some weird reason there happened to be a row in Table where column Blah had a value of "rgwrgwerga'; drop table xyz; --aeherghaseth", then that row would be returned. Otherwise, no rows would be returned. Regardless, the SQL injection attack fails.

So, am I right?

2005-07-06 Reply Admin

All: When comparing to null you cannot use =, which is what throws a lot of folks off initially. SELECT @varExample = null ... IF @varExample = null fails. You have to use IS to test against null. IF @varExample IS NULL. The function ISNULL(@varExample, valueToUseWhenNull) takes a variable or statement and if the variable or statement resolves to a null value, substitutes the valueToUseWhenNull. (I think this is a poorly named SQL function as I initially expected it to return a boolean value as to whether something is null or not when I first started working with SQL Server.) This test against null for equality is the first stumbling block for many.

The single SQL statement, without using dynamic SQL subject to SQL injection attacks, that I would use would be along the lines of: SELECT Column1, Column2, Column3, Column4 WHERE (Column1 LIKE @Column1Parameter OR @Column1Parameter IS NULL) AND (Column2 LIKE @Column2Parameter OR @Column2Parameter IS NULL) AND (Column3 LIKE @Column3Parameter OR @Column3Parameter IS NULL). You can expand this at will. In the stored procedure or query definition simply specify default values of null for all the parameters you might be searching on.

Some folks choose to rewrite that using ISNULL along the lines of: SELECT Column1, Column2, Column3, Column4 WHERE Column1 LIKE ISNULL(@Column1Parameter, Column1) AND Column2 LIKE ISNULL(@Column2Parameter, Column2) AND Column3 LIKE ISNULL(@Column3Parameter, Column3)

The reason the above is more difficult for many to grasp is the set nature of SQL Server. Each statement is evaluated for each row of SQL Server, but understanding that the references to Column1, Column2, and Column3 in the above statement are translated to the values of the row under consideration escape many.

I prefer the first statement because it seems clearer to me what the logic is, and I prefer to keep SQL functions out of a query if I can. (Why? Gut instinct, that I should probably check with the execution plans that SQL Server produces.)

2005-07-06 Reply Admin

hmmm.. this one's got a WTF in the text provided by the submitter...

"... all the while ignoring the WHILE clause." (???)

2005-07-06 Reply Admin

ammoQ:
Anonymous:

1. isnull in the where clause: where isnull(@city,city) = city AND isnull(@customer, customer) = customer AND...
2. OR in the where clause: where (@city is null OR @city = city) AND (@customer is null or @customer = customer) AND...
3.using if statments to break it out into 2^n cases where n = number of optional parameters.

I would also prefer 2 over 1 since I have a feeling that there is a better chance that the query optimizer can optmize the statement based on the constant values. Anyway, as long as you don't have indexes on the searchable columns, all versions should perform equally fast (that means: full table scan). If you have e.g. 2 of 6 columns indexed, you could make a mix of 3. and 2. to get the best possible performance (3. for the indexed colums, 2. for the rest). Dynamic SQL is obviously best in terms of performance, but more difficult to maintain.

I believe 2 performs better than 1 in the presence of indexes because of the possibility of applying short-circuit boolean logic. The indexes go all funny in 1 because the isnull function gets passed an argument which changes from row to row. I seem to recall that indexes work best when functions depend only on passed in parameters and work worst when functions depend on both passed in parameters and row values and that "functional indexes" can help out with functions which depend only on row values.

Thanks to all for reminding me of solution 4 which is to build the SQL dynamically (either in middle tier or in sproc). Does it actually perform better than clever where clauses even though the dynamic sql gets compiled each request?

Ytram · 2005-07-06 Reply Admin

UncleMidriff:
Anonymous:
UncleMidriff:
Anonymous:

fuzzbucket:
im no SQL guru but for optional parameter selects for searches i always use

"SELECT * FROM Table WHERE Col1 LIKE '%" + parm1 +" %' AND Col2 LIKE '%" + parm2 "%'" etc....

makes it one statement...is there something wrong with this? Is this a serious performance hit or does it not allow use of indexes or something? please let me know..

I am also fairly inexperienced in SQL, but whenever I've had a situation where I needed a dynamic query with optional parameters I also use something similar to LIKE "%" & param1 & "%". Does running a query like this come with any sort of performance hit over doing it some other way? 

I'm fairly inexperienced as well, but from I have read, using LIKE can potentially slow things down, especially if the first character is preceeded by a"%" as in, "SELECT * FROM Table WHERE Column LIKE '%F%'". http://www.sql-server-performance.com/transact_sql.asp says that if you can rewrite that to be "SELECT * FROM Table WHERE Column LIKE 'F%'" then, "the Query Optimizer has the ability to potentially use an index to perform the query, speeding performance and reducing the load on SQL Server." I don't know if the same rules apply to other databases.

On a somewhat unrelated note, I thought that dynamic SQL was unspeakably bad and to be avoided at all costs. Am I missing something that makes the examples you all are talking about ok?

Forget performance, you've got a bigger problem:

What if I put in my search criteria as "a'; drop table xyz; --"?

Ok, here's where I'm a little fuzzy on how SQL injection attacks work. I'm going to give two scenarios, one where that SQL injection attack will work, and one where it will not. If I'm wrong about either, please let me know:

1. In this situation, I take your input, "a'; drop table xyz; --" from, say, a web form, and set a string named param1 equal to it. Then I set a string named sql equal to "SELECT * FROM Table WHERE Blah LIKE '%" & param1 & "%'", so sql becomes "SELECT * FROM Table WHERE Blah LIKE '%a'; drop table xyz; -- %'". Then I execute sql and rue the day.

2. In this situation, I take your input, "a'; drop table xyz; --" from, say, a web form, and send it to a stored procedure named GetEverythingByBlah. The meat of this sp is "SELECT * FROM Table WHERE Blah LIKE '%' + @param1 + '%'. When the stored procedure is run it selects everything from Table where the value of column Blah is like "a'; drop table xyz; --". So, if for some weird reason there happened to be a row in Table where column Blah had a value of "rgwrgwerga'; drop table xyz; --aeherghaseth", then that row would be returned. Otherwise, no rows would be returned. Regardless, the SQL injection attack fails.

So, am I right?

From what I know, you have it correct. The worst thing you can do is concatenate the values you have received from the user into the query string you are about to execute.

This is where I can get hazy, due to not knowing all of the specifics on SQL Server. In general the db user that an application is using should only have read/write permissions(unless you're a coding god and just use sa). So, just having those permissions, can this db user actually drop a table? I'm not necessarily saying that a knowledgable person couldn't do damage with SQL Injection with a limited db application account, but table dropping is generally out of the question, right?

2005-07-06 Reply Admin

If you don't need to search on the optional parameters being null and you want an AND search, just set the default value for those parameters to null and have something like the following where clause:
where
(firstField = case when @firstParam is null then firstField else @firstParam end)
and
(secondField = case when @secondParam is null then secondField else @secondParam end)
...

2005-07-06 Reply Admin

Dang, can make that simpler. Change every comparison to a coalesce: (firstField = coalesce(@firstParam, firstField))

Sean · 2005-07-06 Reply Admin

Chris F:
OneFactor:
I think the problem stems from the "6 optional parameters for a search" requirement. I've seen various standard solutions proposed before:

1. isnull in the where clause: where isnull(@city,city) = city AND isnull(@customer, customer) = customer AND...
2. OR in the where clause: where (@city is null OR @city = city) AND (@customer is null or @customer = customer) AND...
3.using if statments to break it out into 2^n cases where n = number of optional parameters.

#1 has poor performance but is highly maintainable and readable
#2 has better performance and remains highly maintainable and readable
#3 has the best performance but is the most difficult to read and maintain

As an aside: Now I'm no DB expert, but I tend to default to #2 until the performance kills me. I don't know what it is about such queries, but MS SQL Server seems unable to parse out the expression (null is null OR null = city). When the performance becomes unbearable I simply switch over to dynamic SQL. It sucks, it's hard to maintain, but I don't know any better way.

There's got to be something I'm missing. Perhaps it's the fact that @city can change from row to row and the query planner can't automatically recognize that certain variables are immutable for the duration of the query. Is there a way to declare them immutable so that #2 is as performant as dynamic SQL?

Maurits:
Another option is to (shudder) DYNAMICALLY BUILD a sql string WITHIN the stored procedure, and then run it...

DECLARE @sql nvarchar(4000)

SELECT
    @sql = '...' +
'WHERE 1 = 1 ' +
CASE WHEN @ProductID IS NULL THEN ''
    WHEN @ProductID = '' THEN ''
    ELSE 'AND ProductID = ''' + Replace(@ProductID, '''', '''''') + ''' '
END +
CASE WHEN @ProductName IS NULL THEN ''

    WHEN @ProductName = '' THEN ''

    ELSE 'AND ProductName = ''' + Replace(@ProductName, '''', '''''') + ''' '

END

ew... slow and hard to maintain.

In my experience, the dynamic SQL method by far the most performant for non-trivial search methods. What would you do?

Actually, one way around the performance of the (@var is null OR field = @var) is this: (@var is null OR (@var is not null AND field = @var))

This prevents the DB from doing an exhaustive search as a result of the right side of the OR expression, since the first test fails. Do this and you'll almost never need to use dynamic SQL again.

2005-07-06 Reply Admin

loneprogrammer:
This isn't so bad. At least it completes in O(N) time and memory usage.

A real WTF would have to use at least two tables to be O(N*N) in either time or memory. Possibly in both if you really bugger it up.

Funny thing about databases, memory usage means almost nothing. Database optimizations are all about reducing disk read/writes. I'll take O(n*n*n) in main memory over O(3n) in disk reads (and writes!) any day.

2005-07-06 Reply Admin

I usually use column = COALESCE(@column, column) for such situations. It tends to perform better in my experience.

Besides, coalesce is a cool word.

2005-07-06 Reply Admin

Anonymous:

Forget performance, you've got a bigger problem:

What if I put in my search criteria as "a'; drop table xyz; --"?

Then you can test whether there's an idiot sysadmin at work who allows the DB Users used by scripts to have access to everything.

2005-07-06 Reply Admin

Anonymous:
Anonymous:

Forget performance, you've got a bigger problem:

What if I put in my search criteria as "a'; drop table xyz; --"?

Then you can test whether there's an idiot sysadmin at work who allows the DB Users used by scripts to have access to everything.

The actual SQL doesn't matter. If the account running is not sa, I could just put in " a' or 1=1" or something similar.

ammoQ · 2005-07-06 Reply Admin

Anonymous:
Funny thing about databases, memory usage means almost nothing. Database optimizations are all about reducing disk read/writes. I'll take O(n*n*n) in main memory over O(3n) in disk reads (and writes!) any day.

O(3n) is the same as O(n). There is definitely a value for n that makes you regret your decision (e.g. when the swapping starts)

2005-07-06 Reply Admin

Ytram:

From what I know, you have it correct. The worst thing you can do is concatenate the values you have received from the user into the query string you are about to execute.

This is where I can get hazy, due to not knowing all of the specifics on SQL Server. In general the db user that an application is using should only have read/write permissions(unless you're a coding god and just use sa). So, just having those permissions, can this db user actually drop a table? I'm not necessarily saying that a knowledgable person couldn't do damage with SQL Injection with a limited db application account, but table dropping is generally out of the question, right?

The stored procedure will execute with the permissions granted to the stored procedure owner. So if dbo created the stored procedure then dbo is who it executes under. Stored procedures make a great way to minimalize security risks by restricting access to tables and providing access through views/stored procedures, but if you don't take care it can blow up in your face.

UncleMidriff · 2005-07-06 Reply Admin

Anonymous:
Ytram:

From what I know, you have it correct. The worst thing you can do is concatenate the values you have received from the user into the query string you are about to execute.

This is where I can get hazy, due to not knowing all of the specifics on SQL Server. In general the db user that an application is using should only have read/write permissions(unless you're a coding god and just use sa). So, just having those permissions, can this db user actually drop a table? I'm not necessarily saying that a knowledgable person couldn't do damage with SQL Injection with a limited db application account, but table dropping is generally out of the question, right?

The stored procedure will execute with the permissions granted to the stored procedure owner. So if dbo created the stored procedure then dbo is who it executes under. Stored procedures make a great way to minimalize security risks by restricting access to tables and providing access through views/stored procedures, but if you don't take care it can blow up in your face.

Ok, so an sp executes with the permissions of the user who created it. Presumably, you wouldn't give JoeWebUser access to an sp that did anything horrible. But let's say you give him access to an sp that creates a sql statement dynamically, like @DynamicSqlString = "SELECT * FROM Table WHERE Blah = '" + @param1 + "'", and then executes it, like Exec(@DynamicSqlString). If JoeWebUser enters something SQL-Injection-irific for @param1, then you're probably screwed because the sp will execute with the full set of permissions held by the developer that created the sp. However, if you were dynamically to create this same SQL statement with the same SQL-Injection-irific input for @param1 and then execute it directly from your application, then, since you'd be logged into the database as JoeWebUser with limited permissions, the SQL injection attack probably wouldn't work on account of JoeWebUser not being allowed to drop tables, delete rows, etc.

Is that right?

bmschkerke · 2005-07-06 Reply Admin

Yes, if you simply concatenate the parameter values onto your dynamic SQL string you're just as screwed as you were before you did anything. It should be noted that this isn't the recommended method of doing this though.

Parameters are translated... although I'm not sure that's the proper term. They are not treated as literal replacements. The SQL Server engine ensures the value inside the parameter is suitable for use as a value by treating it differently than a text query.

JoeUser enters
' OR 0 = 0 --
into your username box. You use parameters to encapsulate this value which is passed to SQL Server to a stored procedure there.

Instead of using a dynamic query, you use a parameter:

SELECT Username, Password FROM Accounts WHERE Username = @Param1 AND Password = @Param2

SQL Server stuffs the value of @Param1 straight into the variables that are being compared. It doesn't first do a text replacement of the values and then execute the query (that would be no better than a dynamic query). What gets "executed" is effectively

SELECT Username, Password FROM Accounts WHERE Username = ''' OR 0 = 0 --' AND Password = @Param2

(Note: That's not what is actually executed. Can't stress that enough. :))

If you had used dynamic SQL though and concatenated the string values together you would've had someone with access to your system.

Parameters can be used outside of stored procedures. They can be used with straight SQL queries as well. Well, EXEC() can't take parameters but sp_executesql can. See MSDN at http://msdn.microsoft.com/library/default.asp?url=/library/en-us/tsqlref/ts_sp_ea-ez_2h7w.asp.

Where a dynamic query is beneficial is the fact that placement and usage of parameters is limited. You cannot, for instance, use a parameter as the column name or table name to select from. You cannot use it as an indicator as to which columns to select from the table. That requires dynamic SQL.

Using parameters also allows execution plan reuse, whereas constants and dynamic SQL garner no such benefit.

Quinnum · 2005-07-06 Reply Admin

ammoQ:
UncleMidriff:

I'm fairly inexperienced as well, but from I have read, using LIKE can potentially slow things down, especially if the first character is preceeded by a"%" as in, "SELECT * FROM Table WHERE Column LIKE '%F%'". http://www.sql-server-performance.com/transact_sql.asp says that if you can rewrite that to be "SELECT * FROM Table WHERE Column LIKE 'F%'" then, "the Query Optimizer has the ability to potentially use an index to perform the query, speeding performance and reducing the load on SQL Server." I don't know if the same rules apply to other databases.

On a somewhat unrelated note, I thought that dynamic SQL was unspeakably bad and to be avoided at all costs. Am I missing something that makes the examples you all are talking about ok?

The same rule is true for Oracle and probably for every database that has a "like" operator and a query optimizer.

Dynamic SQL is not unspeakably bad, but should not be used without a good reason. One particular problem with dynamic SQL and Oracle (at least in version 8i, I'm not sure about the more current versions) is that whenever the literals in the query change, Oracle will not reuse a previously parsed statement; for example, "select * from x where y=1" and "select * from x where y=2" are two different statements and Oracle will parse the second one again although the only difference to the first one is the literal value. Using bind variables instead of literals is better for Oracle.

In Oracle, you can use bind variables in dynamic SQL. You can cobble up a nice SQL string using ':1', ':2', etc, for the placeholders, then call it with the appropriate values. So theoretically it should be able to reuse the query. Also you avoid the whole problem with reserved characters in the literal string then.

Unfortunately on every other level, Oracle makes baby Jesus cry [:'(]

I haven't checked if this is possible with SQL Server though - I must investigate...

evilcode · 2005-07-06 Reply Admin

I write Oracle batch code like that
all the time. Well, not quite, since
I don't like to update in place.

More like this:

create table first_pass as
select vars_i_care_about where
whatever ...;

Then I index what I know I'll need.

Then I might join another table,
do another create, etc.

This is typically for marketing data warehousing work.
Table sizes in the 10s to 100s of millions.

I used to construct very complex sql queries,
extract, joins, etc, but they were next to
impossible to truly troubleshoot and prove.

And the total process could run for hours,
which could fail and need to be redone. If
I kept it piecemeal, I could fix and restart
at a known point.

By cutting the work into these type of chunks
is allowed me to control the size and time of
each step, and made sure that a change in the
optimizer didn't burn me a year from now
since my joins were pretty simple.

Also, I know people who are managers who spent
years in Xywrite editor that did this type of
multiple pass transforms of data, massaging it
until it was what they wanted.

A real program would do it in one pass, but they
only had a crappy macro language to work with.
Might be based on that.

2005-07-07 Reply Admin

There's a good article about dynamic search conditions here:

http://www.sommarskog.se/index.html

SELECTing The Hard Way

Leave a comment on “SELECTing The Hard Way”