• (cs) in reply to Maurits

    ARGHGHG. My eyes! They burn! Please shoot this idiot! He does not deserve to live!

  • (cs) in reply to Anonymous
    Anonymous:
    I can find three nice things to say about this procedure:
    1. It is nicely formatted so that it is very easy to see what is being done.
    2. In some hardware configurations it may well run faster than selecting all rows and filtering them on the client side.
    3. It is encapsulated such that someone who actually understands how to write a WHERE clause can replace this stored procedure with a decent one without affecting any other code.
    Of course, the whole procedure body could (and should) consist of a single SELECT statement, the format of which is so obvious that it's not even worth typing it in here.  I'd suggest that the programmer here might be very competent in other types of coding, but shouldn't be left to generate SQL with some guidance.
     
    A programmer capable of this who claims any SQL expertise should be hung by their toenails and forced to watch episodes of "The Nanny" until they recant.  Otherwise, this seems like a management WTF to have put this person on the task.



    For us nubs who haven't had to use overly much SQL yet, could you post the single SELECT statement?

    (Or better yet, a link to a guide to good and speedy ways to use SQL? Or a link to bad and slow ways, purely for entertainments sake? Like the UI halls of shame...)

  • (cs) in reply to Maurits

    This beast of a WTF stored procedure also included several temp tables, a few cursors, 20 inserts, 38 updates, 15 deletes, and 66 selects. It was 2,000 lines long, and took 30 seconds to run. It was painfull just to look at much less rewrite.

     

  • (cs) in reply to Cyresse
    Cyresse:

    For us nubs who haven't had to use overly much SQL yet, could you post the single SELECT statement?


    To do it within the database layer, see the posts from OneFactor and Maurits.  Alternatively, you could dynamically build a SELECT statement in the middle tier, containing only the relevant clauses:

    SELECT sProjectID, sProjectName, ...
    FROM tbl_Projects
    WHERE true
    AND sProjectID = ### /* omitted if sProjectID is not specified */
    /* 4 more criteria */
    AND sProjectName = 'xxx' /* omitted if sProjectName is not specified */

    Also, if you want to send e-mail from the database layer, I suggest just writing relevant data to a EmailToBeSent type table, and having a separate process periodically scan that table and send whatever it says to send, purging or archiving each entry after it's successfully processed.

  • (cs) in reply to emurphy

    There is a problem with all the proposed alternatives which is that SQL Server will compile the stored procedure the first time it is run, and that compilation may not use the optimal execution plan for most of the times the procedure is run later.

    Here is an alternative procedure design (based on the Northwind data) which will compile for each combination of parameters. It is a sort of dynamic SQL wrapped inside the procedure, using the parameters of the system procedure sp_ExecuteSQL:

    ALTER PROC dbo.usp_GetOrders
      @OrderID    AS INT      = NULL,
      @CustomerID AS NCHAR(5) = NULL,
      @EmployeeID AS INT      = NULL,
      @OrderDate  AS DATETIME = NULL
    AS

    DECLARE @sql AS NVARCHAR(4000);

    SET @sql =
        N'SELECT OrderID, CustomerID, EmployeeID, OrderDate, filler'
      + N' FROM dbo.Orders'
      + N' WHERE 1 = 1'
      + CASE WHEN @OrderID IS NOT NULL THEN
          N' AND OrderID = @oid' ELSE N'' END
      + CASE WHEN @CustomerID IS NOT NULL THEN
          N' AND CustomerID = @cid' ELSE N'' END
      + CASE WHEN @EmployeeID IS NOT NULL THEN
          N' AND EmployeeID = @eid' ELSE N'' END
      + CASE WHEN @OrderDate IS NOT NULL THEN
          N' AND OrderDate = @dt' ELSE N'' END;

    EXEC sp_executesql
      @sql,
      N'@oid AS INT, @cid AS NCHAR(5), @eid AS INT, @dt AS DATETIME',
      @oid = @OrderID,
      @cid = @CustomerID,
      @eid = @EmployeeID,
      @dt  = @OrderDate;
    GO

    Thanks to Itzik Ben-Gan of Solid Quality Learning for this!

     

    With regard to sending emails from the database, I would not use it for a "real" application, but I use it the whole time for alerts on SQL server job or DTS package run failures and errors.

     

  • BogusDude (unregistered) in reply to loneprogrammer
    loneprogrammer:
    This isn't so bad.  At least it completes in O(N) time and memory usage.

    A real WTF would have to use at least two tables to be O(N*N) in either time or memory.  Possibly in both if you really bugger it up.


    You are joking, right. This completes in at least O(N * M) time where M is the number of criteria, where it COULD have completed in O(N) time with a normal where clause.

  • Recca (unregistered) in reply to BogusDude

    omg. this is the best wtf yet.

    Maybe... just maybe this poor programmer was not aware of using isnull for optional fields in the where cause.... then his ingenuity got the better of him

  • (cs) in reply to Recca

    im no SQL guru but for optional parameter selects for searches i always use

    "SELECT * FROM Table WHERE Col1 LIKE '%" + parm1 +" %' AND Col2 LIKE '%" + parm2 "%'" etc....

    makes it one statement...is there something wrong with this? Is this a serious performance hit or does it not allow use of indexes or something? please let me know..

  • (cs) in reply to OneFactor
    Anonymous:

    1. isnull in the where clause: where isnull(@city,city) = city AND isnull(@customer, customer) = customer AND...
    2. OR in the where clause: where (@city is null OR @city = city) AND (@customer is null or @customer = customer) AND...
    3.using if statments to break it out into 2^n cases where n = number of optional parameters.



    I would also prefer 2 over 1 since I have a feeling that there is a better chance that the query optimizer can optmize the statement based on the constant values. Anyway, as long as you don't have indexes on the searchable columns, all versions should perform equally fast (that means: full table scan). If you have e.g. 2 of 6 columns indexed, you could make a mix of 3. and 2. to get the best possible performance (3. for the indexed colums, 2. for the rest). Dynamic SQL is obviously best in terms of performance, but more difficult to maintain.
  • (cs) in reply to Ytram
    Ytram:
    ...I created an sp_send_cdosys_mail stored procedure for sending HTML-formatted email.


    Why would you use the database to send e-mails?



    To make sure it works all the time, no matter which eMail program the client has installed.
  • (cs) in reply to Fregas
    Anonymous:

    There's a bit of debate on this thread on whether email, HTML, business logic, etc should exist on the database tier.  My question to those that promote the "absence" of a middle tier is: what DON'T you put into the database, other than the presentation layer?  My last job put all kinds of business logic in stored procs: sending & creating HTML emails using T-SQL cursors, randomly generating passwords, etc.  It was a maintenance nightmare...

    Also, keep in mind that in some cases, its the database gets replaced, rather than the client programs written in C#, VB, etc.

    I think this is why we really need business object servers, or OOP databases or something.  T-SQL just sucks at these kind of things.  The next version of sql server is going to have the .NET runtime in it, but i guess thats a two edged sword.



    T-SQL might be a bitch (must admit I don't know it), but Oracle's PL/SQL is a nice language to write business logic. We regulary put almost everything into the database, except a thin presentation layer and some programs communicating with other systems using "strange" APIs. Of course, this means that replacing the database with anything else than Oracle is currently not an option. Maybe in a few years competing products like Fyracle or EnterpriseDB may reach a level of PL/SQL-compatibility that makes them a viable alternative.
  • (cs) in reply to ammoQ

    Wow, this is the smartest way I've seen to run custom queries on a database that doesn't allow regular users to execute dynamic SQL, only stored procedures.[Y]

    Well, it might not be the best performance, but who cares. hardware costs aren't the issue these days, right?...[H]

    right guys?...

    [uncomfortable silence]

  • (cs) in reply to ammoQ
    ammoQ:
    Ytram:
    ...I created an sp_send_cdosys_mail stored procedure for sending HTML-formatted email.


    Why would you use the database to send e-mails?



    To make sure it works all the time, no matter which eMail program the client has installed.


    wtf??? I was staying away from this thread -- until this.  You mean your only alternatives to send e-mail are the server's DB service or the client's mail app?  Have you ever heard of an MTA?

        dZ.
  • thedarkknight (unregistered) in reply to DZ-Jay

    Hehe, well, he stated:

    "T-SQL might be a bitch (must admit I don't know it), but Oracle's PL/SQL is a nice language to write business logic. We regulary put almost everything into the database, [...]"

    Yeah heck, who needs application servers, if you have a DB server. Sure this must scale like a charm. And there you also have the explanation why he sends email from his DB. He's lacking a middle tier... what a WTF!

  • (cs) in reply to pshotts
    pshotts:

    There is a problem with all the proposed alternatives which is that SQL Server will compile the stored procedure the first time it is run, and that compilation may not use the optimal execution plan for most of the times the procedure is run later.


    This is a great WTF from the SQL Server team.  They couldn't be bothered with an immutable or constant variable declaration so that these optimizations can take place where they belong.
  • (cs) in reply to wtijsma

    wtijsma:
    Well, it might not be the best performance, but who cares. hardware costs aren't the issue these days, right?...[H]

    This seems to be mainstream thinking where I work [+o(]

    Sometimes I think the WTFers are just trying to keep their jobs, like writing crap that only they will understand, consequently making them undisposable (is that a word?) for the company, and it seems to work.

    I prefer to live dangerously [:D]

  • (cs) in reply to lucio

    Re-inventing the WHERE clause... using WHERE clauses...

    Somewhere along the line of coding this, you'd expect the quarter to drop and go cha-ching and he'd think wait... what am I doing.

  • (cs) in reply to lucio
    lucio:

    consequently making them undisposable (is that a word?)



    Almost ;-) but I get the point, I think you mean something like this:

    public class ITGuy : Employee, IUndisposable{
      public void Undispose(){
         WriteCrap();
      }
    }

    programming == universal language ;-)
  • anonymous (unregistered)

    At the very least, he should have checked whether "LEN(@sProjectID) > 0" and  "LEN(@sProjectName) > 0"

    before creating the search table.

  • SeekerDarksteel (unregistered) in reply to fuzzbucket

    fuzzbucket:
    im no SQL guru but for optional parameter selects for searches i always use

    "SELECT * FROM Table WHERE Col1 LIKE '%" + parm1 +" %' AND Col2 LIKE '%" + parm2 "%'" etc....

    makes it one statement...is there something wrong with this? Is this a serious performance hit or does it not allow use of indexes or something? please let me know..

    I am also fairly inexperienced in SQL, but whenever I've had a situation where I needed a dynamic query with optional parameters I also use something similar to LIKE "%" & param1 & "%".  <FONT color=#000000>Does running a query like this come with any sort of performance hit over doing it some other way?  </FONT>

  • (cs) in reply to DZ-Jay
    DZ-Jay:
    ammoQ:
    Ytram:
    ...I created an sp_send_cdosys_mail stored procedure for sending HTML-formatted email.


    Why would you use the database to send e-mails?



    To make sure it works all the time, no matter which eMail program the client has installed.


    wtf??? I was staying away from this thread -- until this.  You mean your only alternatives to send e-mail are the server's DB service or the client's mail app?  Have you ever heard of an MTA?

        dZ.


    Of course the stored procedure connects to a MTA to send the mails. If I had a middleware, I would let it send the mails.
  • (cs) in reply to thedarkknight
    Anonymous:

    Hehe, well, he stated:

    "T-SQL might be a bitch (must admit I don't know it), but Oracle's PL/SQL is a nice language to write business logic. We regulary put almost everything into the database, [...]"

    Yeah heck, who needs application servers, if you have a DB server. Sure this must scale like a charm. And there you also have the explanation why he sends email from his DB. He's lacking a middle tier... what a WTF!



    It doesn't have to scale like a charm. The application (warehouse management) has - depending on the client - about 50 to 200 users. A single server with 2-4 processors running Oracle can easily handle that.

  • (cs) in reply to SeekerDarksteel
    Anonymous:

    fuzzbucket:
    im no SQL guru but for optional parameter selects for searches i always use

    "SELECT * FROM Table WHERE Col1 LIKE '%" + parm1 +" %' AND Col2 LIKE '%" + parm2 "%'" etc....

    makes it one statement...is there something wrong with this? Is this a serious performance hit or does it not allow use of indexes or something? please let me know..

    I am also fairly inexperienced in SQL, but whenever I've had a situation where I needed a dynamic query with optional parameters I also use something similar to LIKE "%" & param1 & "%".  <font color="#000000">Does running a query like this come with any sort of performance hit over doing it some other way?  </font>



    I'm fairly inexperienced as well, but from I have read, using LIKE can potentially slow things down, especially if the first character is preceeded by a"%" as in, "SELECT * FROM Table WHERE Column LIKE '%F%'".  http://www.sql-server-performance.com/transact_sql.asp says that if you can rewrite that to be "SELECT * FROM Table WHERE Column LIKE 'F%'" then, "<font face="Verdana" size="2">the Query Optimizer has the ability to potentially use an index to perform the query, speeding performance and reducing the load on SQL Server.</font>"  I don't know if the same rules apply to other databases.

    On a somewhat unrelated note, I thought that dynamic SQL was unspeakably bad and to be avoided at all costs.  Am I missing something that makes the examples you all are talking about ok?
  • Arthur Barbato (unregistered)

    Thx for the video of the sparkler adventure. Well done[:P]. visit us ad whores at url pisted and post if you get the spark to comment.also 925m.com welcomes code geeks to post at will too. i found your pastel back color interesting. same as Go Yellow chose in their latest paris Hilton ad that is spreading faster than shiznitz is suposed to spread. Thuis is just the press conference:  ttp://www.goyellow.de/insight/paris/pressekonferenz.html  and here: http://www.goyellow.de/insight/paris/ as a color geek, I find the use of orange in digital cerebrations hot.  thx for all the interesting codes wtf! [I]

  • (cs) in reply to UncleMidriff
    UncleMidriff:

    I'm fairly inexperienced as well, but from I have read, using LIKE can potentially slow things down, especially if the first character is preceeded by a"%" as in, "SELECT * FROM Table WHERE Column LIKE '%F%'".  http://www.sql-server-performance.com/transact_sql.asp says that if you can rewrite that to be "SELECT * FROM Table WHERE Column LIKE 'F%'" then, "<font face="Verdana" size="2">the Query Optimizer has the ability to potentially use an index to perform the query, speeding performance and reducing the load on SQL Server.</font>"  I don't know if the same rules apply to other databases.

    On a somewhat unrelated note, I thought that dynamic SQL was unspeakably bad and to be avoided at all costs.  Am I missing something that makes the examples you all are talking about ok?


    The same rule is true for Oracle and probably for every database that has a "like" operator and a query optimizer.

    Dynamic SQL is not unspeakably bad, but should not be used without a good reason. One particular problem with dynamic SQL and Oracle (at least in version 8i, I'm not sure about the more current versions) is that whenever the literals in the query change, Oracle will not reuse a previously parsed statement; for example, "select * from x where y=1" and "select * from x where y=2" are two different statements and Oracle will parse the second one again although the only difference to the first one is the literal value. Using bind variables instead of literals is better for Oracle.
  • (cs) in reply to ammoQ

    If you cannot visualize why

    LIKE '%def%'

    might be inefficient, regardless of the RDBMS being used, think about it logically:

    If I give you a large book with an index and ask you to give me a list of page numbers with all key words that begin with the letters 'def' (LIKE 'def%'), you can jump right to the index at the end, jump to the d's, quickly scan the index alphabetically, and there they all are. 

    But if I ask to give me a list of all words containing the letters 'def' (LIKE '%def%'), then you need to read the entire index (or the book itself) and search all of the letters of every possible word to determine which pages to return.

     

  • (cs) in reply to ammoQ
    ammoQ:
    UncleMidriff:

    I'm fairly inexperienced as well, but from I have read, using LIKE can potentially slow things down, especially if the first character is preceeded by a"%" as in, "SELECT * FROM Table WHERE Column LIKE '%F%'".  http://www.sql-server-performance.com/transact_sql.asp says that if you can rewrite that to be "SELECT * FROM Table WHERE Column LIKE 'F%'" then, "<font face="Verdana" size="2">the Query Optimizer has the ability to potentially use an index to perform the query, speeding performance and reducing the load on SQL Server.</font>"  I don't know if the same rules apply to other databases.

    On a somewhat unrelated note, I thought that dynamic SQL was unspeakably bad and to be avoided at all costs.  Am I missing something that makes the examples you all are talking about ok?


    The same rule is true for Oracle and probably for every database that has a "like" operator and a query optimizer.

    Dynamic SQL is not unspeakably bad, but should not be used without a good reason. One particular problem with dynamic SQL and Oracle (at least in version 8i, I'm not sure about the more current versions) is that whenever the literals in the query change, Oracle will not reuse a previously parsed statement; for example, "select * from x where y=1" and "select * from x where y=2" are two different statements and Oracle will parse the second one again although the only difference to the first one is the literal value. Using bind variables instead of literals is better for Oracle.


    While Oracle has gotten better about handling dynamic SQL with 9i, and shows more improvement with 10g, I still feel that bind variables are a better way of doing things overall, due to all the things you mentioned.

    On a different note regarding having the database sending emails, there are a lot of systems out there which are written to do exactly that, and all for the reasons that Mauritis mentioned earlier.  Oracle's HR, Peoplesoft and Rational, just to name a few.
  • Rick Scott (unregistered) in reply to Jeff S
    UncleMidriff:
    Anonymous:

    fuzzbucket:
    im no SQL guru but for optional parameter selects for searches i always use

    "SELECT * FROM Table WHERE Col1 LIKE '%" + parm1 +" %' AND Col2 LIKE '%" + parm2 "%'" etc....

    makes it one statement...is there something wrong with this? Is this a serious performance hit or does it not allow use of indexes or something? please let me know..

    I am also fairly inexperienced in SQL, but whenever I've had a situation where I needed a dynamic query with optional parameters I also use something similar to LIKE "%" & param1 & "%".  <font color="#000000">Does running a query like this come with any sort of performance hit over doing it some other way?  </font>



    I'm fairly inexperienced as well, but from I have read, using LIKE can potentially slow things down, especially if the first character is preceeded by a"%" as in, "SELECT * FROM Table WHERE Column LIKE '%F%'".  http://www.sql-server-performance.com/transact_sql.asp says that if you can rewrite that to be "SELECT * FROM Table WHERE Column LIKE 'F%'" then, "<font face="Verdana" size="2">the Query Optimizer has the ability to potentially use an index to perform the query, speeding performance and reducing the load on SQL Server.</font>"  I don't know if the same rules apply to other databases.


    On a somewhat unrelated note, I thought that dynamic SQL was unspeakably bad and to be avoided at all costs.  Am I missing something that makes the examples you all are talking about ok?



    Forget performance, you've got a bigger problem:


    What if I put in my search criteria as "a'; drop table xyz; --"?

  • (cs) in reply to Rick Scott
    Anonymous:


    Forget performance, you've got a bigger problem:

    What if I put in my search criteria as "a'; drop table xyz; --"?



    In most cases, it simply doesn't work. At least not if you use dynamic SQL inside PL/SQL in Oracle, since these procedures expect exactly one DML statement.
    That put aside, it's bad enough if the program crashes when you try to search for "John O'Connor".

  • (cs) in reply to BogusDude
    Anonymous:
    You are joking, right. This completes in at least O(N * M) time where M is the number of criteria, where it COULD have completed in O(N) time with a normal where clause.

    Yes, but M is a constant having nothing to do with the number of rows.  Since constants don't matter, this is still O(N).

  • Fregas (unregistered) in reply to brian.j.parker
    Anonymous:

    I don't know if anyone is "promoting" the absence of a middle tier, just pointing out that there is a cost to implementing a more complex system.  For a smaller system, the efficiencies of a three-tier system might not make up for the cost of an expert programmer's hours.  Some of us are implementing systems alone, or as part of a very small team.  That doesn't mean that the database is the best place to generate a random password; but when choosing between two tiers, it might be the best place to send an email.

    Sometimes the database gets replaced, but in practice I believe that happens less often then replacing a client program.  Often I think you'll see several database clients.  One makes the best predictions one can based on what one knows.

    Implementing anything object-oriented does indeed suck in T-SQL, or really, in any relational database.  I've seen more than one person try to use a system (with tables like "ObjectTypes", "ObjectAttributes", and "Objects") to try to fake it.  There are object oriented databases (hit Google) and I'm curious whether any will gain significant market share.

    .NET in Yukon SQL Server 2005 will open the door to all kinds of great WTF's, I'm sure.  Things that should be done in T-SQL gratuitously being written in .NET, code that should NOT be on the database layer being done there... but the inclusion of an XML column type will be even worse, I bet.  How many tables-inside-a-column will we see?

    I agree with you for the most part if its a very small project.  However, most of the projects I work on are large or become large after a very short time.  So I almost always make 3 logical layers to my app, even if its relatively small to start with.  I try to keep business logic, other than RI, keys, cascade deletes and such out of the database.  On most new projects, I'm building the database from scratch and there are few if any other programs accessing it. 

    The object table system (objectTypes, objectAttributes) sounds truly horrific.  Most of my 3 layered solutions have a class which matches a table in the database, more or less, with the fields in the table matching the primitives fields in the class.  Not everything matches up that neatly, and its a huge amount of work, even with codesmith, so I've been scouring the web looking at several OOP databases and O/R mappers.  I wish OOP databases would become more mainstream but i don't really know if they'd work. 

    I'm sure Yukon WILL have some truly horrifc WTFs.  When I heard they were going to have the .NET runtime in it, I was initially dismayed because I foresaw developers using C# or VB to do all sorts of things that SQL is better suited for.  Slowly I've become more comfortable with the idea, and I think its one step closer to SQL Server being an Object-Relational DBMS.  In the past, I didn't realize that the primary reason I didn't like (and other developers didn't like) putting a lot of business logic in the database was because of the limitations of T-SQL.  If you take away that limitation, by allowing .NET languages and/or true objects in the database, then there's less reason to have a middle tier, IMHO.  Also, in most cases, you want business logic to apply to the data regardless of what client program is accessing it.  With a middle tier, you have to copy DLL's around to multiple client programs, or use remoting, or (uggg!!!) copy & paste code so the same logic applies in all places.  If all this was on the database, you wouldn't have to jump thru these hoops.    Objects are a great way to encapsulate business logic, and to a lesser extent, data.  So it would make sense that if our database become more robust, more OOP, then the middle tier could go ahead and shrink. 

    But I digress.  The way things are now, its an absolute ass whip to try to debug email generating code, text parsing, complicated forumlas and other business tasks when they are written in T-SQL.  I'm actually very comfortable with T-SQL but it just isn't very well suited to that kind of work.  It was created to manage sets of data.  My boss at my last job had this huge, complicated stored procedure that looped over a cursor, appended HTML together and sent out an email.  It would have been so simple to do the same thing in VB.NET but everyone seemd to love putting that crap in stored procedures.  He was also more of a Microsoft Access developer than a .NET/SQL Server developer so he had little concept of a 3-layered/'tiered architecture.  When I got here, a contractor/project manager our company uses tried to get me to do the same thing.  I told him no way.

     

  • (cs) in reply to Rick Scott
    Anonymous:
    UncleMidriff:
    Anonymous:

    fuzzbucket:
    im no SQL guru but for optional parameter selects for searches i always use

    "SELECT * FROM Table WHERE Col1 LIKE '%" + parm1 +" %' AND Col2 LIKE '%" + parm2 "%'" etc....

    makes it one statement...is there something wrong with this? Is this a serious performance hit or does it not allow use of indexes or something? please let me know..

    I am also fairly inexperienced in SQL, but whenever I've had a situation where I needed a dynamic query with optional parameters I also use something similar to LIKE "%" & param1 & "%".  <font color="#000000">Does running a query like this come with any sort of performance hit over doing it some other way?  </font>



    I'm fairly inexperienced as well, but from I have read, using LIKE can potentially slow things down, especially if the first character is preceeded by a"%" as in, "SELECT * FROM Table WHERE Column LIKE '%F%'".  http://www.sql-server-performance.com/transact_sql.asp says that if you can rewrite that to be "SELECT * FROM Table WHERE Column LIKE 'F%'" then, "<font face="Verdana" size="2">the Query Optimizer has the ability to potentially use an index to perform the query, speeding performance and reducing the load on SQL Server.</font>"  I don't know if the same rules apply to other databases.


    On a somewhat unrelated note, I thought that dynamic SQL was unspeakably bad and to be avoided at all costs.  Am I missing something that makes the examples you all are talking about ok?



    Forget performance, you've got a bigger problem:


    What if I put in my search criteria as "a'; drop table xyz; --"?



    Ok, here's where I'm a little fuzzy on how SQL injection attacks work.  I'm going to give two scenarios, one where that SQL injection attack will work, and one where it will not.  If I'm wrong about either, please let me know:

    1.  In this situation, I take your input, "a'; drop table xyz; --" from, say, a web form, and set a string named param1 equal to it.  Then I set a string named sql equal to "SELECT * FROM Table WHERE Blah LIKE '%" & param1  & "%'", so sql becomes "SELECT * FROM Table WHERE Blah LIKE '%a'; drop table xyz; -- %'".  Then I execute sql and rue the day.

    2.  In this situation, I take your input, "a'; drop table xyz; --" from, say, a web form, and send it to a stored procedure named GetEverythingByBlah.  The meat of this sp is "SELECT * FROM Table WHERE Blah LIKE '%' + @param1 + '%'.  When the stored procedure is run it selects everything from Table where the value of column Blah is like "a'; drop table xyz; --".  So, if for some weird reason there happened to be a row in Table where column Blah had a value of "rgwrgwerga'; drop table xyz; --aeherghaseth", then that row would be returned.  Otherwise, no rows would be returned.  Regardless, the SQL injection attack fails.

    So, am I right?

  • Brian Schkerke (unregistered) in reply to anonymous

    All:  When comparing to null you cannot use =, which is what throws a lot of folks off initially.  SELECT @varExample = null ... IF @varExample = null fails.  You have to use IS to test against null.  IF @varExample IS NULL.  The function ISNULL(@varExample, valueToUseWhenNull) takes a variable or statement and if the variable or statement resolves to a null value, substitutes the valueToUseWhenNull.  (I think this is a poorly named SQL function as I initially expected it to return a boolean value as to whether something is null or not when I first started working with SQL Server.)  This test against null for equality is the first stumbling block for many.

    The single SQL statement, without using dynamic SQL subject to SQL injection attacks, that I would use would be along the lines of:  SELECT Column1, Column2, Column3, Column4 WHERE (Column1 LIKE @Column1Parameter OR @Column1Parameter IS NULL) AND (Column2 LIKE @Column2Parameter OR @Column2Parameter IS NULL) AND (Column3 LIKE @Column3Parameter OR @Column3Parameter IS NULL).  You can expand this at will.  In the stored procedure or query definition simply specify default values of null for all the parameters you might be searching on.

    Some folks choose to rewrite that using ISNULL along the lines of:  SELECT Column1, Column2, Column3, Column4 WHERE Column1 LIKE ISNULL(@Column1Parameter, Column1) AND Column2 LIKE ISNULL(@Column2Parameter, Column2) AND Column3 LIKE ISNULL(@Column3Parameter, Column3)

    The reason the above is more difficult for many to grasp is the set nature of SQL Server.  Each statement is evaluated for each row of SQL Server, but understanding that the references to Column1, Column2, and Column3 in the above statement are translated to the values of the row under consideration escape many.

    I prefer the first statement because it seems clearer to me what the logic is, and I prefer to keep SQL functions out of a query if I can.  (Why?  Gut instinct, that I should probably check with the execution plans that SQL Server produces.) 

  • Grant Partridge (unregistered)

    hmmm.. this one's got a WTF in the text provided by the submitter...

    "... all the while ignoring the WHILE clause." (???)

  • OneFactor (unregistered) in reply to ammoQ
    ammoQ:
    Anonymous:

    1. isnull in the where clause: where isnull(@city,city) = city AND isnull(@customer, customer) = customer AND...
    2. OR in the where clause: where (@city is null OR @city = city) AND (@customer is null or @customer = customer) AND...
    3.using if statments to break it out into 2^n cases where n = number of optional parameters.



    I would also prefer 2 over 1 since I have a feeling that there is a better chance that the query optimizer can optmize the statement based on the constant values. Anyway, as long as you don't have indexes on the searchable columns, all versions should perform equally fast (that means: full table scan). If you have e.g. 2 of 6 columns indexed, you could make a mix of 3. and 2. to get the best possible performance (3. for the indexed colums, 2. for the rest). Dynamic SQL is obviously best in terms of performance, but more difficult to maintain.

    I believe 2 performs better than 1 in the presence of indexes because of the possibility of applying short-circuit boolean logic. The indexes go all funny in 1 because the isnull function gets passed an argument which changes from row to row. I seem to recall that indexes work best when functions depend only on passed in parameters and work worst when functions depend on both passed in parameters and row values and that "functional indexes" can help out with functions which depend only on row values.

    Thanks to all for reminding me of solution 4 which is to build the SQL dynamically (either in middle tier or in sproc). Does it actually perform better than clever where clauses even though the dynamic sql gets compiled each request?

  • (cs) in reply to UncleMidriff
    UncleMidriff:
    Anonymous:
    UncleMidriff:
    Anonymous:

    fuzzbucket:
    im no SQL guru but for optional parameter selects for searches i always use

    "SELECT * FROM Table WHERE Col1 LIKE '%" + parm1 +" %' AND Col2 LIKE '%" + parm2 "%'" etc....

    makes it one statement...is there something wrong with this? Is this a serious performance hit or does it not allow use of indexes or something? please let me know..

    I am also fairly inexperienced in SQL, but whenever I've had a situation where I needed a dynamic query with optional parameters I also use something similar to LIKE "%" & param1 & "%".  <font color="#000000">Does running a query like this come with any sort of performance hit over doing it some other way?  </font>



    I'm fairly inexperienced as well, but from I have read, using LIKE can potentially slow things down, especially if the first character is preceeded by a"%" as in, "SELECT * FROM Table WHERE Column LIKE '%F%'".  http://www.sql-server-performance.com/transact_sql.asp says that if you can rewrite that to be "SELECT * FROM Table WHERE Column LIKE 'F%'" then, "<font face="Verdana" size="2">the Query Optimizer has the ability to potentially use an index to perform the query, speeding performance and reducing the load on SQL Server.</font>"  I don't know if the same rules apply to other databases.


    On a somewhat unrelated note, I thought that dynamic SQL was unspeakably bad and to be avoided at all costs.  Am I missing something that makes the examples you all are talking about ok?



    Forget performance, you've got a bigger problem:


    What if I put in my search criteria as "a'; drop table xyz; --"?



    Ok, here's where I'm a little fuzzy on how SQL injection attacks work.  I'm going to give two scenarios, one where that SQL injection attack will work, and one where it will not.  If I'm wrong about either, please let me know:

    1.  In this situation, I take your input, "a'; drop table xyz; --" from, say, a web form, and set a string named param1 equal to it.  Then I set a string named sql equal to "SELECT * FROM Table WHERE Blah LIKE '%" & param1  & "%'", so sql becomes "SELECT * FROM Table WHERE Blah LIKE '%a'; drop table xyz; -- %'".  Then I execute sql and rue the day.

    2.  In this situation, I take your input, "a'; drop table xyz; --" from, say, a web form, and send it to a stored procedure named GetEverythingByBlah.  The meat of this sp is "SELECT * FROM Table WHERE Blah LIKE '%' + @param1 + '%'.  When the stored procedure is run it selects everything from Table where the value of column Blah is like "a'; drop table xyz; --".  So, if for some weird reason there happened to be a row in Table where column Blah had a value of "rgwrgwerga'; drop table xyz; --aeherghaseth", then that row would be returned.  Otherwise, no rows would be returned.  Regardless, the SQL injection attack fails.

    So, am I right?


    From what I know, you have it correct.  The worst thing you can do is concatenate the values you have received from the user into the query string you are about to execute.

    This is where I can get hazy, due to not knowing all of the specifics on SQL Server.  In general the db user that an application is using should only have read/write permissions(unless you're a coding god and just use sa).  So, just having those permissions, can this db user actually drop a table?  I'm not necessarily saying that a knowledgable person couldn't do damage with SQL Injection with a limited db application account, but table dropping is generally out of the question, right?

  • Eric Thompson (unregistered) in reply to OneFactor

    If you don't need to search on the optional parameters being null and you want an AND search, just set the default value for those parameters to null and have something like the following where clause:
    where
      (firstField = case when @firstParam is null then firstField else @firstParam end)
      and
      (secondField = case when @secondParam is null then secondField else @secondParam end)
    ...

  • Eric Thompson (unregistered) in reply to Eric Thompson

    Dang, can make that simpler. Change every comparison to a coalesce: (firstField = coalesce(@firstParam, firstField))

  • (cs) in reply to Chris F
    Chris F:
    OneFactor:

    I think the problem stems from the "6 optional parameters for a search" requirement. I've seen various standard solutions proposed before:

    1. isnull in the where clause: where isnull(@city,city) = city AND isnull(@customer, customer) = customer AND...
    2. OR in the where clause: where (@city is null OR @city = city) AND (@customer is null or @customer = customer) AND...
    3.using if statments to break it out into 2^n cases where n = number of optional parameters.

    #1 has poor performance but is highly maintainable and readable
    #2 has better performance and remains highly maintainable and readable
    #3 has the best performance but is the most difficult to read and maintain


    As an aside: Now I'm no DB expert, but I tend to default to #2 until the performance kills me.  I don't know what it is about such queries, but MS SQL Server seems unable to parse out the expression (null is null OR null = city).  When the performance becomes unbearable I simply switch over to dynamic SQL.  It sucks, it's hard to maintain, but I don't know any better way.

    There's got to be something I'm missing.  Perhaps it's the fact that @city can change from row to row and the query planner can't automatically recognize that certain variables are immutable for the duration of the query.  Is there a way to declare them immutable so that #2 is as performant as dynamic SQL?

    Maurits:
    Another option is to (shudder) DYNAMICALLY BUILD a sql string WITHIN the stored procedure, and then run it...

    DECLARE @sql nvarchar(4000)

    SELECT
        @sql = '...' +
    'WHERE 1 = 1 ' +
    CASE WHEN @ProductID IS NULL THEN ''
        WHEN @ProductID = '' THEN ''
        ELSE 'AND ProductID = ''' + Replace(@ProductID, '''', '''''') + ''' '
    END +
    CASE WHEN @ProductName IS NULL THEN ''

        WHEN @ProductName = '' THEN ''

        ELSE 'AND ProductName = ''' + Replace(@ProductName, '''', '''''') + ''' '

    END

    ew... slow and hard to maintain.


    In my experience, the dynamic SQL method by far the most performant for non-trivial search methods.  What would you do?




    Actually, one way around the performance of the (@var is null OR field = @var) is this:  (@var is null OR (@var is not null AND field = @var))

    This prevents the DB from doing an exhaustive search as a result of the right side of the OR expression, since the first test fails.  Do this and you'll almost never need to use dynamic SQL again.
  • Arik the Red (unregistered) in reply to loneprogrammer

    loneprogrammer:
    This isn't so bad.  At least it completes in O(N) time and memory usage.

    A real WTF would have to use at least two tables to be O(N*N) in either time or memory.  Possibly in both if you really bugger it up.

    Funny thing about databases, memory usage means almost nothing. Database optimizations are all about reducing disk read/writes. I'll take O(n*n*n) in main memory over O(3n) in disk reads (and writes!) any day.

  • Matt B (unregistered) in reply to Arik the Red

    I usually use column = COALESCE(@column, column) for such situations. It tends to perform better in my experience.

    Besides, coalesce is a cool word.

  • dhromed (unregistered) in reply to Rick Scott
    Anonymous:


    Forget performance, you've got a bigger problem:


    What if I put in my search criteria as "a'; drop table xyz; --"?



    Then you can test whether there's an idiot sysadmin at work who allows the DB Users used by scripts to have access to everything.

  • Rick Scott (unregistered) in reply to dhromed
    Anonymous:
    Anonymous:


    Forget performance, you've got a bigger problem:


    What if I put in my search criteria as "a'; drop table xyz; --"?



    Then you can test whether there's an idiot sysadmin at work who allows the DB Users used by scripts to have access to everything.


    The actual SQL doesn't matter. If the account running is not sa, I could just put in " a' or 1=1" or something similar.

  • (cs) in reply to Arik the Red
    Anonymous:

    Funny thing about databases, memory usage means almost nothing. Database optimizations are all about reducing disk read/writes. I'll take O(n*n*n) in main memory over O(3n) in disk reads (and writes!) any day.



    O(3n) is the same as O(n). There is definitely a value for n that makes you regret your decision (e.g. when the swapping starts)
  • Brian Schkerke (unregistered) in reply to Ytram
    Ytram:

    From what I know, you have it correct.  The worst thing you can do is concatenate the values you have received from the user into the query string you are about to execute.

    This is where I can get hazy, due to not knowing all of the specifics on SQL Server.  In general the db user that an application is using should only have read/write permissions(unless you're a coding god and just use sa).  So, just having those permissions, can this db user actually drop a table?  I'm not necessarily saying that a knowledgable person couldn't do damage with SQL Injection with a limited db application account, but table dropping is generally out of the question, right?


    The stored procedure will execute with the permissions granted to the stored procedure owner.  So if dbo created the stored procedure then dbo is who it executes under.  Stored procedures make a great way to minimalize security risks by restricting access to tables and providing access through views/stored procedures, but if you don't take care it can blow up in your face.


  • (cs) in reply to Brian Schkerke
    Anonymous:
    Ytram:

    From what I know, you have it correct.  The worst thing you can do is concatenate the values you have received from the user into the query string you are about to execute.

    This is where I can get hazy, due to not knowing all of the specifics on SQL Server.  In general the db user that an application is using should only have read/write permissions(unless you're a coding god and just use sa).  So, just having those permissions, can this db user actually drop a table?  I'm not necessarily saying that a knowledgable person couldn't do damage with SQL Injection with a limited db application account, but table dropping is generally out of the question, right?


    The stored procedure will execute with the permissions granted to the stored procedure owner.  So if dbo created the stored procedure then dbo is who it executes under.  Stored procedures make a great way to minimalize security risks by restricting access to tables and providing access through views/stored procedures, but if you don't take care it can blow up in your face.


    Ok, so an sp executes with the permissions of the user who created it.  Presumably, you wouldn't give JoeWebUser access to an sp that did anything horrible.  But let's say you give him access to an sp that creates a sql statement dynamically, like @DynamicSqlString = "SELECT * FROM Table WHERE Blah = '" + @param1 + "'", and then executes it, like Exec(@DynamicSqlString).  If JoeWebUser enters something SQL-Injection-irific for @param1, then you're probably screwed because the sp will execute with the full set of permissions held by the developer that created the sp.  However, if you were dynamically to create this same SQL statement with the same SQL-Injection-irific input for @param1 and then execute it directly from your application, then, since you'd be logged into the database as JoeWebUser with limited permissions, the SQL injection attack probably wouldn't work on account of JoeWebUser not being allowed to drop tables, delete rows, etc.

    Is that right?
  • (cs) in reply to UncleMidriff

    Yes, if you simply concatenate the parameter values onto your dynamic SQL string you're just as screwed as you were before you did anything. It should be noted that this isn't the recommended method of doing this though.

    Parameters are translated... although I'm not sure that's the proper term.  They are not treated as literal replacements.  The SQL Server engine ensures the value inside the parameter is suitable for use as a value by treating it differently than a text query.

    JoeUser enters
        ' OR 0 = 0 --
    into your username box.  You use parameters to encapsulate this value which is passed to SQL Server to a stored procedure there.

    Instead of using a dynamic query, you use a parameter:

    SELECT Username, Password FROM Accounts WHERE Username = @Param1 AND Password = @Param2

    SQL Server stuffs the value of @Param1 straight into the variables that are being compared.  It doesn't first do a text replacement of the values and then execute the query (that would be no better than a dynamic query).  What gets "executed" is effectively

    SELECT Username, Password FROM Accounts WHERE Username = ''' OR 0 = 0 --' AND Password = @Param2

    (Note:  That's not what is actually executed.  Can't stress that enough. :))

    If you had used dynamic SQL though and concatenated the string values together you would've had someone with access to your system.

    Parameters can be used outside of stored procedures.  They can be used with straight SQL queries as well.  Well, EXEC() can't take parameters but sp_executesql can.  See MSDN at http://msdn.microsoft.com/library/default.asp?url=/library/en-us/tsqlref/ts_sp_ea-ez_2h7w.asp. 

    Where a dynamic query is beneficial is the fact that placement and usage of parameters is limited.  You cannot, for instance, use a parameter as the column name or table name to select from.  You cannot use it as an indicator as to which columns to select from the table.  That requires dynamic SQL.

    Using parameters also allows execution plan reuse, whereas constants and dynamic SQL garner no such benefit.

  • (cs) in reply to ammoQ

    ammoQ:
    UncleMidriff:

    I'm fairly inexperienced as well, but from I have read, using LIKE can potentially slow things down, especially if the first character is preceeded by a"%" as in, "SELECT * FROM Table WHERE Column LIKE '%F%'".  http://www.sql-server-performance.com/transact_sql.asp says that if you can rewrite that to be "SELECT * FROM Table WHERE Column LIKE 'F%'" then, "<FONT face=Verdana size=2>the Query Optimizer has the ability to potentially use an index to perform the query, speeding performance and reducing the load on SQL Server.</FONT>"  I don't know if the same rules apply to other databases.

    On a somewhat unrelated note, I thought that dynamic SQL was unspeakably bad and to be avoided at all costs.  Am I missing something that makes the examples you all are talking about ok?


    The same rule is true for Oracle and probably for every database that has a "like" operator and a query optimizer.

    Dynamic SQL is not unspeakably bad, but should not be used without a good reason. One particular problem with dynamic SQL and Oracle (at least in version 8i, I'm not sure about the more current versions) is that whenever the literals in the query change, Oracle will not reuse a previously parsed statement; for example, "select * from x where y=1" and "select * from x where y=2" are two different statements and Oracle will parse the second one again although the only difference to the first one is the literal value. Using bind variables instead of literals is better for Oracle.

    In Oracle, you can use bind variables in dynamic SQL. You can cobble up a nice SQL string using ':1', ':2', etc, for the placeholders, then call it with the appropriate values. So theoretically it should be able to reuse the query. Also you avoid the whole problem with reserved characters in the literal string then.

    Unfortunately on every other level, Oracle makes baby Jesus cry [:'(]  

    I haven't checked if this is possible with SQL Server though - I must investigate...

     

  • (cs)

    I write Oracle batch code like that
    all the time.  Well, not quite, since
    I don't like to update in place.

    More like this:

    create table first_pass as
    select vars_i_care_about where
    whatever ...;

    Then I index what I know I'll need.

    Then I might join another table,
    do another create, etc.

    This is typically for marketing data warehousing work.
    Table sizes in the 10s to 100s of millions.

    I used to construct very complex sql queries,
    extract, joins, etc, but they were next to
    impossible to truly troubleshoot and prove.

    And the total process could run for hours,
    which could fail and need to be redone. If
    I kept it piecemeal, I could fix and restart
    at a known point.

    By cutting the work into these type of chunks
    is allowed me to control the size and time of
    each step, and made sure that a change in the
    optimizer didn't burn me a year from now
    since my joins were pretty simple.

    Also, I know people who are managers who spent
    years in Xywrite editor that did this type of
    multiple pass transforms of data, massaging it
    until it was what they wanted.

    A real program would do it in one pass, but they
    only had a crappy macro language to work with.
    Might be based on that.

  • Bob (unregistered) in reply to Chris F

    There's a good article about dynamic search conditions here:

    http://www.sommarskog.se/index.html

Leave a comment on “SELECTing The Hard Way”

Log In or post as a guest

Replying to comment #:

« Return to Article