- Feature Articles
- CodeSOD
- Error'd
- Forums
-
Other Articles
- Random Article
- Other Series
- Alex's Soapbox
- Announcements
- Best of…
- Best of Email
- Best of the Sidebar
- Bring Your Own Code
- Coded Smorgasbord
- Mandatory Fun Day
- Off Topic
- Representative Line
- News Roundup
- Editor's Soapbox
- Software on the Rocks
- Souvenir Potpourri
- Sponsor Post
- Tales from the Interview
- The Daily WTF: Live
- Virtudyne
Admin
I must confess I had this sort of AND counting in my code somewhere back in 1998, when I was writing my first CGI scripts in Perl. Though even back then I knew how to use placeholders in SQL statements...
captcha: burned. Yeah, I know...
Admin
I'd much rather be censured than censored, considering one of my posts was deleted.
Admin
Does
even work?I once spent forever debugging a piece of code that used that sort of construction (not my code, something imported from elsewhere, thank you very much). The
string was getting corrupted. would probably make more sense.And
should certainly be , in any event.Admin
And people wonder why SQL injections are so common....
Real C programmers don't use stdlib! ;) Kidding, of course, but implode is easy to write.
I bet this code sucks, but it's a shot:
char *implode(char *separator, char **quoted_statements) { if (!separator || !quoted_statements) return NULL;
}
.... hmmm, I named variables as if it were specific to this WTF but that should be a general implode... well, I didn't look at php.net for specifics but it should be close.
captcha: poindexter! and how...
Admin
... and of course that check for realloc returning NULL is completely wrong, it should store the original, call realloc and check if the original pointer == the new pointer or not.
Admin
and if you're wondering why I used i instead of just statement++ ... no caffeine yet today.
captcha: kungfu! wish I had brought my C-fu instead... ;)
Admin
I have to speak up in support of StressBomb's approach here.
There's nothing about the base problem that would prevent one from writing very clean, maintainable code that outputs the minimal, readable sql without generating scat like "where 1=1 [[and <clause>] ...]. StressBomb's implementation is not necessarily completely correct or ideal, but that is dodging his/her point - the general approach is better, one should prefer to write a readable hack over an equally readable but more correct solution.
The obvious better approach as others have pointed out would be to use a join function if available; or steal or write one if there isn't one handy. Though for something as easy to write as a join function, I would probably just write it myself rather than go to the trouble of copying someone else's - most of the time spent is in writing and running the unit tests anyway.
That said, writing any serious web app in C is definitely a WTF all by itself.
Admin
Re: "Once found, that string is searched and a pointer to a copy of the value associated with the key is returned. This pointer is never freed." What? Why? Either he used strstr and you missed it or he should have used strstr.
Admin
Argh. One should NOT prefer a readable hack over an equally readable but more correct solution is what I had meant to say. Obviously.
Admin
We'll file that under "should have used strstr." The thing leaked memory, was wide open for injection, had buffer overruns all over the place. etc. The corporate culture was much the same; this thing was one of the WTFs that was easy to share without much background information - but is also indicative of just how many layers deep in excrement it was buried.
And the login validation was that a cookie of a certain name existed - not that it had any valid content corresponding to an authentication token, but just that you had one.
Admin
Am I the only one thinking of prepared statements here?????
In postgres, this is what I would do: PREPARE the_query(data1, data2, ..., dataN) SELECT * from foo where (col1 = $1 OR $1 IS NULL) AND (col2 = $2 OR $2 IS NULL) ...;
Then, whenever I want to search, I can do: EXECUTE the_query(arg1, arg2, ..., argN);
Substituting data for arg1 etc and tossing in NULLs where I dod not have any data to input. The prepare statement has the added benifit of only having to be optimized once and then can be used multiple time. SQL injections are also now impossible b/c postgres knows that I am only providing data. I even get free type checking and there is no longer a need to keep recreating the query and have it be re-optimized or risking memory leeks on ever run.
Admin
I like your solution. One small change I think I would make ... char * delim = " WHERE "; For each field: sprintf(..., delim, ...); delim = " AND ";
And obviously leave " WHERE " out of the static part. That way if you don't have any conditions you don't wind up with: "SELECT * FROM foo WHERE"
Admin
I wouldn't let anyone perform appendicitis on me. How would one even go about doing that? Inject my appendix with some kind of blockage? Should such a strange event happen, I'll be sure to leave the procedure of performing an appendectomy to professionals who know what they're doing :P
Admin
I have actually used an "and counter" before, but I have an excuse--our target (embedded device) had no C compiler, just an assembler. So to do complex statements like: if (a && b<20 && c >30 && c < 40) { do something }
I'd have to do (in psuedo-C, since the assembly's pretty opaque):
conditions = 4; if (a) conditions--; if (b<20) conditions--; if (c>30) conditions--; if (c<40) conditions--; if conditions==0, do something
Admin
[quote] Yes, it's possible to create a where clause builder but as the previous poster points out there are a few gotchas with it.
What's bad about your example is the maintenance burden placed on others to have to read and understand your parsing routines, then decide if it's important to the work they're doing. [quote]
Honestly I don't understand where the "maintenance hurdle" is. In the end when I build the parts of the query into the final query, it's as checking if I have something in the whereStatement I've built (in case I could have nothing) and if so adding "WHERE "+whereStatement, or else doing nothing.
I suppose assembling strings with no context all over your code is far more readable and maintainable for some people. I don't want to work with those people.
Admin
It's an artifact of code re-use, which I am sure you would support. Let me put it into perspective:
"Select Field from Table Where 1=1 " + BuildWhereClause(ParmCollection)
Function BuildWhereClause(parmCollection) as string For each Item in parmCollection parmstring = parmstring + " and " + Item Next Return parmstring end function
Looking at the psuedo-code above you can see that there is absolutly no additional checks needed anywhere and the statement just works.
Of course to truly put it in perspective, forget the dynamic query and use stored procs and prepared statements, problem solved.
Admin
I don't see why we need the 1=1 in there. Is it there just so we could keep arguing? What prevents buildWhereClause being marginally smarter than prepending the "AND"?
Maybe we're working in totally different conditions and producing totally different queries. But in what I do, I may have to build a WHERE expression, and later put it in parens "(..)" and OR it with another expression, ending with:
(X and Y and Z) or (A and B and C)
But your "reusable" code prepends "and" as if it's simply impossible not to want to "AND" with something precending the expression, how do I solve this? Oh, just like you solved it in the first example, we add more dummy statements! Here we go:
WHERE 1=1 AND (1=1 and X and Y and Z) or (1=1 and X and Y and Z)
And I'm not even talking YET about the cases where you have to build the SELECT ... FROM TABLE part dynamically as well. But relax people, as we have solution for this too:
SELECT 1 as IgnoreThis, ..., ... FROM ...
Admin
Your function doesn't do what it says it does. I would suggest (although I'm not a fan of the "1=1" hack):
Admin
Actually I agree with you, remember my last statement...
Of course to truly put it in perspective, forget the dynamic query and use stored procs and prepared statements, problem solved.
Admin
Actually it does exactly the same thing, but without the additional check for length, but I will always have a "Where 1=1" clause on my select. The point here was to make the function as dumb as possible yet still do the work necessary.
Of course now we are being pedantic because we all know that even this solution leads to major screw ups with sql injection and what not. As i said before, make the point moot and use prepared statements with stored procedures.
Admin
if you don't know the number or type of columns you'll check against your WHERE clause, you'll need to prepare the statement every time, arriving at slower code (two roundtrips to the server), and exactly the same problem as before, just this time with question marks:
SELECT .. FROM .. WHERE 1=1 AND x=? AND y=? AND z=?
Not to mention you can't run any sort of query as a prepared statement just yet, only select few (pun not intended :) ).
If we'll talk injection, using the database API quoting/escaping function, combined with not changing the charset/collation at runtime via query makes injections impossible.
Admin
Nope. My function builds (and returns) the entire WHERE clause (as per its name).
The original one just turns a parameter collection into a string delimited by (and starting with) " and ".
Admin
.. and still, I only need to send parameter1 as:
'; DROP TABLE something; DROP DATABASE everything; COMMIT;
Admin
I would argue that using "WHERE 1 = 1" is trying to be clever. Cleverness typically reduces the amount of code but takes a bit more thinking for someone new to figure out what's going on. Also, from the the database perspective it's more work to figure if "select customer_name from customers where 1 = 1" is supposed to return all or it was a bug in the sql query generation.
Admin
If you're appending the AND statement (as opposed to prepend) then presumably you never end up with the above situation.
Instead you get "WHERE condition AND" or simply "WHERE AND", in which case you can simply strip off trailing ANDs or WHERE ANDs after you build your query.
Admin
Uhm, then it's the Necronomicon (http://en.wikipedia.org/Necronomicon) of computer programs?
Admin
I don't know, the following make me edgy:
I mean, what happens if the web page doesn't return ANY terms?
Personally, my favored solution involves recognizing that there are 2 ends to the problem ... the last term and the first term. I might have tried something like this (hope I get this syntax right):
Okay, maybe even this could be improved, but at least it avoids the counter of and.
Admin
Ooops ... this first thing should have been:
Instead of
Sorry, changed names partway through and didn't eradicate them all...
Admin
[quote user="Coyne"] if ( termword == whereword ) { //You get nothing, cheater! sprintf( stmt, "%s WHERE 1 = 0", stmt); } [quote]
Congratulations, you just topped the WTF in the original article, by:
forbidding the users to search for the word "WHERE" (in caps...), try in Google, you can search for "WHERE" hmmm.. funny how this doesn't get injected in their SQL.
adding a protection (same from 1.) that doesn't protect against absolutely anything.
running a query that by definition returns nothing (1=0). Oh wait.. maybe, like Alex likes to joke, there are parallel universes where 1 = 0 and thus this would make sense.
Admin
LOL! You think Google uses SQL, idiot..
Admin
.Net users shall use the following techniques:
Parameterised queries
AND
string[] terms = ...; string.Join(" and ", terms);
Admin
I'll let you further embarass yourself. Enlighten me, what exactly do they use?
Admin
Jesus.. seriously amazing the comments in here these days.
For those suggesting you write conditional branches into your query building code, you're making the same mistake as the person who originally wrote the code, unnecessary code logic.
Does that not speak to you? The equivalent of your "solution" is being posted on the daily wtf, this would indicate that your "solution" IS NOT A GOOD ONE.
The more "intelligence" you put in your code, the more vectors you're creating for yourself to stuff up, programmers are not infallible, hell probably 1/4 of all conditionals don't get written exactly right the first time.
The difference between your code, and the guys who just put "where 1=1" in every query, is that they KNOW their code will work, aside from stuffing up the string manip. there's nothing that can go wrong.
And readability is not even an issue, where 1=1 or where true is pretty damn obvious, and also, who are you trying to make the sql look pretty for? The dbms??
Admin
I'm pretty new to SQL, but this is how I do it in PHP:
($whereFilters is an array of qualification strings)
$where = ""; foreach ($whereFilters as $value) { $where .= (($where == "") ? " WHERE " : " AND ") . $value; }
$sql = "SELECT * FROM whatever" . $where;
Admin
"For those suggesting you write conditional branches into your query building code, you're making the same mistake as the person who originally wrote the code, unnecessary code logic."
So you really suggest we should build queries like these:
SELECT 1 as IgnoreThis, ..., ... FROM ... WHERE 1=1 AND (1=1 and X and Y and Z) or (1=1 and X and Y and Z)
This is what "unnecessary" code logic leads to in anything but the most simple query. I've not even began talking about joins.
"The more "intelligence" you put in your code, the more vectors you're creating for yourself to stuff up, programmers are not infallible, hell probably 1/4 of all conditionals don't get written exactly right the first time."
Seriously, if a programmer I work with comes crying that he can't understand this:
whereStatement = subExpr.length?'WHERE '+subExpr.join(' AND '):'';
... well, he's one fired programmer.
I've built such complex querying systems that are at the same time stable and easy to use, that I can't believe so many people believe it's above their head to properly build their damn simple query. And well.. I never thought what I'm doing is really "complex querying system", but the comments in this article really put things in perspective for me, about the kind of programming skills lots of the web developers have nowadays.
And it makes me very sad.
Admin
Deliberately misunderstanding my point doesn't make you right.
First, you used an OOP example, the wtf was written in C. Second, I never said that the conditional logic to build a query was hard to understand, I said it was an extra place for errors to appear. Third, you obviously have no idea about defensive programming. Fourth, that query you just wrote out, yes, that's how it should end up, the 1=1's will be optimised out by the dbms, and eliminate the need for conditional processing on the programmers part.
The simplest means, with the least code on your part, is nearly always the best.
Admin
"First, you used an OOP example, the wtf was written in C."
There we go, sorry for making it too hard for you to understand my point:
whereStatement = strlen(subExpr)?'WHERE '+joinArr(subExpr,' AND '):'';
"Third, you obviously have no idea about defensive programming."
This is not defensive programming. This is programming delusional paranoia. If you're afraid that a stupidly simple string join will make your program complex.. I'm truly amazed.
Admin
"Third, you obviously have no idea about defensive programming."
Actually I'd like to show you an example of what defensive programming you're suggesting here. Imagine that in a twist of fate it turns out you have to use OR-s and not AND-s... or even more interesting you may need to use a combination of both, not known in advance...
Your "defensive" implementation will produce this:
WHERE 1=1 OR ... OR ... OR ...
Oh.. shoot :P I suppose that assuming that your expression always starts with an "AND" is not exactly following the "assume nothing" idea behind defensive programming, is it...
Admin
Uh huh, and what if the query actually IS complex, with multiple places (possibly different functions, objects etc) where 'where' clauses can be added? I guess we could... search the query string for the substr 'where'? or... we could maintain the sql statement in different pieces until we are finished... or we could have a flag (and++ and--) that indicates whether we have a 'where' clause or not... or we could have a convention of adding 'where 1=1' to all of our statements and all clauses are added as 'and x'.
And what on earth is joinArr()? You're writing a function like that.. in C, manipulating strings, for no purpose. quick.... here it comes... SEGFAULT! Especially when your version of the function doesn't have bounds checking... clever
So I'll say it again, only program what you have to, for every extra x lines you're adding y bugs, as well as wasting time.
Admin
"Uh huh, and what if the query actually IS complex, with multiple places (possibly different functions, objects etc) where 'where' clauses can be added? I guess we could... search the query string for the substr 'where'?"
I can't believe that you just said this b.s., and want to seriously make it seem as if I'm the one overcomplicating the situation.
FYI, in my code, since every part of the query is dynamic, I usually collect valid expression which are valid on their own, means, I could wrap them in parens: (..) or join them with OR, AND or XOR, or whatever I see fit.
At any point the expressions are valid since this is the basic guiding principle I build them upon. I don't have extraneous operators or commands on left or right which "assume" to be added to the correct filler junk in the query.
In the end, I just add the expressions in the query skeleton:
query = 'SELECT '+selectExpr+' FROM '+tableName; if (whereExpr) query+= ' WHERE '+whereExpr; if (orderExpr) query+= ' ORDER BY '+orderExpr; if (limitExpr) query+= ' LIMIT '+limitExpr;
I've used this query build process as a basis for driver-independent abstraction for generic queries in a MVC framework. I have YET to ever have a query gone wrong by using it, if even because the methods will validate all parameters for me before the query is built and executed.
Admin
"Uh huh, and what if the query actually IS complex, with multiple places (possibly different functions, objects etc) where 'where' clauses can be added? I guess we could... search the query string for the substr 'where'?"
I can't believe that you just said this b.s., and want to seriously make it seem as if I'm the one overcomplicating the situation.
FYI, in my code, since every part of the query is dynamic, I usually collect valid expression which are valid on their own, means, I could wrap them in parens: (..) or join them with OR, AND or XOR, or whatever I see fit.
At any point the expressions are valid since this is the basic guiding principle I build them upon. I don't have extraneous operators or commands on left or right which "assume" to be added to the correct filler junk in the query.
In the end, I just add the expressions in the query skeleton:
query = 'SELECT '+selectExpr+' FROM '+tableName; if (whereExpr) query+= ' WHERE '+whereExpr; if (orderExpr) query+= ' ORDER BY '+orderExpr; if (limitExpr) query+= ' LIMIT '+limitExpr;
I've used this query build process as a basis for driver-independent abstraction for generic queries in a MVC framework. I have YET to ever have a query gone wrong by using it, if even because the methods will validate all parameters for me before the query is built and executed.
Admin
"And what on earth is joinArr()? You're writing a function like that.. in C, manipulating strings, for no purpose. quick.... here it comes... SEGFAULT! Especially when your version of the function doesn't have bounds checking... clever"
If you feel insecure in your coding and assume elementary string functions will immediately cause segfaults, you shouldn't transfer this quality on me.
I simply can't imagine how you'll react if you're given to write a language lexer and parser, or translator, which I do and use in my daily work. You'll give up before you've even started: I mean, imagine all the possible segfaults in a full-blown parser!
I actually am not afraid to code. Maybe you've picked the wrong profession.
Admin
No idea what to respond to your description of your own system... it's just so beside the point...
a list of the problems with your char* joinArr(char**,const char[]) function will have to do:
To make this safe for memory bounds checking its argument list would actually need to be (char**,int elems,const char[]) C doesn't allow you to concatenate strings together like that, in fact C doesn't have "strings". Your function can't be implemented without leaking memory all over the place.
Actually I'm thinking that this scenario would make a damn good one for testing job applicants, separates the 'programmers' from those who just know how to write code very effectively.
Admin
"a list of the problems with your char* joinArr(char**,const char[]) function will have to do: "
Dude, I didn't write this function. How about you stop lecturing me what I supposedly did wrong in it, and scroll up a bit and direct your comments to the guy who wrote it.
Admin
Canumbler, and FYI, I use C#, Java and PHP for web programming.
If you're looking for an unnecessary complication in this whole situation, it's using C for a project like this in the first place.
Admin
Good lord, never, ever, ever write in a lower level language, please god please.
Lexers and parsers, are written with high level tools, you don't have to worry about things like segfaults.
Naww, it's just my clients actually have uptime requirements. Seriously, it's not a dick measuring contest, and it's people with your attitude that give rise to sites like this. I promise you, all the people who get stuff submitted here aren't "afraid to code" either, and are supremely confident in their abilities. Look up some statistics on the amount of bug-free lines of code programmers write on average a day sometime.
Admin
"it's people with your attitude that give rise to sites like this."
It'd be a nice time to stop assuming things you've no clue about. It's becoming so lame.
"I promise you, all the people who get stuff submitted here aren't "afraid to code" either, and are supremely confident in their abilities."
Don't forget what we're discussing: you want me to say I don't have the confidence to build the most basic query of all properly. My dog could build this query properly.
Admin
I realise I'm speaking something blasphemous here, but... in this particular case WHO CARES if the strings are never freed? It's not wonderful, but it's not a WTF.
The OS cleans up all the process's allocated memory on exit anyway, and few platforms can reclaim freed memory from a running process (except by paging it out, in which case it actually doesn't matter if it's freed or just inactive). For short lived processes the memory footprint won't be particularly significantly affected by free()ing the strings, especially as there's a good chance the smaller free()d blocks wouldn't get reused due to memory fragmentation.
You might save a little in terms of the process' total resident footprint, but probably not actually all that much. And, hey, you sure can't get any double free() security problems/crashes.
Obviously it's much better to properly manage your memory, since (eg) the code might be converted to a long-running fastcgi/scgi server rather than a short lived CGI program, or might be re-used elsewhere. But for a CGI program it's really not actually that bad.
Admin
This is horrible, horrible stuff - no ifs and buts about it.
But it is not as bad as it seems:
(1) there is is only one code file and it is only 40 KB in size. 40 KB is not so big - in my experience that will typically equal between 50 and 100 A4 pages of printout. That is not a tome, at least. (2) (this is my assumption but I feel it is reasonable) there are no external reference to other code modules (except standard libraries) since everything is contained in one monolithic code file. I see this as a chance since one does not have to check tons of other code files (which would be problably equally horrible). (3) From the code snippets provided there appears to be a lot of cut-and-paste coding. Again I see this as a chence so functions can be coded in order to replace repetitive occurences of cut-and paste coding with the obvious improvements resulting thereof. I had to handle cases of cut-and-paste coding before and was able to cut the size of the code base by 30-50 percent of the original.
The biggest problem I see with this pile of shit is that most likely all variables are global. So my strategy for straightening out this crap would to refactor all cut-and-paste coding as functions and calls thereof (thereby lowering the absolute number of variable references in the code), test the revised code and then in the next step try get a grip on the use of global variables in order to minimize/eliminate their use by refactoring as limited scope variables as required and test again.
Just hope that you are not stuck in a developmestuction environment; otherwise a test environment would have to be built. I would flatly refuse to touch this crap without extensively testing every refactoring step along the way.
Admin
Bollocks - you are just confusing people with this kind of coding.