- Feature Articles
- CodeSOD
- Error'd
- Forums
-
Other Articles
- Random Article
- Other Series
- Alex's Soapbox
- Announcements
- Best of…
- Best of Email
- Best of the Sidebar
- Bring Your Own Code
- Coded Smorgasbord
- Mandatory Fun Day
- Off Topic
- Representative Line
- News Roundup
- Editor's Soapbox
- Software on the Rocks
- Souvenir Potpourri
- Sponsor Post
- Tales from the Interview
- The Daily WTF: Live
- Virtudyne
Admin
well, SELECT COUNT(*) FROM [TableName] is notoriously unreliable....
Admin
Brilliant WTF!
But I can't believe any one would do somthing like this!
<font color="#0000ff">SELECT </font><font color="#ee82ee">Count</font>(*) isn't exactly advanced SQL!
Admin
Do you still have that Swiss Army knife from yesterday's post? I need to stab myself in the eyes. On second thought, never mind. This has got to be the highest WTF-per-line ratio that I will ever see in my life, so the damage is already done.
Extra WTF credit must go to this person for selecting ALL columns from the database. I would award a special "WTF with Gold Leaf Cluster" for using count += 1 instead of count++.
It's too bad people that program like this are able to get past the syntax errors so their code can actually run. It's a great argument for cryptic compiler syntax errors. To figure out the problem our WTFer would have had to go to a real programmer who would have either laughed his ass off or fallen dead right on the spot.
Admin
<FONT style="BACKGROUND-COLOR: #efefef">Add to the funny, the DataReader is not being closed when finished. Kudos!!</FONT>
Admin
Yes, COUNT(*) should have been used.
It's also worth pointing out that SqlDataReader provides no way of accessing the number of returned records.
And ++ vs. +=1, that's trivial.
Admin
Hey, at least they put the
Admin
I've seen this several times in the past with straight ASP/VBScript. When I questioned it, I was told that some of the ODBC drivers in use did not provide a count back to the dataset so this was the work around.
shudder...
Admin
select count is for weeinies
for life!Admin
<FONT size=3>[I've seen this several times in the past with straight ASP/VBScript. When I questioned it, I was told that some of the ODBC drivers in use did not provide a count back to the dataset so this was the work around.
shudder...]</FONT>
<FONT size=3>That shouldn't be the issue. As others have mentioned, a count of records in the table should happen through select count(*) on the database, not thru ADO or on the application server. recordset.Count is handy, or in .NET: DataTable.Rows.Count, but even if you cannot use those options, a seperate query to database for the number of rows is immensely more efficient than looping thru the records.</FONT>
<FONT size=3>LAME!</FONT>
Admin
Select count(smallest_column) is equal but faster than select count(*), no?
Admin
For those commenting on the trivial minor aspects of the code as written please remember that this is Alex's interpretations of three different samples all in differing languages. So quibbles about += vs ++ and closing the dbconnection really aren’t pertinent. The major WTF here is that:
a) somebody brought a full query to the client side to get the count.
b) most client side datasets will give you a row count automatically if you ask so why iterate through it again, since the component had to iterate through when it filled it.
c) the biggest WTF is that since Alex kept getting more and more of these sorts of wtfs it is an apparently common method of getting row counts.
Admin
Wouldn't that leave out rows where the value of smallest_column is null?
Admin
Blast, I finally register and when I go to post, I wasn't logged in.
Admin
At least optimizing the code will be easy. Then again, maybe not...
Admin
I am only guessing, but I expect a database could easily keep track of how many rows are in each table and quickly return that number when it sees count(*).
Admin
In SQL, Count(SomeColumn) is NOT equivalent to COUNT(*).
Also -- using the statistics is not a good way to get counts, because it depends on the stats being updated, and also it obviously will not work with filters or more complex SQL statements.
Remember, Alex's code is a cumulative example of several WTF's in which the technique itself is to be considered, not the specifics of the syntax he has written or a particular table or column name he has selected.
This WTF is way too common, unfortunately, so it doesn't even shock me that much I guess...
Admin
Alas, I, too, have witnessed this vulgarity.
Mr. Angry DBA
Admin
work for very small tables :P but what if a table has more than 32767 rows?!
Admin
But transactions and table versioning make it much, much harder for ACID databases, since there's no single row count for the entire table; each transaction has its own view of the table which needs various degrees of isolation from other transactions, so they usually end up needing to do a full table scan:
PITA, especially when you're not that bothered about a completely accurate count. InnoDB can make estimates, but they're effectively useless; on this particular table, a SHOW TABLE STATUS estimated rows count can be off by as much as 2 million in both directions, despite it using fixed length rows and not being modified between invocations. Bah.
Admin
Yes, using count(somefield) would be a worse WTF because it would yield incorrect record counts if somefield could be NULL. At least the code as-is gives a correct result. On the other hand, maybe an incorrect result would have led someone to find this mess sooner.
Admin
Re: "SELECT rows FROM sysindexes"
Agreed with the comment regarding statistics being updated correctly, and also the post I am replying to assumes a particular backend DB with that code - not all DBs have that table, etc.
Admin
In MS SQL Server, if you want a rough estimate of rows without fear of a lengthy query you can use the trick described here:
http://db.ittoolbox.com/documents/document.asp?i=2551
Admin
Ints are usually 4 bytes nowadays
Admin
VONE clueless developer who doesn't know SQL!
Muahahahaha!
TWO clueless developers who don't know SQL!
Muahahahaha!
THREE clueless developers who don't know SQL!
Admin
I once saw something much worse even than this. A quick summary of the action would be: -
Get first_table
Get second_table
foreach record1 in first_table
if record1 meets some_criterion then
foreach record2 in second_table
if record2 matches record1 in someway then
store some fields of record1 and record2 in a temporary_table1
endif
next
endif
next
This was then repeated twice more, once with temporary_table1 and third_table (results into temporary_table2), then again with temporary_table2 and fourth_table (results returned to caller). Oh, and it was all in PL/SQL.
And ran daily.
Needless to say, after a few user complaints as to why the output was taking 18 *HOURS* to generate I was asked to look at the SP and see if I could optimize it. I replaced the code with 3 straightforward select statements and the new routine ran in just over 9 minutes.
BLQ.
Admin
<font face="Georgia" size="2">I have acutally seen one example better than this. I inherited an shpoody ASP application about 3 years ago, that was full of WTF's like this. I wish I'd kept copies of it all becuase it made for inlightining reading ... bit I digrees.
You can probalby guess how he would retrieve the number....The developer on this porject had a thing about recordsets. He clearly mistrusted them, and would alays copy their intire contents into an array rather than acess them directly. Mad ... perhaps ... but there we a number of pages in the site where there were 'statistic' report summaries with links to a full break down. So for example there might be:
</font>
<font face="Georgia" size="2">Yep ... a full </font><font color="#9acd32" face="Georgia" size="2">"SELECT * from orders_table WHERE date_of_order = #Today#" <font color="#000000">would be placed into a recordset, thiswould be looped through to populate and array, and the eventualy figure of 250 would be gained by returning array.length
*shakes head in utter bermusement*
</font></font>
Admin
<FONT style="BACKGROUND-COLOR: #efefef">...unroll the loops...unroll the loops...un....</FONT>
Admin
That's funny .... if he was using ADO, and he liked arrays that much, too bad he didn't know about the GetRows() method of a recordset. It returns a 2D array oif all rows/columns in a recordset extremely quickly (no looping needed). Filling up an array with GetRows() and then working with the resulting array actually can be a significant performance improvement over working with the recordset.
Admin
<FONT style="BACKGROUND-COLOR: #efefef">The funniest of it all is people laughing their ass and then suggesting </FONT>
<FONT style="BACKGROUND-COLOR: #efefef">select count(*)</FONT>
<FONT style="BACKGROUND-COLOR: #efefef">You should use</FONT>
<FONT style="BACKGROUND-COLOR: #efefef">select count(1)</FONT>
Admin
Actually, what is funnier is that I am personally laughing my ass off at you for thinking COUNT(1) is faster than COUNT(*) .... you understand what COUNT(*) means, right? it is not like SELECT * ....
Admin
>Extra WTF credit must go to this person for selecting ALL columns from the database
Well, it depends on how many columns you have. If there're only 3 columns, using select * actually takes advantage of the MySQL query cache. be noticed that the query has to be EXACTLY the same to be cached.
Admin
Found this in some VB6 code today.
I think they heard of disconnected recordsets (see the comment in the code) but decided to write their own.
Function TrimBlanks(strSql As String) As ADODB.Recordset
On Error GoTo Error_Handler
Dim counter As Integer
Dim strString As String
Dim rs2 As New ADODB.Recordset
Dim blnCharFound As Boolean
Dim rs As ADODB.Recordset
Dim x As Long
Dim xx As Double
Dim xxx As String
Dim xxxx As Integer
Set rs2 = New ADODB.Recordset
Set rs = New ADODB.Recordset
'rs.CursorLocation = adUseClient
rs.LockType = adLockOptimistic
rs.Open strSql, cn
blnCharFound = False
With rs
For counter = 0 To .Fields.Count - 1
If .Fields.Item(counter).Type = adChar Then
rs2.Fields.Append .Fields.Item(counter).Name, adVarChar, _
.Fields.Item(counter).DefinedSize ', .Fields.Item(counter).Attributes
rs2.Fields.Item(counter).Attributes = .Fields.Item(counter).Attributes
rs2.Fields.Item(counter).NumericScale = .Fields.Item(counter).NumericScale
rs2.Fields.Item(counter).Precision = .Fields.Item(counter).Precision
blnCharFound = True
Else
rs2.Fields.Append .Fields.Item(counter).Name, .Fields.Item(counter).Type, _
.Fields.Item(counter).DefinedSize
rs2.Fields.Item(counter).Attributes = .Fields.Item(counter).Attributes
rs2.Fields.Item(counter).NumericScale = .Fields.Item(counter).NumericScale
rs2.Fields.Item(counter).Precision = .Fields.Item(counter).Precision
End If
Next
rs2.Open
If Not .EOF Then
.MoveFirst
Do While Not .EOF
DoEvents
rs2.AddNew
For counter = 0 To .Fields.Count - 1
If Not IsNull(.Fields.Item(counter)) _
And Not IsEmpty(.Fields.Item(counter)) Then
If .Fields.Item(counter).Type = adChar Then
rs2.Fields.Item(counter).Value = Trim(.Fields.Item(counter).Value)
Else
rs2.Fields.Item(counter).Value = .Fields.Item(counter).Value
End If
End If
Next
.MoveNext
Loop
End If
End With
' If blnCharFound Then
Set TrimBlanks = rs2.Clone
' Else
' Set TrimBlanks = rs.Clone
' End If
Set rs = Nothing
Set rs2 = Nothing
Exit Function
Error_Handler:
ShowError ("TrimBlanks")
End Function
Admin
They'd be equally fast except that any Null values in the "smallest_column" will not be counted so you could potentially end up with different counts.
Admin
I'm guessing many of you here have little or no experience developing enterprise software. An important goal in developing enterprise software is scalability and in order to scale, often times you want to distribute the load away from your SQL server to a middle-tier server. By counting the records on the middle-tier you are saving valuable processor cycles on the database server.
Admin
<FONT size=1>I'm guessing many of you here have little or no experience developing enterprise software. An important goal in developing enterprise software is scalability and in order to scale, often times you want to distribute the load away from your SQL server to a middle-tier server. By counting the records on the middle-tier you are saving valuable processor cycles on the database server.
</FONT>
<FONT size=2>So your saying that we can avoid scanning the whole table and then returning a row of one integer by scanning the whole table and returning a row of the size of each row for each row in the table all through the servers NIC. I can see how thats a lot more scalable, thanks for the tip ;)</FONT>
<FONT size=1></FONT>
Admin
I knew you wouldn't understand
Admin
Because you're wrong.
The time it takes to do the aggregation on the DBMS server will always been (worse-case, never real-world) equal to and best case, significantly faster than doing the aggregation outside of the database.
This is why the database does it -- the network overhead alone if the table is large could make the difference. Even ignoring that, the above still holds true.
Admin
Congratulations! [:D] the "WTF Comment of the Day!"
Yikes .... someone please tell me he forgot the <sarcasm> tags .... wow ....
Admin
Why don't they try XML?
Admin
"How about JavaScript?" :p
Admin
Uh, wait just one moment here. This code shouldn't even compile!
Hello! It's a pointer to the heap! Let's try cmd->Connection, kiddies. While we're at it, let's not even worry about cleaning up the heap afterwards. Using that old <font color="#0000ff">delete </font>doohickey was never a great idea anyway. And this being a static function, let's not worry about it being called multiple times, so we won't bother checking our vars are NULL, after all if we're not worrying whether our resources are properly released, it probably doesn't matter how they're allocated.
This is WTF on so many levels, it's scary. Surprised nobody has pointed that out yet, although a lot of people seem to be DBA's. I won't hold that against you, though.
Admin
C# doesn't have a reference-type and a pointer-type, just reference-types and value-types. Meaning it doesn't have a pointer indirection operator (->). It's garbaged-collected, there is no delete operator either.
And the meaning of static w.r.t to a function has no relation to the meaning of static w.r.t to a local variable. This is true even in C, so I don't know where you got your last point.
Admin
I'm a little uncertain, while I don't think there is really any difference between them in MS Sql Server, I am ignorant of the theoretical difference and of other implementations.
It isn't the slow processing that kills you, its the sudden stop when the pool runs out.
There was a whole school of people who thought that everything would faster if the used GetRows() in asp because using arrays was faster than recordsets.[^o)]
Admin
We have "WTF Comment #2" of the day! [:D]
Extra points for being condescending and having no idea what they are talking about . Congratulations![<:o)]
Admin
Wow... you imply you develop enterprise software, yet you don't see the flaw in your reasoning there. No wonder you posted anonymously.
Admin
And the golden globe WTF goes to the poster of that comment..
agreed that it might be good to let other computers perform some tasks..
but making the db server select all rows with all fields.
sending all of the rows over the network.
filling up the ram of the consumer with data that isnt going to be used.
and then let another computer count them.
THAT IS A TRUE WTF!
//Roger
Admin
in lack of edit functionallity:
filling up the ram of the consumer with data that isnt going to be used.
and then let the cunsumer computer count them.
Admin
Better yet,.... Assembly !!!!!!
Admin
I think the real WTF is that the original poster of that article left off (and perhaps hinted to in a later post) that this was meant to be sarcastic. No smiley, no grin, no wink, no XML sarcasm tags (as another pointed out earlier in this thread). It seems they got what they wanted; a lot of white noise. [:@]
Regards,
Joe
Admin
No way
You are definately joking.
This is
No
No nono. This can't be. You lie.