The Daily WTF: Curious Perversions in Information Technology

2005-02-25 Reply Admin

well, SELECT COUNT(*) FROM [TableName] is notoriously unreliable....

Sweets · 2005-02-25 Reply Admin

Brilliant WTF!

But I can't believe any one would do somthing like this!

SELECT Count(*) isn't exactly advanced SQL!

2005-02-25 Reply Admin

Do you still have that Swiss Army knife from yesterday's post? I need to stab myself in the eyes. On second thought, never mind. This has got to be the highest WTF-per-line ratio that I will ever see in my life, so the damage is already done.

Extra WTF credit must go to this person for selecting ALL columns from the database. I would award a special "WTF with Gold Leaf Cluster" for using count += 1 instead of count++.

It's too bad people that program like this are able to get past the syntax errors so their code can actually run. It's a great argument for cryptic compiler syntax errors. To figure out the problem our WTFer would have had to go to a real programmer who would have either laughed his ass off or fallen dead right on the spot.

2005-02-25 Reply Admin

Add to the funny, the DataReader is not being closed when finished. Kudos!!

seizethedave · 2005-02-25 Reply Admin

Yes, COUNT(*) should have been used.

It's also worth pointing out that SqlDataReader provides no way of accessing the number of returned records.

And ++ vs. +=1, that's trivial.

znaps · 2005-02-25 Reply Admin

Hey, at least they put the

 while (dr.Read()) { count += 1; }


all on one line to make it seem faster..

2005-02-25 Reply Admin

I've seen this several times in the past with straight ASP/VBScript. When I questioned it, I was told that some of the ODBC drivers in use did not provide a count back to the dataset so this was the work around.
shudder...

Phil Scott · 2005-02-25 Reply Admin

select count is for weeinies

SELECT rows FROM sysindexes WHERE id = OBJECT_ID('someTable') AND indid < 2

for life!

fregas · 2005-02-25 Reply Admin

[I've seen this several times in the past with straight ASP/VBScript. When I questioned it, I was told that some of the ODBC drivers in use did not provide a count back to the dataset so this was the work around.
shudder...]

That shouldn't be the issue. As others have mentioned, a count of records in the table should happen through select count(*) on the database, not thru ADO or on the application server. recordset.Count is handy, or in .NET: DataTable.Rows.Count, but even if you cannot use those options, a seperate query to database for the number of rows is immensely more efficient than looping thru the records.

LAME!

2005-02-25 Reply Admin

Select count(smallest_column) is equal but faster than select count(*), no?

2005-02-25 Reply Admin

For those commenting on the trivial minor aspects of the code as written please remember that this is Alex's interpretations of three different samples all in differing languages. So quibbles about += vs ++ and closing the dbconnection really aren’t pertinent. The major WTF here is that:

a) somebody brought a full query to the client side to get the count.

b) most client side datasets will give you a row count automatically if you ask so why iterate through it again, since the component had to iterate through when it filled it.

c) the biggest WTF is that since Alex kept getting more and more of these sorts of wtfs it is an apparently common method of getting row counts.

2005-02-25 Reply Admin

:
Select count(smallest_column) is equal but faster than select count(*), no?

Wouldn't that leave out rows where the value of smallest_column is null?

mugs · 2005-02-25 Reply Admin

:

[image] wrote:

Select count(smallest_column) is equal but faster than select count(*), no?

Wouldn't that leave out rows where the value of smallest_column is null?

Blast, I finally register and when I go to post, I wasn't logged in.

sardonic · 2005-02-25 Reply Admin

At least optimizing the code will be easy. Then again, maybe not...

2005-02-25 Reply Admin

:
Select count(smallest_column) is equal but faster than select count(*), no?

I am only guessing, but I expect a database could easily keep track of how many rows are in each table and quickly return that number when it sees count(*).

Jeff S · 2005-02-25 Reply Admin

In SQL, Count(SomeColumn) is NOT equivalent to COUNT(*).

Also -- using the statistics is not a good way to get counts, because it depends on the stats being updated, and also it obviously will not work with filters or more complex SQL statements.

Remember, Alex's code is a cumulative example of several WTF's in which the technique itself is to be considered, not the specifics of the syntax he has written or a particular table or column name he has selected.

This WTF is way too common, unfortunately, so it doesn't even shock me that much I guess...

Mr. Angry DBA · 2005-02-25 Reply Admin

Alas, I, too, have witnessed this vulgarity.

Mr. Angry DBA

HerrDaur · 2005-02-25 Reply Admin

work for very small tables :P but what if a table has more than 32767 rows?!

2005-02-25 Reply Admin

I am only guessing, but I expect a database could easily keep track of how many rows are in each table and quickly return that number when it sees count(*).

MySQL MyISAM does that, ala:

mysql> select count(*) from News;
+----------+
| count(*) |
+----------+
|  3675187 |
+----------+
1 row in set (0.04 sec)

But transactions and table versioning make it much, much harder for ACID databases, since there's no single row count for the entire table; each transaction has its own view of the table which needs various degrees of isolation from other transactions, so they usually end up needing to do a full table scan:

mysql> select count(*) from NewsInno;
+----------+
| count(*) |
+----------+
|  3675187 |
+----------+
1 row in set (3 min 20.55 sec)

PITA, especially when you're not that bothered about a completely accurate count. InnoDB can make estimates, but they're effectively useless; on this particular table, a SHOW TABLE STATUS estimated rows count can be off by as much as 2 million in both directions, despite it using fixed length rows and not being modified between invocations. Bah.

2005-02-25 Reply Admin

Yes, using count(somefield) would be a worse WTF because it would yield incorrect record counts if somefield could be NULL. At least the code as-is gives a correct result. On the other hand, maybe an incorrect result would have led someone to find this mess sooner.

Blue · 2005-02-25 Reply Admin

Re: "SELECT rows FROM sysindexes"

Agreed with the comment regarding statistics being updated correctly, and also the post I am replying to assumes a particular backend DB with that code - not all DBs have that table, etc.

2005-02-25 Reply Admin

In MS SQL Server, if you want a rough estimate of rows without fear of a lengthy query you can use the trick described here:

http://db.ittoolbox.com/documents/document.asp?i=2551

mugs · 2005-02-25 Reply Admin

HerrDaur:
work for very small tables :P but what if a table has more than 32767 rows?!

Ints are usually 4 bytes nowadays

2005-02-25 Reply Admin

VONE clueless developer who doesn't know SQL!

Muahahahaha!

TWO clueless developers who don't know SQL!

Muahahahaha!

THREE clueless developers who don't know SQL!

2005-02-25 Reply Admin

I once saw something much worse even than this. A quick summary of the action would be: -

Get first_table
Get second_table
foreach record1 in first_table
if record1 meets some_criterion then
    foreach record2 in second_table
      if record2 matches record1 in someway then
        store some fields of record1 and record2 in a temporary_table1
      endif
    next
endif
next

This was then repeated twice more, once with temporary_table1 and third_table (results into temporary_table2), then again with temporary_table2 and fourth_table (results returned to caller). Oh, and it was all in PL/SQL.

And ran daily.

Needless to say, after a few user complaints as to why the output was taking 18 *HOURS* to generate I was asked to look at the SP and see if I could optimize it. I replaced the code with 3 straightforward select statements and the new routine ran in just over 9 minutes.

BLQ.

2005-02-25 Reply Admin

I have acutally seen one example better than this. I inherited an shpoody ASP application about 3 years ago, that was full of WTF's like this. I wish I'd kept copies of it all becuase it made for inlightining reading ... bit I digrees.

The developer on this porject had a thing about recordsets. He clearly mistrusted them, and would alays copy their intire contents into an array rather than acess them directly. Mad ... perhaps ... but there we a number of pages in the site where there were 'statistic' report summaries with links to a full break down. So for example there might be:

There were 250 orders today - click here to see full list

You can probalby guess how he would retrieve the number....

Yep ... a full "SELECT * from orders_table WHERE date_of_order = #Today#" would be placed into a recordset, thiswould be looped through to populate and array, and the eventualy figure of 250 would be gained by returning array.length

*shakes head in utter bermusement*

2005-02-25 Reply Admin

...unroll the loops...unroll the loops...un....

Jeff S · 2005-02-25 Reply Admin

:
I have acutally seen one example better than this. I inherited an shpoody ASP application about 3 years ago, that was full of WTF's like this. I wish I'd kept copies of it all becuase it made for inlightining reading ... bit I digrees.

The developer on this porject had a thing about recordsets. He clearly mistrusted them, and would alays copy their intire contents into an array rather than acess them directly. Mad ... perhaps ... but there we a number of pages in the site where there were 'statistic' report summaries with links to a full break down. So for example there might be:


There were 250 orders today - click here to see full list



You can probalby guess how he would retrieve the number....

Yep ... a full "SELECT * from orders_table WHERE date_of_order = #Today#" would be placed into a recordset, thiswould be looped through to populate and array, and the eventualy figure of 250 would be gained by returning array.length

*shakes head in utter bermusement*

That's funny .... if he was using ADO, and he liked arrays that much, too bad he didn't know about the GetRows() method of a recordset. It returns a 2D array oif all rows/columns in a recordset extremely quickly (no looping needed). Filling up an array with GetRows() and then working with the resulting array actually can be a significant performance improvement over working with the recordset.

2005-02-25 Reply Admin

The funniest of it all is people laughing their ass and then suggesting

select count(*)

You should use

select count(1)

Jeff S · 2005-02-25 Reply Admin

:

The funniest of it all is people laughing their ass and then suggesting 

select count(*)

You should use

select count(1)

Actually, what is funnier is that I am personally laughing my ass off at you for thinking COUNT(1) is faster than COUNT(*) .... you understand what COUNT(*) means, right? it is not like SELECT * ....

2005-02-25 Reply Admin

>Extra WTF credit must go to this person for selecting ALL columns from the database

Well, it depends on how many columns you have. If there're only 3 columns, using select * actually takes advantage of the MySQL query cache. be noticed that the query has to be EXACTLY the same to be cached.

2005-02-25 Reply Admin

Found this in some VB6 code today.
I think they heard of disconnected recordsets (see the comment in the code) but decided to write their own.

Function TrimBlanks(strSql As String) As ADODB.Recordset

On Error GoTo Error_Handler

Dim counter As Integer
Dim strString As String
Dim rs2 As New ADODB.Recordset
Dim blnCharFound As Boolean
Dim rs As ADODB.Recordset
Dim x As Long
Dim xx As Double
Dim xxx As String
Dim xxxx As Integer

    Set rs2 = New ADODB.Recordset
    Set rs = New ADODB.Recordset
    'rs.CursorLocation = adUseClient
    rs.LockType = adLockOptimistic
    rs.Open strSql, cn

    blnCharFound = False
    With rs
        For counter = 0 To .Fields.Count - 1
            If .Fields.Item(counter).Type = adChar Then
                rs2.Fields.Append .Fields.Item(counter).Name, adVarChar, _
                    .Fields.Item(counter).DefinedSize ', .Fields.Item(counter).Attributes
                rs2.Fields.Item(counter).Attributes = .Fields.Item(counter).Attributes
                rs2.Fields.Item(counter).NumericScale = .Fields.Item(counter).NumericScale
                rs2.Fields.Item(counter).Precision = .Fields.Item(counter).Precision
                blnCharFound = True
            Else
                rs2.Fields.Append .Fields.Item(counter).Name, .Fields.Item(counter).Type, _
                    .Fields.Item(counter).DefinedSize
                rs2.Fields.Item(counter).Attributes = .Fields.Item(counter).Attributes
                rs2.Fields.Item(counter).NumericScale = .Fields.Item(counter).NumericScale
                rs2.Fields.Item(counter).Precision = .Fields.Item(counter).Precision
            End If
        Next
        rs2.Open
        If Not .EOF Then
            .MoveFirst
            Do While Not .EOF
                DoEvents
                rs2.AddNew
                For counter = 0 To .Fields.Count - 1
                    If Not IsNull(.Fields.Item(counter)) _
                    And Not IsEmpty(.Fields.Item(counter)) Then
                        If .Fields.Item(counter).Type = adChar Then
                            rs2.Fields.Item(counter).Value = Trim(.Fields.Item(counter).Value)
                        Else
                            rs2.Fields.Item(counter).Value = .Fields.Item(counter).Value
                        End If
                    End If
                Next
                .MoveNext
            Loop
        End If
    End With

'    If blnCharFound Then
        Set TrimBlanks = rs2.Clone
'    Else
'        Set TrimBlanks = rs.Clone
'    End If
    Set rs = Nothing
    Set rs2 = Nothing

Exit Function
Error_Handler:
    ShowError ("TrimBlanks")

End Function

Bustaz Kool · 2005-02-25 Reply Admin

:
Select count(smallest_column) is equal but faster than select count(*), no?

They'd be equally fast except that any Null values in the "smallest_column" will not be counted so you could potentially end up with different counts.

2005-02-25 Reply Admin

I'm guessing many of you here have little or no experience developing enterprise software. An important goal in developing enterprise software is scalability and in order to scale, often times you want to distribute the load away from your SQL server to a middle-tier server. By counting the records on the middle-tier you are saving valuable processor cycles on the database server.

2005-02-25 Reply Admin

I'm guessing many of you here have little or no experience developing enterprise software. An important goal in developing enterprise software is scalability and in order to scale, often times you want to distribute the load away from your SQL server to a middle-tier server. By counting the records on the middle-tier you are saving valuable processor cycles on the database server.

So your saying that we can avoid scanning the whole table and then returning a row of one integer by scanning the whole table and returning a row of the size of each row for each row in the table all through the servers NIC. I can see how thats a lot more scalable, thanks for the tip ;)

2005-02-25 Reply Admin

I knew you wouldn't understand

LordHunter317 · 2005-02-25 Reply Admin

Because you're wrong.

The time it takes to do the aggregation on the DBMS server will always been (worse-case, never real-world) equal to and best case, significantly faster than doing the aggregation outside of the database.

This is why the database does it -- the network overhead alone if the table is large could make the difference. Even ignoring that, the above still holds true.

Jeff S · 2005-02-25 Reply Admin

:
I'm guessing many of you here have little or no experience developing enterprise software. An important goal in developing enterprise software is scalability and in order to scale, often times you want to distribute the load away from your SQL server to a middle-tier server. By counting the records on the middle-tier you are saving valuable processor cycles on the database server.

Congratulations! [:D] the "WTF Comment of the Day!"

Yikes .... someone please tell me he forgot the <sarcasm> tags .... wow ....

2005-02-25 Reply Admin

Why don't they try XML?

2005-02-25 Reply Admin

:
Why don't they try XML?

"How about JavaScript?" :p

2005-02-25 Reply Admin

Uh, wait just one moment here. This code shouldn't even compile!

    SqlCommand cmd = new SqlCommand();
    cmd.Connection = getConn();

Hello! It's a pointer to the heap! Let's try cmd->Connection, kiddies. While we're at it, let's not even worry about cleaning up the heap afterwards. Using that old delete doohickey was never a great idea anyway. And this being a static function, let's not worry about it being called multiple times, so we won't bother checking our vars are NULL, after all if we're not worrying whether our resources are properly released, it probably doesn't matter how they're allocated.

This is WTF on so many levels, it's scary. Surprised nobody has pointed that out yet, although a lot of people seem to be DBA's. I won't hold that against you, though.

LordHunter317 · 2005-02-25 Reply Admin

:
Uh, wait just one moment here. This code shouldn't even compile!
 SqlCommand cmd = new SqlCommand();
 cmd.Connection = getConn();
Hello! It's a pointer to the heap! Let's try cmd->Connection, kiddies. While we're at it, let's not even worry about cleaning up the heap afterwards. Using that old delete doohickey was never a great idea anyway. And this being a static function, let's not worry about it being called multiple times, so we won't bother checking our vars are NULL, after all if we're not worrying whether our resources are properly released, it probably doesn't matter how they're allocated.

Seeing as this is C#, not C++, you're wrong on all counts.

C# doesn't have a reference-type and a pointer-type, just reference-types and value-types. Meaning it doesn't have a pointer indirection operator (->). It's garbaged-collected, there is no delete operator either.

And the meaning of static w.r.t to a function has no relation to the meaning of static w.r.t to a local variable. This is true even in C, so I don't know where you got your last point.

Free · 2005-02-25 Reply Admin

Jeff S:

[image] wrote:

The funniest of it all is people laughing their ass and then suggesting 

select count(*)

You should use

select count(1)

Actually, what is funnier is that I am personally laughing my ass off at you for thinking COUNT(1) is faster than COUNT(*) .... you understand what COUNT(*) means, right? it is not like SELECT * ....

I'm a little uncertain, while I don't think there is really any difference between them in MS Sql Server, I am ignorant of the theoretical difference and of other implementations.

It isn't the slow processing that kills you, its the sudden stop when the pool runs out.

There was a whole school of people who thought that everything would faster if the used GetRows() in asp because using arrays was faster than recordsets.[^o)]

2005-02-25 Reply Admin

:
Uh, wait just one moment here. This code shouldn't even compile!
 SqlCommand cmd = new SqlCommand();
 cmd.Connection = getConn();
Hello! It's a pointer to the heap! Let's try cmd->Connection, kiddies. While we're at it, let's not even worry about cleaning up the heap afterwards. Using that old delete doohickey was never a great idea anyway. And this being a static function, let's not worry about it being called multiple times, so we won't bother checking our vars are NULL, after all if we're not worrying whether our resources are properly released, it probably doesn't matter how they're allocated.

This is WTF on so many levels, it's scary. Surprised nobody has pointed that out yet, although a lot of people seem to be DBA's. I won't hold that against you, though.

We have "WTF Comment #2" of the day! [:D]

Extra points for being condescending and having no idea what they are talking about . Congratulations![<:o)]

mugs · 2005-02-25 Reply Admin

:
I'm guessing many of you here have little or no experience developing enterprise software. An important goal in developing enterprise software is scalability and in order to scale, often times you want to distribute the load away from your SQL server to a middle-tier server. By counting the records on the middle-tier you are saving valuable processor cycles on the database server.

Wow... you imply you develop enterprise software, yet you don't see the flaw in your reasoning there. No wonder you posted anonymously.

2005-02-26 Reply Admin

:
I'm guessing many of you here have little or no experience developing enterprise software. An important goal in developing enterprise software is scalability and in order to scale, often times you want to distribute the load away from your SQL server to a middle-tier server. By counting the records on the middle-tier you are saving valuable processor cycles on the database server.

And the golden globe WTF goes to the poster of that comment..

agreed that it might be good to let other computers perform some tasks..

but making the db server select all rows with all fields.
sending all of the rows over the network.
filling up the ram of the consumer with data that isnt going to be used.
and then let another computer count them.

THAT IS A TRUE WTF!

//Roger

2005-02-26 Reply Admin

in lack of edit functionallity:

filling up the ram of the consumer with data that isnt going to be used.
and then let the cunsumer computer count them.

2005-02-26 Reply Admin

:
:
Why don't they try XML?

"How about JavaScript?" :p

Better yet,.... Assembly !!!!!!

JoeDelekto · 2005-02-26 Reply Admin

:

[image] wrote:

I'm guessing many of you here have little or no experience developing enterprise software. An important goal in developing enterprise software is scalability and in order to scale, often times you want to distribute the load away from your SQL server to a middle-tier server. By counting the records on the middle-tier you are saving valuable processor cycles on the database server.

And the golden globe WTF goes to the poster of that comment..

agreed that it might be good to let other computers perform some tasks..

but making the db server select all rows with all fields.
sending all of the rows over the network.
filling up the ram of the consumer with data that isnt going to be used.
and then let another computer count them.

THAT IS A TRUE WTF!

//Roger

I think the real WTF is that the original poster of that article left off (and perhaps hinted to in a later post) that this was meant to be sarcastic. No smiley, no grin, no wink, no XML sarcasm tags (as another pointed out earlier in this thread). It seems they got what they wanted; a lot of white noise. [:@]

Regards,

Joe

2005-02-26 Reply Admin

No way

You are definately joking.

This is

No

No nono. This can't be. You lie.

Count Rowula

Leave a comment on “Count Rowula”