The Daily WTF: Curious Perversions in Information Technology

ammoQ · 2006-01-24 Reply Admin

Jeff S:
ammoQ:
Jeff S:

Some of the posts here mention doing the old "Count()+1" or "Max(ID)+1" techniques to get a new ID, but this @@ROWCOUNT method is even more ingenius because not only is it often wrong like the others, it is also exponentially less efficient!

Why? Isn't it basically the same like count()? I aggree that max(ID) is much faster if ID is indexed, a reasonable though unsafe (considering the circumstances) assumption.

Which do you think is a more efficient way to return the # of rows in a table to the client?

Option 1: SELECT COUNT(*) FROM Table

Option 2: SELECT * FROM Table; SELECT @@ROWCOUNT

IMO both are O(n). I can't say how efficient (or inefficient) the database handles option 2.

2006-01-24 Reply Admin

Rick:
Its obvious that the 'designer' wanted to save the unnecessary space that would be used up by a UserId column in the table. You can always determine the UserId with the code below.
select * from USER_TABLE where UserName < "JOE BLOW"
SET @intUserId = @@ROWCOUNT

I consider this to be a very 'normal' way of doing things.

What would happen if you deleted a user?

Select * from Users where clue > 0
0 Records returned.

Jeff S · 2006-01-24 Reply Admin

ammoQ:
Jeff S:
ammoQ:
Jeff S:

Some of the posts here mention doing the old "Count()+1" or "Max(ID)+1" techniques to get a new ID, but this @@ROWCOUNT method is even more ingenius because not only is it often wrong like the others, it is also exponentially less efficient!

Why? Isn't it basically the same like count()? I aggree that max(ID) is much faster if ID is indexed, a reasonable though unsafe (considering the circumstances) assumption.

Which do you think is a more efficient way to return the # of rows in a table to the client?

Option 1: SELECT COUNT(*) FROM Table

Option 2: SELECT * FROM Table; SELECT @@ROWCOUNT

IMO both are O(n). I can't say how efficient (or inefficient) the database handles option 2.

Have you been drinking today or just acting goofy or what?

Look carefully again. What if the table has 1,000,000 rows. What gets returned to the client with option 1? What gets returned to the client with option 2? Which do you think is more efficient?

ammoQ · 2006-01-24 Reply Admin

Jeff S:
ammoQ:
Jeff S:
ammoQ:
Jeff S:

Some of the posts here mention doing the old "Count()+1" or "Max(ID)+1" techniques to get a new ID, but this @@ROWCOUNT method is even more ingenius because not only is it often wrong like the others, it is also exponentially less efficient!

Why? Isn't it basically the same like count()? I aggree that max(ID) is much faster if ID is indexed, a reasonable though unsafe (considering the circumstances) assumption.

Which do you think is a more efficient way to return the # of rows in a table to the client?

Option 1: SELECT COUNT(*) FROM Table

Option 2: SELECT * FROM Table; SELECT @@ROWCOUNT

IMO both are O(n). I can't say how efficient (or inefficient) the database handles option 2.

Have you been drinking today or just acting goofy or what?

Look carefully again. What if the table has 1,000,000 rows. What gets returned to the client with option 1? What gets returned to the client with option 2? Which do you think is more efficient?

I thought it was a stored procedure, where option 2 doesn't really return the result of the "SELECT *" to the client. I must admit that I don't know TSQL well. Option2 returns a result set for the "SELECT *", then the result of "SELECT @@ROWCOUNT"?

Disgruntled DBA · 2006-01-24 Reply Admin

How many DBAs will be sacrificed to this table-scanner?
Contention issues? Heck, you can have huge transactions to mitigate that.
Concurrency issues? Heck, just move to Oracle, and they will evaporate away.

Now, let's see....

Countless developers using a poor design at $60 an hour.
New UNIX system (because Oracle on Windows is "obviously crap") $200,000
Oops. Forgot the Oracle license: $20,000 (depends on how good a negotiator you are.)
Reworking the application to run on Oracle would require developers at $70 per hour....

Jumping off this Hindenberg before the New Jersey airfield....priceless....

2006-01-24 Reply Admin

Alex Papadimoulis:
When you're working on a system that's stagnated in the development phase for a few years and has chewed through more developers and teams than you even realized existed, it can be a daunting task just to explain why the system is as ridiculously bad as it is. As much as I'd love to feature such systems here, I suspect that most would prefer not to read thirty-seven pages just to get to the punch line. No less, Rick Harris was able find a *single line* of code that is a fair representation of such a system he was working on ...
SET @intUserId = @@ROWCOUNT
And no, this was not taken out of context. And yes, it really was that bad. I'll leave it as an exersize for the reader to try to immagine what possibly could be surrounding this code.

My contribution to the "The real WTF is...." discusssion. The real WTF is that he is using hungarian notation! intUserId? That is SO 1997. Even Microsoft doesn't recommend Hungarian any more.

Jeff S · 2006-01-24 Reply Admin

ammoQ:
Jeff S:
ammoQ:
Jeff S:
ammoQ:
Jeff S:

Some of the posts here mention doing the old "Count()+1" or "Max(ID)+1" techniques to get a new ID, but this @@ROWCOUNT method is even more ingenius because not only is it often wrong like the others, it is also exponentially less efficient!

Why? Isn't it basically the same like count()? I aggree that max(ID) is much faster if ID is indexed, a reasonable though unsafe (considering the circumstances) assumption.

Which do you think is a more efficient way to return the # of rows in a table to the client?

Option 1: SELECT COUNT(*) FROM Table

Option 2: SELECT * FROM Table; SELECT @@ROWCOUNT

IMO both are O(n). I can't say how efficient (or inefficient) the database handles option 2.

Have you been drinking today or just acting goofy or what?

Look carefully again. What if the table has 1,000,000 rows. What gets returned to the client with option 1? What gets returned to the client with option 2? Which do you think is more efficient?

I thought it was a stored procedure, where option 2 doesn't really return the result of the "SELECT *" to the client. I must admit that I don't know TSQL well. Option2 returns a result set for the "SELECT *", then the result of "SELECT @@ROWCOUNT"?

Yes, I guess you don't .. :) even stored procedures return all resultsets back to the client; in fact, many stored procs are even supposed to !

Now, based on that, do you still feel they both are the same efficiency?

JohnO · 2006-01-24 Reply Admin

CornedBee:
Doesn't SQL server have sequences?

SELECT curval('my_id_sequence) as newid;

Now the value of newid is mine, all mine. Nobody else will ever get it! (Until the sequences wraps around, anyway.)

It has identity columns and scope_identity() which is better.

If you couldn't use @@IDENITY Pre-2000, you could do (from memory w/o error handling):

SET TRANSACTIONISOLATIONLEVEL REPEATABLE_READ

BEGIN TRAN

SELECT @nextid = MAX(id) + 1

FROM TABLE

INSERT TABLE ....

COMMIT TRAN

If you have a unique or primary key contraint on the ID column, you can drop the transactionisolationlevel piece and just retry on failure due to unique/primary key constraint violation errors. You invariably have an index on ID so MAX(id) is not usually a table scan.

Identity columns can create performance bottlenecks and you do have to wait for successful completion of the SQL to get the value. I think GUIDs may be the way to go now that they are directly supported by DBMSs and application development platforms.

2006-01-24 Reply Admin

Anonymous:
Why is it that nobody ever thought of having the generated ID returned as a parameter of the execute statement that executed the sql statement?

because INSERT doesn't return things, for a very good reason :

CREATE TABLE animals (
    id MEDIUMINT NOT NULL AUTO_INCREMENT PRIMARY KEY,
    name CHAR(30) NOT NULL,
);

INSERT INTO animals (name) VALUES 
    ('dog'),('cat'),('penguin'),('lax'),('whale'),('ostrich');

maldrich · 2006-01-24 Reply Admin

Now see, if they would just upgrade to STORRAY, there would be no need to figure out all that hard stuff about scope_identity() and concurrency. That SQL is just too complicated!

2006-01-24 Reply Admin

Anonymous:
Anonymous:
YEAH! I'm FIRST!

I'm not, am I? Ahh well, I came pretty close. Why do some morons keep posting that? A lot of them aren't even first because people actually beat them to it.
And what's so great about being first, anyway? Do you really think you get any honour of being The First of one of the many posts on this site? What is it?
All I can say is it seems most people find it quite irritating. And if you wan't honour, then posting "FIRST" is probably the thing not to do.
Sorry, I, for one, am pretty irritated by all those "FIRST" posts (which are usually second posts).

At least you were the first one to complain. You win the kewpie doll. If there were more prizes available, perhaps the First Post winner could win something.

What prize to I get for being the first one to post that it is obviously "auto generated code"?

2006-01-24 Reply Admin

well, i guess the best way to deal with that is to have a CHAR(1) column with something from { 'A', 'L', 'D' } in it for active, locked and deleted - you don't need to worry about rowcount decreasing then, so it's safe(er) to use, although 'COUNT(*)+1' is better...

2006-01-25 Reply Admin

In response to Jeff S and AmmoQ discussion.

SELECT COUNT(*) ... can often be precalculated and cached

2006-01-25 Reply Admin

In SQL Server 2000, count(*) is still quite slow. There actually is another way using the sysindexes table (from "Inside SQL Server 2000"):

SELECT rows

FROM sysindexes

WHERE id = OBJECT_ID ('your table')

AND indid < 2

It's not 100% accurate all the time, but depending on what you need to do it can be quite handy.

If you use SQL Server at all, then the book "Inside SQL Server" is mandatory.

ammoQ · 2006-01-25 Reply Admin

Jeff S:
Yes, I guess you don't .. :) even stored procedures return all resultsets back to the client; in fact, many stored procs are even supposed to !

It's different in Oracle's PL/SQL, sorry for acting like a fool.

Now, based on that, do you still feel they both are the same efficiency?

Well, O(n)=O(n), but it might be difficult to explain that to the customer who sees 10 secs vs. 20 mins response time. ;-)

ammoQ · 2006-01-25 Reply Admin

Anonymous:
In response to Jeff S and AmmoQ discussion.

SELECT COUNT(*) ... can often be precalculated and cached

In general probably yes, but in this case it might fail, since the number of records has inevitably changed.

2006-01-25 Reply Admin

jvancil:
But... presumably this would always return 1... assuming there was only one "JOE BLOW" in the database... The only place this makes sense is if you are adding a new user to the end of the table and.... no wait. IT NEVER MAKES SENSE![H]

If it doesn't make sense, perhaps it is an attempt at a Chewbacca Defense in diguise ;-)

2006-01-25 Reply Admin

SQL Server has identity columns. You can't access the sequence directly. You simply insert, and after the insert, you retrieve the identity for your insert from @@scopeidentity. Some people incorrectly use @@identity, but this can be incorrect if your insert fires a trigger that inserts into another table. Because the @@identity is the most recent identity generated, whereas the @@scopeidentity is the most recent identity within your SQL block.

If you knew how many times I've found catch(...) followed by {} in the program I'm updating, you'd forgive my insanity.

2006-01-25 Reply Admin

Anonymous:

Why is it that nobody ever thought of having the generated ID returned as a parameter of the execute statement that executed the sql statement?

That is exactly because of this problem that we settled down for a table with an ID count. We set a lock on the table before picking up the number and release the hold once updated with the new value to prevent concurrency. The GetNewID procedure will retry to get a number for a certain time before giving up if the record is locked by another process.

Actually, the PostgreSQL people are thinking about an extension to INSERT that allows to return arbitrary columns as result, including auto-generated keys, it will go like "INSERT INTO table (name, street) VALUES ('blah', 'blub') RETURNING name, id;" However, currently they're not really shure how to deal with Triggers and Rules, and so it is not implemented yet. It might come with V8.2 or 8.3.

RiX0R · 2006-01-25 Reply Admin

ammoQ:
Anonymous:

In response to Jeff S and AmmoQ discussion.

SELECT COUNT(*) ... can often be precalculated and cached

In general probably yes, but in this case it might fail, since the number of records has inevitably changed.

Actually it's about to change.

But it's trivial for an rdbms to maintain a 'rowcount' variable for each table, then incrementing for each INSERT and decrementing for each DELETE, making SELECT COUNT(*) an O(1) operation. I don't know if any do this, though.

If not, I imagine it would be a linear scan through the index, which is obviously O(N) but probably still quite cheap.

ammoQ · 2006-01-25 Reply Admin

RiX0R:
ammoQ:
Anonymous:

In response to Jeff S and AmmoQ discussion.

SELECT COUNT(*) ... can often be precalculated and cached

In general probably yes, but in this case it might fail, since the number of records has inevitably changed.

Actually it's about to change.

That's nitpicking ;-) Unless other parts of the program also do a count(*), it has changed since the last execution.

But it's trivial for an rdbms to maintain a 'rowcount' variable for each table, then incrementing for each INSERT and decrementing for each DELETE, making SELECT COUNT(*) an O(1) operation. I don't know if any do this, though.

It's that trivial for a single user database. In a multiuser database system where ACID ist more than a buzzword the following can happen:

1. user A deletes 10 rows (no commit yet)
2. user B inserts 5 rows (no commit yet)
3. user A rollbacks his transaction
4. user A does a count(*) and since we assume a consistent look at the data, he should not see the uncommited inserts of B
5. user A deletes 10 rows
6. user B does a count(*), does not see the uncommited deletes of A but sees his own uncommited inserts
7. user C does a count(*), should see neighter the uncommited deletes of A nor the uncommited inserts of B
8. user B does a commit
9. user A does a count(*), sees his own uncommited deletes
10. user A does a commit

Now try to give each user a correct count(*) with a single rowcount variable.

If not, I imagine it would be a linear scan through the index, which is obviously O(N) but probably still quite cheap.

Just tried that on Oracle, it does a full table scan.

2006-01-25 Reply Admin

Anonymous:
Anonymous:
YEAH! I'm FIRST!

I'm not, am I? Ahh well, I came pretty close. Why do some morons keep posting that? A lot of them aren't even first because people actually beat them to it.
And what's so great about being first, anyway? Do you really think you get any honour of being The First of one of the many posts on this site? What is it?
All I can say is it seems most people find it quite irritating. And if you wan't honour, then posting "FIRST" is probably the thing not to do.
Sorry, I, for one, am pretty irritated by all those "FIRST" posts (which are usually second posts).

At least you were the first one to complain. You win the kewpie doll. If there were more prizes available, perhaps the First Post winner could win something.

This clearly explain why we need:

<font>SET</font> @intUserId = <font>@@ROWCOUNT</font>

It get the "unique id". It solved the "YEAH! I'm FIRST!" problem.

2006-01-25 Reply Admin

ferrengi:

I want to be able to grasp just how the much the code sucks.

Me too. And being rather a newbie to programming, I suggest the following: why don’t you (we, for the rare occasions where I do know a better solution) guys make it a challenge to explain clearly what the good solution would be?

Jeff S · 2006-01-25 Reply Admin

ammoQ:
RiX0R:
ammoQ:
Anonymous:

In response to Jeff S and AmmoQ discussion.

SELECT COUNT(*) ... can often be precalculated and cached

In general probably yes, but in this case it might fail, since the number of records has inevitably changed.

Actually it's about to change.

That's nitpicking ;-) Unless other parts of the program also do a count(*), it has changed since the last execution.

But it's trivial for an rdbms to maintain a 'rowcount' variable for each table, then incrementing for each INSERT and decrementing for each DELETE, making SELECT COUNT(*) an O(1) operation. I don't know if any do this, though.

It's that trivial for a single user database. In a multiuser database system where ACID ist more than a buzzword the following can happen:

1. user A deletes 10 rows (no commit yet)
2. user B inserts 5 rows (no commit yet)
3. user A rollbacks his transaction
4. user A does a count(*) and since we assume a consistent look at the data, he should not see the uncommited inserts of B
5. user A deletes 10 rows
6. user B does a count(*), does not see the uncommited deletes of A but sees his own uncommited inserts
7. user C does a count(*), should see neighter the uncommited deletes of A nor the uncommited inserts of B
8. user B does a commit
9. user A does a count(*), sees his own uncommited deletes
10. user A does a commit

Now try to give each user a correct count(*) with a single rowcount variable.

If not, I imagine it would be a linear scan through the index, which is obviously O(N) but probably still quite cheap.

Just tried that on Oracle, it does a full table scan.

AmmoQ -- you still aren't getting it. What you described simple does not happen in the order you describe, since those transactions block one another until things are done. But even if the database DID allow of these things to happen at once (say, on virtual copies of each table and somehow it merges them all together at the right point when things are commited), don't you think that a database that can do THAT would be able to keep *multiple* simple row counts?

In addition, if you are doing a table scan in Oracle on a COUNT(*) statement, then either a) oracle sucks even more than I thought or b) you have no indexes or PK on your table.

Finally, if you are contributing to a discussion in which you have no idea how the technology being discussed works (i.e., SQL Server), at least state that out front to save a lot of time and confusion. If you are prompting some discussion so you can learn about that technology (nothing wrong with that), why not ask specific questions instead of making statements like "this SQL statement is is O(n) which is equivalent to that SQL statement" and "stored procs cannnot return multiple resultsets" and so on, which are wild guesses on your part. Ask *questions* if you are not sure, don't make wild assumptions in the form of statements that you know to be true. It makes an intelligent discussion quite difficult and makes you look a little ignorant. I am not saying that you *are* ignorant, but it can make you look that way.

2006-01-25 Reply Admin

hopefully there's a
"/* This is one of the wrong ways of doing this that we considered"
before and a
"*/"
after it :-)

2006-01-25 Reply Admin

Anonymous:
ferrengi:

I want to be able to grasp just how the much the code sucks.

Me too. And being rather a newbie to programming, I suggest the following: why don’t you (we, for the rare occasions where I do know a better solution) guys make it a challenge to explain clearly what the good solution would be?

Sorry to spoil the fun, but there appear to be a lot of newbie lurkers, versus me, a lurker (who forgot ot login).

This OP code sucks because before you could get access to @@ROWCOUNT, you had to run a SELECT query returning every row of the database. This is slow... especially on a large database. But, what if, in addition to having millions of records, you had thousands of users and you ran this query during the business day? In the split second between when you ran the SELECT (that returns the whole table) and the SELECT (that returns @@ROWCOUNT), users inserted new rows? @@ROWCOUNT only shows you the results from your own last query, not the actual number. If you then insert new rows, using this now erroneous number as your user ID, multiple users will have the same ID.

Now lets say you delete some rows in the middle of the table. Same problem. Multiple users end up with the same ID.

None of this is a big deal in and of itself. But tie those user IDs to say, finance records, purchasing records, employee performace records, mailing addresses... you get the drift. The wrong information goes to the wrong person and from the database maintainer's point of view, it's impossible to figure out who belongs where.

The solution? That spoils the fun! Basically, use a 'natural' business process generated ID as your user ID. Or, if your business produces no such key, allow the database to generate it for you. There may be times where you would want to generate your own unique ID, but you wouldn't be asking such questions if those conditions applied to you, because it's horrible to do, as you can see by the other posts.

2006-01-25 Reply Admin

Perhaps it is part of a self incrementing insert script vs an identity? Example:

DECLARE @intUserID INT
SET @intUserId = @@ROWCOUNT

INSERT INTO myTable (id, firstname, lastname)
VALUES ((@intUserID +1), 'T', 'DOG')

ammoQ · 2006-01-25 Reply Admin

Jeff S:

AmmoQ -- you still aren't getting it. What you described simple does not happen in the order you describe, since those transactions block one another until things are done.

I knew someone would raise this question.

But even if the database DID allow of these things to happen at once (say, on virtual copies of each table and somehow it merges them all together at the right point when things are commited),

Let's say there is a database system that does that, called "Oracle".

don't you think that a database that can do THAT would be able to keep *multiple* simple row counts?

It could, but for what reason? No reasonable application does count(*) without a where clause repeatedly.

In addition, if you are doing a table scan in Oracle on a COUNT(*) statement, then either a) oracle sucks even more than I thought or b) you have no indexes or PK on your table.

or c) Oracle keeps its data in a way where a full index scan is not faster than a full table scan

Finally, I know this is about SQL Server; a readily state that don't know much about it; and if you read my posts, you will see I asked several questions. O(n) was the answer to your claim "exponentially slower" which is simply untrue.

Jeff S · 2006-01-25 Reply Admin

ammoQ:
[
Finally, I know this is about SQL Server; a readily state that don't know much about it; and if you read my posts, you will see I asked several questions. O(n) was the answer to your claim "exponentially slower" which is simply untrue.

sorry. I meant thousands of times slower, not exponentially.

Some simple questions (and I still cannot believe I am playing along with you here, but what they heck): how do you compare two algorithms that are both O(n) to see which is more efficient? Are all alogrithms that are o(n) equally efficient in your mind? Do you understand that an algorithm that counts rows in a table by sending each one to a printer, one row per page, and then has the page count derived by measuring the weight of the output pages divided by the weight of each is also O(n) ?

ammoQ · 2006-01-25 Reply Admin

Jeff S:

Some simple questions (and I still cannot believe I am playing along with you here, but what they heck): how do you compare two algorithms that are both O(n) to see which is more efficient? Are all alogrithms that are o(n) equally efficient in your mind? Do you understand that an algorithm that counts rows in a table by sending each one to a printer, one row per page, and then has the page count derived by measuring the weight of the output pages divided by the weight of each is also O(n) ?

My mistake was believing that "select * from someTable;" (within a trigger or stored procedure) would send the output to nirvana.
There is still something I do not understand, but I'm sure you can help me with that:
Does "select * from someTable;" really fetch all the data (and store it to later return it to the client) or does it merely open a cursor and returns that?
If it is a cursor, how does it know @@rowcount before fetching to the last row?

OneFactor · 2006-01-25 Reply Admin

ammoQ:
Jeff S:

Some simple questions (and I still cannot believe I am playing along with you here, but what they heck): how do you compare two algorithms that are both O(n) to see which is more efficient? Are all alogrithms that are o(n) equally efficient in your mind? Do you understand that an algorithm that counts rows in a table by sending each one to a printer, one row per page, and then has the page count derived by measuring the weight of the output pages divided by the weight of each is also O(n) ?

My mistake was believing that "select * from someTable;" (within a trigger or stored procedure) would send the output to nirvana.
There is still something I do not understand, but I'm sure you can help me with that:
Does "select * from someTable;" really fetch all the data (and store it to later return it to the client) or does it merely open a cursor and returns that?
If it is a cursor, how does it know @@rowcount before fetching to the last row?

I think the cursor vs full fetch is supposed to be "at the discretion of the database" and developers should not be relying on any particular behavior under those conditions and assume the DB does whatever it thinks is best under the circumstances.

By the way, have you tried select 1 from someTable instead of select * from someTable? I think that if the table has an index on a primary key, it may be able to avoid the TableScan. Then again, if the table is small, Oracle may rightly decide to do the TableScan...

2006-01-25 Reply Admin

If you're inserting a user into a database, it is always always always better to let the database figure out what number to use, through whatever auto-incrementing key you've setup to take care of the user table.

I've used databases where the row count was the only auto-incrementing mechanism you had. Not modern relational databases, but the little embedded databases you have to use in small-memory or non-hosted environments can be amazingly limited.

maldrich · 2006-01-25 Reply Admin

There's just one correct way to do what we all infer the author of the original "code" was doing: the function scope_identity() paired with an Identity column in the table. It's fast, simple, and it always works, no matter how many users are on the system.

Select * from table returns the entire table to the client.

@@rowcount then contains the number of rows that were selected. That is very slow. O to the whatever, it makes no difference.

If Identity is not available for some reason, then you need to maintain a counter elsewhere in the database with the next available value, and use a stored proc and some careful transaction handling to perform inserts that are numbered correctly. Not to many people do that, because the Identity function is easier and it works. I think mainly pre-SQL Server 2000 systems have workarounds like that.

ammoQ · 2006-01-25 Reply Admin

OneFactor:
I think the cursor vs full fetch is supposed to be "at the discretion of the database" and developers should not be relying on any particular behavior under those conditions and assume the DB does whatever it thinks is best under the circumstances.

By the way, have you tried select 1 from someTable instead of select * from someTable? I think that if the table has an index on a primary key, it may be able to avoid the TableScan. Then again, if the table is small, Oracle may rightly decide to do the TableScan...

"select count(1)" also does a full table scan. The result is ~ 4 Mio, I don't think this is a "small" table. But an Oracle table is not an ISAM file or something similar. Scanning through the table is not slower than scanning through the index. I tried
"select count(*) from sometable where id>0" and it takes excactly as long (but does a index range scan that eventually retrieves all rows).

Bustaz Kool · 2006-01-25 Reply Admin

Disgruntled DBA:
Reworking the application to run on Oracle would require developers at $70 per hour....

Hiring fleet of Oracle DBAs at $150/hr...

Hiring everyone's favorite Oracle Consultant at $400/hr...

Bustaz Kool · 2006-01-25 Reply Admin

OR... You could use a Natural Key (Braces for avalanche of indignity [:|])

twks · 2006-01-25 Reply Admin

Anonymous:
I dunno, without looking at surrounding code, that could just be a typo (@@ROWCOUNT instead of @@IDENTITY). Which is bad, but if it's unintentional, I don't think it qualifies.

Unlike this little morsel of evil I inherited, which obviously took some thought and a certain degree of skill:

SELECT ...
FROM ITINERARYITEMS AS I
JOIN HOTELS AS H
ON CONVERT (
 INT
 , SUBSTRING(
 CONVERT (VARCHAR, I.Data)
 , PATINDEX('%<var name="" id=""><string>%', I.Data) + 23
 , PATINDEX(
 '%</string>%'
 , SUBSTRING(
 CONVERT (VARCHAR, I.Data)
 , PATINDEX('%<var name="" id=""><string>%', I.Data) + 23
 , 20
 )
 ) - 1
 )
 ) = H.ID
...
WHERE BookingType = 'hotel'
AND I.STATE=1
AND PATINDEX('%<var name="" id=""><string>%', Data) > 0
AND TIME BETWEEN @StartDate AND @EndDate

(names anonymized a bit and formatted so you can see what it's doing. The original was all on one line, naturally)

Yes, it's joining between an int column in one table and a number inside some XML in a text column in another! (and that number is stored in a <string> tag!?)

Wow, this is a very impressive WTF! In fact this code should have its own thread. My guess is that the foreign key did actually exist as a separate column in the </string></string></var></string></var></var>ITINERARYITEMS table, but this developer thought this would be more of a challenge.

2006-01-25 Reply Admin

If I recall IDENTITY still has issues with Sybase where if there is a system problem the seeding values get all screwed up, thus we avoid identity columns in Sybase and yes we are on a late version!

2006-01-25 Reply Admin

Anonymous:
Sure he could use identity fields, if he knew they existed. But this adds the excitement of concurrency issues and the performance hit of retrieving the whole table to get the row count.

Well, there's no concurrency issues, so long as each call occurs within the same database transaction as the initial insert.

My question for you is, sup with this "1 = 1" nonsense? You're not still writing ad-hoc queries that append AND clauses, are you?

This is the 21st Century -- abstract that shit, monkey!

2006-01-25 Reply Admin

Anonymous:

What would happen if you deleted a user?

That's a pretty poor idea -- deleting a user. It's begging for NPEs whenever you pull up any record of work that user performed or any other record tied to their records.

Far better to have a user liveness field.

2006-01-25 Reply Admin

that's broken too. (try running it on your users table)
whatever happened to:

declare @intUserId int
select @intUserId = @userId from USERS where UserName = 'Yer Mum'

lofwyr · 2006-01-25 Reply Admin

ammoQ:
Jeff S:
Some of the posts here mention doing the old "Count()+1" or "Max(ID)+1" techniques to get a new ID, but this @@ROWCOUNT method is even more ingenius because not only is it often wrong like the others, it is also exponentially less efficient!

Why? Isn't it basically the same like count()? I aggree that max(ID) is much faster if ID is indexed, a reasonable though unsafe (considering the circumstances) assumption.

More like sql%rowcount.

l.

lofwyr · 2006-01-25 Reply Admin

Jeff S:

AmmoQ -- you still aren't getting it. What you described simple does not happen in the order you describe, since those transactions block one another until things are done.

Say good bye to multiuser systems then. Haven't heard that for a while. sqlserver is really that backwards? Can't believe that.

Jeff S:

But even if the database DID allow of these things to happen at once (say, on virtual copies of each table and somehow it merges them all together at the right point when things are commited), don't you think that a database that can do THAT would be able to keep *multiple* simple row counts?

Might be a nice idea, but how and where would you store that? Those multiple versions you are talking about are called undo (rollback) in Oracle.

Jeff S:

In addition, if you are doing a table scan in Oracle on a COUNT(*) statement, then either a) oracle sucks even more than I thought or b) you have no indexes or PK on your table.

Depending on the statistics, Oracle's optimizer could decide to use a full table scan - they are not "evil" on Oracle.

Jeff S:

Finally, if you are contributing to a discussion in which you have no idea how the technology being discussed works (i.e., SQL Server), at least state that out front to save a lot of time and confusion. If you are prompting some discussion so you can learn about that technology (nothing wrong with that), why not ask specific questions instead of making statements like "this SQL statement is is O(n) which is equivalent to that SQL statement" and "stored procs cannnot return multiple resultsets" and so on, which are wild guesses on your part. Ask *questions* if you are not sure, don't make wild assumptions in the form of statements that you know to be true. It makes an intelligent discussion quite difficult and makes you look a little ignorant. I am not saying that you *are* ignorant, but it can make you look that way.

I have a question (only got Sybase/Oracle/a bit DB2 experience): You really can't delete two different records in the same table from two different sessions without causing a lock?

l.

ammoQ · 2006-01-25 Reply Admin

lofwyr:

I have a question (only got Sybase/Oracle/a bit DB2 experience): You really can't delete two different records in the same table from two different sessions without causing a lock?

That might (with some luck) be true with older versions of SQL Server (page level locking); but I think what Jeff meant was that delete blocks select count(*) and vice versa.

2006-01-26 Reply Admin

Locking in SQL Server (below 2005) is problematic. Although row-level locking does exist and is used, deadlocking still occurs and can be frequent on a high traffic SQL Server DB if you're not careful. In this regard Oracle has been superior.

Many years ago, Oracle used to have a beautiful, smug, and absolutely correct video on their website demonstrating this.

SQL Server 2005 introduces a new type of transaction called snapshot isolation, which is supposed to remove these sorts of problems.

ammoQ · 2006-01-26 Reply Admin

Anonymous:
Locking in SQL Server (below 2005) is problematic. Although row-level locking does exist and is used, deadlocking still occurs and can be frequent on a high traffic SQL Server DB if you're not careful. In this regard Oracle has been superior.

Many years ago, Oracle used to have a beautiful, smug, and absolutely correct video on their website demonstrating this.

SQL Server 2005 introduces a new type of transaction called snapshot isolation, which is supposed to remove these sorts of problems.

Hey, that snapshot isolation is nice! (Given your hint, I've just read some desriptions of it) This is how oracle works by default.
It's a very good thing for SQL Server that MS added this feature, it even makes the concurrent "delete - select count(*)" - transactions described in one of my previous posts possible.

jwenting · 2006-01-26 Reply Admin

Anonymous:
I wonder what would happen if they removed a user?

Nothing. The record in the user database would continue to exist but be flagged as inactive.
Thus data relating to the defunct user would still be retrievable as being related to him.

Personally I can see the validity of such reasoning which would enable the use of the table index (which should be the rowid) as userId a logical decision.
Using the rowcount on insert as the programmatic ID would then make some sense as well, saves going back to the database for a potentially expensive operation to retrieve the user information you just stored.
Somewhat risky if there's a good chance of 2 users being inserted at the same time (you might get the wrong number) but in a small scale system where only one person has the authority to insert new users that's not likely to happen.

Not that I'd do it, but I can understand the logic.

Anonymous:
Anonymous:

Why is it that nobody ever thought of having the generated ID returned as a parameter of the execute statement that executed the sql statement?

because INSERT doesn't return things, for a very good reason :

Not necessarilly true. If the insert is done using a stored procedure it may well return something.

ammoQ · 2006-01-26 Reply Admin

Anonymous:
Why is it that nobody ever thought of having the generated ID returned as a parameter of the execute statement that executed the sql statement?

The guys from Oracle thought of it:

insert into sometable(bla) values ('blubb') returning id into myhostvar;

(id is supposedly filled in the before-insert-trigger, since oracle has no auto-increment fields)

hank miller · 2006-01-26 Reply Admin

Anonymous:
Sorry to spoil the fun, but there appear to be a lot of newbie lurkers, versus me, a lurker (who forgot ot login).

This OP code sucks because before you could get access to @@ROWCOUNT, you had to run a SELECT query returning every row of the database.

Ahh.... That explains much. At first I thought rowCount was a variable in the program itself. I was thinking the real WTF was the poor variable name scheme that meant you had no idea why rowCount was being manipulated elsewhere. After looking close you would realize if should be userCount...

I don't work with sql much, and when I do it isn't in a language that has anything like the @@rowCount syntax. (either len(results), or 'select count(*) from ...;)

2006-01-26 Reply Admin

I guess that it's for calculate the last user in the DB.

quantity of users = last user in the db.

Makes sense in a world without deletes and updates.

A Representative Sample

Leave a comment on “A Representative Sample”