- Feature Articles
- CodeSOD
- Error'd
- Forums
-
Other Articles
- Random Article
- Other Series
- Alex's Soapbox
- Announcements
- Best of…
- Best of Email
- Best of the Sidebar
- Bring Your Own Code
- Coded Smorgasbord
- Mandatory Fun Day
- Off Topic
- Representative Line
- News Roundup
- Editor's Soapbox
- Software on the Rocks
- Souvenir Potpourri
- Sponsor Post
- Tales from the Interview
- The Daily WTF: Live
- Virtudyne
Admin
IMO both are O(n). I can't say how efficient (or inefficient) the database handles option 2.
Admin
What would happen if you deleted a user?
Select * from Users where clue > 0
0 Records returned.
Admin
Have you been drinking today or just acting goofy or what?
Look carefully again. What if the table has 1,000,000 rows. What gets returned to the client with option 1? What gets returned to the client with option 2? Which do you think is more efficient?
Admin
I thought it was a stored procedure, where option 2 doesn't really return the result of the "SELECT *" to the client. I must admit that I don't know TSQL well. Option2 returns a result set for the "SELECT *", then the result of "SELECT @@ROWCOUNT"?
Admin
How many DBAs will be sacrificed to this table-scanner?
Contention issues? Heck, you can have huge transactions to mitigate that.
Concurrency issues? Heck, just move to Oracle, and they will evaporate away.
Now, let's see....
Countless developers using a poor design at $60 an hour.
New UNIX system (because Oracle on Windows is "obviously crap") $200,000
Oops. Forgot the Oracle license: $20,000 (depends on how good a negotiator you are.)
Reworking the application to run on Oracle would require developers at $70 per hour....
Jumping off this Hindenberg before the New Jersey airfield....priceless....
Admin
My contribution to the "The real WTF is...." discusssion. The real WTF is that he is using hungarian notation! intUserId? That is SO 1997. Even Microsoft doesn't recommend Hungarian any more.
Admin
Yes, I guess you don't .. :) even stored procedures return all resultsets back to the client; in fact, many stored procs are even supposed to !
Now, based on that, do you still feel they both are the same efficiency?
Admin
It has identity columns and scope_identity() which is better.
If you couldn't use @@IDENITY Pre-2000, you could do (from memory w/o error handling):
SET TRANSACTIONISOLATIONLEVEL REPEATABLE_READ
BEGIN TRAN
SELECT @nextid = MAX(id) + 1
FROM TABLE
INSERT TABLE ....
COMMIT TRAN
If you have a unique or primary key contraint on the ID column, you can drop the transactionisolationlevel piece and just retry on failure due to unique/primary key constraint violation errors. You invariably have an index on ID so MAX(id) is not usually a table scan.
Identity columns can create performance bottlenecks and you do have to wait for successful completion of the SQL to get the value. I think GUIDs may be the way to go now that they are directly supported by DBMSs and application development platforms.
Admin
because INSERT doesn't return things, for a very good reason :
Admin
Now see, if they would just upgrade to STORRAY, there would be no need to figure out all that hard stuff about scope_identity() and concurrency. That SQL is just too complicated!
Admin
What prize to I get for being the first one to post that it is obviously "auto generated code"?
Admin
well, i guess the best way to deal with that is to have a CHAR(1) column with something from { 'A', 'L', 'D' } in it for active, locked and deleted - you don't need to worry about rowcount decreasing then, so it's safe(er) to use, although 'COUNT(*)+1' is better...
Admin
In response to Jeff S and AmmoQ discussion.
SELECT COUNT(*) ... can often be precalculated and cached
Admin
Admin
It's different in Oracle's PL/SQL, sorry for acting like a fool.
Well, O(n)=O(n), but it might be difficult to explain that to the customer who sees 10 secs vs. 20 mins response time. ;-)
Admin
In general probably yes, but in this case it might fail, since the number of records has inevitably changed.
Admin
If it doesn't make sense, perhaps it is an attempt at a Chewbacca Defense in diguise ;-)
Admin
SQL Server has identity columns. You can't access the sequence directly. You simply insert, and after the insert, you retrieve the identity for your insert from @@scopeidentity. Some people incorrectly use @@identity, but this can be incorrect if your insert fires a trigger that inserts into another table. Because the @@identity is the most recent identity generated, whereas the @@scopeidentity is the most recent identity within your SQL block.
If you knew how many times I've found catch(...) followed by {} in the program I'm updating, you'd forgive my insanity.
Admin
Actually, the PostgreSQL people are thinking about an extension to INSERT that allows to return arbitrary columns as result, including auto-generated keys, it will go like "INSERT INTO table (name, street) VALUES ('blah', 'blub') RETURNING name, id;" However, currently they're not really shure how to deal with Triggers and Rules, and so it is not implemented yet. It might come with V8.2 or 8.3.
Admin
Actually it's about to change.
But it's trivial for an rdbms to maintain a 'rowcount' variable for each table, then incrementing for each INSERT and decrementing for each DELETE, making SELECT COUNT(*) an O(1) operation. I don't know if any do this, though.
If not, I imagine it would be a linear scan through the index, which is obviously O(N) but probably still quite cheap.
Admin
That's nitpicking ;-) Unless other parts of the program also do a count(*), it has changed since the last execution.
It's that trivial for a single user database. In a multiuser database system where ACID ist more than a buzzword the following can happen:
1. user A deletes 10 rows (no commit yet)
2. user B inserts 5 rows (no commit yet)
3. user A rollbacks his transaction
4. user A does a count(*) and since we assume a consistent look at the data, he should not see the uncommited inserts of B
5. user A deletes 10 rows
6. user B does a count(*), does not see the uncommited deletes of A but sees his own uncommited inserts
7. user C does a count(*), should see neighter the uncommited deletes of A nor the uncommited inserts of B
8. user B does a commit
9. user A does a count(*), sees his own uncommited deletes
10. user A does a commit
Now try to give each user a correct count(*) with a single rowcount variable.
Just tried that on Oracle, it does a full table scan.
Admin
This clearly explain why we need:
It get the "unique id". It solved the "YEAH! I'm FIRST!" problem.
Admin
Me too. And being rather a newbie to programming, I suggest the following: why don’t you (we, for the rare occasions where I do know a better solution) guys make it a challenge to explain clearly what the good solution would be?
Admin
AmmoQ -- you still aren't getting it. What you described simple does not happen in the order you describe, since those transactions block one another until things are done. But even if the database DID allow of these things to happen at once (say, on virtual copies of each table and somehow it merges them all together at the right point when things are commited), don't you think that a database that can do THAT would be able to keep *multiple* simple row counts?
In addition, if you are doing a table scan in Oracle on a COUNT(*) statement, then either a) oracle sucks even more than I thought or b) you have no indexes or PK on your table.
Finally, if you are contributing to a discussion in which you have no idea how the technology being discussed works (i.e., SQL Server), at least state that out front to save a lot of time and confusion. If you are prompting some discussion so you can learn about that technology (nothing wrong with that), why not ask specific questions instead of making statements like "this SQL statement is is O(n) which is equivalent to that SQL statement" and "stored procs cannnot return multiple resultsets" and so on, which are wild guesses on your part. Ask *questions* if you are not sure, don't make wild assumptions in the form of statements that you know to be true. It makes an intelligent discussion quite difficult and makes you look a little ignorant. I am not saying that you *are* ignorant, but it can make you look that way.
Admin
hopefully there's a
"/* This is one of the wrong ways of doing this that we considered"
before and a
"*/"
after it :-)
Admin
Sorry to spoil the fun, but there appear to be a lot of newbie lurkers, versus me, a lurker (who forgot ot login).
This OP code sucks because before you could get access to @@ROWCOUNT, you had to run a SELECT query returning every row of the database. This is slow... especially on a large database. But, what if, in addition to having millions of records, you had thousands of users and you ran this query during the business day? In the split second between when you ran the SELECT (that returns the whole table) and the SELECT (that returns @@ROWCOUNT), users inserted new rows? @@ROWCOUNT only shows you the results from your own last query, not the actual number. If you then insert new rows, using this now erroneous number as your user ID, multiple users will have the same ID.
Now lets say you delete some rows in the middle of the table. Same problem. Multiple users end up with the same ID.
None of this is a big deal in and of itself. But tie those user IDs to say, finance records, purchasing records, employee performace records, mailing addresses... you get the drift. The wrong information goes to the wrong person and from the database maintainer's point of view, it's impossible to figure out who belongs where.
The solution? That spoils the fun! Basically, use a 'natural' business process generated ID as your user ID. Or, if your business produces no such key, allow the database to generate it for you. There may be times where you would want to generate your own unique ID, but you wouldn't be asking such questions if those conditions applied to you, because it's horrible to do, as you can see by the other posts.
Admin
Perhaps it is part of a self incrementing insert script vs an identity? Example:
DECLARE @intUserID INT
SET @intUserId = <FONT size=+0>@@ROWCOUNT</FONT>
INSERT INTO myTable (id, firstname, lastname)
VALUES ((@intUserID +1), 'T', 'DOG')
Admin
I knew someone would raise this question.
Let's say there is a database system that does that, called "Oracle".
It could, but for what reason? No reasonable application does count(*) without a where clause repeatedly.
or c) Oracle keeps its data in a way where a full index scan is not faster than a full table scan
Finally, I know this is about SQL Server; a readily state that don't know much about it; and if you read my posts, you will see I asked several questions. O(n) was the answer to your claim "exponentially slower" which is simply untrue.
Admin
sorry. I meant thousands of times slower, not exponentially.
Some simple questions (and I still cannot believe I am playing along with you here, but what they heck): how do you compare two algorithms that are both O(n) to see which is more efficient? Are all alogrithms that are o(n) equally efficient in your mind? Do you understand that an algorithm that counts rows in a table by sending each one to a printer, one row per page, and then has the page count derived by measuring the weight of the output pages divided by the weight of each is also O(n) ?
Admin
My mistake was believing that "select * from someTable;" (within a trigger or stored procedure) would send the output to nirvana.
There is still something I do not understand, but I'm sure you can help me with that:
Does "select * from someTable;" really fetch all the data (and store it to later return it to the client) or does it merely open a cursor and returns that?
If it is a cursor, how does it know @@rowcount before fetching to the last row?
Admin
I think the cursor vs full fetch is supposed to be "at the discretion of the database" and developers should not be relying on any particular behavior under those conditions and assume the DB does whatever it thinks is best under the circumstances.
By the way, have you tried select 1 from someTable instead of select * from someTable? I think that if the table has an index on a primary key, it may be able to avoid the TableScan. Then again, if the table is small, Oracle may rightly decide to do the TableScan...
Admin
I've used databases where the row count was the only auto-incrementing mechanism you had. Not modern relational databases, but the little embedded databases you have to use in small-memory or non-hosted environments can be amazingly limited.
Admin
There's just one correct way to do what we all infer the author of the original "code" was doing: the function scope_identity() paired with an Identity column in the table. It's fast, simple, and it always works, no matter how many users are on the system.
Select * from table returns the entire table to the client.
@@rowcount then contains the number of rows that were selected. That is very slow. O to the whatever, it makes no difference.
If Identity is not available for some reason, then you need to maintain a counter elsewhere in the database with the next available value, and use a stored proc and some careful transaction handling to perform inserts that are numbered correctly. Not to many people do that, because the Identity function is easier and it works. I think mainly pre-SQL Server 2000 systems have workarounds like that.
Admin
"select count(1)" also does a full table scan. The result is ~ 4 Mio, I don't think this is a "small" table. But an Oracle table is not an ISAM file or something similar. Scanning through the table is not slower than scanning through the index. I tried
"select count(*) from sometable where id>0" and it takes excactly as long (but does a index range scan that eventually retrieves all rows).
Admin
Hiring fleet of Oracle DBAs at $150/hr...
Hiring everyone's favorite Oracle Consultant at $400/hr...
Admin
OR... You could use a Natural Key (Braces for avalanche of indignity [:|])
Admin
Wow, this is a very impressive WTF! In fact this code should have its own thread. My guess is that the foreign key did actually exist as a separate column in the </string></string></var></string></var></var>ITINERARYITEMS table, but this developer thought this would be more of a challenge.
Admin
If I recall IDENTITY still has issues with Sybase where if there is a system problem the seeding values get all screwed up, thus we avoid identity columns in Sybase and yes we are on a late version!
Admin
Well, there's no concurrency issues, so long as each call occurs within the same database transaction as the initial insert.
My question for you is, sup with this "1 = 1" nonsense? You're not still writing ad-hoc queries that append AND clauses, are you?
This is the 21st Century -- abstract that shit, monkey!
Admin
That's a pretty poor idea -- deleting a user. It's begging for NPEs whenever you pull up any record of work that user performed or any other record tied to their records.
Far better to have a user liveness field.
Admin
that's broken too. (try running it on your users table)
whatever happened to:
declare @intUserId int
select @intUserId = @userId from USERS where UserName = 'Yer Mum'
Admin
More like sql%rowcount.
l.
Admin
Say good bye to multiuser systems then. Haven't heard that for a while. sqlserver is really that backwards? Can't believe that.
Might be a nice idea, but how and where would you store that? Those multiple versions you are talking about are called undo (rollback) in Oracle.
Depending on the statistics, Oracle's optimizer could decide to use a full table scan - they are not "evil" on Oracle.
I have a question (only got Sybase/Oracle/a bit DB2 experience): You really can't delete two different records in the same table from two different sessions without causing a lock?
l.
Admin
That might (with some luck) be true with older versions of SQL Server (page level locking); but I think what Jeff meant was that delete blocks select count(*) and vice versa.
Admin
Admin
Hey, that snapshot isolation is nice! (Given your hint, I've just read some desriptions of it) This is how oracle works by default.
It's a very good thing for SQL Server that MS added this feature, it even makes the concurrent "delete - select count(*)" - transactions described in one of my previous posts possible.
Admin
Nothing. The record in the user database would continue to exist but be flagged as inactive.
Thus data relating to the defunct user would still be retrievable as being related to him.
Personally I can see the validity of such reasoning which would enable the use of the table index (which should be the rowid) as userId a logical decision.
Using the rowcount on insert as the programmatic ID would then make some sense as well, saves going back to the database for a potentially expensive operation to retrieve the user information you just stored.
Somewhat risky if there's a good chance of 2 users being inserted at the same time (you might get the wrong number) but in a small scale system where only one person has the authority to insert new users that's not likely to happen.
Not that I'd do it, but I can understand the logic.
Not necessarilly true. If the insert is done using a stored procedure it may well return something.
Admin
The guys from Oracle thought of it:
insert into sometable(bla) values ('blubb') returning id into myhostvar;
(id is supposedly filled in the before-insert-trigger, since oracle has no auto-increment fields)
Admin
Ahh.... That explains much. At first I thought rowCount was a variable in the program itself. I was thinking the real WTF was the poor variable name scheme that meant you had no idea why rowCount was being manipulated elsewhere. After looking close you would realize if should be userCount...
I don't work with sql much, and when I do it isn't in a language that has anything like the @@rowCount syntax. (either len(results), or 'select count(*) from ...;)
Admin
I guess that it's for calculate the last user in the DB.
quantity of users = last user in the db.
Makes sense in a world without deletes and updates.