The Daily WTF: Curious Perversions in Information Technology

kastein · 2009-03-13 Reply Admin

First. Or 758475th, if using this code... Seriously, how do people come up with nonsense like this and think it's a good idea? At least they should have the decency to do something like a "select id from invoices", then abuse that to find a random ID that isn't used in php, instead of doing a new query every single time they fail to find an unused ID.

2009-03-13 Reply Admin

But why is it suddenly slowing down? This would just gradually become slower as the table filled up, and take forever after no slots were left.

2009-03-13 Reply Admin

Once again, anybody who was a decent programmer needed to stop and say "there has to be a better way than this..." oh well like many other WTF's I suspect this wasn't exactly a decent programmer...

2009-03-13 Reply Admin

If you graphed average the time it takes to generate the ID as the 'table filled up,' as you say, it would not be gradual, it would take exponentially longer for each entry that is put in the table (approaching the max number of IDs, where you're right, it takes infinite time). So, to answer your question, when it gets close to the max number of unique IDs, it takes a really really long time. Sure, the majority of the searches early on don't take long, but then when you're running short on available IDs...

Code Dependent · 2009-03-13 Reply Admin

Yup, this code has wound up right where it belongs.

On The Daily WTF?

obediah · 2009-03-13 Reply Admin

One of the most trivial tasks on the internet is finding a "developer" that advocates implementing all database checks and constraints (and most datatypes) in php, and letting MySQL do what it does best (never complain).

And since their code is a perfect and beautiful as they are, "all checks and constraints" starts out as nothing and quickly ends up looking like a cancerous tumor covered in poison ivy and genital warts.

snoofle · 2009-03-13 Reply Admin

User1                              User2
GetNewId                           GetNewId
n=Random(000000,999999) // 1234    n=Random(000000,999999) // 1234
select                             select
if 1234 not in db                  if 1234 not in db
   insert 1234                        insert 1234

*universe explodes*

pjt33 · 2009-03-13 Reply Admin

Burdieburd:
But why is it suddenly slowing down?

I don't see a "suddenly" in the article. It's probably just passed the point at which someone perceives that it's slower than it used to be. Also note the lack of index, so with n rows it could take O(n) time per query and the expected number of queries is N / (N - n) where N is the number of possible keys. nN / (N - n) grows fairly fast once n isn't small.

Incidentally, I'm not familiar with PHP but am I correct in thinking that the error handling is dodgy? It seems to echo an error message and carry straight on.

citricsquid · 2009-03-13 Reply Admin

Code Dependent:
Yup, this code has wound up right where it belongs.
On The Daily WTF?

In a big business production environment, of course!

2009-03-13 Reply Admin

He could have run that code in multiple threads to speed it up!

2009-03-13 Reply Admin

Sorry to be pedantic but "eachother" is NOT a word. It's a big pet peeve of mine since way back in school, I had a teacher attempt to teach us that "eachother" was correct. It's two words people, "EACH" and "OTHER".

/pedantic

I guess the only possible explanation for this code being written is that the developer had exactly zero knowledge of SQL. I... guess... I'm grasping at straws here because this is pretty terrible :)

2009-03-13 Reply Admin

Vollhorst:
He could have run that code in multiple threads to speed it up!

Argghh! That made me Laugh.

So what happens when there are 999999 invoices in the database...??

Dave

2009-03-13 Reply Admin

Vollhorst:
He could have run that code in multiple threads to speed it up!

He should have had something in there to measure how long the last search for an ID took, and then fork into multiple threads, the number of which being proportional to the time of the last search. That's how I would have done it, at least.

2009-03-13 Reply Admin

I was waiting for a nice For Loop.

Nether · 2009-03-13 Reply Admin

Maybe this programmer was inspired by the Futurama episode where Bender was to be executed when a big random number generator reached 0.

"42!" "89!" "347!" "12!" "Hey look, 0!"

zzzzzaaaaaappp!

2009-03-13 Reply Admin

That would be better:

for(int i = 0; i < 10000; i++)
 switch(i){
   case 0: if (PSEUDO_INSERT_WITH_0_SUCCEEDED){ i = 10000; break; }
   case 1: if (PSEUDO_INSERT_WITH_1_SUCCEEDED){ i = 10000; break; }
   ...
 }

2009-03-13 Reply Admin

StrVariant:
I was waiting for a nice For Loop.

With a Case statement?

2009-03-13 Reply Admin

It took me a while, but I've figured out what was wrong. He should have been storing every try in an array and then checked the array before seeing if the ID was valid. Otherwise that code could have generated the same reserved id multiple times, and that is just inefficient.

JamesQMurphy · 2009-03-13 Reply Admin

DaveShaw:
So what happens when there are 999999 invoices in the database...??

You sing a drinking song: 999999 invoices of data in the database, 999999 invoices of data are there, DELETE one good, like you know you should, 999998 invoices of data in the database...

ad naseum

2009-03-13 Reply Admin

Was there an index on the 'code' column? Just being an agitant :)

Code Dependent · 2009-03-13 Reply Admin

SomeCoder:
It's a big pet peeve of mine since way back in school, I had a teacher attempt to teach us that "eachother" was correct.

We pedants all have our crosses to bear.

SomeCoder:
I guess the only possible explanation for this code being written is that the developer had exactly zero knowledge.

Fixed that for you.

jaspax · 2009-03-13 Reply Admin

The sad thing is that after reading the title and the first paragraph, I could see exactly what the WTF was going to be. The rest of the article just confirmed my suspicions.

2009-03-13 Reply Admin

Stop posting comments!!! I keep colliding with your stupid IDs!

2009-03-13 Reply Admin

Or perhaps he should have taken advantage of Perl's relationship with PHP and put it into a hash function to get the number out, then increment until you find the next untaken number, after all, 1000000 is prime, right guys?

2009-03-13 Reply Admin

IANARP (I am not a real programmer), and even I can tell that random numbers have limited application in business invoicing...

2009-03-13 Reply Admin

blah:
Stop posting comments!!! I keep colliding with your stupid IDs!

POST!

2009-03-13 Reply Admin

$code = mt_rand(1000000, 99999999);

Because you know, Just do what it takes.

2009-03-13 Reply Admin

Why bother with a loop? Just increase the range and you're fine. At least, that's the thinking behind the transaction id generator' in a particular PHP/MySQL financial app I inherited:

function tid()

{

$tid=srand((double) microtime() * 1000000);

$randa = rand(48,57);

$randb = rand(48,57);

$randc = rand(48,57);

$randd = rand(48,57);

$rande = rand(48,57);

$randf = rand(48,57);

$randg = rand(48,57);

$randh = rand(48,57);

$randi = rand(48,57);

$str = sprintf("0%c%c%c%c%c%c%c%c%c",$randa,$randb,$randc,$randd,$rande,$randf,$randg,$randh,$randi);

return $str;

}

NO I KID YOU NOT. And, of course the 'tid' column in the database was char(10) with no primary key constraint. By the time I arrived, they were getting strange errors in certain transactions and couldn't figure out why.

CAPTCHA (which I have to mention in this instance): validus

2009-03-13 Reply Admin

Untouchable:
$code = mt_rand(1000000, 99999999);
Because you know, Just do what it takes.

And of course, increase the size of the field

2009-03-13 Reply Admin

Oh and yes you did notice that it only generates 9 'random' digits and prefixes everything with an inexplicable 0. I suppose they were going to 'open up' that additional range if they started getting busy.

WayneCollins · 2009-03-13 Reply Admin

I must confess that I have written random ID generators in code to feed to a table.

In my defense, the ID was 128 bits, so it would be a lot longer before you would expect a collision. Also, there was a primary key constraint on the ID, so you couldn't put in a duplicate record... which would bring you back to the ID generating loop, except much later in the growth of the table and with a round trip to the database every time.

By the way, is there a good way to insert a row with a unique key and get back the key in a single request to an Oracle or MSSQL server?

2009-03-13 Reply Admin

Obviously, the company has done well enough to hire a programmer with more skills!

2009-03-13 Reply Admin

Been awhile, but maybe...

@@IDENTITY returns the last inserted identity value

Just have your row identfied as an identity

2009-03-13 Reply Admin

AnthonyB:
It took me a while, but I've figured out what was wrong. He should have been storing every try in an array and then checked the array before seeing if the ID was valid. Otherwise that code could have generated the same reserved id multiple times, and that is just inefficient.

My first guess would have been that he was seeding the random number generator with the same seed each time. I like the way Informix handles auto-number/sequences this better than Oracle.

2009-03-13 Reply Admin

Why aren't the invoice numbers allowed to be sequential ?

WayneCollins · 2009-03-13 Reply Admin

LHC:
Been awhile, but maybe...
@@IDENTITY returns the last inserted identity value

Just have your row identfied as an identity

Yeah, but that's still two requests.

A stored procedure could do it in one request. I should have done that, but I was still revolted by TSQL and PL/SQL and tried to do everything from a "real" language...

obediah · 2009-03-13 Reply Admin

rycamor:
Why bother with a loop? Just increase the range and you're fine. At least, that's the thinking behind the transaction id generator' in a particular PHP/MySQL financial app I inherited:
function tid()

{

$tid=srand((double) microtime() * 1000000);

$randa = rand(48,57); ...

$str = sprintf("0%c%c%c%c%c%c%c%c%c",$randa,$randb,$randc,$randd,$rande,$randf,$randg,$randh,$randi);

return $str;

}

What's the magic of 48 and 57?

They really should have beefed it up a bit:

$randa = rand(rand(40,49), rand(50,59); ...

And then shuffle $randa through $randi just to be sure :)

2009-03-13 Reply Admin

This code is stupid and inefficient. He should've preemptively generated several IDs, say 100, and query all of those at the same time, like so (python-esque pseudocode):

idlist = [randomid() for i in range(100)] existing_invoices = query("SELECT * FROM invoices WHERE invoicenumber IN (" + idlist.join(",") + ")") for id in idlist: if not existing_invoices.contains(id): query("INSERT... ") break

JamesQMurphy · 2009-03-13 Reply Admin

WayneCollins:
I must confess that I have written random ID generators in code to feed to a table.
In my defense, the ID was 128 bits, so it would be a lot longer before you would expect a collision. Also, there was a primary key constraint on the ID, so you couldn't put in a duplicate record... which would bring you back to the ID generating loop, except much later in the growth of the table and with a round trip to the database every time.

By the way, is there a good way to insert a row with a unique key and get back the key in a single request to an Oracle or MSSQL server?

128 bits? You're essentially using GUIDs. A company that I worked at 10 years ago used GUIDs for their ID column -- SQL Server supports it.

themagni · 2009-03-13 Reply Admin

WayneCollins:
I must confess that I have written random ID generators in code to feed to a table.
In my defense, the ID was 128 bits, so it would be a lot longer before you would expect a collision. Also, there was a primary key constraint on the ID, so you couldn't put in a duplicate record... which would bring you back to the ID generating loop, except much later in the growth of the table and with a round trip to the database every time.

By the way, is there a good way to insert a row with a unique key and get back the key in a single request to an Oracle or MSSQL server?

You could use the primary key, add each one sequentially, then if you don't want customers to know what their number is, hash it up and give them that number. Save the hash in the db and you can search by that, so you could write something like "select * from customers where hashID = value"

I do PLC programming and I know that much.

WayneCollins · 2009-03-13 Reply Admin

themagni:
WayneCollins:
I must confess that I have written random ID generators in code to feed to a table.
In my defense, the ID was 128 bits, so it would be a lot longer before you would expect a collision. Also, there was a primary key constraint on the ID, so you couldn't put in a duplicate record... which would bring you back to the ID generating loop, except much later in the growth of the table and with a round trip to the database every time.

By the way, is there a good way to insert a row with a unique key and get back the key in a single request to an Oracle or MSSQL server?

You could use the primary key, add each one sequentially, then if you don't want customers to know what their number is, hash it up and give them that number. Save the hash in the db and you can search by that, so you could write something like "select * from customers where hashID = value"

I do PLC programming and I know that much.

Yeah, that's great, until some other app, or multiple instances of the same app, want to insert things into the same table.

WayneCollins · 2009-03-13 Reply Admin

JamesQMurphy:
128 bits? You're essentially using GUIDs. A company that I worked at 10 years ago used GUIDs for their ID column -- SQL Server supports it.

This was Oracle. I might have overlooked its GUID type if it had one. What I ended up doing was randomly generating two longs in C#, then concatting their "ToString()" output. Which, come to think of it, means that there was less than 128 bits of randomness, since multiple pairs of numbers yield the same concatted string.

Oops.

2009-03-13 Reply Admin

Obviously the solution here is another table, remaining_ids, with all possible IDs. Then when you need an ID you randomly select one from that table and delete it. No endless looping trying to find the next available number!

Kaosadvokit · 2009-03-13 Reply Admin

I almost hate to say this, but since I love playing devil's advocate, I actually enjoy it.

In principle, it is done correctly, the problem is that the range of allowed random numbers is insufficient. Version 4 UUIDs are generated as 122-bit pseudorandom numbers. If he was issuing a GUID and retrying in the event of a conflict he would be applauded for having error detection/correction that many people omit.

Of course, it would make even more sense to use an auto-increment integer column with a unique constraint, as so many have pointed out, but that doesn't mean that this method is invalid.

2009-03-13 Reply Admin

but why:
Why aren't the invoice numbers allowed to be sequential ?

Because you can't have a customer place order number 800234 one day and then order 800235 the next day and realize that you have no other customers! Same reason you don't start your invoice numbers at 000001.

2009-03-13 Reply Admin

After your insert query, call this:

SELECT LAST_INSERT_ID() FROM

It will be thread safe if you're not using persistent connections.

2009-03-13 Reply Admin

jaspax:
The sad thing is that after reading the title and the first paragraph, I could see exactly what the WTF was going to be. The rest of the article just confirmed my suspicions.

Same thing here. And on top of that, I know exactly how this code came into existence:

Developer builds invoicing system. PHB says "I don't want the unique invoice numbers incrementing because then our clients will think we don't do much business. Can you make them random and still unique?"

lolwtf · 2009-03-13 Reply Admin

I could tell almost exactly what was coming after the first paragraph, but the random number still came as a surprise. Guess I haven't been here long enough.

Also, obligatory: while($comment != 'WTF') $comment = generateRandomComment(); post($comment);

2009-03-13 Reply Admin

The follow WTF was in a project I had to fix some month ago. I think it the only way to make it even more worse:

good_key = false while(good_key == false) key = generate_random_key(5) g = Guest.find_by_login_code(key) if g.blank? key = generate_random_key(5) good_key = true end end

2009-03-13 Reply Admin

hm... why not let MySQL handle ID generation?

CREATE TABLE IF NOT EXISTS `invoices` (
  `code` bigint NOT NULL auto_increment,
  `DtCode` date NOT NULL default '0000-00-00',
  `Type` varchar(100) NOT NULL default '',
  PRIMARY KEY(`code`)
) ENGINE=MyISAM DEFAULT CHARSET=latin1;

If you really need the code to be a certain number of characters then just use str_pad() after you fetch the value. Or you could just use a real DBMS (since MySQL is horrible for anything non-trivial).

The Quest for the Unique ID

Leave a comment on “The Quest for the Unique ID”