• (cs)

    First. Or 758475th, if using this code... Seriously, how do people come up with nonsense like this and think it's a good idea? At least they should have the decency to do something like a "select id from invoices", then abuse that to find a random ID that isn't used in php, instead of doing a new query every single time they fail to find an unused ID.

  • Burdieburd (unregistered)

    But why is it suddenly slowing down? This would just gradually become slower as the table filled up, and take forever after no slots were left.

  • *sigh* (unregistered)

    Once again, anybody who was a decent programmer needed to stop and say "there has to be a better way than this..." oh well like many other WTF's I suspect this wasn't exactly a decent programmer...

  • Josh (unregistered) in reply to Burdieburd

    If you graphed average the time it takes to generate the ID as the 'table filled up,' as you say, it would not be gradual, it would take exponentially longer for each entry that is put in the table (approaching the max number of IDs, where you're right, it takes infinite time). So, to answer your question, when it gets close to the max number of unique IDs, it takes a really really long time. Sure, the majority of the searches early on don't take long, but then when you're running short on available IDs...

  • (cs)
    Yup, this code has wound up right where it belongs.
    On The Daily WTF?
  • (cs)

    One of the most trivial tasks on the internet is finding a "developer" that advocates implementing all database checks and constraints (and most datatypes) in php, and letting MySQL do what it does best (never complain).

    And since their code is a perfect and beautiful as they are, "all checks and constraints" starts out as nothing and quickly ends up looking like a cancerous tumor covered in poison ivy and genital warts.

  • (cs)
    User1                              User2
    GetNewId                           GetNewId
    n=Random(000000,999999) // 1234    n=Random(000000,999999) // 1234
    select                             select
    if 1234 not in db                  if 1234 not in db
       insert 1234                        insert 1234
    
    *universe explodes*
    
    
  • (cs) in reply to Burdieburd
    Burdieburd:
    But why is it suddenly slowing down?
    I don't see a "suddenly" in the article. It's probably just passed the point at which someone perceives that it's slower than it used to be. Also note the lack of index, so with n rows it could take O(n) time per query and the expected number of queries is N / (N - n) where N is the number of possible keys. nN / (N - n) grows fairly fast once n isn't small.

    Incidentally, I'm not familiar with PHP but am I correct in thinking that the error handling is dodgy? It seems to echo an error message and carry straight on.

  • (cs) in reply to Code Dependent
    Code Dependent:
    Yup, this code has wound up right where it belongs.
    On The Daily WTF?

    In a big business production environment, of course!

  • Vollhorst (unregistered)

    He could have run that code in multiple threads to speed it up!

  • SomeCoder (unregistered)

    Sorry to be pedantic but "eachother" is NOT a word. It's a big pet peeve of mine since way back in school, I had a teacher attempt to teach us that "eachother" was correct. It's two words people, "EACH" and "OTHER".

    /pedantic

    I guess the only possible explanation for this code being written is that the developer had exactly zero knowledge of SQL. I... guess... I'm grasping at straws here because this is pretty terrible :)

  • DaveShaw (unregistered) in reply to Vollhorst
    Vollhorst:
    He could have run that code in multiple threads to speed it up!

    Argghh! That made me Laugh.

    So what happens when there are 999999 invoices in the database...??

    Dave

  • Eric (unregistered) in reply to Vollhorst
    Vollhorst:
    He could have run that code in multiple threads to speed it up!
    He should have had something in there to measure how long the last search for an ID took, and then fork into multiple threads, the number of which being proportional to the time of the last search. That's how I would have done it, at least.
  • StrVariant (unregistered) in reply to Vollhorst

    I was waiting for a nice For Loop.

  • (cs) in reply to kastein

    Maybe this programmer was inspired by the Futurama episode where Bender was to be executed when a big random number generator reached 0.

    "42!" "89!" "347!" "12!" "Hey look, 0!"

    zzzzzaaaaaappp!

  • Vollhorst (unregistered)

    That would be better:

    for(int i = 0; i < 10000; i++)
     switch(i){
       case 0: if (PSEUDO_INSERT_WITH_0_SUCCEEDED){ i = 10000; break; }
       case 1: if (PSEUDO_INSERT_WITH_1_SUCCEEDED){ i = 10000; break; }
       ...
     }
    
  • OldCoder (unregistered) in reply to StrVariant
    StrVariant:
    I was waiting for a nice For Loop.
    With a Case statement?
  • AnthonyB (unregistered)

    It took me a while, but I've figured out what was wrong. He should have been storing every try in an array and then checked the array before seeing if the ID was valid. Otherwise that code could have generated the same reserved id multiple times, and that is just inefficient.

  • (cs) in reply to DaveShaw
    DaveShaw:
    So what happens when there are 999999 invoices in the database...??

    You sing a drinking song: 999999 invoices of data in the database, 999999 invoices of data are there, DELETE one good, like you know you should, 999998 invoices of data in the database...

    ad naseum

  • Pita (unregistered)

    Was there an index on the 'code' column? Just being an agitant :)

  • (cs) in reply to SomeCoder
    SomeCoder:
    It's a big pet peeve of mine since way back in school, I had a teacher attempt to teach us that "eachother" was correct.
    We pedants all have our crosses to bear.
    SomeCoder:
    I guess the only possible explanation for this code being written is that the developer had exactly zero knowledge.
    Fixed that for you.
  • (cs)

    The sad thing is that after reading the title and the first paragraph, I could see exactly what the WTF was going to be. The rest of the article just confirmed my suspicions.

  • blah (unregistered)

    Stop posting comments!!! I keep colliding with your stupid IDs!

  • icywindow (unregistered)

    Or perhaps he should have taken advantage of Perl's relationship with PHP and put it into a hash function to get the number out, then increment until you find the next untaken number, after all, 1000000 is prime, right guys?

  • VLPRONJ (unregistered)

    IANARP (I am not a real programmer), and even I can tell that random numbers have limited application in business invoicing...

  • LHC (unregistered) in reply to blah
    blah:
    Stop posting comments!!! I keep colliding with your stupid IDs!

    POST!

  • Untouchable (unregistered)
    $code = mt_rand(1000000, 99999999);

    Because you know, Just do what it takes.

  • rycamor (unregistered)

    Why bother with a loop? Just increase the range and you're fine. At least, that's the thinking behind the transaction id generator' in a particular PHP/MySQL financial app I inherited:

    function tid()

    {

    $tid=srand((double) microtime() * 1000000);

    $randa = rand(48,57);

    $randb = rand(48,57);

    $randc = rand(48,57);

    $randd = rand(48,57);

    $rande = rand(48,57);

    $randf = rand(48,57);

    $randg = rand(48,57);

    $randh = rand(48,57);

    $randi = rand(48,57);

    $str = sprintf("0%c%c%c%c%c%c%c%c%c",$randa,$randb,$randc,$randd,$rande,$randf,$randg,$randh,$randi);

    return $str;

    }

    NO I KID YOU NOT. And, of course the 'tid' column in the database was char(10) with no primary key constraint. By the time I arrived, they were getting strange errors in certain transactions and couldn't figure out why.

    CAPTCHA (which I have to mention in this instance): validus

  • Untouchable (unregistered) in reply to Untouchable
    Untouchable:
    $code = mt_rand(1000000, 99999999);
    Because you know, Just do what it takes.
    And of course, increase the size of the field
  • rycamor (unregistered) in reply to rycamor

    Oh and yes you did notice that it only generates 9 'random' digits and prefixes everything with an inexplicable 0. I suppose they were going to 'open up' that additional range if they started getting busy.

  • (cs)

    I must confess that I have written random ID generators in code to feed to a table.

    In my defense, the ID was 128 bits, so it would be a lot longer before you would expect a collision. Also, there was a primary key constraint on the ID, so you couldn't put in a duplicate record... which would bring you back to the ID generating loop, except much later in the growth of the table and with a round trip to the database every time.

    By the way, is there a good way to insert a row with a unique key and get back the key in a single request to an Oracle or MSSQL server?

  • VLPRONJ (unregistered) in reply to DaveShaw

    Obviously, the company has done well enough to hire a programmer with more skills!

  • LHC (unregistered) in reply to WayneCollins

    Been awhile, but maybe...

    @@IDENTITY returns the last inserted identity value

    Just have your row identfied as an identity

  • Zapp Brannigan (unregistered) in reply to AnthonyB
    AnthonyB:
    It took me a while, but I've figured out what was wrong. He should have been storing every try in an array and then checked the array before seeing if the ID was valid. Otherwise that code could have generated the same reserved id multiple times, and that is just inefficient.
    My first guess would have been that he was seeding the random number generator with the same seed each time. I like the way Informix handles auto-number/sequences this better than Oracle.
  • but why (unregistered)

    Why aren't the invoice numbers allowed to be sequential ?

  • (cs) in reply to LHC
    LHC:
    Been awhile, but maybe...

    @@IDENTITY returns the last inserted identity value

    Just have your row identfied as an identity

    Yeah, but that's still two requests.

    A stored procedure could do it in one request. I should have done that, but I was still revolted by TSQL and PL/SQL and tried to do everything from a "real" language...

  • (cs) in reply to rycamor
    rycamor:
    Why bother with a loop? Just increase the range and you're fine. At least, that's the thinking behind the transaction id generator' in a particular PHP/MySQL financial app I inherited:

    function tid()

    {

    $tid=srand((double) microtime() * 1000000);

    $randa = rand(48,57); ...

    $str = sprintf("0%c%c%c%c%c%c%c%c%c",$randa,$randb,$randc,$randd,$rande,$randf,$randg,$randh,$randi);

    return $str;

    }

    What's the magic of 48 and 57?

    They really should have beefed it up a bit:

    $randa = rand(rand(40,49), rand(50,59); ...

    And then shuffle $randa through $randi just to be sure :)

  • ShatteredArm (unregistered)

    This code is stupid and inefficient. He should've preemptively generated several IDs, say 100, and query all of those at the same time, like so (python-esque pseudocode):

    idlist = [randomid() for i in range(100)] existing_invoices = query("SELECT * FROM invoices WHERE invoicenumber IN (" + idlist.join(",") + ")") for id in idlist: if not existing_invoices.contains(id): query("INSERT... ") break

  • (cs) in reply to WayneCollins
    WayneCollins:
    I must confess that I have written random ID generators in code to feed to a table.

    In my defense, the ID was 128 bits, so it would be a lot longer before you would expect a collision. Also, there was a primary key constraint on the ID, so you couldn't put in a duplicate record... which would bring you back to the ID generating loop, except much later in the growth of the table and with a round trip to the database every time.

    By the way, is there a good way to insert a row with a unique key and get back the key in a single request to an Oracle or MSSQL server?

    128 bits? You're essentially using GUIDs. A company that I worked at 10 years ago used GUIDs for their ID column -- SQL Server supports it.

  • (cs) in reply to WayneCollins
    WayneCollins:
    I must confess that I have written random ID generators in code to feed to a table.

    In my defense, the ID was 128 bits, so it would be a lot longer before you would expect a collision. Also, there was a primary key constraint on the ID, so you couldn't put in a duplicate record... which would bring you back to the ID generating loop, except much later in the growth of the table and with a round trip to the database every time.

    By the way, is there a good way to insert a row with a unique key and get back the key in a single request to an Oracle or MSSQL server?

    You could use the primary key, add each one sequentially, then if you don't want customers to know what their number is, hash it up and give them that number. Save the hash in the db and you can search by that, so you could write something like "select * from customers where hashID = value"

    I do PLC programming and I know that much.

  • (cs) in reply to themagni
    themagni:
    WayneCollins:
    I must confess that I have written random ID generators in code to feed to a table.

    In my defense, the ID was 128 bits, so it would be a lot longer before you would expect a collision. Also, there was a primary key constraint on the ID, so you couldn't put in a duplicate record... which would bring you back to the ID generating loop, except much later in the growth of the table and with a round trip to the database every time.

    By the way, is there a good way to insert a row with a unique key and get back the key in a single request to an Oracle or MSSQL server?

    You could use the primary key, add each one sequentially, then if you don't want customers to know what their number is, hash it up and give them that number. Save the hash in the db and you can search by that, so you could write something like "select * from customers where hashID = value"

    I do PLC programming and I know that much.

    Yeah, that's great, until some other app, or multiple instances of the same app, want to insert things into the same table.

  • (cs) in reply to JamesQMurphy
    JamesQMurphy:
    128 bits? You're essentially using GUIDs. A company that I worked at 10 years ago used GUIDs for their ID column -- SQL Server supports it.

    This was Oracle. I might have overlooked its GUID type if it had one. What I ended up doing was randomly generating two longs in C#, then concatting their "ToString()" output. Which, come to think of it, means that there was less than 128 bits of randomness, since multiple pairs of numbers yield the same concatted string.

    Oops.

  • Anon (unregistered)

    Obviously the solution here is another table, remaining_ids, with all possible IDs. Then when you need an ID you randomly select one from that table and delete it. No endless looping trying to find the next available number!

  • (cs)

    I almost hate to say this, but since I love playing devil's advocate, I actually enjoy it.

    In principle, it is done correctly, the problem is that the range of allowed random numbers is insufficient. Version 4 UUIDs are generated as 122-bit pseudorandom numbers. If he was issuing a GUID and retrying in the event of a conflict he would be applauded for having error detection/correction that many people omit.

    Of course, it would make even more sense to use an auto-increment integer column with a unique constraint, as so many have pointed out, but that doesn't mean that this method is invalid.

  • Anon (unregistered) in reply to but why
    but why:
    Why aren't the invoice numbers allowed to be sequential ?

    Because you can't have a customer place order number 800234 one day and then order 800235 the next day and realize that you have no other customers! Same reason you don't start your invoice numbers at 000001.

  • Anonymous (unregistered) in reply to WayneCollins

    After your insert query, call this:

    SELECT LAST_INSERT_ID() FROM

    It will be thread safe if you're not using persistent connections.

  • I've seen this (unregistered) in reply to jaspax
    jaspax:
    The sad thing is that after reading the title and the first paragraph, I could see exactly what the WTF was going to be. The rest of the article just confirmed my suspicions.

    Same thing here. And on top of that, I know exactly how this code came into existence:

    Developer builds invoicing system. PHB says "I don't want the unique invoice numbers incrementing because then our clients will think we don't do much business. Can you make them random and still unique?"

  • (cs)

    I could tell almost exactly what was coming after the first paragraph, but the random number still came as a surprise. Guess I haven't been here long enough.

    Also, obligatory: while($comment != 'WTF') $comment = generateRandomComment(); post($comment);

  • Jonas (unregistered)

    The follow WTF was in a project I had to fix some month ago. I think it the only way to make it even more worse:

    good_key = false while(good_key == false) key = generate_random_key(5) g = Guest.find_by_login_code(key) if g.blank? key = generate_random_key(5) good_key = true end end

  • Bob (unregistered)

    hm... why not let MySQL handle ID generation?

    CREATE TABLE IF NOT EXISTS `invoices` (
      `code` bigint NOT NULL auto_increment,
      `DtCode` date NOT NULL default '0000-00-00',
      `Type` varchar(100) NOT NULL default '',
      PRIMARY KEY(`code`)
    ) ENGINE=MyISAM DEFAULT CHARSET=latin1;
    

    If you really need the code to be a certain number of characters then just use str_pad() after you fetch the value. Or you could just use a real DBMS (since MySQL is horrible for anything non-trivial).

Leave a comment on “The Quest for the Unique ID”

Log In or post as a guest

Replying to comment #249239:

« Return to Article