The Daily WTF: Curious Perversions in Information Technology

2009-03-15 Reply Admin

Could it be that the Random Number generator is not being seeded before use. So its walking through the same set of "random" numbers on each call?

I looked it up in A PHP manual and seeding was required until version 4.2.0. It would certainly make a bad solution worse.

2009-03-15 Reply Admin

Untouchable:
Untouchable:
$code = mt_rand(1000000, 99999999);
Because you know, Just do what it takes.
And of course, increase the size of the field

What, rob us of tomorrow's WTF? Better yet, switch to SQLite for the 'backend'and forget all about field sizes, data constraints, etc. If you think MySQL can be made not to squeal, you haven't seen anything yet. (This should not be taken as an endorsement of SQLite as a Enterprise or Production ready DB, you should always consult your DBA (or at least a competent one) before making any decisions on switching your backend toolsets.)

I'm still trying to figure out who the business person was who said 'Sure we want random numbers on our invoices! It's not like we'd use the number itself to give us an indication of how old the order is, where it might be in the paper records, etc. Also, all that fun associated with someone picking order 747747 and sending it to the shipping address for 774747 would be avoided! Not to mention that it just confuses our pickers to be able to refer to an invoice by the last 2 digits because we NEVER have more than 60 or so active orders in train, no, far more important that they learn to exercise their verbal skills by having to quote the full number every time.'

TRWTF seems to be that business would find this a logical (and possibly preferred) option for invoice numbers. Alarm bells should have been ringing at acceptance testing.

2009-03-15 Reply Admin

Maybe it was intentional:

Algorithm runs fine at first
Pertformance slows, then nosedives
Consultant comes in to "upgrade" the system
Profit!

2009-03-15 Reply Admin

Kaosadvokit:
I almost hate to say this, but since I love playing devil's advocate, I actually enjoy it.
In principle, it is done correctly, the problem is that the range of allowed random numbers is insufficient. Version 4 UUIDs are generated as 122-bit pseudorandom numbers. If he was issuing a GUID and retrying in the event of a conflict he would be applauded for having error detection/correction that many people omit.

Of course, it would make even more sense to use an auto-increment integer column with a unique constraint, as so many have pointed out, but that doesn't mean that this method is invalid.

On the plus side a UUI has almost no chance of a collision with anyone else's UUID (whether an invoice or otherwise), weven without a key constraint.

On the minus side imagine trying to query an invoice with a UUID over the phone"

"Hello, yes I have a problem with invoice number eff four seven ay see one zero be dash eight see see dash four three seven two ..."

"Sorry was that eff four seven ay see one zero be dash eight see see dash four two seven two ?"

"No I said eff four seven ay see one zero be dash eight see see dash four three seven two... oh never mind, I'll just pay it."

2009-03-15 Reply Admin

I've seen this:

Developer builds invoicing system. PHB says "I don't want the unique invoice numbers incrementing because then our clients will think we don't do much business. Can you make them random and still unique?"

Developer uses autonumber field in database for the key, and adds N random digits (where 10**N = number of imaginary customers) for the "invoice number".

2009-03-16 Reply Admin

I have worked on a system where invoices numbers where genreated from random numbers, just like this. And this was defenitely not a WTF but a very strong requirement from a customer who had insisted not to have non consecutive invoices numbers, just to make it easier not to write up an invoice into the accounting system and have the tax inspector get mad when trying to figure out if the customer was cheating or not.

2009-03-16 Reply Admin

DaveShaw:
Vollhorst:
He could have run that code in multiple threads to speed it up!

Argghh! That made me Laugh.

So what happens when there are 999999 invoices in the database...??

Dave

You just need an infinite number of threads

dkf · 2009-03-16 Reply Admin

acid:
Better yet, switch to SQLite for the 'backend'and forget all about field sizes, data constraints, etc. If you think MySQL can be made not to squeal, you haven't seen anything yet. (This should not be taken as an endorsement of SQLite as a Enterprise or Production ready DB, you should always consult your DBA (or at least a competent one) before making any decisions on switching your backend toolsets.)

Funnily enough, I know someone who did just that. Apparently he's more satisfied with it than the alternative (Oracle) in that situation. I'm not sure whether I would have been that brave...

2009-03-16 Reply Admin

Anyone who chooses "pick a random ID and see if it already exists" over "start at zero and work upwards" is not really a software developer. This one clearly hasn't realised yet but sooner or later he'll get the picture and figure out which career path is actually right for him (or her).

2009-03-16 Reply Admin

SomeCoder:
... I guess the only possible explanation for this code being written is that the developer had exactly zero knowledge of SQL. I... guess... I'm grasping at straws here because this is pretty terrible :)

In my experience, there are a lot of PHP (and other web-app platform) developers out there with, as you say, zero knowledge of SQL.

As we all know, even a little SQL know-how goes a long way towards reducing code bloat and increasing robustness and yet code like we see above seems strangely common.

It's like some developers don't consider SQL to be "real code" or "worth learning" for some reason.

Probably they are just lazy, and used to working in situations where no-one ever sees their code, just the output.

2009-03-16 Reply Admin

Anon:
but why:
Why aren't the invoice numbers allowed to be sequential ?

Because you can't have a customer place order number 800234 one day and then order 800235 the next day and realize that you have no other customers! Same reason you don't start your invoice numbers at 000001.

So mess with the output.

2009-03-16 Reply Admin

I'm surprised 210 posts has failed to mention the birthday paradox: http://en.wikipedia.org/wiki/Birthday_paradox It's one of the reasons this is a really poor idea.

+1 nerd points to me.

Smash King · 2009-03-16 Reply Admin

WayneCollins:
By the way, is there a good way to insert a row with a unique key and get back the key in a single request to an Oracle or MSSQL server?

Sorry if this has already been answered and the case is closed, but for the sake of my sanity I won't read more than 200 posts just to know it.

In a previous job I used a database called Firebird (which is actually good for medium-size databases) and one of my co-workers found out a pretty good way to avoid concurrent inserts.

Firebird has a structure called generators, which do exactly what you asked. You can create a generator associated to a primary key and run a command like "SELECT GEN_ID(<GeneratorName>, 1) FROM Tablename" and it will return you the value of the last call to the generator added by one.

The best part is that I didn't have to worry whether someone would insert a record between my inserts to the parent table and the child table, because I already had the ID. There was no need to do a select to obtain the last ID at all. And even if someone called the generator before my insert to the parent table, their result would not be the same number I got.

All in all, it was a pretty nice database for our not-monstrously-huge applications. Too bad it can't compete in the same league as Oracle or MySQL.

2009-03-16 Reply Admin

Archimidas:
SomeCoder:
Sorry to be pedantic but "eachother" is NOT a word. It's a big pet peeve of mine since way back in school, I had a teacher attempt to teach us that "eachother" was correct. It's two words people, "EACH" and "OTHER".
/pedantic

I guess the only possible explanation for this code being written is that the developer had exactly zero knowledge of SQL. I... guess... I'm grasping at straws here because this is pretty terrible :)

This is how language evolves. If 'eachother' isn't aword itwill besoon.

as will alot of other words....

2009-03-16 Reply Admin

TRWTF:

WayneCollins:
I must confess that I have written random ID generators in code to feed to a table.
In my defense, the ID was 128 bits(snip)

Smash King · 2009-03-16 Reply Admin

WayneCollins:
themagni:
WayneCollins:
I must confess that I have written random ID generators in code to feed to a table.
In my defense, the ID was 128 bits, so it would be a lot longer before you would expect a collision. Also, there was a primary key constraint on the ID, so you couldn't put in a duplicate record... which would bring you back to the ID generating loop, except much later in the growth of the table and with a round trip to the database every time.

By the way, is there a good way to insert a row with a unique key and get back the key in a single request to an Oracle or MSSQL server?

You could use the primary key, add each one sequentially, then if you don't want customers to know what their number is, hash it up and give them that number. Save the hash in the db and you can search by that, so you could write something like "select * from customers where hashID = value"

I do PLC programming and I know that much.

Yeah, that's great, until some other app, or multiple instances of the same app, want to insert things into the same table.

No, that's a very good way to implement it. You still have a sequential primary key, you just display its hash instead. And you save both to the database... I like it.

Your concerns about multiple instances or other apps have nothing to do with the hash. And this "random ID" is far more likely to rise a problem than concurrent access would be.

WayneCollins · 2009-03-16 Reply Admin

In my experience, there are a lot of PHP (and other web-app platform) developers out there with, as you say, zero knowledge of SQL.
As we all know, even a little SQL know-how goes a long way towards reducing code bloat and increasing robustness and yet code like we see above seems strangely common.

It's like some developers don't consider SQL to be "real code" or "worth learning" for some reason.

Probably they are just lazy, and used to working in situations where no-one ever sees their code, just the output

Yeah, I'd say that's the mindset I was stuck in, and I'm working my way out of. SQL is so non-imperative and so unlike my preferred ways of programming that I tried to avoid it as much as I could, and tended to search the database docs only until I learned enough to get me something that my "real" language could process into something usable. Which is how I managed to miss several ways to insert a key from a sequence and get that key back in the same request.

nocturnal · 2009-03-16 Reply Admin

SomeCoder:
Sorry to be pedantic but "eachother" is NOT a word.

word /wɜrd/ Show Spelled Pronunciation [wurd] Show IPA –noun

a unit of language, consisting of one or more spoken sounds or their written representation, that functions as a principal carrier of meaning. Words are composed of one or more morphemes and are either the smallest units susceptible of independent use or consist of two or three such units combined under certain linking conditions, as with the loss of primary accent that distinguishes blackʹbirdʹ from blackʹ birdʹ. Words are usually separated by spaces in writing, and are distinguished phonologically, as by accent, in many languages.

So you are saying that 'eachother' conveys no meaning? Being a word has no correlation to being in the dictionary.

WayneCollins · 2009-03-16 Reply Admin

I think some of us unconsciously embark on a quest for the One True Programming Paradigm. Since most languages are Turing-complete, it's easy to convince ourselves that we've found it, and to look at stumbling blocks of other paradigms as signs that it falls short of the One True Programming Paradigm and overlook the stumbling blocks inherent in our own preferred approaches.

The sad truth is that you need a whole diverse collection of tools to solve a diverse set of problems well. You can hack something out in C# for which a much better solution exists in SQL, and you can hack something out in SQL that would be better solved in the application. But it's better to just learn more tools and learn when to use them and abandon the quest for the One True Programming Paradigm.

2009-03-16 Reply Admin

If we keep going, there will be more comments than words in the original post :)

2009-03-16 Reply Admin

Josh:
... it would take exponentially longer ...

Why must people persist in abusing the work "exponentially"? Clearly the function is 1/(1-(num/max)). Do you see the exponential anywhere here?

The exponential funciton only arises naturally in divide-and-conquour algorithms and differential equations (which you can think of as linear feedback systems).

Stuff that "gets bigger" isn't necessarily exponential. Your fat ass gets bigger when you eat more cakes. Is it an exponential ass? No it aint.

2009-03-16 Reply Admin

WayneCollins:
JamesQMurphy:
128 bits? You're essentially using GUIDs. A company that I worked at 10 years ago used GUIDs for their ID column -- SQL Server supports it.

This was Oracle. I might have overlooked its GUID type if it had one. What I ended up doing was randomly generating two longs in C#, then concatting their "ToString()" output. Which, come to think of it, means that there was less than 128 bits of randomness, since multiple pairs of numbers yield the same concatted string.

Oops.

You should seriously think about getting a different career path.

1 - Why didn't you just asked C# to generate a GUID ? 2 - Concatenating randomness don't decrease it (even if there are several ways to concatenate to get the same result). 3 - If you used a pseudo-random generator, the amount of randomness in the concatenation depends on the underlying algorithm. With some old congruent generators, for instance, there is no added randomness.

WayneCollins · 2009-03-16 Reply Admin

Mark V Shaney:
WayneCollins:
JamesQMurphy:
128 bits? You're essentially using GUIDs. A company that I worked at 10 years ago used GUIDs for their ID column -- SQL Server supports it.

This was Oracle. I might have overlooked its GUID type if it had one. What I ended up doing was randomly generating two longs in C#, then concatting their "ToString()" output. Which, come to think of it, means that there was less than 128 bits of randomness, since multiple pairs of numbers yield the same concatted string.

Oops.

You should seriously think about getting a different career path.

I did. I chose "non-WTF developer", which involves not doing boneheaded things like that. So far it's working out pretty well for me.

1 - Why didn't you just asked C# to generate a GUID ?

Because I was too impatient to learn how to get that GUID into the database.

2 - Concatenating randomness don't decrease it (even if there are several ways to concatenate to get the same result).

No, but if there are several ways to concatenate to get the same result, then you won't get twice as much randomness. If you think you are getting twice as much randomness and you're not, bad things could end up happening.

3 - If you used a pseudo-random generator, the amount of randomness in the concatenation depends on the underlying algorithm. With some old congruent generators, for instance, there is no added randomness.

I used the "SecureRandom" class in the .NET library. I figured that ought to give me as much randomness as I could possibly get my hands on.

nguyenn · 2009-03-16 Reply Admin

Because y'know, God didn't create the world with sequential primary keys, otherwise evolution would be alot more obvious.

2009-03-16 Reply Admin

GUIDs are great for keys that the user will never see. It makes exporting data to another database FAR easier, particularly when the data you're exporting comes from several tables and uses a lot of foreign keys.

I had to write some exporter code for a database that used unsigned ints for the keys, and tracking the foreign keys and creating/updating new ones during the import so as not to clash with any existing records in the target database was... not nice.

2009-03-16 Reply Admin

So does Microsoft CMS Server. Never did find out why.

2009-03-16 Reply Admin

JamesQMurphy:
WayneCollins:
I must confess that I have written random ID generators in code to feed to a table.
In my defense, the ID was 128 bits, so it would be a lot longer before you would expect a collision. Also, there was a primary key constraint on the ID, so you couldn't put in a duplicate record... which would bring you back to the ID generating loop, except much later in the growth of the table and with a round trip to the database every time.

By the way, is there a good way to insert a row with a unique key and get back the key in a single request to an Oracle or MSSQL server?
128 bits? You're essentially using GUIDs. A company that I worked at 10 years ago used GUIDs for their ID column -- SQL Server supports it.

You didn't incidentally work on Microsoft CMS Server did you? Never did find out why they use GUIDs as the primary ID.

2009-03-16 Reply Admin

WayneCollins:
I must confess that I have written random ID generators in code to feed to a table.
In my defense, the ID was 128 bits, so it would be a lot longer before you would expect a collision. Also, there was a primary key constraint on the ID, so you couldn't put in a duplicate record... which would bring you back to the ID generating loop, except much later in the growth of the table and with a round trip to the database every time.

By the way, is there a good way to insert a row with a unique key and get back the key in a single request to an Oracle or MSSQL server?

Are you a troll? Or just an idiot?

steenbergh · 2009-03-16 Reply Admin

DaveShaw:
So what happens when there are 999999 invoices in the database...??
Dave

The sales department goes out to dinner...

2009-03-16 Reply Admin

DaveShaw:
Vollhorst:
He could have run that code in multiple threads to speed it up!

Argghh! That made me Laugh.

So what happens when there are 999999 invoices in the database...??

Dave

You take one down and pass it around?

2009-03-16 Reply Admin

Been awhile, but maybe...
@@IDENTITY returns the last inserted identity value

Just have your row identfied as an identity

Actually, its better to use @@SCOPE_IDENTITY in case the table has triggers firing inserts. @@IDENTITY returns the last ID in the scope of the transaction, whereas SCOPE_IDENTITY gets the ID of the table inthe current scope.

2009-03-16 Reply Admin

1 and 12 11 and 2

2009-03-16 Reply Admin

hatterson:
Brett:
"bumble sort"

So how did it know when it was done? Did it just run forever and give you a higher and higher degree of confidence of sorting?

That was the point of the joke. :)

hatterson · 2009-03-16 Reply Admin

Brett:
hatterson:
Brett:
"bumble sort"

So how did it know when it was done? Did it just run forever and give you a higher and higher degree of confidence of sorting?

That was the point of the joke. :)

I, for one, am firmly behind this new sorting paradigm.

If having a high confidence level rather than absolute knowledge is good enough for primes then it should be good enough for sorting.

design the function to take two arguments. The array needing sorting and the level of confidence to sort it to.

2009-03-16 Reply Admin

WoW.

2009-03-16 Reply Admin

Seeing people code in ignorance of the uniqueidentifier data type makes me fearful of leaving my cloistered MS SQL/C# environment.

2009-03-16 Reply Admin

David:
On the minus side imagine trying to query an invoice with a UUID over the phone"
"Hello, yes I have a problem with invoice number eff four seven ay see one zero be dash eight see see dash four three seven two ..."

"Sorry was that eff four seven ay see one zero be dash eight see see dash four two seven two ?"

"No I said eff four seven ay see one zero be dash eight see see dash four three seven two... oh never mind, I'll just pay it."

This sounds like profit, I'll put AJ on it right now.. thanks.... I'll add you to my facebook money machine.

2009-03-16 Reply Admin

Brett:
This reminds me of a sorting algorithm that a friend invented (in jest). He called it a "bumble sort".
The code randomly looked at an array location and compared it's contents to the location immediately after. If it was lower than the first, it swapped the location's contents and repeated the process on another randomly selected array location.

I <3 Bumble Sort!

2009-03-16 Reply Admin

mort8104:
It's like some developers don't consider SQL to be "real code" or "worth learning" for some reason.
Probably they are just lazy, and used to working in situations where no-one ever sees their code, just the output.

Funny, I actually worked with some Java developers who were trying like crazy to get the office to buy them a component that would abstract data calls so that they didn't have to write any SQL at all. Their view was that SQL directly written by code was effectively the same as 'hard coding', and they wanted to call abstracted data fetches instead.

My point to them that:

a) Calling an abstracted data fetch still required them to put something in code which identified the data they required

b) An extra module was only going to make the code less efficient and therefore slower

c) SQL is a reasonably standard and ubiquitous language whereas their module was going to be propietary

d) FFS, think gloves

seemed to fall on deaf ears. This situation is (I think) very similar to the one Alex represented in a recent soapbox stand about the fun (or lack thereof) in programming jobs. Code does not have to be abstracted to all getup to make it good, and most of the time is better off if it isn't. Developers of most corporate software should know at least some SQL to do their jobs IMHO, and this distinct reluctance and / or outright refusal to learn SQL in any form is something that can and should be tested for during the interview process.

2009-03-17 Reply Admin

acid:
Their view was that SQL directly written by code was effectively the same as 'hard coding', and they wanted to call abstracted data fetches instead.

Arrgh... FFS indeed. Anyone who has done a little thinking about the matter should realize that SQL is actually at a higher level of abstraction than Java or any other object-oriented language (aren't logic and set theory CompSci prerequisites?). This sort of institutionalized ignorance really just makes me want to knock some heads together.

2009-03-17 Reply Admin

Well, a GUID is created similar way - except nobody checks it's unique...

2009-03-17 Reply Admin

Smash King:
Firebird has a structure called generators, which do exactly what you asked. You can create a generator associated to a primary key and run a command like "SELECT GEN_ID(<GeneratorName>, 1) FROM Tablename" and it will return you the value of the last call to the generator added by one.
The best part is that I didn't have to worry whether someone would insert a record between my inserts to the parent table and the child table, because I already had the ID. There was no need to do a select to obtain the last ID at all. And even if someone called the generator before my insert to the parent table, their result would not be the same number I got.

Actually in Firebird 2.1 and later you can simply use a trigger to assign the id and use INSERT ... RETURNING to get the id (same method as you can use in Oracle). No need for querying to sequence/generator yourself.

equex · 2009-03-17 Reply Admin

Burdieburd:
But why is it suddenly slowing down? This would just gradually become slower as the table filled up, and take forever after no slots were left.

Its just that people start noticing it after a certain level of sluggishness has been reached. Before that, you subconciously dismiss it as a random system hickup/lag. Happens here all the time. Tables that grow and grow over time and suddenly you notice that the database query/procedure that processes it goes slower.

2009-03-17 Reply Admin

I wonder that no one of you genies have recognized the missing brace in the PHP code.

DWalker59 · 2009-03-17 Reply Admin

Well, after 5 pages of comments, I don't know if anyone is still reading, but anyway...

I recently saw a Web site devoted to (intentionally) bad sort algorithms. Actually, that's pretty interesting, because if you understand why a sort algorithm is bad, you might have learned something.

Anyway, this post reminded me of one particular sort algorithm: Scramble the dataset randomly, and then check all of the elements to see if it's sorted. If not, try again.

I thought that was really clever, and something I would never have thought of!

DWalker59 · 2009-03-17 Reply Admin

WayneCollins:
I must confess that I have written random ID generators in code to feed to a table.
In my defense, the ID was 128 bits, so it would be a lot longer before you would expect a collision. Also, there was a primary key constraint on the ID, so you couldn't put in a duplicate record... which would bring you back to the ID generating loop, except much later in the growth of the table and with a round trip to the database every time.

By the way, is there a good way to insert a row with a unique key and get back the key in a single request to an Oracle or MSSQL server?

Huh? In MSSQL Server, use NewID() or NewSequentialId() to generate a new ID; don't do it yourself. And if you set the column default to NewId() or NewSequentialId(), and you're inserting records, and you want to get the IDs back, use the Output clause of the Insert statement.

Actually, this works even if you don't let the column default to the ID, but instead use "newid()" as the value to insert. The Output clause will give you the IDs and anything else that you want, from the inserted records, and it's in one request, as you wanted. This works whether you're inserting one record, or more.

2009-03-17 Reply Admin

WayneCollins:
By the way, is there a good way to insert a row with a unique key and get back the key in a single request to an Oracle or MSSQL server?

Sorry, not going to read all five pages (or however many it is by now) just to make sure this isn't a repeat, but:

As of Microsoft SQL Server 2005, Transact-SQL allows you to use an OUTPUT Clause on INSERT, UPDATE, and DELETE statements.

OUTPUT Clause (Transact-SQL)

DaveK · 2009-03-17 Reply Admin

pink_fairy:
Not only does the Queen own them (under specific circumstances, as pointed out by DaveK above), but they're vicious. Not only that, but they have a penis. I'm told that ducks do, too, but somehow I don't have any fear of being raped by a duck.

I blame Zeus.

DaveK · 2009-03-17 Reply Admin

pepe:
I wonder that no one of you genies have recognized the missing brace in the PHP code.

We were hoping you'd let us out of the bottle first. Then we'll grant you three wishes.

2009-03-18 Reply Admin

its so simple, yet so stupid

The Quest for the Unique ID

Leave a comment on “The Quest for the Unique ID”