- Feature Articles
- CodeSOD
- Error'd
- Forums
-
Other Articles
- Random Article
- Other Series
- Alex's Soapbox
- Announcements
- Best of…
- Best of Email
- Best of the Sidebar
- Bring Your Own Code
- Coded Smorgasbord
- Mandatory Fun Day
- Off Topic
- Representative Line
- News Roundup
- Editor's Soapbox
- Software on the Rocks
- Souvenir Potpourri
- Sponsor Post
- Tales from the Interview
- The Daily WTF: Live
- Virtudyne
Admin
dick
Admin
113 is not the same as 113 to you? wtf?
Admin
Some of them even failed on both the programming and the stats in a single comment! I think boog was tormenting them earlier, in accordance with the law.
Admin
[quote user="the beholder] Generated Guid:
Andy's half-assed attemp at a Guid: take the 21, swap them and copy the rest of the GUID by using the function Hortical called irrelevant.The result is:
and it would be be used in an unique column at the same table. I think I do not need to explain why the system barfed when the 1st and 2nd chars were the same, do I?
[/quote]
Dude, if you're going to start quoting GUIDs at people, please stick to the same version of GUID as the article. Things like that are important 'round here. FTFY.
Admin
Actually, I have used 2 systems where we have gotten a single reused GUID.
From Wikipedia's UUID entry:
"In other words, only after generating 1 billion UUIDs every second for the next 100 years, the probability of creating just one duplicate would be about 50%. The probability of one duplicate would be about 50% if every person on earth owns 600 million UUIDs.
However, these probabilities only hold when the UUIDs are generated using sufficient entropy. Otherwise the probability of duplicates may be significantly higher, since the statistical dispersion may be lower."
Since we already know how bad the entropy is on most Microsoft encryption schemes, is it any wonder that UUIDs get more duplicates than they should?
Admin
Admin
Because we're taking a value representing the unique identifier in a database table, and using it to derive a unique identifier for another row in the same table. So if the inputs and outputs are identical, we get two rows with the same unique id. And that's bad.
Admin
Four pages, and no one has pointed out that Andy appears to have been hired by Sony to secure their game consoles and servers.
Captcha: ludus
It's NEVER ludus.
Admin
Nope. The BEST way is still to add, for instance, U to the beginning of the unprocessed ID and change that to P for the processed ID, as someone already suggested. It's the simplest solution and doesn't really have any drawbacks.
Admin
I hope this is a troll.
Admin
And, let me tell you, the impact is something to see. :)
Admin
Except he was using decimal digits: 10^4 = 10,000.
Admin
When you process the processed ID? Why are we processing IDs anyway, I thought we were processing something that HAD an ID?
Admin
The point is that that second set is supposed to be a new batch of GUIDS, the origional ones are not discarded.
In that case the second batch has multiple matches with the first one (you left out 111 in the second batch though :-o ).
Admin
The same chance as all 100 of them get in their right envelopes.
If one letter is not in its right envelope it takes the space of the letter that should be there and that one therefore must be in a wrong envelope too.
In other words: There is a minimum of two letters that can be in wrong envelopes.
... or did I now explain something which is common knowledge ?
Admin
Wow Andy's family is long lived. Methuselah had nothing on these guys.
Admin
Unless you've heard of something called a "database" which might hold a "table" of "columns"; note the plural. For the more advanced, why not consider the possibility of a "value" in on of these "columns" that indicates the "status". Or, as I prefer to call it, "Frinkahedratus"...
Admin
ABAP, for example. But to be fair it's a short form of the proper "with start index" substring construct.
Admin
For reasons significantly connected with youthful ignorance, I once wrote some VBScript code to generate GUIDs for rows in an existing dataset using VBScript's random number function. About 6,000 records as I recall, so well within GUID's capacity to avoid duplicates in theory. So, why write a deduplication script? ;-)
Oops. There were huge numbers of duplicates. Some GUIDs got repeated, exactly, as many as eight times as I recall. Ended up having to write some code to not just check for duplicates but then keep generating new GUIDs on a row-by-row basis until it finally hit on one which hadn't already been used. That took quite a few iterations to get anywhere...
One always knows in theory that a language's 'random numbers' are actually nothing of the sort, but it was still a shock to see it demonstrated quite so conclusively!
Admin
Sure you might consider it but there's no reason to do it that way. Of course, it's more enterprisey.
Admin
As for identity vs. GUID, we've got an app where a significant majority of the time, most of the content processed by the app is handled "offline", at least as far as the data storage is concerned. The original developer realized this, so he assigned a GUID value to every record. However, he also believed that every table's primary key must be an integer primary key. (Ignoring the fact that, now about 6 months in, one of the tables has half a billion records, which means we're going to run out of signed-integer space within the next 18 months, more or less. Assuming the rate of growth doesn't increase, which it is extremely likely to do.)
So every object has both a GUID and an int identifier. And yeah, sometimes he refers to stuff by its GUID and sometimes by its ID, with no apparent rhyme or reason.
Admin
Actually, there was a bug in SQL Server for a while that would do the same thing. It was because their first attempt was purely time-based, so if multiple requests for a new GUID occurred within the same millisecond (?) they'd get the same seed and therefore the same GUID.
Actually, it might have been the exact same bug.
Admin
no, just something which is completely obvious
Admin
I am getting stupider just reading these comments. Guys do you really have nothing better to wave your cocks at?
Admin
It really disturbs me that after 174 comments nobody mentioned that the ID column dit not have a unique constraint.
Admin
You're fun.
I pray to god you're not just adding each name I post to a blacklist.
But no matter how sophisticated your script is, wouldn't this have been easier if you just never mentioned it in the first place? Just write the script and keep quiet about it. I would have never changed names or attempted to obscure my terminology.
Instead, each time you think of how to block something, you have to get a big head and make a point of it. Which just causes a bigger problem.
What a t0ol.
Admin
This is quite a paradox. The idea that a brit could be funny I might find laughable, but a brit suggested, so a brit made me laugh...
Admin
how many years is Universe Lifetime these days?
Admin
The probabilty that two people have been born on the same weekday is 25%.
Admin
What you should have done was use something along the lines of a Mersenne twister.
Admin
Admin
Your not too smart, are you?
Admin
What part of "duplicate" don't you understand?
Admin
Admin
Admin
You guys are all morons. There are only two choices - either a duplicate ID is generated, or it isn't. 1 out of 2 is a 50% chance.
Admin
Admin
better
Admin
Admin
Left out in all the BS about statistics is that Andy needs a severe beatdown in the worst way.
Admin
I am truly impressed by your epic trolls.
Admin
So here's the upgraded version: The Law of The Internet: Troll or be trolled.
Admin
It was a troll?
Admin
If you don't like trolling, but can't take anyone else's comments seriously, why bother posting at all? It seems to me people like you ruin things in a joyless, robotic fashion. One day you'll grow up and realize all this means nothing. Have a nice day, loser.
Admin
There was some contest at a fast-food place I used to frequent where they gave you a ticket with each purchase that might result in you getting some prize, ranging from a free cola to a new car.
On the back of the ticket they gave the probability of winning any given prize on it in X number of visits. Unfortunately, the person who wrote these numbers was, umm, statistically challenged.
Of course a very large percentage of the tickets were the fries or a cola. So they said that the probability of getting a ticket for a cola in one visit was, I forget the exact number, say 15%. Then they said that the probability of winning the cola in 2 visits was 30%, 3 visits: 45%, 5 visits: 75%, 10 visits: "more than 100%".
That is, apparently they thought that if the probability of something happening on one try is X, then the probability of it happening in two tries is 2X, in three tries is 3X, etc. Leading them to the curious phenomenon that if the chance of winning on one try is 15%, then on ten tries the chance of winning must be 150%. Hmmm.
Admin
Admin
On the actual serious side -- I hope that isn't against the rules.
I don't know the algorithm to generate GUIDs. As I understand it there are some bytes that have values that identify how the GUID was created.
But suppose we tried to just create a GUID as a string of 32 random hex digits. So we write something like:
Right? No, really really wrong. Random has, what a 64 bit seed? Or maybe it's only 32 bits? So there are only 2^64 or 2^32 possible strings that we could generate, not 2^128.
Do GUID generation functions use their own random-number generator with a 128 bit seed? How is this seed generated? As the seed value would exactly determine the resulting string, once you've generated the seed, you don't need to generate the string. Just use the seed. So ... how do you create the seed in a way that assures uniqueness?
Admin
So the moral of this story is, for some TDWTF readers, basic logic also escapes them?
Let's create two hypothetical sets, the first being 'N' which contains a series of 'n' characters total, the second being 'S' which is always a string of 'x' length, where x >= 2, each character within the elements of S is an element of N.
Let's create a subset of S called U, which represents the working set of unique identifiers defined within the system in play; for an element of U, let's call it E, if all you're doing is swapping out the first two characters and repeating the rest of E, after those two characters, the probability of a duplicate string occurring in U, is roughly 'n'.
The number of strings represented by the set S is n^x, the number of sub-strings represented by the first two characters (NN) of every element of S is n^2 (n * n).
When you're talking duplicates or probability, you consider the number of elements within the first option, which is 'n' number of elements, let's call this first character C, the second one, which is a duplicate must be the same as the first character (NC), so there's only 1 single character that can be used here (n*1).
So if everything, after the first two characters are swapped, is included, you end up with a drastically smaller pool than you should have. The initial Guid you started with is likely to be unique, the one you created is likely to collide with that same guid one in 16 times, or one in 'n' times, depending on the unique identifier implementation.
You also have to remember, the odds of a unique identifier, in this case a GUID, being used a second time only pertains to the functional system which applies the unique identifier.
So while the statistical odds, or poor implementation, of the GUID system might lead to potentially more repeats than theoretically possible, that only pertains to that closed system, the CLSID entries within HKEY_CLASSES_ROOT, in your registry are irrelevant because they do not overlap with the functional system in play.
If you encounter a duplicate when you generate a key, generate a new damn key. Easy fix.
The real question is why they didn't just call Guid.NewGuid(), or implement a system specific variation that would check the subset in play for a duplicate and repeat as necessary until a new Guid is found.
Admin
Everyone already figured that out, professor. But thanks for the overly complicated (and subtly incorrect, ironically) explanation.
Admin
If it's subtly incorrect. Explain how.
I'll admit to being wrong, if you can tell me where the flaw lies.