The Daily WTF: Curious Perversions in Information Technology

boog · 2011-07-14 Reply Admin

oheso:
Bruce W:
My 10^657 great-grandfather Andy was right!"

I'm not sure we need to worry about Andy reproducing

Are you kidding? This guy's proven he can duplicate even when it should be impossible.

2011-07-14 Reply Admin

The REAL entertaining part about the site is the guys who have even more free time and use it to rant about the comments themselves.

2011-07-14 Reply Admin

Prison coder #19232132:
boog:
Lies:
the code switches the first 2 chars and then appends them in the original order to itself...
Which code snippet did you read?

some of us can read code, others can only read comments: it says:

// Swap two chars of dataset ID // to create processed ID

but it does (i added comments so you can read it ;) ) : //get a sting out of a mysterious guid var dsID = dataSetGuid.ToString(); //create a string builder to perform transform var pdsID = new StringBuilder(); //append second char from dsID pdsID.Append(dsID[1]); //append first char from dsID pdsID.Append(dsID[0]); //append the first chars from ID (in their original order) pdsID.Append(dsID.Substring(2)); //create a guid from the 4 char string called pdsID return new Guid(pdsID.ToString());

dsId.Substring(2); returns the chars of the string from index 2 (inclusive) and on, not the first two chars.

Look it up:

String.Substring Method (Int32) .NET Framework 1.1

Retrieves a substring from this instance. The substring starts at a specified character position.

[Visual Basic] Overloads Public Function Substring( _ ByVal startIndex As Integer _ ) As String [C#] public string Substring( int startIndex ); [C++] public: String* Substring( int startIndex ); [JScript] public function Substring( startIndex : int ) : String;

2011-07-14 Reply Admin

boog:
Prison coder #19232132:
boog:
Which code snippet did you read?
some of us can read code, others can only read comments
You're cute, but that doesn't answer my question.
Prison coder #19232132:
//append the first chars from ID (in their original order) pdsID.Append(dsID.Substring(2));
What kind of ass-backwards language has a substring form that takes a length with no start index? Even VB does this right.

mm...that's the problem with trolling, you often(=always) come out looking like an ass... perhaps if us prison coders had better training.

anyhow - you're right of course, the code looks like c# and the substring would take all chars from position 2 to end of string

which raises a different question - if dsID is 2 chars - then the original statement of the article stands, but if it's longer then it really doesn't...(it might generate a legit guid)

2011-07-14 Reply Admin

Anon:
The REAL entertaining part about the site is the guys who have even more free time and use it to rant about the comments themselves.

This community service is provided at no extra cost to yourself .... (Just happens to be the middle of the night and I was woken up just to see why the server wasn't available remotely. One taxi ride later ... )

2011-07-14 Reply Admin

boog:
Prison coder #19232132:
boog:
Which code snippet did you read?
some of us can read code, others can only read comments
You're cute, but that doesn't answer my question.
Prison coder #19232132:
//append the first chars from ID (in their original order) pdsID.Append(dsID.Substring(2));
What kind of ass-backwards language has a substring form that takes a length with no start index? Even VB does this right.

Dude, .substring(2) returns the substring that begins at index 2. c++ and I think C# do that.

2011-07-14 Reply Admin

Prison coder #19232132:
boog:
Lies:
the code switches the first 2 chars and then appends them in the original order to itself...
Which code snippet did you read?

some of us can read code, others can only read comments: it says:

// Swap two chars of dataset ID // to create processed ID

but it does (i added comments so you can read it ;) ) : //get a sting out of a mysterious guid var dsID = dataSetGuid.ToString(); //create a string builder to perform transform var pdsID = new StringBuilder(); //append second char from dsID pdsID.Append(dsID[1]); //append first char from dsID pdsID.Append(dsID[0]); //append the first chars from ID (in their original order) pdsID.Append(dsID.Substring(2)); //create a guid from the 4 char string called pdsID return new Guid(pdsID.ToString());

dsID.Substring(2) does not get the first two characters of the string. It gets all the characters EXCEPT the first two characters of the string.

If they tried to create a GUID consisting of the first 2 characters swapped plus the first two characters in the original order, that would total -- try out my second grade math here -- 4 characters. But a GUID is 32 characters. So that would not be a valid GUID.

The function does NOT create a 4-character string. It creates a 32-character string with the first two characters swapped. Like, if the original was 12000000-0000-0000-0000-000000000000, the output would be 21000000-0000-0000-0000-000000000000, not 2112.

Thus, probability that the output string will be identical to the input string: 1/16.

Note that if the previous poster were correct in his understanding of substring(2), then the probability that the output string would be identical to the input string would be zero, as the two would have different lengths.

2011-07-14 Reply Admin

Hey, this function works 15 times out of 16, or over 93% of the time! That's probably better than most of the code that I have to work with ...

2011-07-14 Reply Admin

Lies:
Anon:
For each of the 16 characters, there is a 1/16 * 1/16 chance to have that character repeated.
Add all those probabilities up (1/256 + 1/256 ..., 16 times) and you get 1/16.

For all of you who are confused:

let's assume that there are only 3 letters in the alphabet - A ,B and C, the options are:

AA AB AC BA BB BC CA CB CC

or 3^2 = 9 options. the fact that the second character is identical to the first one is not relevant.

the code switches the first 2 chars and then appends them in the original order to itself, so that:

AA = AAAA AB = BAAB AC = CAAC BA = ABBA BB = BBBB BC = CBBC CA = ACCA CB = BCCB CC = CCCC

you still have the orig 9 options so you get 3^2 alternatives, which gives you a 1/9 chance for each of these.

the calculation remains n^2 where n is the number of letters in the alphabet.

some have pointed out that i misread the code and that it reverses the first 2 chars and then takes the rest of the original string.

so i'll adjust the output assuming that the orig. string is only 2 chars long (the rest of the data is irrelevant from a probability standpoint regarding the 1/16 or 1/256 argument):

AA = AA AB = BA AC = CA BA = AB BB = BB BC = CB CA = AC CB = BC CC = CC

you still get 3^2 = 9 distinct cases or 1/9 chance for each combination.

2011-07-14 Reply Admin

"He didn't have a better idea, but was confident that, given enough time, he could cobble something together that utilized the computer's serial number, CPU footprint and a number of other factors."

I would use numeric time, assuming the SLA allowed for the performance hit of synchronization/mutex lock.

2011-07-14 Reply Admin

hmm...yes, never mind - i misunderstood the usage of pdId and dsId..

2011-07-14 Reply Admin

yes, and if your goal was to create a new combination that is unique (meaning different) than the original, then it fails in 3 out of the 9 cases (AA,BB,CC).

As has been said, you create the same combination in 3/9 or as it is more commonly stated, 1/3 of the time.

2011-07-14 Reply Admin

Could someone explain to a non-Windows coder why it matters whether or not the first two digits are identical? I get the point that it does, and don't really want to argue about how likely that is, it just seems broken that something purporting to generate a unique identifier is that sensitive to initial input.

I mean, if there's a library routine like this:

int random(unsigned int seed) { srand(seed); return rand(); }

and Andy calls it from his program as random(42), then yes there's some justification in calling Andy an idiot, but perhaps not as much as in calling the person who wrote the library routine an idiot.

2011-07-14 Reply Admin

Reminiscent of this....

http://stackoverflow.com/questions/1705008/simple-proof-that-guid-is-not-unique

2011-07-14 Reply Admin

Jerry:
jrh:
So, to fix this they need to go back to where they are creating the DatasetID, check to see if the first 2 characters match and if they do create a new guid until they don't. Amiright?
No, no no! That will use up too many guids, aren't you paying attention? If the first two characters are the same, just swap the first and the third.

I actually can't think of a stupider way to fix the problem. You win.

That should also be a featured comment.

2011-07-14 Reply Admin

Hortical:
Prison coder #19232132:
boog:
Lies:
the code switches the first 2 chars and then appends them in the original order to itself...
Which code snippet did you read?

some of us can read code, others can only read comments: it says:

// Swap two chars of dataset ID // to create processed ID

but it does (i added comments so you can read it ;) ) : //get a sting out of a mysterious guid var dsID = dataSetGuid.ToString(); //create a string builder to perform transform var pdsID = new StringBuilder(); //append second char from dsID pdsID.Append(dsID[1]); //append first char from dsID pdsID.Append(dsID[0]); //append the first chars from ID (in their original order) pdsID.Append(dsID.Substring(2)); //create a guid from the 4 char string called pdsID return new Guid(pdsID.ToString());

dsId.Substring(2); returns the chars of the string from index 2 (inclusive) and on, not the first two chars.

And some can read code, comments and documentation.

I wish I worked with more people like Hortical.

2011-07-14 Reply Admin

It takes the GUID from the API, or else it gets the hose again. It does this, whenever it is told.

Coyne · 2011-07-14 Reply Admin

Mathias:
"Oh," Andy said, embarrassed. "I see. But what are the chances of that?"
About one in sixteen, statistically speaking.

Exactly.

Amazing, the number of people who can't figure stuff like this out.

At our location, one of the programmers was going to generate a pin for each employee: His proposal called for 4 digits, with no two employees having the same pin.

When I pointed out that we had 40,000 employees, and a 4-digit pin has 10,000 unique pins, his response was, "So, what?"

(Duhhhhh...)

2011-07-14 Reply Admin

Please somebody, I'm so confused. What's the final word on this? Does SubString() take a length or a start index? I first read it as start index and wondered what the problem was (other than just using the GUID out of the box instead of manipulating it). If "length" (starting at index 0), then the story works, but that's not how I read it at first.

Coyne · 2011-07-14 Reply Admin

Ptorq:
Could someone explain to a non-Windows coder why it matters whether or not the first two digits are identical? I get the point that it does, and don't really want to argue about how likely that is, it just seems broken that something purporting to generate a unique identifier is that sensitive to initial input.
I mean, if there's a library routine like this:

int random(unsigned int seed) { srand(seed); return rand(); }

and Andy calls it from his program as random(42), then yes there's some justification in calling Andy an idiot, but perhaps not as much as in calling the person who wrote the library routine an idiot.

Windows or non-windows doesn't matter. If your GUID is this:

BB63DC25-D37B-4107-9D63-74825C2C7443

Then swapping the first two hex digits doesn't accomplish much. In any language.

ContraCorners · 2011-07-14 Reply Admin

Jay:
Prison coder #19232132:
boog:
Lies:
the code switches the first 2 chars and then appends them in the original order to itself...
Which code snippet did you read?

some of us can read code, others can only read comments: it says:

// Swap two chars of dataset ID // to create processed ID

but it does (i added comments so you can read it ;) ) : //get a sting out of a mysterious guid var dsID = dataSetGuid.ToString(); //create a string builder to perform transform var pdsID = new StringBuilder(); //append second char from dsID pdsID.Append(dsID[1]); //append first char from dsID pdsID.Append(dsID[0]); //append the first chars from ID (in their original order) pdsID.Append(dsID.Substring(2)); //create a guid from the 4 char string called pdsID return new Guid(pdsID.ToString());

dsID.Substring(2) does not get the first two characters of the string. It gets all the characters EXCEPT the first two characters of the string.

If they tried to create a GUID consisting of the first 2 characters swapped plus the first two characters in the original order, that would total -- try out my second grade math here -- 4 characters. But a GUID is 32 characters. So that would not be a valid GUID.
<snipped for length>
Note that if the previous poster were correct in his understanding of substring(2), then the probability that the output string would be identical to the input string would be zero, as the two would have different lengths.

Correct! But the probability that your newly-generated, 4 character "processed id" would be identical to some other "processed id" from some other record in your table would be somewhat high (relative to the probability of having duplicated GUIDs) wouldn't it?

2011-07-14 Reply Admin

Lies:
Lies:
Anon:
For each of the 16 characters, there is a 1/16 * 1/16 chance to have that character repeated.
Add all those probabilities up (1/256 + 1/256 ..., 16 times) and you get 1/16.

For all of you who are confused:

let's assume that there are only 3 letters in the alphabet - A ,B and C, the options are:

AA AB AC BA BB BC CA CB CC

or 3^2 = 9 options. the fact that the second character is identical to the first one is not relevant.

the code switches the first 2 chars and then appends them in the original order to itself, so that:

AA = AAAA AB = BAAB AC = CAAC BA = ABBA BB = BBBB BC = CBBC CA = ACCA CB = BCCB CC = CCCC

you still have the orig 9 options so you get 3^2 alternatives, which gives you a 1/9 chance for each of these.

the calculation remains n^2 where n is the number of letters in the alphabet.

some have pointed out that i misread the code and that it reverses the first 2 chars and then takes the rest of the original string.

so i'll adjust the output assuming that the orig. string is only 2 chars long (the rest of the data is irrelevant from a probability standpoint regarding the 1/16 or 1/256 argument):

AA = AA AB = BA AC = CA BA = AB BB = BB BC = CB CA = AC CB = BC CC = CC

you still get 3^2 = 9 distinct cases or 1/9 chance for each combination.

AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA!!!!!!!!!!!!!!!!!!!!!!!

HOW MANY OF THOSE HAVE TWO OF THE SAME CHARACTER!!!?!?!?!?!?!!

TTTTTTHHHHHHHHHHRRRRRRRRREEEEEEEEEEE!!!!!!!!!!!!!!!

AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA!!!!!!!!!!!!!!!!!!!!!!!

2011-07-14 Reply Admin

Coyne:
Amazing, the number of people who can't figure stuff like this out.

The problem is they're thinking too hard. They start to trip when approaching the question, but instead of trying to pull themselves up to reassess the question, they just dive right into the concrete.

2011-07-14 Reply Admin

Why haven't anyone pointed out that using a guid in the first place is kind of silly? Why not just use a counter? Using a guid was kind of wtf in the first place if you ask me..

boog · 2011-07-14 Reply Admin

Wow. Reading through some of the comments here, I'm thinking this roughly characterizes what many of the commentators on this site are feeling right now.

2011-07-14 Reply Admin

Hortical:

AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA!!!!!!!!!!!!!!!!!!!!!!!

HOW MANY OF THOSE HAVE TWO OF THE SAME CHARACTER!!!?!?!?!?!?!!

TTTTTTHHHHHHHHHHRRRRRRRRREEEEEEEEEEE!!!!!!!!!!!!!!!

AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA!!!!!!!!!!!!!!!!!!!!!!!

Probability:
Re: A More Unique Identifier 2011-07-14 13:04 • by Probability, Lies... Damn (unregistered) 353518 in reply to 353516 REPLY QUOTE hmm...yes, never mind - i misunderstood the usage of psId and dsId..

and for those who are still not sure: "12345".substring(2) = "345"

2011-07-14 Reply Admin

Coyne:
swapping the first two hex digits doesn't accomplish much. In any language.

Now that you've pointed it out, I see. Somehow I read a call to something to generate a GUID in there, instead of just taking an existing one and lightly frobbing it.

2011-07-14 Reply Admin

Probability:
Hortical:

AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA!!!!!!!!!!!!!!!!!!!!!!!

HOW MANY OF THOSE HAVE TWO OF THE SAME CHARACTER!!!?!?!?!?!?!!

TTTTTTHHHHHHHHHHRRRRRRRRREEEEEEEEEEE!!!!!!!!!!!!!!!

AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA!!!!!!!!!!!!!!!!!!!!!!!

Probability:
Re: A More Unique Identifier 2011-07-14 13:04 • by Probability, Lies... Damn (unregistered) 353518 in reply to 353516 REPLY QUOTE hmm...yes, never mind - i misunderstood the usage of psId and dsId..

and for those who are still not sure: "12345".substring(2) = "345"

ah, fuck it.

You're either obstinately wrong or talking about something that's irrelevant to the discussion.

2011-07-14 Reply Admin

Ace:
Why haven't anyone pointed out that using a guid in the first place is kind of silly? Why not just use a counter? Using a guid was kind of wtf in the first place if you ask me..

You missed one important sentence:

The tricky part in all this was that the processing application would never know how many IDs were issued or what IDs had been issued: It would somehow have to provide an ID that was always unique.

2011-07-14 Reply Admin

Scott:
Please somebody, I'm so confused. What's the final word on this? Does SubString() take a length or a start index? I first read it as start index and wondered what the problem was (other than just using the GUID out of the box instead of manipulating it). If "length" (starting at index 0), then the story works, but that's not how I read it at first.

Microsoft:
String.Substring Method (Int32, Int32)
Retrieves a substring from this instance. The substring starts at a specified character position and has a specified length.

First arg is start position and the second is the length to read.

Taken from MSDN

Not exactly spam now is it?

2011-07-14 Reply Admin

Oh, forgot to mention length defaults to length of string - start position

2011-07-14 Reply Admin

Coyne:
At our location, one of the programmers was going to generate a pin for each employee: His proposal called for 4 digits, with no two employees having the same pin.
When I pointed out that we had 40,000 employees, and a 4-digit pin has 10,000 unique pins, his response was, "So, what?"

(Duhhhhh...)

Maybe he was planning on using hex digits?

No, probably not...

2011-07-14 Reply Admin

boog:
Jem:
Are you all seriously arguing about 8th grade math?
What else are 8th grade math nerds to do during summer break?

Really? Did they hold you back a year?

2011-07-14 Reply Admin

:):
article:
After all of 10 minutes, Jeremy discovered the root of the problem:
<VB>

Good catch! Oh, wait, it isn't VB. You fucking retard.

2011-07-14 Reply Admin

butthurt vb programmer:
:):
article:
After all of 10 minutes, Jeremy discovered the root of the problem:
<VB>
Good catch! Oh, wait, it isn't VB, it's...

What is it?

2011-07-14 Reply Admin

[quote user="Emu"] Retrieves a substring from this instance. The substring starts at a specified character position and has a specified length.[/quote] First arg is start position and the second is the length to read. [/quote]

Thank you, Emu. There's no indication what language or toolkit this code is (some intelligent guesses could be made I suppose), and I'm not a Windows programmer.

So, if this is true, wouldn't this code take a guid like {f4204ca9... and turn it into {4f204ca9... ?

In that case, there's a WTF, but it wouldn't cause collisions with a 1/16 probability.

Right?

2011-07-14 Reply Admin

No Windoze!:
This couldn't happen in Linux.

Once you get to the point where your operating system is virtually infallible, the developer becomes the weakest link.

2011-07-14 Reply Admin

Hortical:
Probability:
Hortical:

AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA!!!!!!!!!!!!!!!!!!!!!!!

HOW MANY OF THOSE HAVE TWO OF THE SAME CHARACTER!!!?!?!?!?!?!!

TTTTTTHHHHHHHHHHRRRRRRRRREEEEEEEEEEE!!!!!!!!!!!!!!!

AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA!!!!!!!!!!!!!!!!!!!!!!!

Probability:
Re: A More Unique Identifier 2011-07-14 13:04 • by Probability, Lies... Damn (unregistered) 353518 in reply to 353516 REPLY QUOTE hmm...yes, never mind - i misunderstood the usage of psId and dsId..

and for those who are still not sure: "12345".substring(2) = "345"

ah, fuck it.

You're either obstinately wrong or talking about something that's irrelevant to the discussion.

No, he's both right and relevant. And while I think you might be trolling I'll answer anyway, at least for the sake of other readers who did not understand today's WTF.

proposition: The system would generate a GUID for the unprocessed dataset and another for the processed dataset. They wouldn't need to be related but Andy related them anyway. Here's an example:

Generated Guid:

21 EC2020-3AEA-1069-A2DD-08002B30309D

Andy's half-assed attemp at a Guid: take the 21, swap them and copy the rest of the GUID by using the function Hortical called irrelevant.

The result is:

12 EC2020-3AEA-1069-A2DD-08002B30309D

and it would be be used in an unique column at the same table. I think I do not need to explain why the system barfed when the 1st and 2nd chars were the same, do I?

2011-07-14 Reply Admin

Nexzus:
WC:
No, 1 in 256 is the changes of the 2 characters being a certain character.
The chance that the second character is the same as the first is only 1 in 16.

Look at it this way: The first character is given to you. You don't need to worry about it, it's already there. The second character is the one we're predicting the probability of.

A? - 1 in 16 chance that ? is an A B? - 1 in 16 chance that ? is a B

etc.

It's fun to ask people the chance of being born on the same day of the week as yourself. Assuming they at least know the bare minimum about odds and multiplication, about 90%* will say 1 in 49.

Yes, I have a strange definition of fun.

*number pulled out of my ass.

Then, your ass obeys Sturgeon's Law as well it should.

2011-07-14 Reply Admin

:):
butthurt vb programmer:
:):
article:
After all of 10 minutes, Jeremy discovered the root of the problem:
<VB>
Good catch! Oh, wait, it isn't VB, it's...
What is it?

C#

hatterson · 2011-07-14 Reply Admin

Prison coder #19232132:
boog:
Prison coder #19232132:
boog:
Which code snippet did you read?
some of us can read code, others can only read comments
You're cute, but that doesn't answer my question.
Prison coder #19232132:
//append the first chars from ID (in their original order) pdsID.Append(dsID.Substring(2));
What kind of ass-backwards language has a substring form that takes a length with no start index? Even VB does this right.

mm...that's the problem with trolling, you often(=always) come out looking like an ass... perhaps if us prison coders had better training.

anyhow - you're right of course, the code looks like c# and the substring would take all chars from position 2 to end of string

which raises a different question - if dsID is 2 chars - then the original statement of the article stands, but if it's longer then it really doesn't...(it might generate a legit guid)

The issue is that (apparently) the values of dsID and pdsID are both used to (supposedly) uniquely identify datasets.

pdsID is simply dsID with the first two characters switched so in all GUIDs where the first two characters are the same (1/16th of GUIDs) the pdsID = dsID which then causes the issue.

Reynoldsjt · 2011-07-14 Reply Admin

For the love of all that is holy, just spell it out for people. Total number of combination - duplicates divided by total number of combinations 16^2 - 16*15 / 16^2 = 1/16

2011-07-14 Reply Admin

the beholder:
I think I do not need to explain why the system barfed when the 1st and 2nd chars were the same, do I?

No, you don't, because you don't know what I was referring to possibly because I didn't know what the other guy was referring to.

The dicussion was pulled off on some tangent were people didn't know what the chances of this problem was (the guy i was epsonsfvdfvdf vdfvd

just tired of talking about it

2011-07-14 Reply Admin

LANMind:
:):
butthurt vb programmer:
:):
article:
After all of 10 minutes, Jeremy discovered the root of the problem:
<VB>
Good catch! Oh, wait, it isn't VB, it's...
What is it?
C#

yeah, I looked up the var keyword, didn't that was in c#

2011-07-14 Reply Admin

Reynoldsjt:
For the love of all that is holy, just spell it out for people. Total number of combination - duplicates divided by total number of combinations 16^2 - 16*15 / 16^2 = 1/16

No. For the thousandth time, this is wrong.

The GUID in the story starts with "66" the chances of getting a 6 are 1/16 followed by another 6 is another 1/16. So when Andy asked "what are the chances of that happening?" the chances of getting that GUID are 1/16 * 1/16 = 1/256.

2011-07-14 Reply Admin

Yep, Andy must have felt that the two guids needed to be related in someway and coded to generate the second guid based on swapping the 1st two characters of the 1st guid.

No clue as to why he didn't just generate a new one, unless he wanted to 'track' the relationship of the 2 ids. In which case, it's best to relate the guids using a lookup table. But Andy probably didn't want to do that.

Cal

2011-07-14 Reply Admin

Lies:
Lies:
Anon:
For each of the 16 characters, there is a 1/16 * 1/16 chance to have that character repeated.
Add all those probabilities up (1/256 + 1/256 ..., 16 times) and you get 1/16.

For all of you who are confused:

let's assume that there are only 3 letters in the alphabet - A ,B and C, the options are:

AA AB AC BA BB BC CA CB CC

or 3^2 = 9 options. the fact that the second character is identical to the first one is not relevant.

the code switches the first 2 chars and then appends them in the original order to itself, so that:

AA = AAAA AB = BAAB AC = CAAC BA = ABBA BB = BBBB BC = CBBC CA = ACCA CB = BCCB CC = CCCC

you still have the orig 9 options so you get 3^2 alternatives, which gives you a 1/9 chance for each of these.

the calculation remains n^2 where n is the number of letters in the alphabet.

some have pointed out that i misread the code and that it reverses the first 2 chars and then takes the rest of the original string.

so i'll adjust the output assuming that the orig. string is only 2 chars long (the rest of the data is irrelevant from a probability standpoint regarding the 1/16 or 1/256 argument):

AA = AA AB = BA AC = CA BA = AB BB = BB BC = CB CA = AC CB = BC CC = CC

you still get 3^2 = 9 distinct cases or 1/9 chance for each combination.

Well, there's your problem. That's the whole point of the article - that when the first two characters are the same, we get a collision.

The fact that the second character is identical to the first one is supremely relevant.

2011-07-14 Reply Admin

boog:
oheso:
Bruce W:
My 10^657 great-grandfather Andy was right!"
I'm not sure we need to worry about Andy reproducing
Are you kiddy-fiddling? This guy's proven he can duplicate even when it should be impossible.

It's nice talking to you!

2011-07-14 Reply Admin

From the sound of the story, the Processed GUID can't collide with the original GUID. The Processed GUID is "generated" by swapping the first two characters. So, if the first two characters of the original GUID are identical, the Processed GUID will be the same as the original GUID.

So, given that the GUID will collide in any case where the first two characters are the same, we need to look for any situation where the first two characters are the same. You state that the probability of the first two characters both being 6 is 1/256, which is correct, but we don't care what the first one is, only that the second letter is the same as the first. Since the probability of any given character being in both the first and second place is 1/256, and there are 16 possible characters, multiplying gives us...

1/16 chance of collision.

2011-07-14 Reply Admin

Lies:
Reynoldsjt:
For the love of all that is holy, just spell it out for people. Total number of combination - duplicates divided by total number of combinations 16^2 - 16*15 / 16^2 = 1/16

No. For the thousandth time, this is wrong.

The GUID in the story starts with "66" the chances of getting a 6 are 1/16 followed by another 6 is another 1/16. So when Andy asked "what are the chances of that happening?" the chances of getting that GUID are 1/16 * 1/16 = 1/256.

Except that isn't the question at hand. The question is what are the chances that the first two digits of the GUID will be identical. If every digit as a 1/16 chance of being any one particular character, then the chances of the first two digits being identical is 1/16 as the probability of the first digit doesn't matter at all since the first digit is already fixed.

A More Unique Identifier

Leave a comment on “A More Unique Identifier”