- Feature Articles
- CodeSOD
- Error'd
- Forums
-
Other Articles
- Random Article
- Other Series
- Alex's Soapbox
- Announcements
- Best of…
- Best of Email
- Best of the Sidebar
- Bring Your Own Code
- Coded Smorgasbord
- Mandatory Fun Day
- Off Topic
- Representative Line
- News Roundup
- Editor's Soapbox
- Software on the Rocks
- Souvenir Potpourri
- Sponsor Post
- Tales from the Interview
- The Daily WTF: Live
- Virtudyne
Admin
String.Concat("F", "R", "I", "X", "T").ToLower().Replace("X", "S")
Admin
@MatsH: If replace does not work case insensitive, your replace won't work :)
Admin
I would expect to see a Math.Random() in there somewhere to optimize uniqueness.
Admin
True WTF is a GUID is only a 128-bit value. So if you have 20 character column it seems some clever boy could find an encoding scheme that could store the whole value.
Can't wait for the follow up post where they discover their truncated unique ids are no longer quite unique a few years from now.
Admin
Missed the must start with 'N' part at first. Okay so you have 19 characters to store a 16 byte value. Still seems doable; though you might need cleverness to deal with code page limitations in the DB.
Admin
Assuming 19 ASCII characters, then there are 19*7 = 133 bits available. GUIDs are 128 bits so that fits comfortably.
We even have 5 bits available to do other WTF stuff!
Admin
'N' +
date +%s%N
Fits 20 chars exactly.
Ok, bit of a problem if you're db entries are coming in more than one per nanosecond, but hey, that's what sleep() is for! :-)
Admin
Some date() function implementations don't have very high granularity. And they can also be quite expensive (slow) but I'm not sure if guid() can be done faster.
Admin
106 symbols are required to store 128 bits into 19 characters. (106^19 < 2^128 < 107^19). You can go down to 85 if you allow 20 characters.
ASCII can do it, but there are a good number of non-printing characters that are largely obsolete.
Without being clever, you just break it up into bytes and store the bytes as the corresponding ASCII character, but your Database might get upset with you. Presumably that's what the original coder was worried about.
Admin
The real WTF is using anything except id for the single primary key. What value are you adding? Table.Id is not as redundant as Table.TableId. It is not as random as Table.[22345668905233560890]. Can anyone make a valid case for naming a single non-composite primary key with anything other than id?
Admin
SCNR
Admin
Some people prefer to keep the names of the member fields intact in the database (think roll-your-own ORM). So id is taken by the id of the driver's licence, and thus the unique primary key becomes Table.Id
Yes, I know it's a bad example as the id of the driver's licence should be unique as well, but still.
Admin
Sure, ASCII is 7-bit, but not all the characters are printable. No good if you want to display the identifiers somewhere (which I assume they do, otherwise why the length & first character restrictions?).
Admin
If we limit ourselves to printable characters including space that gives us 95 usable characters which according to wolfram alpha lets us store just over 124 bits worth of information in 19 characters.
By my recoking to store the whole GUID in 19 character positions the system needs to be able to safely handle 107 possibilities for each character.
Admin
Stupid question maybe, but why not just convert a GUID string into something like Base36?
Would that not keep the "uniqueness" of the GUID while coming in under the character limit?
Admin
Nevermind, some quick and simple tests online show that even using Base64 results in more than 19 characters.
Admin
WTF???
(I guess I should go reread the man pages once every decade or so.)
Admin
Part 1: the constraint that you must use a varchar(20) column. Why? Is there some religious objection to using a proper sequence ID?
Part 2: the constraint that it must be more uniquerer than a number that just goes up and up and up without meaningful limit.
$me suspects that this is a result of someone having been bitten at some time in the past by flaky operation on an incorrectly managed DB.
Admin
This is a simple and common misunderstanding. When the boss said to use a GUID he probably meant Guaranteed Useless Informix Database.
Admin
Perhaps a good time to recall the old superlative of unique: "most uniquest"!
Admin
The pseudo-random requirement for the id is to make it harder for hackers / crackers to guess the next id. How many times have we seen a web interface where the table id is passed in through the uri?
Admin
Thanks for doing that. I was lazy.
Admin
Admin
But... erm... what is the single supercomparative then? uniquester?
Admin
I think you got your base and exponent mixed up
Admin
Fitting a real GUID/UUID into 20 chars is easy. They fit into VARBINARY(16) easily, too.
create a UUID strip '-' and trim '{}' now you can use the result with HEX()/UNHEX into 16 bytes.
eg. SELECT HEX(id) as id, name FROM my_table WHERE id=UNHEX(?) or INSERT INTO my_table (id, name, rank) VALUES (UNHEX(?),?,?)
Admin
Security through obscurity is what you're describing... and only prevents access to "unauthorized records" (from bad code) by writing more bad code.
Admin
I use TableNameID because foreign key columns will have the same name as the primary key (always join on x.foo = y.foo), and I don't have to alias the column when I select from a join.
Admin
+1
Admin
Admin
Admin
My first job used a db that enforced a 2 field primary key. Key one was yyyymmdd Key two was hhnnss
When records were created, the 'duplicate key' error was captured, a sleep(1) was issued and the insert tried again.
Performance was often an issue but my suggestion for using an integer key was dismissed with a withering look that said "hey new guy - shut up - we're the experts here".
Funnily enough, they're no longer in business.
Admin
Making sure your company will get the next project to fix the software you wrote earlier is a VERY important part of future planning!
Admin
Why oh why can't they add the N in the code that uses the value?
Oh right...
Admin
TRWTF is English.
Why we even have words like more and most.
Just add er and est onto the end. Oh what, it doesn't sound good? Well we're no longer a poetic language, get pragmatically used to it. most ragged -> raggedest Switching letters instead of just adding the suffix, what, it sounds the same, does it just not look nice. Get used to it. jumpier -> jumpyer I before E except after C except... random exceptions. What... NO not at all. Get used to it. weird -> wierd. (IMO, it looks weird either way). The plural of x is x. Not anymore. sheep -> sheeps The plural of oo is ee. The hell no. geese -> gooses
I don't care if it sounds wierd. That will go away with use. Let's get consistent for once.
Admin
In absence of any addition information I would have to assume the whatever needs the leading 'N' probably also is expecting not more than 20 characters, given the requirement is to use a varchar 20 field. Its also quite likely that whatever can't really take 20 characters max but rather 20 bytes max.
Given the constraints are rather bizarre to start with its reasonable there may be other limitations too such as the bytes must represent printable characters in the native code page.
You could do a number of things you could do the most naive thing possible for instance create a view that just concats 'N' on to the front of the id column but this might result in 21 bytes being return to the caller.
My guess would be there is more stupid in here some place. There is virtually no reason anything should depend on a constant fixed first byte inside a single database attribute unless its really multi-value, in that 'N' defines some kind of class, or grouping, etc. Hey they are using a varchar as a unique identifier and stuffing it with bogus broken guids; I think we can assume their schema probably sucks too.
Admin
"Can anyone make a valid case for naming a single non-composite primary key with anything other than id?"
from table1 a join table2 b on a.Table1ID=b.Table1ID
The real-real-wtf is your lack of imagination.
Admin
So, yes, English is a WTF, but other languages are WTF as well. They use silly things like diacriticals and even funny letters.
The next tirade is on our measurement systems, but that is a big WTF that we all know.
Admin
Base64 will always result in more characters than the source string, because it's designed to pack a series of 8-bit bytes into a series of 7-bit-clean bytes. You lose 1/8 of the storage capacity of each byte when you Base64-encode something, by design.
But what you could do, hypothetically, is take a GUID in hex form, strip out the dashes, and treat the result as if it were an already-base-64-encoded string. Run that through a Base64 decode and it would get smaller. But you're still losing out in the long run, because simply expressing a GUID as a string of hex digits in the first place automatically doubles the space it takes (each byte of the actual GUID is represented by two characters, or two bytes). So you would go from 128 bytes of original data to ( (2*7)/8 ) * 128 = 224 bytes of "compressed" data.
TRWTF is a project manager that doesn't understand relational databases or bytes.
( CAPTCHA: 'consequat' -- Tim couldn't do a single developer task correctly; consequatly, they made him the project manager.)
Admin
Exactly. And once the field name when used as a foreign key has to be different from the field name in the original record, you're bound to get people calling it different things. I'm presently working on a system where sometimes transactions.id is referenced as "transactions_id", other times as "transaction_id", "transactionid", "tranid", etc etc. Ditto when you have to alias it.
Admin
It's more uniquer; the double supersuperlative would be most uniquestest.
Admin
Yes, I can: "join using()"
love its!
Admin
Must start with an N and be unique and (at most) 20 chars long.
First, where does that requirement come from? Is the "N" negotiable? Can I change the width of the column? Is this a value that is stored in the DB or is it in a report?
Second, can I store a real guid in the DB and provide this column's value as a computed field? That way I can ensure uniqueness, and play with a formula for computing the field until I get it actually producing a unique value.
Or, can I use a sequence number in this column?
It boggles the mind that the engineer who coded up this solution did not ask these questions first. It also boggles the mind that he did not do some calculations/test runs to see how unique his "unique" value would be.
Admin
This assumes that any ASCII character is fair game, including NUL, ACK, BEL, and so on. Whether characters like TAB and CR would be allowed is debatable... But let's say you have 100 printable characters to play with (it's almost certainly a little less):
100^19 = 1.0e+38 permutations 256^16 = 3.4e+38 permutations
So you're short on space by more than a factor of three.
Admin
Not work with turkey Localization.
Admin
Yes. So that the ID is clearly specified in multi-table statements. I use [TableName]ID for surrogate keys. People who use simply "ID" in every table are noobs.
Admin
Dear Dr. Boggled,
The application has existed for years and years with the varchar(20) key. That key was selected because it matched a key provided by an external data supplier. Yes, it's supposed to be displayed on reports, so it can't be pure binary bytes stored in a varchar.
A new requirement recently arose that amounted to a second, completely distinct source of rows for this same table. The 'N' prefix was chosen to distinguish this class of IDs, and to guaranteed that the purely-numeric values already on file would never overlap the 'N' synthetically generated ones.
The focus of the posting was the wacko usage of the GUID, not the underlying schema. I'm very surprised at the turn this discussion has taken.
Admin
In India, I can have driver's license in two states at same time. Both current.
Admin
If you can guarantee that it's always the same type of GUID (probably v4), then you only need to store 124 bits. Or use a v1, and chop out the MAC part, if you can guarantee the IDs will always be generated by the same comp.
I.e. both are bad ideas. The second is worse that the first.
Admin
Given that the shown SQL casts the guid to an Absolute BigInt, the format appears to be 'N' followed by Numeric characters. The article isn't clear if only numeric are allowed after the 'N', but probably.
A couple more words in the article could easily clarify what is allowed.
The proposed solutions and alternatives have been based on the full ascii set being allowed in the key's value. Which is unlikely.