• MatsH (unregistered)

    String.Concat("F", "R", "I", "X", "T").ToLower().Replace("X", "S")

  • mm (unregistered)

    @MatsH: If replace does not work case insensitive, your replace won't work :)

  • Gyxi (unregistered)

    I would expect to see a Math.Random() in there somewhere to optimize uniqueness.

  • Geoff (unregistered)

    True WTF is a GUID is only a 128-bit value. So if you have 20 character column it seems some clever boy could find an encoding scheme that could store the whole value.

    Can't wait for the follow up post where they discover their truncated unique ids are no longer quite unique a few years from now.

  • Geoff (unregistered) in reply to Geoff
    Geoff:
    True WTF is a GUID is only a 128-bit value. So if you have 20 character column it seems some clever boy could find an encoding scheme that could store the whole value.

    Can't wait for the follow up post where they discover their truncated unique ids are no longer quite unique a few years from now.

    Missed the must start with 'N' part at first. Okay so you have 19 characters to store a 16 byte value. Still seems doable; though you might need cleverness to deal with code page limitations in the DB.

  • Colin 't Hart (unregistered) in reply to Geoff

    Assuming 19 ASCII characters, then there are 19*7 = 133 bits available. GUIDs are 128 bits so that fits comfortably.

    We even have 5 bits available to do other WTF stuff!

  • faoileag (unregistered)

    'N' + date +%s%N

    Fits 20 chars exactly.

    Ok, bit of a problem if you're db entries are coming in more than one per nanosecond, but hey, that's what sleep() is for! :-)

  • (cs) in reply to faoileag

    Some date() function implementations don't have very high granularity. And they can also be quite expensive (slow) but I'm not sure if guid() can be done faster.

  • Xeno (unregistered)

    106 symbols are required to store 128 bits into 19 characters. (106^19 < 2^128 < 107^19). You can go down to 85 if you allow 20 characters.

    ASCII can do it, but there are a good number of non-printing characters that are largely obsolete.

    Without being clever, you just break it up into bytes and store the bytes as the corresponding ASCII character, but your Database might get upset with you. Presumably that's what the original coder was worried about.

  • Smug Unix User (unregistered)

    The real WTF is using anything except id for the single primary key. What value are you adding? Table.Id is not as redundant as Table.TableId. It is not as random as Table.[22345668905233560890]. Can anyone make a valid case for naming a single non-composite primary key with anything other than id?

  • faoileag (unregistered) in reply to Colin 't Hart
    Colin 't Hart:
    Assuming 19 ASCII characters, then there are 19*7 = 133 bits available. GUIDs are 128 bits so that fits comfortably.

    We even have 5 bits available to do other WTF stuff!

    4 - you forgot the file_not_found flag ;-)

    SCNR

  • faoileag (unregistered) in reply to Smug Unix User
    Smug Unix User:
    Can anyone make a valid case for naming a single non-composite primary key with anything other than id?
    Think of a class DriversLicence with the following members: id, name, penaltyPointsAccrued.

    Some people prefer to keep the names of the member fields intact in the database (think roll-your-own ORM). So id is taken by the id of the driver's licence, and thus the unique primary key becomes Table.Id

    Yes, I know it's a bad example as the id of the driver's licence should be unique as well, but still.

  • mangobrain (unregistered) in reply to Colin 't Hart

    Sure, ASCII is 7-bit, but not all the characters are printable. No good if you want to display the identifiers somewhere (which I assume they do, otherwise why the length & first character restrictions?).

  • Peter Michael Green (unregistered) in reply to Colin 't Hart
    Colin 't Hart:
    Assuming 19 ASCII characters, then there are 19*7 = 133 bits available. GUIDs are 128 bits so that fits comfortably.
    That works if we assume all ascii characters are usable whether or not they are printable.

    If we limit ourselves to printable characters including space that gives us 95 usable characters which according to wolfram alpha lets us store just over 124 bits worth of information in 19 characters.

    By my recoking to store the whole GUID in 19 character positions the system needs to be able to safely handle 107 possibilities for each character.

  • Owen (unregistered)

    Stupid question maybe, but why not just convert a GUID string into something like Base36?

    Would that not keep the "uniqueness" of the GUID while coming in under the character limit?

  • Owen (unregistered) in reply to Owen
    Owen:
    Stupid question maybe, but why not just convert a GUID string into something like Base36?

    Would that not keep the "uniqueness" of the GUID while coming in under the character limit?

    Nevermind, some quick and simple tests online show that even using Base64 results in more than 19 characters.

  • Will (unregistered) in reply to faoileag
    faoileag:
    ...date +%s%N...
    Learned something useful today!

    WTF???

    (I guess I should go reread the man pages once every decade or so.)

  • (cs)
    You might think that transforming a database sequence ID might be a good way to go
    Seems to me that whatever you are doing is constrained somewhat. The people given the job of providing the technical solution have also been inflicted with half of a technical solution to a problem. It's just a shame that it is a solution to some other problem.

    Part 1: the constraint that you must use a varchar(20) column. Why? Is there some religious objection to using a proper sequence ID?

    Part 2: the constraint that it must be more uniquerer than a number that just goes up and up and up without meaningful limit.

    $me suspects that this is a result of someone having been bitten at some time in the past by flaky operation on an incorrectly managed DB.

  • George Uses Idempotent Dereferencing (unregistered)

    This is a simple and common misunderstanding. When the boss said to use a GUID he probably meant Guaranteed Useless Informix Database.

  • faoileag (unregistered) in reply to Steve The Cynic
    Steve The Cynic:
    the constraint that it must be more uniquerer than
    Ah, you don't see the comparative form of unique very often these days! In an age of uniformity it obviously fell out of usage...

    Perhaps a good time to recall the old superlative of unique: "most uniquest"!

  • JAPH (unregistered)

    The pseudo-random requirement for the id is to make it harder for hackers / crackers to guess the next id. How many times have we seen a web interface where the table id is passed in through the uri?

  • Geoff (unregistered) in reply to Peter Michael Green
    Peter Michael Green:
    Colin 't Hart:
    Assuming 19 ASCII characters, then there are 19*7 = 133 bits available. GUIDs are 128 bits so that fits comfortably.
    That works if we assume all ascii characters are usable whether or not they are printable.

    If we limit ourselves to printable characters including space that gives us 95 usable characters which according to wolfram alpha lets us store just over 124 bits worth of information in 19 characters.

    By my recoking to store the whole GUID in 19 character positions the system needs to be able to safely handle 107 possibilities for each character.

    Thanks for doing that. I was lazy.

  • (cs) in reply to faoileag
    faoileag:
    Steve The Cynic:
    the constraint that it must be more uniquerer than
    Ah, you don't see the comparative form of unique very often these days! In an age of uniformity it obviously fell out of usage...

    Perhaps a good time to recall the old superlative of unique: "most uniquest"!

    No, no, "more uniquerer" is the double supercomparative, a rare form that is frequently dismissed as incorrect by those not in the know. By the same token, "most uniquest" is the (single) supersuperlative.

  • faoileag (unregistered) in reply to Steve The Cynic
    Steve The Cynic:
    faoileag:
    Steve The Cynic:
    the constraint that it must be more uniquerer than
    Ah, you don't see the comparative form of unique very often these days! In an age of uniformity it obviously fell out of usage...

    Perhaps a good time to recall the old superlative of unique: "most uniquest"!

    No, no, "more uniquerer" is the double supercomparative, a rare form that is frequently dismissed as incorrect by those not in the know. By the same token, "most uniquest" is the (single) supersuperlative.
    Ooops, stupid me!

    But... erm... what is the single supercomparative then? uniquester?

  • khgkjhgjgh (unregistered) in reply to Xeno

    I think you got your base and exponent mixed up

  • Uncle Code Monkey (unregistered)

    Fitting a real GUID/UUID into 20 chars is easy. They fit into VARBINARY(16) easily, too.

    create a UUID strip '-' and trim '{}' now you can use the result with HEX()/UNHEX into 16 bytes.

    eg. SELECT HEX(id) as id, name FROM my_table WHERE id=UNHEX(?) or INSERT INTO my_table (id, name, rank) VALUES (UNHEX(?),?,?)

  • SQLer (unregistered) in reply to JAPH
    JAPH:
    The pseudo-random requirement for the id is to make it harder for hackers / crackers to guess the next id. How many times have we seen a web interface where the table id is passed in through the uri?

    Security through obscurity is what you're describing... and only prevents access to "unauthorized records" (from bad code) by writing more bad code.

  • (cs) in reply to Smug Unix User
    Smug Unix User:
    The real WTF is using anything except id for the single primary key. What value are you adding? Table.Id is not as redundant as Table.TableId. It is not as random as Table.[22345668905233560890]. Can anyone make a valid case for naming a single non-composite primary key with anything other than id?
    *responds to ObviousTroll*

    I use TableNameID because foreign key columns will have the same name as the primary key (always join on x.foo = y.foo), and I don't have to alias the column when I select from a join.

  • Uncle Code Monkey (unregistered) in reply to joe.edwards

    +1

  • (cs) in reply to faoileag
    faoileag:
    Steve The Cynic:
    faoileag:
    Steve The Cynic:
    the constraint that it must be more uniquerer than
    Ah, you don't see the comparative form of unique very often these days! In an age of uniformity it obviously fell out of usage...

    Perhaps a good time to recall the old superlative of unique: "most uniquest"!

    No, no, "more uniquerer" is the double supercomparative, a rare form that is frequently dismissed as incorrect by those not in the know. By the same token, "most uniquest" is the (single) supersuperlative.
    Ooops, stupid me!

    But... erm... what is the single supercomparative then? uniquester?

    My supposition is "more uniquer". It's the "erer" that's the indicator of "double" in all this. I've heard tell of a double supersuperlative, but there's some dispute as to whether it is "most uniquestest" or "most uniquerest".

  • (cs) in reply to khgkjhgjgh
    khgkjhgjgh:
    I think you got your base and exponent mixed up
    An important thing to get right. You wouldn't want to find yourself saying "All your exponent are belong to us!" would you?
  • Mr A (unregistered) in reply to faoileag

    My first job used a db that enforced a 2 field primary key. Key one was yyyymmdd Key two was hhnnss

    When records were created, the 'duplicate key' error was captured, a sleep(1) was issued and the insert tried again.

    Performance was often an issue but my suggestion for using an integer key was dismissed with a withering look that said "hey new guy - shut up - we're the experts here".

    Funnily enough, they're no longer in business.

  • J-P (unregistered)

    ...and if you keep it down to 13 characters, like the original developer of this code did, you're doing what developers like to call planning for the future.

    Making sure your company will get the next project to fix the software you wrote earlier is a VERY important part of future planning!

  • Valued Service (unregistered) in reply to Geoff
    Geoff:
    Geoff:
    True WTF is a GUID is only a 128-bit value. So if you have 20 character column it seems some clever boy could find an encoding scheme that could store the whole value.

    Can't wait for the follow up post where they discover their truncated unique ids are no longer quite unique a few years from now.

    Missed the must start with 'N' part at first. Okay so you have 19 characters to store a 16 byte value. Still seems doable; though you might need cleverness to deal with code page limitations in the DB.

    Why oh why can't they add the N in the code that uses the value?

    Oh right...

  • Valued Service (unregistered) in reply to faoileag
    faoileag:
    Steve The Cynic:
    the constraint that it must be more uniquerer than
    Ah, you don't see the comparative form of unique very often these days! In an age of uniformity it obviously fell out of usage...

    Perhaps a good time to recall the old superlative of unique: "most uniquest"!

    TRWTF is English.

    Why we even have words like more and most.

    Just add er and est onto the end. Oh what, it doesn't sound good? Well we're no longer a poetic language, get pragmatically used to it. most ragged -> raggedest Switching letters instead of just adding the suffix, what, it sounds the same, does it just not look nice. Get used to it. jumpier -> jumpyer I before E except after C except... random exceptions. What... NO not at all. Get used to it. weird -> wierd. (IMO, it looks weird either way). The plural of x is x. Not anymore. sheep -> sheeps The plural of oo is ee. The hell no. geese -> gooses

    I don't care if it sounds wierd. That will go away with use. Let's get consistent for once.

  • Geoff (unregistered) in reply to Valued Service
    Valued Service:
    Geoff:
    Geoff:
    True WTF is a GUID is only a 128-bit value. So if you have 20 character column it seems some clever boy could find an encoding scheme that could store the whole value.

    Can't wait for the follow up post where they discover their truncated unique ids are no longer quite unique a few years from now.

    Missed the must start with 'N' part at first. Okay so you have 19 characters to store a 16 byte value. Still seems doable; though you might need cleverness to deal with code page limitations in the DB.

    Why oh why can't they add the N in the code that uses the value?

    Oh right...

    In absence of any addition information I would have to assume the whatever needs the leading 'N' probably also is expecting not more than 20 characters, given the requirement is to use a varchar 20 field. Its also quite likely that whatever can't really take 20 characters max but rather 20 bytes max.

    Given the constraints are rather bizarre to start with its reasonable there may be other limitations too such as the bytes must represent printable characters in the native code page.

    You could do a number of things you could do the most naive thing possible for instance create a view that just concats 'N' on to the front of the id column but this might result in 21 bytes being return to the caller.

    My guess would be there is more stupid in here some place. There is virtually no reason anything should depend on a constant fixed first byte inside a single database attribute unless its really multi-value, in that 'N' defines some kind of class, or grouping, etc. Hey they are using a varchar as a unique identifier and stuffing it with bogus broken guids; I think we can assume their schema probably sucks too.

  • Mark (unregistered) in reply to Smug Unix User

    "Can anyone make a valid case for naming a single non-composite primary key with anything other than id?"

    1. visual symmetry when doing joins to tables that use the id column as a foreign key.

    from table1 a join table2 b on a.Table1ID=b.Table1ID

    1. If the primary key isn't synthetic select SocialSecurityNumber from table3

    The real-real-wtf is your lack of imagination.

  • (cs) in reply to Valued Service
    faoileag:
    TRWTF is English.
    Maybe true, but for those of you following at home, when doing internationalization, English usually has the shortest (character wise) phrases. If you have silly things like dialog boxes (funny plural) and only provide space for English, other languages will overflow.

    So, yes, English is a WTF, but other languages are WTF as well. They use silly things like diacriticals and even funny letters.

    The next tirade is on our measurement systems, but that is a big WTF that we all know.

  • Jazz (unregistered) in reply to Owen
    Owen:
    Owen:
    Stupid question maybe, but why not just convert a GUID string into something like Base36?

    Would that not keep the "uniqueness" of the GUID while coming in under the character limit?

    Nevermind, some quick and simple tests online show that even using Base64 results in more than 19 characters.

    Base64 will always result in more characters than the source string, because it's designed to pack a series of 8-bit bytes into a series of 7-bit-clean bytes. You lose 1/8 of the storage capacity of each byte when you Base64-encode something, by design.

    But what you could do, hypothetically, is take a GUID in hex form, strip out the dashes, and treat the result as if it were an already-base-64-encoded string. Run that through a Base64 decode and it would get smaller. But you're still losing out in the long run, because simply expressing a GUID as a string of hex digits in the first place automatically doubles the space it takes (each byte of the actual GUID is represented by two characters, or two bytes). So you would go from 128 bytes of original data to ( (2*7)/8 ) * 128 = 224 bytes of "compressed" data.

    TRWTF is a project manager that doesn't understand relational databases or bytes.

    ( CAPTCHA: 'consequat' -- Tim couldn't do a single developer task correctly; consequatly, they made him the project manager.)

  • jay (unregistered) in reply to joe.edwards
    joe.edwards:
    Smug Unix User:
    The real WTF is using anything except id for the single primary key. What value are you adding? Table.Id is not as redundant as Table.TableId. It is not as random as Table.[22345668905233560890]. Can anyone make a valid case for naming a single non-composite primary key with anything other than id?
    *responds to ObviousTroll*

    I use TableNameID because foreign key columns will have the same name as the primary key (always join on x.foo = y.foo), and I don't have to alias the column when I select from a join.

    Exactly. And once the field name when used as a foreign key has to be different from the field name in the original record, you're bound to get people calling it different things. I'm presently working on a system where sometimes transactions.id is referenced as "transactions_id", other times as "transaction_id", "transactionid", "tranid", etc etc. Ditto when you have to alias it.

  • ABCD (unregistered) in reply to faoileag
    faoileag:
    Steve The Cynic:
    faoileag:
    Steve The Cynic:
    the constraint that it must be more uniquerer than
    Ah, you don't see the comparative form of unique very often these days! In an age of uniformity it obviously fell out of usage...

    Perhaps a good time to recall the old superlative of unique: "most uniquest"!

    No, no, "more uniquerer" is the double supercomparative, a rare form that is frequently dismissed as incorrect by those not in the know. By the same token, "most uniquest" is the (single) supersuperlative.
    Ooops, stupid me!

    But... erm... what is the single supercomparative then? uniquester?

    It's more uniquer; the double supersuperlative would be most uniquestest.

  • tippfaul (unregistered) in reply to Smug Unix User
    Smug Unix User:
    The real WTF is using anything except id for the single primary key. What value are you adding? Table.Id is not as redundant as Table.TableId. It is not as random as Table.[22345668905233560890]. Can anyone make a valid case for naming a single non-composite primary key with anything other than id?

    Yes, I can: "join using()"

    • saves typing
    • less error-prone
    • identity of joined columns is known to the interpreter

    love its!

  • (cs)

    Must start with an N and be unique and (at most) 20 chars long.

    First, where does that requirement come from? Is the "N" negotiable? Can I change the width of the column? Is this a value that is stored in the DB or is it in a report?

    Second, can I store a real guid in the DB and provide this column's value as a computed field? That way I can ensure uniqueness, and play with a formula for computing the field until I get it actually producing a unique value.

    Or, can I use a sequence number in this column?

    It boggles the mind that the engineer who coded up this solution did not ask these questions first. It also boggles the mind that he did not do some calculations/test runs to see how unique his "unique" value would be.

  • J (unregistered) in reply to Colin 't Hart

    This assumes that any ASCII character is fair game, including NUL, ACK, BEL, and so on. Whether characters like TAB and CR would be allowed is debatable... But let's say you have 100 printable characters to play with (it's almost certainly a little less):

    100^19 = 1.0e+38 permutations 256^16 = 3.4e+38 permutations

    So you're short on space by more than a factor of three.

  • (cs) in reply to MatsH
    MatsH:
    String.Concat("F", "R", "I", "X", "T").ToLower().Replace("X", "S")

    Not work with turkey Localization.

  • sqlblindman (unregistered) in reply to Smug Unix User

    Yes. So that the ID is clearly specified in multi-table statements. I use [TableName]ID for surrogate keys. People who use simply "ID" in every table are noobs.

  • Jay (unregistered) in reply to DrPepper

    Dear Dr. Boggled,

    The application has existed for years and years with the varchar(20) key. That key was selected because it matched a key provided by an external data supplier. Yes, it's supposed to be displayed on reports, so it can't be pure binary bytes stored in a varchar.

    A new requirement recently arose that amounted to a second, completely distinct source of rows for this same table. The 'N' prefix was chosen to distinguish this class of IDs, and to guaranteed that the purely-numeric values already on file would never overlap the 'N' synthetically generated ones.

    The focus of the posting was the wacko usage of the GUID, not the underlying schema. I'm very surprised at the turn this discussion has taken.

  • (cs) in reply to faoileag
    faoileag:
    Smug Unix User:
    Can anyone make a valid case for naming a single non-composite primary key with anything other than id?
    Think of a class DriversLicence with the following members: id, name, penaltyPointsAccrued.

    Some people prefer to keep the names of the member fields intact in the database (think roll-your-own ORM). So id is taken by the id of the driver's licence, and thus the unique primary key becomes Table.Id

    Yes, I know it's a bad example as the id of the driver's licence should be unique as well, but still.

    In India, I can have driver's license in two states at same time. Both current.

  • RandomUser423721 (unregistered)

    If you can guarantee that it's always the same type of GUID (probably v4), then you only need to store 124 bits. Or use a v1, and chop out the MAC part, if you can guarantee the IDs will always be generated by the same comp.

    I.e. both are bad ideas. The second is worse that the first.

  • HowItWorks (unregistered)

    Given that the shown SQL casts the guid to an Absolute BigInt, the format appears to be 'N' followed by Numeric characters. The article isn't clear if only numeric are allowed after the 'N', but probably.

    A couple more words in the article could easily clarify what is allowed.

    The proposed solutions and alternatives have been based on the full ascii set being allowed in the key's value. Which is unlikely.

Leave a comment on “Mini-GUID”

Log In or post as a guest

Replying to comment #405410:

« Return to Article