• n_a (unregistered) in reply to pecus
    pecus:
    I read that as: If you expect a million datasets, you'd better allow more than 2.6E16 values for your randomly generated GUIDs, or you'll likely have at least one collision in your DB.

    Next, remember to upgrade your GUIDs when you pass the million-record mark.

    Finally, we all know that one-in-a-million chances have a 50:50 probability of occurring in everyday life.

    If you need millions of GUIDs existing at the same time (in your [R]DB no less), there is a high probability that GUIDs aren't what you need.

    That is to say, I don't support "let's generate lotsa GUIDs and cross our fingers they don't collide" either. Relying on pseudo-randomness being uniquely random on this scale is a wrong thing to do ("You can never be sure", quoting Dilbert).

  • (cs)

    Dear Sybase,

    Your shitty, slow, broken, obsolete RDBMS doesn't belong in any organisation anywhere, and nor do the barely functional 90s-era "utilities" you ship with it.

    Please hurry up and go out of business so that future generations of developers don't have to be subjected to the execrable, extortionately-priced "code" you call a database system.

    If you won't do that, at least stop shipping products with installers written in Java. Writing shit in Java doesn't make it less shit, it just makes it slow shit, you dumb fucks.

    Kind regards, The_Assimilator (former, unwilling, Sybase developer and sometimes DBA)

  • (cs) in reply to no name
    no name:
    dkf:
    Ah, but someone else randomly generating GUIDs might hit the one you're using, and then you're in trouble!

    Or not. I don't plan to lose sleep over it.

    If your source of entropy is good enough (and my calculations are correct), the chances of two GUIDs colliding are 1 in 340 billion billion billion billion. Or 1 / (the Irish national debt) ;-)

    I would feel MUCH more comfortable if the chances were 1 / United States national debt.

  • Jim Reaper (unregistered) in reply to Jasper
    Jasper:
    OH MY GOD! This guy has our SOURCE CODE! Call the police! He's a hacker who broke into our systems and stole our source code!

    An intern at my old job accidentally emailed our source code to a customer. He then went on to make company history, by emailing and asking for it back.

  • TK (unregistered) in reply to Soviut
    Soviut:
    Steve Parker:
    Oh, wait - we already did that for you, decades ago, and open sourced most of it, too, so that you could (if you want to) join in the (r)evolution.
    Good too see you've drunk the Kool-Aid, now direct yourself towards the TM Repository and peer at yourself in the mirror.
    Got to love the hypocrisy of MS zealots. Thanks!
  • nah (unregistered) in reply to Anonymouse
    Anonymouse:
    You know what bothers me? This:
    Alex:
                        Directory.CreateDirectory(realBasePath);
                        CreateDll(); 
    

    ... I mean, think about it: they go through all the trouble of creating a "somewhat unique-ish" directory, have the path for it in a local variable, and then they call a method without parameters, i.e. one that can't know about the directory which was just created for its purpose. ..

    You left out the part where the path + dll name is put in an object variable. Presumably one that CreateDll will use.

    Alex:
                    string realBasePath = basePath + directorySubIndex.ToString();
                    this.s_dllPath = realBasePath + @"\dbdata.dll"; 
    
                // hopefully, this won't exist yet!
                if (!Directory.Exists(realBasePath))
                {
                    Directory.CreateDirectory(realBasePath);
                    CreateDll(); 
                   /*** ADD A CALL TO CreateTheOtherDLL() HERE ***/     
    
  • itsmo (unregistered) in reply to Anonymous
    Anonymous:
    TheCPUWizard:
    IMHO, it is unfortunate that "privacy concerns" forced the abandonment of the original algorithm(s) and degraded GUIDS into simply bery big random numbers. [SoKoUUID - Sort of, Kind of, Usually Unique Identifiers]
    I sort of agree but at the same time I kind of think that encoding the computer's MAC address into the GUID is a major privacy concern. It is also very easy to subvert since you can manually set the MAC address of a virtual machine to anything you want and start creating GUIDs, which might be good for framing some poor slob. If this sounds unlikely I would direct you towards the author of the Mellisa virus, who served hard time because the GUIDs in his virus code identified his machine's MAC address (he was also a bit of a moron but that's beside the point).

    FTFY

  • Anon (unregistered) in reply to Olius
    Olius:
    Two things Alex...
    1. Why not use a proper, free database - MySQL, Postgres, Ingres, etc etc? You can bind to these from any language, even using odbc if there is no other option. I'll assume it was someone else's choice and you're maintaining the software, I guess.

    2. Why stick by a language, set of libraries, set of software which forces you to go out of your way to quash errors which would result in UI dialog boxes? I can't imagine any kind of environment which would accidentally create gui components when you're writing a server process. But then again, I don't touch Windows, let alone .Net.

    But I'm not taking the mic here, I'm genuinely interested in your answers.

    Why not read the article and try and understand what BuildMaster is? This is a tool Alex sells to customers, he has to add support for as many different environments as possible because that maximizes the number of potential customers. Telling your customers that you won't support their favorite database because of your ideological crusade for FOSS is a good way to end up out-of-business. Why don't you make a competing product that only runs on Linux and only using FOSS databases and see how many copies you can sell?

  • Stephen Cleary (unregistered) in reply to Anonymous

    I couldn't pass up this post without making a few corrections...

    Anonymous:
    The original principle behind it was that it would be truly unique - generated based on the computer's MAC address and a current timestamp, such that it should be impossible to create a collision without subverting the algorithm or the system on which is was run.

    Correct. RFC 4122 Version 1 GUIDs use a node identifier (usually a MAC address) combined with a timestamp (and a clock sequence, which helps handle clock resets/corrections). However, there is also a "Version" field.

    Anonymous:
    This is no longer the case as the revised specification drops the MAC address part for security reasons. So now GUIDs are generated from the timestamp without the entropy provided by a unique MAC address.

    No. Version 1 GUIDs are still exactly the same. I believe you're thinking of Version 4 GUIDs, which are random. Version 4 GUIDs do not have a timestamp (or clock sequence). They are 122 bits of randomness, with 6 bits used to identify them as a Version 4 RFC 4122 GUID.

    Note that Version 1 GUIDs cannot collide with Version 4 GUIDs, because the value of the Version field is different.

    Anonymous:
    Now consider sequential GUIDs. The last source of entropy, the timestamp, is removed from the equation and GUIDs are now simply an incrementing number. There is no randomisation other than the starting point. The fundamental principle that makes a globally unique identifier unique has been removed.

    I think that there's some confusion about "sequential GUIDs". A "sequential GUID" is just a Version 1 GUID - nothing more, nothing less. It is wrong to take any GUID and increment it. Period. Always has been and always will be.

    For more information: http://nitoprograms.blogspot.com/2010/11/few-words-on-guids.html

    P.S. Does anyone find the spam-detection comment guardian really annoying?

  • (cs) in reply to Jimmy Jones
    Jimmy Jones:
    frits:
    // GUID generated 2009-08-24; could not find anyone else using it "{16AA8FB8-4A98-4757-B7A5-0FF22C0A6E33}");

    Hilarious! I see this all the time. Is it really so costly to generate GUIDs in your program?

    GUIDs contain your MAC address so if you want to avoid collisions with other GUIDs it's best to generate them on your machine, not the user's.

    Citation needed ^^.

    P.S. I already know this is wrong in most cases. P.P.S. Give me one good reason why I should publish my MAC Address.

  • Olivier (unregistered) in reply to Brian Hudell

    Well, if I was a hardcore risk taker maybe I would find it exciting to have invisible longjmps in my code. As it stands I'm a whimp, and I'd rather have error codes since you can't forget to document those in a func decl. (More or less true, might not explain why a func call failed, but at least you get to know it can fail.)

    Oh yes, and you don't get exhilaratingly mysterious crashes when one dll is built with exceptions and another (3rd party) explicitly told to compiler never to check for them.

    Captcha: ratis (improper construction of the Esperanto verb 'ratifis', meaning 'ratify (past tense)')

  • Olivier (unregistered) in reply to frits

    I quote: "In the OSF-specified algorithm for generating new (V1) GUIDs, the user's network card MAC address is used as a base for the last group of GUID digits... Most of the other digits are based on the time while generating the GUID."

    -- wikipedia.org/wiki/Globally_unique_identifier#Algorithm

    Now it's true that they changed that later on, it doesn't make Jimmy wrong.

  • (cs) in reply to Olivier
    Olivier:
    I quote: "In the OSF-specified algorithm for generating new (V1) GUIDs, the user's network card MAC address is used as a base for the last group of GUID digits... Most of the other digits are based on the time while generating the GUID."

    -- wikipedia.org/wiki/Globally_unique_identifier#Algorithm

    Now it's true that they changed that later on, it doesn't make Jimmy wrong.

    This article is talking about generating GUIDs using Windows system calls (wrapped by .Net). Windows 2000 and later the MAC address is not used for security purposes.

  • jger (unregistered) in reply to jger
    Note from Alex: I actually used the word I meant to use this time. "Device" as in "story-telling" device of an open letter.

    I see, thanks for clarification. =)

    J-

  • Anon (unregistered) in reply to no name
    no name:

    If your source of entropy is good enough (and my calculations are correct), the chances of two GUIDs colliding are 1 in 340 billion billion billion billion. Or 1 / (the Irish national debt) ;-)

    Try using two cloned virtual machine images and the chances go to 1 :)

  • Dog (unregistered)

    "If a MessageBox shows and no user is there to click it, does it still appear?" - I should create an article about 'Schrödinger's MessageBox' on Wikipedia...

  • (cs) in reply to John Evans
    John Evans:
    Lampshade, Alex. Hanging a Lampshade.

    Actually, your link confirms Alex' version:

    This practice is also known as "hanging a clock on it", "hanging a lantern on it", or "spotlighting it".

    Should you now be hanging a clock on your own WTF?

  • hachu (unregistered) in reply to Steve Parker
    Steve Parker:
    Oh, wait - we already did that for you, decades ago, and open sourced most of it, too, so that you could (if you want to) join in the [r]evolution.

    Oh yeah, this BSD thing is awesome!

  • Olius (unregistered) in reply to Anon

    Slightly pointless reply that'll never get read as we're a day after the article was posted...

    I've used Sybase Anywhere: It is a single user, workstation only database which is useful to embed in any product you wish. I would imagine Buildmaster ships with it to use as its back end.

    It's not something I've come across as a standalone product one would buy. It is something which applications use as a database when they (naturally) don't want to use Jet or any other similar product, don't want to write their own dbms, want to use sql for convenience etc etc.

    Buildmaster could ship with anything, they have that choice. They could use sqlite.

    As far as I've read about buildmaster on the site, it is a server class product to support many developers and large projects. Sure, start off with an embedded sql environment and move on to a "proper" dbms when you need to scale.

    All the databases I mentioned run on windows and can be interfaced from any of your favourite languages.

    And you'll still be able to sell your product. You'll even find it cheaper and easier to sell your product because you wont be fiddling about with either explaining to customers that your software costs X + the cost of the licenses for all this other software, or bundling the cost in to your price and dealing with these companies directly to do licensing deals.

    This isn't windows vs linux, it's just common sense with an open mind.

  • (cs) in reply to Anonymous
    Anonymous:
    (note to DBAs: if you generate GUIDs sequentially they're not GUIDs anymore).

    Yes, they are still GUIDs even when generated sequentially. They are globally unique.

  • (cs) in reply to Anon
    Anon:
    no name:

    If your source of entropy is good enough (and my calculations are correct), the chances of two GUIDs colliding are 1 in 340 billion billion billion billion. Or 1 / (the Irish national debt) ;-)

    Try using two cloned virtual machine images and the chances go to 1 :)

    That is incorrect. The part that GUARANTEES uniqueness across different computers (the MAC address of the network card) might be the same in two virtual machines (or in the virtual machine and the real machine, although I'm not sure about this since the network card in a VM is a virtually-created card that is different than the real card), but the rest of the GUID will almost certainly be different from any other GUID created on the same or a different virtual machine. And the hardware is different between a VM and its host (although two VMs may have essentially the same hardware, or they may not (since memory size might be different)).

    The probability will not be anywhere near 1 that you will create the same GUID in two different virtual machines, or in a virtual and a real machine.

    Try it and see; I'll bet you can't generate a GUID collision.

  • (cs) in reply to John Muller
    John Muller:
    As for incrementing from a known GUID, ...

    Maybe, MAYBE, you could get away with each thread picking a starting GUID, and incrementing from that. But,

    You don't know what sequential GUIDs are, or how they are generated, do you?

  • (cs) in reply to Anonymous
    Anonymous:
    wtf:
    ThingGuy McGuyThing:
    I wish you would go into this. I did a quick Google search and couldn't find any reason why sequential GUIDs are any more likely to collide than non-sequential. Perhaps you could enlighten me?

    I'm curious about this as well - not that I doubt the statement, I just don't see where it comes from. (an actual, sincere, non-snarky question? on this site? Oh, come on!)

    Since there seems to be some genuine interest I feel I should explain my comments in regard to sequential GUIDs. First let me make it clear that using sequential GUIDs isn't an issue if they only need to be internally unique, so no disrespect to DBAs implementing internal strategies. Of course, the point of a GUID is that it is globally unique, as in only one in the whole world now or ever.

    This is the very definition of a GUID - it's a globally unique identifier. The original principle behind it was that it would be truly unique - generated based on the computer's MAC address and a current timestamp, such that it should be impossible to create a collision without subverting the algorithm or the system on which is was run. This is no longer the case as the revised specification drops the MAC address part for security reasons. So now GUIDs are generated from the timestamp without the entropy provided by a unique MAC address. This already means that GUIDs are not truly globally unique, since it is possible for the same GUID to be generated on different machines. So, we have lost a valuable source of entropy but it is still very unlikely that a collision will occur.

    Now consider sequential GUIDs. The last source of entropy, the timestamp, is removed from the equation and GUIDs are now simply an incrementing number. There is no randomisation other than the starting point. The fundamental principle that makes a globally unique identifier unique has been removed.

    Tharg said it absolutely right above - if you are using GUIDs as an internally unique identifer and you manage that properly then there is no risk in using an incrementing GUID strategy. But the fact is that as soon as you generate GUIDs sequentially they cease to be globally unique. And it's amazing how many developers just cannot fathom that idea.

    An honourable mention to the Birthday "paradox" since that does apply to this scenario as well but that's somewhat beside the point when considering sequential GUIDs. A sequential GUID is no different to an arbitrary 32 character hex string - it lacks the "GU" part of "GUID".

    Anyway, I said I wasn't going to get into this...

    Timestamp and MAC address are not the only things that go into making GUIDs. Sheesh.

    Sequential GUIDS are much, much different than an arbitrary 32-character hex string.

  • (cs) in reply to Swedish tard
    Swedish tard:
    WTF is up with all the people here that thinks that MAC adresses are in any way unique? Even the same line of cards from the same manufacturer can ahve collisions in MAC adresses. Depending on the uniqueness of MAC adresses for generating unique GUIDs... I mean, come on!

    MAC addresses are supposed to be unique, but of course some card manufacturers have made mistakes (or don't care).

  • Z (unregistered)

    It's 00.49 here, but this already Made My Day!

  • Jay (unregistered) in reply to DWalker59
    DWalker59:
    Anonymous:
    (note to DBAs: if you generate GUIDs sequentially they're not GUIDs anymore).

    Yes, they are still GUIDs even when generated sequentially. They are globally unique.

    No they're not, dumbass. The might be globally unique but you have lost any guarantee of that by subverting the algorithm.

  • Design Pattern (unregistered) in reply to topspin
    topspin:
    All this busy talk about GUIDs and only one guy so far has complained about loading a DLL from a resource?? What's up with that?

    Seriously, having a DLL stored in a resource and writing that to a file at runtime is NOT "actually a pretty clever technique", it certainly is THE REAL WTF in all upper case.

    QFT!

    But if a print it out, scan it with a scanner and then upload the localised messages using OCR-software, it starts to have a charm!

  • Design Pattern (unregistered)

    Bonus task for Alex: Modify the code so that it also runs on an embedded system without a filesystem!

    That would actually mean: Remove the whole WTF-ness of writing a DLL which is embedded as a resource to the local filesystem for the whole purpose of loading localised resources!

  • Anon (unregistered) in reply to Anon

    H,

    Thanks for confirming that GUIDs unique I've hunting around all over to find one no-one else is using.

  • UseTheLibrary (unregistered)

    Is there a reason Path.Combine() doesn't do the trick for this kid and instead has to roll his own version? The real WTF is this guy's 'fix', along with the rest of these comments worrying about a GUID Collision lol.

  • L. (unregistered) in reply to no name

    Why do you people keep on focusing on sources of entropy when they are totally unnecessary to GUID's ?

    I mean, the goal of a GUID is to be unique alright ? Another important factor is that it's generation must be quick, and that collision ought to be totally impossible (and thus allowing you to remove a UNIQUE constraint in order to speed up the whole process - don't try this at home)

    When there are so many ways (most of which extremely simple in terms of logic) to guarantee that a GUID remains Unique (at least on one machine / server pool / bunch of stuff managed by the same guy), you still believe it makes sense to waste processing time on creating enough randomness to achieve unicity ???

    I know many people mix up hashes and randoms to make GUID's, but that is simply retarded, if you want your id to be unique, use a counter, a ridiculously low timestamp, a thread id, a cpu id and a "managed entity (like a server)" id .

    It will be unique by design (unless your OS allows threads with the same id or cpu's with the same id, or has quantum-state variable counters) and much much faster.

    It will not be "secret" or "unhackable" and is thus not meant for session id's and the likes, but GUID and session id's are two totally different requirements anyway so why mix them up ?

    Now I could be wrong, but so far every benchmark and every test I've done with actual GUID generators based on the aforementioned design are conclusive and quite a bit faster than the usual hash(hash(random*random+random-random)) crap.

    And why ... well that's just as dumb, every variable that is used in this guid is there even before your code does anything - except the timestamp at the start of the loop (assuming you use a loop, otherwise you'll need a timestamp for every guid, bit more expensive but hey .. you coded it that way) and the counter is there if you have a loop / don't need it if you don't have a loop ...

  • Anonymouser Coward (unregistered) in reply to hoodaticus
    hoodaticus:
    Tharg:
    You wouldn't be firing this DBA for using sequential GUID's, as I just use guaranteed-not-to-repeat integers for my primary keys.
    What a brillant idea! Just think of the flexibility, such as when you have to merge that table into another one with the same identity scheme :).
    \

    yeah, table1.id and table2.id, totally indistinguishable - no possibility of merging those two tables if they use autonumbers. CAN'T BE DONE! NO WAY OUT! MUST USE GUUIDs...

    not

  • Anonymouser Coward (unregistered) in reply to Swedish tard
    Swedish tard:
    WTF is up with all the people here that thinks that MAC adresses are in any way unique? Even the same line of cards from the same manufacturer can ahve collisions in MAC adresses. Depending on the uniqueness of MAC adresses for generating unique GUIDs... I mean, come on!

    Yeah, I've had a batch of Netgear cards that all had the same MAC address, fluke at the factory, I had to reprogram them.

    MAC addresses only need to be unique on your segment. The way they are composed should make them unique but there is no guarante in reality, and since you can reprogram them on most devices, they should NOT be used to guarantee any type of uniqueness.

  • Jeff Albion (unregistered)

    P.P.P.S. I should also note that I'm not a fan of open letters. I think they're pompous, mean-spirited, and completely unproductive

    This is why Sybase offers the free "Create an Enhancement Case" tool in Case-Express, so that customers don't feel that they have to write anonymous letters on web forums to make product suggestions. (See: http://case-express.sybase.com/cx/ ). We certainly appreciate the feedback and really want to hear what features our customers are looking for in the product - opening an official request with technical support is the best way to make sure the product suggestion makes its way back to the developers for consideration.

    I might also add that we also have official Sybase (monitored) newsgroups where you can make similar product suggestions / discuss product issues directly with technical support, developers, and other interested SQL Anywhere customers:

    http://www.sybase.com/detail_list?id=11507&multi=true&SR=Y&show=1255

    Additionally, we have the SQLA "Stack-Exchange" like site for Q&A for developers to ask questions and get suggestions for your particular issue/usage: http://sqlanywhere-forum.sybase.com/


    I have opened a support enhancement case #11728912 / and associated engineering case CR #703288 to be potentially considered in a future version of the product. The request relates to how the language DLLs are packaged (managed versus unmanaged resources) and if the language resource is missing, how we report this back to the client (maybe throw an Exception instead of a MessageBox).

    Thank you for the feedback / product suggestion - in the future though, please try to direct your product request into an official Sybase channel rather than unofficial web forums.

    Thanks again,

    -- Jeff Albion, Sybase iAnywhere, an SAP Company

  • Jeff Albion (unregistered)

    >> if the language resource is missing, how we report this back to the client (maybe throw an Exception instead of a MessageBox).

    This issue has already been resolved as CR #651423 / CR #651363.

Leave a comment on “Dear Sybase: MessageBoxes Don’t Belong In Drivers”

Log In or post as a guest

Replying to comment #:

« Return to Article