There are many ways to mess up doing unique identifiers. It's a hard problem, and that's why we've sorta agreed on a few distinct ways to do it. First, we can just autonumber. Easy, but it doesn't always scale that well, especially in distributed systems. Second, we can use something like UUIDs: mix a few bits of real data in with a big pile of random data, and you can create a unique ID. Finally, there are some hashing-related options, where the data itself generates its ID.
Tiffanie was digging into some weird crashes in a database application, and discovered that their MODULES
table couldn't decide which was correct, and opted for two: MODULE_ID
, an autonumbered field, and MODULE_UUID
, which one would assume, held a UUID. There were also the requsite MODULE_NAME
and similar fields. A quick scan of the table looked like:
MODULE_ID | MODULE_NAME | MODULE_UUID | MODULE_DESC |
---|---|---|---|
0 | Defects | 8461aa9b-ba38-4201-a717-cee257b73af0 | Defects |
1 | Test Plan | 06fd18eb-8214-4431-aa66-e11ae2a6c9b3 | Test Plan |
Now, using both UUIDs and autonumbers is a bit suspicious, but there might be a good reason for that (the UUIDs might be used for tracking versions of installed modules, while the ID is the local database-reference for that, so the ID shouldn't change ever, but the UUID might). Still, given that MODULE_NAME
and MODULE_DESC
both contain exactly the same information in every case, I suspect that this table was designed by the Department of Redunancy Department.
Still, that's hardly the worst sin you could commit. What would be really bad would be using the wrong datatype for a column. This is a SQL Server database, and so we can safely expect that the MODULE_ID
is numeric, the MODULE_NAME
and MODULE_DESC
must be text, and clearly the MODULE_UUID
field should be the UNIQUEIDENTIFIER
type, right?
Well, let's look at one more row from this table:
MODULE_ID | MODULE_NAME | MODULE_UUID | MODULE_DESC |
---|---|---|---|
11 | Releases | Releases does not have a UUID | Releases |
Oh, well. I think I have a hunch what was causing the problems. Sure enough, the program was expecting the UUID field to contain UUIDs, and was failing when a field contained something that couldn't be converted into a UUID.