The Daily WTF: Curious Perversions in Information Technology

chubertdev · 2014-03-31 Reply Admin

oheso:
PedanticCurmudgeon:
GUIDs are a PITA to work with and don't offer any advantages over identity columns that would compensate for that.

They can possibly mitigate a security hole arising from a guessable index. Mind, if you've got such a security hole, it means your code is broken, not that you're using the wrong index.

Or if you just don't want your user knowing he's one of the first 10 people to register for your application ... But then, this use case can be solved by seeding the index with ARBITRARILY_LARGE_INT.

I'd be more worried about exposing UserIDs to the end user.

oheso · 2014-03-31 Reply Admin

chubertdev:
I'd be more worried about exposing UserIDs to the end user.

Quite.

oheso · 2014-03-31 Reply Admin

chubertdev:
Then use GUIDs as a temporary supplemental identifier for a temporary solution. There's no need to have a GUID column just in case you need to merge data. Use an identity column, and if you need to merge the data, use a GUID, merge it, then remove the GUID. Unless you're merging (or something else that may require a GUID), it's not the right type of column.

This requires updating all the foreign keys during the merge.

chubertdev · 2014-03-31 Reply Admin

oheso:
chubertdev:
Then use GUIDs as a temporary supplemental identifier for a temporary solution. There's no need to have a GUID column just in case you need to merge data. Use an identity column, and if you need to merge the data, use a GUID, merge it, then remove the GUID. Unless you're merging (or something else that may require a GUID), it's not the right type of column.

This requires updating all the foreign keys during the merge.

Still less work than pointlessly maintaining GUIDs when you're not merging.

2014-03-31 Reply Admin

Valued Service:
Yep, developers can't make good DBAs, and DBAs can't make good developers.
And they both suck at each other's domain.

I mean, neither thought to use GUIDs, and avoid the calculations / identity scope issue altogether.

GUIDs are lousy in indexes though.

Loose Bree · 2014-03-31 Reply Admin

GUIDo ninjas NOT wanted!

[GU]UIDs might be OK as keys in small tables, but will bloat indexes on bigger tables.

A competent developer who spends 2~4 weeks studying query planning, normalization and indexes will usually make a better DBA than most DBAs.

TheCPUWizard · 2014-03-31 Reply Admin

Arrgh... Once again mixing DBA [ADMINISTRATOR] with a Database DEVELOPER.... two different roles....and both completely distinct from C#/VB/C++/et.al. Developers....

2014-03-31 Reply Admin

Anyone want to ELI5 what the hell this is even trying to do?

2014-03-31 Reply Admin

If this is MS SQL, then all they need to do is return @@IDENTITY.

Assuming the table has an Identity column, which if it doesn't, it should..

http://technet.microsoft.com/en-us/library/ms187342.aspx

oheso · 2014-03-31 Reply Admin

Jeremy:
Anyone want to ELI5 what the hell this is even trying to do?

It's a function that's trying to return the next integer higher than the max of the named column. All the rest is guesswork, but it's likely this is being used to insert a new record. Obviously, the correct approach is to simply replace this function with an auto-incrementing index on the table. Then if you need to retrieve the index of the record just inserted (which is NOT what the code does), then you may employ SCOPE_IDENTITY or OUTPUT, as noted.

chubertdev · 2014-03-31 Reply Admin

SELECT TOP 10 UserLoginName FROM user_table

UserLoginName
==============================================
1
11
111
1111
11111
111111
1111111
11111111
111111111
1111111111

2014-03-31 Reply Admin

What does a DBA actually do? Most the ones I've had the unfortunate experience of meeting were idiotic pencil pushing monkeys.

2014-03-31 Reply Admin

Johnny Leroux:
What does a DBA actually do? Most the ones I've had the unfortunate experience of meeting were idiotic pencil pushing monkeys.

I think you just answered your own question...

chubertdev · 2014-03-31 Reply Admin

Something something monkey something football.

2014-03-31 Reply Admin

When in doubt just throw feces

2014-03-31 Reply Admin

argh... Giving developers a bad name, cmon now! A number of us pride ourselves on predominantly working VERY close to the database (or solely in the database) and would shudder to see such code.

2014-03-31 Reply Admin

For those unfamiliar with what Marle was suggesting and why.

Having the UserLoginName as an identity field means you can't (usually) specify it during the insert, it is automatically assigned a (max + increment) value preventing race conditions.

SCOPE_IDENTITY then retrieves the UserLoginName of the record you have just created for output.

@@IDENTITY is not used because if the table has a trigger that does an insert to another table with an identity field the other table's created identity would be returned.

2014-03-31 Reply Admin

The DBA should have fixed this by declaring an index on the UserLoginName column. No code changes necessary.

Also, someone should point out to Merle that they are using Firebird and therefore there's no such thing as SCOPE_IDENTITY().

2014-03-31 Reply Admin

Triggers are forbidden in most large databases (that I came across, at least), along with CLR functions... Just based on experience it could lead to very huge problems.

2014-03-31 Reply Admin

Remy Porter:
Well, yes, the results would be different- identity columns would work consistently, whereas this approach has a non-zero chance of running into a race condition and attempting to reuse the same identity.

$i have many identities.

Wait a minute, didn't $i just say that last week? Come to think of it, how did $i wait a minute? Was $i awakened by another identity performing faster tham me?

2014-03-31 Reply Admin

Lorne Kates:
Planar:
Indeed. I wonder what kind of user would want to use a GUID as their login name...
My name is actually a GUID, you just can't tell because I spell it in Canadian.

SPELL IT IN DUTCH.

2014-04-01 Reply Admin

Geoff:
Monkey:
Just realised UserLoginName is actually an integer id column. <bleurgh>

I thought that was the TWTF myself.

Me 2

chubertdev · 2014-04-01 Reply Admin

RS:
Triggers are forbidden in most large databases (that I came across, at least), along with CLR functions... Just based on experience it could lead to very huge problems.

Luddites!

2014-04-01 Reply Admin

That still does not make it anything near correct, leaving a central WTF in place. The only correct way to do something like this is to let the only instance make the ID incrementation that can do it, and that is the DB itself during the INSERT.

Therefore the pattern is indeed: use an auto-incremented ID column (the concept exists in virtually every DB, even MySQL), just do the frickin' INSERT already and ask the database afterwards to get the ID it generated for you. I learned that pattern when I did my very first PHP-based web site ages ago. No wonder current Software has to many problems with concurrency when apparently developers have no idea what they are doing.

About the GUID vs. integer ID thing: GUIDs have their uses. As soon as you are dealing with distributed data, you are in trouble when using just integer IDs. So, it depends: are you looking for a local ID or does it have to be globally unique (sic)? GUIDs exist for a reason.

Severity One · 2014-04-01 Reply Admin

The Balance:
The difference however, is that the well trained developers can usually do the job of a DBA, whereas well trained DBAs wouldn't know the first thing about development.

Our DBAs are very well trained: they do exactly as I say. :)

Joking apart, there is a general understanding between developers and DBAs that programming (stored procedures and such) is largely the domain of the developers, and when it comes to things like optimisation and keeping the damn thing running in the first place, that's where the DBAs have the expertise.

Also, the phrase "developers" is used very loosely by Marle. Some people develop cancer; that doesn't make them developers, not even if they develop the programmatic equivalent of cancer.

ShachMaT · 2014-04-01 Reply Admin

Geoff:
Monkey:
Just realised UserLoginName is actually an integer id column. <bleurgh>

I thought that was the TRWTF myself.

FTFY. I'm an ass!

2014-04-01 Reply Admin

MAX()+1 was bad back in the days when you had to work without triggers or autoincrement columns. It's even worse now.

As I recall my preferred workaround was to add one to a column in a nextid table, then select the result, then insert into the desired table, then commit, as this would at least ensure that two clients couldn't retrieve the same id. (There was of course contention for ids for otherwise unrelated tables but that was the least of their problems.)

2014-04-01 Reply Admin

Loose Bree:
GUIDo ninjas NOT wanted!
[GU]UIDs might be OK as keys in small tables, but will bloat indexes on bigger tables.

A competent developer who spends 2~4 weeks studying query planning, normalization and indexes will usually make a better DBA than most DBAs.

Depends on the purpose of the DB.

For example, if you know your dataset is limited (the amount of active customers for a local homebuilder), it's better not to normalize the data.

I've seen DBAs normalize to save space, and put the address into a separate table.

Some of our customers live in the same address, why duplicate that data?

But then, one person at that address moves and the other doesn't. How do you update the address?

Obviously you could create a new instance, but then if your UI can't do that, then your fried.

So, yeah, I'm beginning to learn that the traditional solutions employed by DBAs doesn't work for all needs.

lucidfox · 2014-04-01 Reply Admin

TRWTF is stored procedures, period.

lucidfox · 2014-04-01 Reply Admin

TRWTF is stored procedures, period.

no laughing matter · 2014-04-01 Reply Admin

Kyle Huff:
The DBA should have fixed this by declaring an index on the UserLoginName column. No code changes necessary.

An index might solve the performance issues, but it does not solve the issues with parallel access at all!

Kyle Huff:
Also, someone should point out to Merle that they are using Firebird and therefore there's no such thing as SCOPE_IDENTITY().

What in the world let's you think this is a Firebird DB?

To the contrary, this firebirdsql.org page strongly suggests the opposite!

The SQL syntax shown in the article is MS-SQL, not Firebird.

2014-04-01 Reply Admin

Not WTF here. Person unsure about SQL, but good enough to actually know some of the syntax attempts a fix and submits it for review.

Fix is reviewed and found wanting. Note providing direction sent back.

Looks good to me. Maybe the RealWTF is simply that they actually did a code review at all?

chubertdev · 2014-04-01 Reply Admin

Valued Service:
Loose Bree:
GUIDo ninjas NOT wanted!
[GU]UIDs might be OK as keys in small tables, but will bloat indexes on bigger tables.

A competent developer who spends 2~4 weeks studying query planning, normalization and indexes will usually make a better DBA than most DBAs.

Depends on the purpose of the DB.

For example, if you know your dataset is limited (the amount of active customers for a local homebuilder), it's better not to normalize the data.

I've seen DBAs normalize to save space, and put the address into a separate table.

Some of our customers live in the same address, why duplicate that data?

But then, one person at that address moves and the other doesn't. How do you update the address?

Obviously you could create a new instance, but then if your UI can't do that, then your fried.

So, yeah, I'm beginning to learn that the traditional solutions employed by DBAs doesn't work for all needs.

If my wife and I enter the same address into a system, the text values may be the same, but the system better treat them as two different addresses, because that's what they are, even if they're currently equal to each other.

chubertdev · 2014-04-01 Reply Admin

lucidfox:
TRWTF is stored procedures, period.

Again:

Luddites!

chubertdev · 2014-04-01 Reply Admin

JohnBoy:
Not WTF here. Person unsure about SQL, but good enough to actually know some of the syntax attempts a fix and submits it for review.
Fix is reviewed and found wanting. Note providing direction sent back.

Looks good to me. Maybe the RealWTF is simply that they actually did a code review at all?

It's a fairly basic WTF, but it's still a WTF. As I mentioned earlier, I've seen someone lose their job for making this mistake.

2014-04-01 Reply Admin

chubertdev:
Valued Service:
Loose Bree:
GUIDo ninjas NOT wanted!
[GU]UIDs might be OK as keys in small tables, but will bloat indexes on bigger tables.

A competent developer who spends 2~4 weeks studying query planning, normalization and indexes will usually make a better DBA than most DBAs.

Depends on the purpose of the DB.

For example, if you know your dataset is limited (the amount of active customers for a local homebuilder), it's better not to normalize the data.

I've seen DBAs normalize to save space, and put the address into a separate table.

Some of our customers live in the same address, why duplicate that data?

But then, one person at that address moves and the other doesn't. How do you update the address?

Obviously you could create a new instance, but then if your UI can't do that, then your fried.

So, yeah, I'm beginning to learn that the traditional solutions employed by DBAs doesn't work for all needs.

If my wife and I enter the same address into a system, the text values may be the same, but the system better treat them as two different addresses, because that's what they are, even if they're currently equal to each other.

That's true, since one is your address and one is your wife's, but I would sure do a lot less recycling if people would recognize that my address equals my husband's address (and my kids' address). We get three of everything from our college (where our son will attend in the fall) and we barely want one!

So there is at least one instance where the addresses should be recognized as being the same.

chubertdev · 2014-04-01 Reply Admin

married:
That's true, since one is your address and one is your wife's, but I would sure do a lot less recycling if people would recognize that my address equals my husband's address (and my kids' address). We get three of everything from our college (where our son will attend in the fall) and we barely want one!
So there is at least one instance where the addresses should be recognized as being the same.

My wife and I are smart enough not to subscribe to the same mailing lists.

Addendum (2014-04-01 14:15):

married:
That's true, since one is your address and one is your wife's, but I would sure do a lot less recycling if people would recognize that my address equals my husband's address (and my kids' address). We get three of everything from our college (where our son will attend in the fall) and we barely want one!
So there is at least one instance where the addresses should be recognized as being the same.

My wife and I are smart enough not to subscribe to the same mailing lists.

But if we did both sign up to receive a particular piece of literature, then I would expect that we both receive a copy.

2014-04-01 Reply Admin

To all who think the DBA's "solution" is correct, I dare you to implement the recommended fix in the "real world." TRWTF is the existence of the stored procedure.

Based on the original procedure, the following statements have >50% odds of being true:

There are more than one stored procedures that insert new records.
There are more than one applications that call said procedures
There are more than one applications that perform direct INSERTs against the table
There are more than one development teams that manage the unknown quantity of applications
There are more than zero applications for which a current copy of the source cannot be found
There are more than zero teams that know their applications have a dependency on this procedure
Any change management procedures that might be in place will be unable to reach more than 25% stakeholders

chubertdev · 2014-04-01 Reply Admin

chubertdev:
married:
That's true, since one is your address and one is your wife's, but I would sure do a lot less recycling if people would recognize that my address equals my husband's address (and my kids' address). We get three of everything from our college (where our son will attend in the fall) and we barely want one!
So there is at least one instance where the addresses should be recognized as being the same.

My wife and I are smart enough not to subscribe to the same mailing lists.

Addendum (2014-04-01 14:15):

married:
That's true, since one is your address and one is your wife's, but I would sure do a lot less recycling if people would recognize that my address equals my husband's address (and my kids' address). We get three of everything from our college (where our son will attend in the fall) and we barely want one!
So there is at least one instance where the addresses should be recognized as being the same.

My wife and I are smart enough not to subscribe to the same mailing lists.

But if we did both sign up to receive a particular piece of literature, then I would expect that we both receive a copy.

I'm also choking on the irony.

chubertdev · 2014-04-01 Reply Admin

Sucatraps:
To all who think the DBA's "solution" is correct, I dare you to implement the recommended fix in the "real world." TRWTF is the existence of the stored procedure.
Based on the original procedure, the following statements have >50% odds of being true:

There are more than one stored procedures that insert new records.

There are more than one applications that call said procedures

There are more than one applications that perform direct INSERTs against the table

There are more than one development teams that manage the unknown quantity of applications

There are more than zero applications for which a current copy of the source cannot be found

There are more than zero teams that know their applications have a dependency on this procedure

Any change management procedures that might be in place will be unable to reach more than 25% stakeholders

I'm guessing that a lot of your work could end up on this site.

2014-04-01 Reply Admin

chubertdev:
married:
That's true, since one is your address and one is your wife's, but I would sure do a lot less recycling if people would recognize that my address equals my husband's address (and my kids' address). We get three of everything from our college (where our son will attend in the fall) and we barely want one!
So there is at least one instance where the addresses should be recognized as being the same.

My wife and I are smart enough not to subscribe to the same mailing lists.

Addendum (2014-04-01 14:15):

married:
That's true, since one is your address and one is your wife's, but I would sure do a lot less recycling if people would recognize that my address equals my husband's address (and my kids' address). We get three of everything from our college (where our son will attend in the fall) and we barely want one!
So there is at least one instance where the addresses should be recognized as being the same.

My wife and I are smart enough not to subscribe to the same mailing lists.

But if we did both sign up to receive a particular piece of literature, then I would expect that we both receive a copy.

Sure, I'm stupid. Thanks.

We went to the same college, we are separate people, we get two. We didn't "sign up" for any of it. I guess you didn't graduate from college.

chubertdev · 2014-04-01 Reply Admin

married:
Sure, I'm stupid. Thanks.
We went to the same college, we are separate people, we get two. We didn't "sign up" for any of it. I guess you didn't graduate from college.

I graduated from a college that you can contact via phone, email, fax, etc...to ask them to only send one.

chubertdev · 2014-04-01 Reply Admin

Basically, my point is that the burden is on YOU to only have one sent, not them. That is the mistake that you made with your initial comment.

Nagesh · 2014-04-01 Reply Admin

faoileag:
Try as I might, the reason for the "while" loop in the stored procedure escapes me. Apart from the fact that it can result in an endless loop if UserLoginName is not unique, why not simply do a "set @tempid = (select max(UserLoginName) from user_table;) + 1"?
And why the advice "use SCOPE_IDENTITY to get the id for the inserted record"? If I understand the code correctly, it returns the id of the last inserted record + 1, so (presumably) the id of the next record to insert?

Or am I completely misunderstanding TransactSQL?

Rubbish!

2014-04-01 Reply Admin

no laughing matter:
Kyle Huff:
The DBA should have fixed this by declaring an index on the UserLoginName column. No code changes necessary.
An index might solve the performance issues, but it does not solve the issues with parallel access at all!
Kyle Huff:
Also, someone should point out to Merle that they are using Firebird and therefore there's no such thing as SCOPE_IDENTITY().
What in the world let's you think this is a Firebird DB?
To the contrary, this firebirdsql.org page strongly suggests the opposite!

The SQL syntax shown in the article is MS-SQL, not Firebird.

It's a joke, buddy.

2014-04-01 Reply Admin

chubertdev:
Sucatraps:
To all who think the DBA's "solution" is correct, I dare you to implement the recommended fix in the "real world." TRWTF is the existence of the stored procedure.
Based on the original procedure, the following statements have >50% odds of being true:

There are more than one stored procedures that insert new records.

There are more than one applications that call said procedures

There are more than one applications that perform direct INSERTs against the table

There are more than one development teams that manage the unknown quantity of applications

There are more than zero applications for which a current copy of the source cannot be found

There are more than zero teams that know their applications have a dependency on this procedure

Any change management procedures that might be in place will be unable to reach more than 25% stakeholders

I'm guessing that a lot of your work could end up on this site.

He has a point. Sloppy coding infers sloppy procedures in other areas so changing the underlying table by changing UserLoginName to an Identity column may break more than it fixes.

Which is I assume why Marle said "look into" making the suggested changes.

Mr. AHole DBA · 2014-04-01 Reply Admin

darkyuris:
silly DBAs. living in past, thinking that data alone is what's important.

Silly developer, not realizing that providing fast, highly concurrent, ACID compliant, data efficiently while adhering to all necessairy compliance and uptime is closer to the duties of a DBA.

It seems you're not that concerned with 'the data', you could always pipe your output to /dev/null. That'll be really fast bro, really fast!

Don't confuse your lack of understanding of the role outside of development, as the DBAs lack of understanding of 'data'. Pretty sure any good DBA with years of experience has a much better understanding of 'data' than a good dev with years of experienc

Mr. AHole DBA · 2014-04-01 Reply Admin

The Balance:
I think the number of poorly trained people who drift into becoming sysadmins and DBAs generally outweighs the number of poorly trained people who drift into becoming developers.
The difference however, is that the well trained developers can usually do the job of a DBA, whereas well trained DBAs wouldn't know the first thing about development.

Dog:
Most DBAs couldn't code their way out of a wet paper bag. Most DBAs I've worked with are arrogant, stupid, and a fucking pain to deal with.

Dog:
That's because it was written by a fucking DBA with the brain capacity of a toadstool.

Dog: That's a fitting name for you. I suggest you keep it, perhaps add 'female' to the front of it to make it more accurate... you know, focus on data quality.

It's time to feast on crying developer tears, this is my favourite part on this site. Today is a especially good day, as the developer tears seem to be emitting from plump newborn babies, making it akin to eating 'develop tears veal'.

Ok crying developers, as you overfill my cups with your delicious tears, realize a couple of things:

1- You are agreeing with your favourite whipping boy, Nagesh. Full stop. That alone should make you question your existence, but let's continue on.

2- The more you cry and scream, the more obvious it is that you aren't a big time developer and probably worked in mid tech boring ass shops without a major tech department to put your ego in the trash bin where it belongs.

If you SERIOUSLY feel that a developer can do the job of a DBA, you are working in the most simple of environments and always have. It is rather comical watching you guys think you know what being a DBA providing 5 9's uptime (most of you don't even know what that means and have to google it, you're so far removed from operations), in a highly concurrent environment (read over 5000 transactions per second per DB), on data sets terabytes upon terabytes, and that's just on the SQL Relational database side! You think you could handle a Hadoop or Google Elastic search digesting hundreds of gigs a day and handle all bottlenecks along the line?? LOL. The ego and delusion here is off the hook.

3- The crying about not having access to sensitive data or environments, shows a complete lack of understanding of security policy on the enterprise, federal, payment card or finance industries. It means you have worked in a shop that doesn't really have anything that anyone really wants, writing apps that no one really cares about as it really has no exposure.

Please, do continue to cry more. You fillith thy cup.

Mr. AHole DBA · 2014-04-01 Reply Admin

Kyle Huff:
The DBA should have fixed this by declaring an index on the UserLoginName column. No code changes necessary.
Also, someone should point out to Merle that they are using Firebird and therefore there's no such thing as SCOPE_IDENTITY().

And if the table is highly concurrent and huge? Also, why are the DBAs looked to provide band-aids for developer mistakes? It's as if the DBAs are the 'garbage collection' for the devs implementing worst-practices.

How about you fix your code so it doesn't suck and fits to the standards of a experienced developer instead of creating potentially massive IO, wasted memory, another index to be maintained every single time the base table changes, and more options for the query optimizer to review?

chubertdev · 2014-04-01 Reply Admin

Mr. AHole DBA:
Kyle Huff:
The DBA should have fixed this by declaring an index on the UserLoginName column. No code changes necessary.
Also, someone should point out to Merle that they are using Firebird and therefore there's no such thing as SCOPE_IDENTITY().

And if the table is highly concurrent and huge? Also, why are the DBAs looked to provide band-aids for developer mistakes? It's as if the DBAs are the 'garbage collection' for the devs implementing worst-practices.

How about you fix your code so it doesn't suck and fits to the standards of a experienced developer instead of creating potentially massive IO, wasted memory, another index to be maintained every single time the base table changes, and more options for the query optimizer to review?

The problem is that there are a lot of DBAs who have the attitude shown by people like you and JJ from today's story, but most of them have the skills of JJ, rather than someone like you who can actually back up and justify what he's saying. I've been lucky to put myself in a position to work with more people like you, but from what I've heard about in this industry (these sites, user group meetings, etc), there are a lot of DBAs who will do something a given way "just because", and that's not acceptable. They would have probably produced a blank stare after reading your second paragraph.

That being said, there are a similar number of developers who act this way, but it's a bit different. DBAs go into the front end code far less often than developers touch the databases, so I can understand that DBAs are a bit more defensive of their territory.

The Fix

Leave a comment on “The Fix”