- Feature Articles
- CodeSOD
- Error'd
- Forums
-
Other Articles
- Random Article
- Other Series
- Alex's Soapbox
- Announcements
- Best of…
- Best of Email
- Best of the Sidebar
- Bring Your Own Code
- Coded Smorgasbord
- Mandatory Fun Day
- Off Topic
- Representative Line
- News Roundup
- Editor's Soapbox
- Software on the Rocks
- Souvenir Potpourri
- Sponsor Post
- Tales from the Interview
- The Daily WTF: Live
- Virtudyne
Admin
There, are we happy now?
Admin
"Management was satisfied with the reduction ... " should not be taken as meaning that there actually was a reduction.
Admin
Admin
well you'd obviously need a UUID
that is, a Unique Universe IDentifier
Admin
Actually, it's (nTrolled % 1) times.
Admin
Broken clock effect. Every so often, the function would generate a duplicate hash by chance, and occasionally this would correspond to actual duplicate data.
Admin
I think it's implemented something like this: http://rubbishsoft.com/longguid/
I hate akismet with a white-hot passion! Piece of crap-ware!
Admin
Once in a while? You mean once in every 9999999999999 * 9999999999999 * 9999999999999 = 1.0E39 data entries?
Admin
Admin
It's client side. You can never guarantee that Math.Seed() and Math.Rand() has the same implementation on all clients.
Admin
Admin
So, nobody has the wit to simply put a unique constraint on the data field in question - ah, I know, the application can maintain data integrity, silly me.
Admin
Isn't that the "Right Way (TM)"?
Admin
My guess: Sam's colleague automated the de-duplication script and set it to run once per day. The "hash" code was a smokescreen to let him slough off for a week or so.
Admin
Let's say that the hashes randomly match 10% of the time without regard to any other property of the entry. Basic statistics would suggest that you'd see a 10% reduction in the absolute number duplicate entries, just as long as you didn't bother to look to see if there was any corresponding decline in valid entries compared to the total number of records processed.
That's assuming that they even bothered to count how many duplicate records were still slipping through (which would have been a sign to anyone with a sense of what the system was doing that something wasn't right) and didn't simply count the number of rejections as the number of duplicate entries prevented, ignoring actual duplication.
Admin
...right, but you're ignoring a key piece of information here: management noticed the decline in duplicate entries...
You have a group of people who get distracted by shiny things analyzing the statistics of their data.
Admin
Admin
Admin
Admin
As if you're surprised! This is TDWTF after all...
Admin
This is the most maddening thing about this whole article. It should be "lay."
Admin
Admin
Regarding the "post twice" vs. "multiple entry of the same data".....
I vote for the former. If people in different areas have different (unique) sources of data, and each only knows about their own source then the later is unlikely to happen.
Does not excuse this being a poor way to handle it though...
Admin
I really do wish all you losers who use client side scripts to validate data would just dry up and blow away.
Admin
Admin
Admin
Did anybody else hear the wooshing sound? It gets louder everytime I hear it...
Huh, wierd. The way things are going today, I'm sure we'll hear it again, and again...
Admin
There's no WTF here. OK, except maybe that the OP and his colleague were working with unclear requirements. And that the unique id generator is called a hash. And languages that define modulo arithmetic on non-integers. And that the modulo operator was a no-op due to the input data. And the unnecessary use of integers more than 32-bits wide. But yeah, other than that stuff, not a WTF.
Admin
Validating client side AND server side saves electrons. Not everyone has high speed internet access and doesn't mind a 800kb page refresh with each incorrect form submission.
Admin
I think the "idea" was to create a hash of all the field values and pass around a single hash value rather than comparing each field each time (or passing around possibly 30, 50, etc. values). In fact I would store the hash of the record in an indexed column for easy comparison (assuming they have a DB)... Of course this would require a trigger to ensure if the data is ever changed to ensure the hash is recomputed, etc. Conceptually, it's a good idea, but the implementation was an epic fail.
I once read that the only thing worse than inaccurate data is inaccurate data that you think is right...
Admin
Admin
(Feeling a bit trolled somehow...)
Admin
You sure it would randomly happen? because Math.random() % 1 isn't a random value - it will always be 0; so the hashed value will always be "000000000000000000000000000000000000000000000000"
Admin
You sure it would randomly happen? because Math.random() % 1 isn't a random value - it will always be 0; so the hashed value will always be "000000000000000000000000000000000000000000000000"
Admin
Oh, god, the pain. It is blinding.
I have got to stop coming by TDWTF.
Admin
Once in 1000000000000000000000000000000000000000000000000. Yeah, I really don't think you'll ever have the same hash twice.
Admin
Yes this code is a massive WTF. But the biggest WTF is their business process. They should be splitting up the work when its assigned to avoid any two people from ever performing duplicate work in the first place.
Admin
Yup! This is no guarantee against collisions.
Admin
No, actually this code is a larger WTF because duplicate data can be handled but coordinating the efforts of users across the world could be a physical impossibility...
Management trusted the developer to build them something to prevent this problem; instead they got something that actually makes them loose data and randomly may or may not prevent duplicate data from being entered.
I'd rather have duplicate data then no/missing data.
Admin
Admin
Admin
You replied to a comment on TDWTF, therefore you've been trolled.
I replied to a comment on TDWTF, therefore I've been trolled. Anybody who disagrees...
Admin
Nagesh and his outsourcing office strike again!
Admin
Oh. Oops.
Admin
FTFY
Admin
[q]Sam wanted to use the hashing logic for a similar problem. [/q]
TRWTF is code reuse in an enterprise situation, amirite?
Admin
You're in luck! You're no longer coming to TDWTF, you're coming to the TOEFDWTF.
(That's the once every few days WTF)
Admin
Admin
You only need to compute the hash for each string once (expensive) and then comparing the hash (very cheap).
Comparing the strings un-hashed would be moderately expensive for every string.
If the data was guaranteed to be short then perhaps you would not benefit much from the hash.
Admin
Is that how it sounded to you?