The Daily WTF: Curious Perversions in Information Technology

2017-08-22 Reply Admin

meh meh

2017-08-22 Reply Admin

I would be first. But after the confirmation page my comment wasn't there.

2017-08-22 Reply Admin

(6 minutes later) First!

2017-08-22 Reply Admin

Time to work out, OP, time to work out!

2017-08-22 Reply Admin

Lisa had already fixed the problem - INCORRECTLY

the correct approach would have been to set a flag preventing deletion until the confirmation had been generated and viewed.

Man with one watch always knows he time, man with two watches is never quite sure

2017-08-22 Reply Admin

Lisa had already fixed this race condition: wait at least 5 minutes before moving the data

Lisa has the best fixes.

(Better fix: log the transaction, read the log to generate the confirmation page)

2017-08-22 Reply Admin

Yep, if your "solution" to a race condition is just jamming in a delay somewhere, all you're doing is kicking the can down the road, as Lisa found out. Eventually your application will be a horrible tangle of conflicting waits that are impossible to debug when the timebomb inevitably explodes. As IP Guru says, the only real way to solve a concurrency problem is to build in positive confirmation that whatever was supposed to happen actually happened.

2017-08-22 Reply Admin

Ohh really and what if the user never ends up requesting that conformation page. A much harder to control issue than making sure the clock is in sync (or making sure your timer relies on just 1 clock or runs as an change-interval).

Your 'solution' is just as flawed if not more so since in your solution the data would stay there forever and the transaction would never be executed at all. In that case I prefer a buggy confirmation page

In a greenfield situation and recognizing this issue upfront you would most likely model the flow differently. For instance caching the confirmation text for a limited time could have been an option.

2017-08-22 Reply Admin

Several here have given options (Log, flag, etc). Another method is to have the DB tell the backend which records to move (Alter the SELECT statement to filter out anything not ready to move). With proper coding this happens in a Stored Procedure... wait, which website is this again?

2017-08-22 Reply Admin

That's absolutely a situation that happens. Some users are using a cell phone, some have a bad connection, etc. It's entirely possible (and does), for a connection to drop between paying for the document and seeing the confirmation page. Setting a flag would result in missing applications for these users.

2017-08-22 Reply Admin

So... a flag?

2017-08-22 Reply Admin

Lots of backseat driving here when TRWTF is having any server without NTP sync.

2017-08-22 Reply Admin

Another port and service to run on government-owned machines? Machines of unknown quantity and purpose with this level of legacy-ness? No thanks!

2017-08-22 Reply Admin

In fact, TRWTF is comparing timestamps created by different machines. The database is the single source of truth, and database time should be the single source of time!

2017-08-22 Reply Admin

I'd say there is more than one issue here, the biggest one being: why is the taxpayer paying someone who isn't capable of setting up NTP sync on a government owned server, especially since there are so many time sensitive components on the system, certificates being one, services etc.

The other issue is that this was implemented in a way that was more convenient in terms of development effort vs using proper development principles that would be independent of time (state machines come to mind).

The_Dark_Lord · 2017-08-22 Reply Admin

Another solution would be to COPY the data from the front-end DB to the back-end DB, then erase from the front-end DB only the PII (or obfuscate, anonymise, etc.), leaving generic data such as transaction number, amount, etc.

Then the record would always be there to display.

2017-08-22 Reply Admin

Then the frontend would fill up with records and the application would get slow and laggy and somebody would have to push the 'Turbo' button on the computer they are using for a server.

2017-08-22 Reply Admin

Takes more than a ntp server when it's a VM, you have to get something in there like VMware tools to avoid clock drift.

2017-08-22 Reply Admin

Driver's licenses have to be issued, as do pet licenses. Buildings have to be inspected and certified

Dear Jane In some countries pets (or their owners) don't need any licenses. Still, they all live. Mrs. Benz didn't need a license to safely drive a car. And let's not forget about ancient Egyptians, who build the pyramids without any certification AFAICT. So no, they don't "have to". You're forced to "have to". And to think this is the way it has to be.

HTH Michał

2017-08-22 Reply Admin

This is not a political forum.

HTH.

2017-08-22 Reply Admin

What happens when the backend moves to a new timezone.

2017-08-22 Reply Admin

Driver's licenses have to be issued, as do pet licenses.

Wait, what? When did that happen?

2017-08-22 Reply Admin

+1 for the old-school "turbo button" reference

2017-08-22 Reply Admin

In that case, you (the server) rely upon a timeout.

Say, a five minute timeout.

A five minute timeout that applies on your own machine, the server, not a random timeout that relies upon some spavined "universal clock" out there.

You see where I'm going with this? (Probably not.)

2017-08-22 Reply Admin

As to the TNP, there was in fact one (well, several in each layer of the network). But policy (not unreasonably), is to have everything locked down by default, e.g., no incoming/outgoing connections blocked. Each one is opened up as needed. What was missed was opening up the port to the TNP.

As to stubbing the data, yes, we actually do that for a portion of the app but even that is short lived due to a (again, not unreasonably given it's people's data and a government system), very stringent data protection policy, this too is a limited solution in this case.

2017-08-22 Reply Admin

**correction: that should read "ALL incoming/outgoing connections blocked"

2017-08-22 Reply Admin

Yeah, we had a customer who complained that the time stamps in our database were different depending on which time zone their users were in. We did an analysis and some of the time we'd get the time from the OS on the user's workstation and some of the time it was retrieved from the DB server. Obviously the DB server is correct but about 75% of the calls were to the OS. So we told the customer, sorry, this is how it works. Management determined that it wasn't worth the time to fix it.

2017-08-22 Reply Admin

He's the kind of guy that when someone says "but the program has to support incoming faxes", he goes off on how they didn't even have faxes in the old days and people got along just fine, and in fact we don't need computer programs for anything. Man has lived for eons without computers or electricity or any of the creature comforts we have today.

2017-08-22 Reply Admin

We have an API that uses an HMAC for verification, on of the fields that we ask to be passed is the date that the system generated the request, we then invalidate requests we deem to be too old.

This was great until we expanded to a different time zone and no one could figure out why all of the requests were being bounced!

2017-08-22 Reply Admin

I usually sit on my backend when I'm passing into another timezone

2017-08-22 Reply Admin

UTC is your friend, we do the same, using UTC without issue (I think there were some early when in development, but resolved quickly)

Only issue we had was a testing on an iPad that wasn't set to auto-sync. The time eventually drifted more than two minutes out, so we had authentication failures.

Why wasn't it set to auto-sync? To test the timeout of course!

2017-08-22 Reply Admin

NTP is great - except in one case it went bananas (some odd interaction with the virtual machine whatsits) and my server ended up several hundred years in the past.

Clearing up the database was ... Fun?

2017-08-22 Reply Admin

Several hundred years? Wow!

2017-08-22 Reply Admin

NTP should be one of these exclusion cases, there no reason all the servers are not using NTP, if only, to permit proper debugging.

gordonjcp · 2017-08-23 Reply Admin

I don't understand why you think not deleting the flag to say "show a confirmation page" that's been set after the transaction is processed would somehow stop the transaction being processed.

2017-08-23 Reply Admin

I can think of a couple of reasons for a government server not to use NTP by default: https://www.cvedetails.com/vulnerability-list/vendor_id-2153/NTP.html

2017-08-23 Reply Admin

The confirmation page requires the data. The transaction removes the data. The transaction cannot proceed until the confirmation page has unset the flag.

2017-08-23 Reply Admin

You are a moron if you think either that NTP is not essential, or that those CVEs are remotely relevant for ntpd running as an NTP client.

As per the OP, the server was using NTP anyway (which, for some reason, she is calling it "TNP", I'm guessing she means NTP and not picric acid), they had misconfigured the firewall to block responses. So yeah, typical government IT.

2017-08-23 Reply Admin

https://en.wikipedia.org/wiki/Etiquette_in_technology#Netiquette

2017-08-23 Reply Admin

Pre-generate the confirmation page, then delete the data. When the user requests the confirmation page they'll see the pre-generated page even though the data's probably already gone. Have a timeout to delete the pre-generated page in case the user never requests it.

2017-08-23 Reply Admin

The real WTF is not implementing some sort of lock to ensure that the data isn't deleted from the frontend database until it's no longer needed.

Also just because a database is "temporary" doesn't provide an excuse to make it insecure.

2017-08-23 Reply Admin

Sure the frontend DB would grow, but you could easily create a maintenance task to control that. For example, set a TTL on complete records so after they are pushed to the backend DB they are pruned from the frontend DB in 48 hours (or whatever turns out to be a suitable lifecycle).

Your database would be larger than it is currently but the TTL would keep it at a fairly stable size (assuming consistent traffic). I have a few PII collections which use TTLs to ensure we don't fall foul of compliance by expiring data before it could become an issue if it isn't cleaned before then.

2017-08-23 Reply Admin

No no no.

What you missed is that any form of brute-force timeout is completely the wrong solution for a state-based workflow like this. What you needed was a proper message-based protocol.

It's depressing that useless dingbat hacks like this are used anywhere at all in a server-based IT system these days. I suppose it's no surprise that they are used in GovIT, and the resultant cretinism is actually defended by the ignorant fools in charge.

2017-08-24 Reply Admin

Absolutely, trying to "Fix" a race condition by shoving lead boots on one of the participants is just storing up trouble for later, and making it all the more confusing for the poor sod who has to deal with it then.

I've seen it in the private sector too, but then I guess firms that routinely tolerate hacky crap like this gradually drift out of business as their IT fails to deliver, and I've seen that too.

More poor to crappy govt IT, been feeding my family fixing* that shite for years now.

alright, keeping it limping down the road.

2017-08-24 Reply Admin

https://www.youtube.com/watch?v=pnq96W9jtuw

2017-08-26 Reply Admin

If the server was an IBM mainframe it might be impossible to change the time without rebooting.... or buying an incredibly expensive piece of add on equipment.

And the initial time might be set manually by the operators doing the reboot.

There's probably a lcd display blinking 00:00.

(There's a tiny bit of of justification for the design, since the hardware guarantees that no two timestamps can be the same - the least significant bits of the timestamp are uniquifiers - but when you're trying to align logs, write code to parse the binary times, with all the leap seconds, can't figure out WTF is going on, then realize that it's a surreal time clock, there's a tiny bit of justification for breaking the designers on the flywheel).

2017-08-29 Reply Admin

As engineers, we (all of us), have to have a certain amount of pragmatism when applying any kind of solution. There are always situations where non-optimal solutions are put in place as a stop gap to keep things running while a more appropriate solution is created. The story above is one such situation...and it worked for the time it was implemented. We did eventually rolled out the stubbing approach but the "kick the can down the road" solution worked well enough in the interim (zero timed out applications - it demonstrably did SOMETHING right), to buy time to address the issue more appropriately.

As to the message based protocol - it IS a message based protocol. However, "message end" is determined to be when payment is taken (for reasons already outlined by others here), and the confirmation page was left a little in limbo. Hence the need for a timeout/stub to display the info to the user while the "message" was being moved and processed.

Scarlet_Manuka · 2017-08-31 Reply Admin

Have a timeout to delete the pre-generated page in case the user never requests it.

Yeah, that should work. Let's see, how long to make the timeout? Five minutes seems pretty reasonable to me...

2017-09-04 Reply Admin

Lisa's fix was stupid! She should connect the process of moving the data and generating the confirmation page so that the data will be moved because the confirmation page is done being generated instead of because some arbitrary time lapse has passed! Her way of "fixing" this problem leads to a fragile system!

Time for Lisa to stop playing around and go back to the kitchen!

2017-09-04 Reply Admin

"Stupid" would perhaps in this case be defined as;

not reading the previous comments outlining why that solution isn't viable
assuming that you fully understand the complexity of a system based on a brief description
assuming that the gender hasn't been anonymized away along with the name of the OP in the article

Time To Transfer

Leave a comment on “Time To Transfer”