• meh (unregistered)

    meh meh

  • User Number 1 (unregistered)

    I would be first. But after the confirmation page my comment wasn't there.

  • User Number 1 (unregistered)

    (6 minutes later) First!

  • Troll (unregistered)

    Time to work out, OP, time to work out!

  • IP Guru (unregistered)

    Lisa had already fixed the problem - INCORRECTLY

    the correct approach would have been to set a flag preventing deletion until the confirmation had been generated and viewed.

    Man with one watch always knows he time, man with two watches is never quite sure

  • Unhelpful (unregistered)

    Lisa had already fixed this race condition: wait at least 5 minutes before moving the data

    Lisa has the best fixes.

    (Better fix: log the transaction, read the log to generate the confirmation page)

  • Brian (unregistered) in reply to IP Guru

    Yep, if your "solution" to a race condition is just jamming in a delay somewhere, all you're doing is kicking the can down the road, as Lisa found out. Eventually your application will be a horrible tangle of conflicting waits that are impossible to debug when the timebomb inevitably explodes. As IP Guru says, the only real way to solve a concurrency problem is to build in positive confirmation that whatever was supposed to happen actually happened.

  • Moi (unregistered) in reply to Brian

    Ohh really and what if the user never ends up requesting that conformation page. A much harder to control issue than making sure the clock is in sync (or making sure your timer relies on just 1 clock or runs as an change-interval).

    Your 'solution' is just as flawed if not more so since in your solution the data would stay there forever and the transaction would never be executed at all. In that case I prefer a buggy confirmation page

    In a greenfield situation and recognizing this issue upfront you would most likely model the flow differently. For instance caching the confirmation text for a limited time could have been an option.

  • Random DB Guy (unregistered)

    Several here have given options (Log, flag, etc). Another method is to have the DB tell the backend which records to move (Alter the SELECT statement to filter out anything not ready to move). With proper coding this happens in a Stored Procedure... wait, which website is this again?

  • "Lisa" (unregistered) in reply to Moi

    That's absolutely a situation that happens. Some users are using a cell phone, some have a bad connection, etc. It's entirely possible (and does), for a connection to drop between paying for the document and seeing the confirmation page. Setting a flag would result in missing applications for these users.

  • Unhelpful (unregistered) in reply to Random DB Guy

    So... a flag?

  • I dunno LOL ¯\(°_o)/¯ (unregistered)

    Lots of backseat driving here when TRWTF is having any server without NTP sync.

  • Idort (unregistered) in reply to I dunno LOL ¯\(°_o)/¯

    Another port and service to run on government-owned machines? Machines of unknown quantity and purpose with this level of legacy-ness? No thanks!

  • Idort (unregistered) in reply to I dunno LOL ¯\(°_o)/¯

    In fact, TRWTF is comparing timestamps created by different machines. The database is the single source of truth, and database time should be the single source of time!

  • Trolololol (unregistered)

    I'd say there is more than one issue here, the biggest one being: why is the taxpayer paying someone who isn't capable of setting up NTP sync on a government owned server, especially since there are so many time sensitive components on the system, certificates being one, services etc.

    The other issue is that this was implemented in a way that was more convenient in terms of development effort vs using proper development principles that would be independent of time (state machines come to mind).

  • The_Dark_Lord (nodebb)

    Another solution would be to COPY the data from the front-end DB to the back-end DB, then erase from the front-end DB only the PII (or obfuscate, anonymise, etc.), leaving generic data such as transaction number, amount, etc.

    Then the record would always be there to display.

  • Jerepp (unregistered) in reply to The_Dark_Lord

    Then the frontend would fill up with records and the application would get slow and laggy and somebody would have to push the 'Turbo' button on the computer they are using for a server.

  • Waldo (unregistered) in reply to I dunno LOL ¯\(°_o)/¯

    Takes more than a ntp server when it's a VM, you have to get something in there like VMware tools to avoid clock drift.

  • Michał (unregistered)

    Driver's licenses have to be issued, as do pet licenses. Buildings have to be inspected and certified

    Dear Jane In some countries pets (or their owners) don't need any licenses. Still, they all live. Mrs. Benz didn't need a license to safely drive a car. And let's not forget about ancient Egyptians, who build the pyramids without any certification AFAICT. So no, they don't "have to". You're forced to "have to". And to think this is the way it has to be.

    HTH Michał

  • Idort (unregistered) in reply to Michał

    This is not a political forum.

    HTH.

  • John (unregistered)

    What happens when the backend moves to a new timezone.

  • masonwheeler (github)

    Driver's licenses have to be issued, as do pet licenses.

    Wait, what? When did that happen?

  • i_like_speed (unregistered) in reply to Jerepp

    +1 for the old-school "turbo button" reference

  • Sole Purpose of VIsit (unregistered) in reply to Moi

    In that case, you (the server) rely upon a timeout.

    Say, a five minute timeout.

    A five minute timeout that applies on your own machine, the server, not a random timeout that relies upon some spavined "universal clock" out there.

    You see where I'm going with this? (Probably not.)

  • "Lisa" (unregistered)

    As to the TNP, there was in fact one (well, several in each layer of the network). But policy (not unreasonably), is to have everything locked down by default, e.g., no incoming/outgoing connections blocked. Each one is opened up as needed. What was missed was opening up the port to the TNP.

    As to stubbing the data, yes, we actually do that for a portion of the app but even that is short lived due to a (again, not unreasonably given it's people's data and a government system), very stringent data protection policy, this too is a limited solution in this case.

  • "Lisa" (unregistered)

    **correction: that should read "ALL incoming/outgoing connections blocked"

  • P.B. Floyd (unregistered) in reply to Idort

    Yeah, we had a customer who complained that the time stamps in our database were different depending on which time zone their users were in. We did an analysis and some of the time we'd get the time from the OS on the user's workstation and some of the time it was retrieved from the DB server. Obviously the DB server is correct but about 75% of the calls were to the OS. So we told the customer, sorry, this is how it works. Management determined that it wasn't worth the time to fix it.

  • P.B. Floyd (unregistered) in reply to Idort

    He's the kind of guy that when someone says "but the program has to support incoming faxes", he goes off on how they didn't even have faxes in the old days and people got along just fine, and in fact we don't need computer programs for anything. Man has lived for eons without computers or electricity or any of the creature comforts we have today.

  • WonkoTheSane (unregistered)

    We have an API that uses an HMAC for verification, on of the fields that we ask to be passed is the date that the system generated the request, we then invalidate requests we deem to be too old.

    This was great until we expanded to a different time zone and no one could figure out why all of the requests were being bounced!

  • BetterSafeThanSorry (unregistered) in reply to John

    I usually sit on my backend when I'm passing into another timezone

  • Gerry (unregistered) in reply to WonkoTheSane

    UTC is your friend, we do the same, using UTC without issue (I think there were some early when in development, but resolved quickly)

    Only issue we had was a testing on an iPad that wasn't set to auto-sync. The time eventually drifted more than two minutes out, so we had authentication failures.

    Why wasn't it set to auto-sync? To test the timeout of course!

  • Anonymous Coward (unregistered)

    NTP is great - except in one case it went bananas (some odd interaction with the virtual machine whatsits) and my server ended up several hundred years in the past.

    Clearing up the database was ... Fun?

  • Olivier (unregistered) in reply to Anonymous Coward

    Several hundred years? Wow!

  • Olivier (unregistered) in reply to "Lisa"

    NTP should be one of these exclusion cases, there no reason all the servers are not using NTP, if only, to permit proper debugging.

  • gordonjcp (nodebb) in reply to Moi

    I don't understand why you think not deleting the flag to say "show a confirmation page" that's been set after the transaction is processed would somehow stop the transaction being processed.

  • Idort (unregistered) in reply to Olivier

    I can think of a couple of reasons for a government server not to use NTP by default: https://www.cvedetails.com/vulnerability-list/vendor_id-2153/NTP.html

  • Idort (unregistered) in reply to gordonjcp

    The confirmation page requires the data. The transaction removes the data. The transaction cannot proceed until the confirmation page has unset the flag.

  • Herr Otto Flick (unregistered) in reply to Idort

    You are a moron if you think either that NTP is not essential, or that those CVEs are remotely relevant for ntpd running as an NTP client.

    As per the OP, the server was using NTP anyway (which, for some reason, she is calling it "TNP", I'm guessing she means NTP and not picric acid), they had misconfigured the firewall to block responses. So yeah, typical government IT.

  • A moron (unregistered) in reply to Herr Otto Flick

    https://en.wikipedia.org/wiki/Etiquette_in_technology#Netiquette

  • anonymous (unregistered) in reply to Moi

    Pre-generate the confirmation page, then delete the data. When the user requests the confirmation page they'll see the pre-generated page even though the data's probably already gone. Have a timeout to delete the pre-generated page in case the user never requests it.

  • anonymous (unregistered)

    The real WTF is not implementing some sort of lock to ensure that the data isn't deleted from the frontend database until it's no longer needed.

    Also just because a database is "temporary" doesn't provide an excuse to make it insecure.

  • MoreThanOneStepSolution (unregistered) in reply to Jerepp

    Sure the frontend DB would grow, but you could easily create a maintenance task to control that. For example, set a TTL on complete records so after they are pushed to the backend DB they are pruned from the frontend DB in 48 hours (or whatever turns out to be a suitable lifecycle).

    Your database would be larger than it is currently but the TTL would keep it at a fairly stable size (assuming consistent traffic). I have a few PII collections which use TTLs to ensure we don't fall foul of compliance by expiring data before it could become an issue if it isn't cleaned before then.

  • Sole Purpose of VIsit (unregistered) in reply to "Lisa"

    No no no.

    What you missed is that any form of brute-force timeout is completely the wrong solution for a state-based workflow like this. What you needed was a proper message-based protocol.

    It's depressing that useless dingbat hacks like this are used anywhere at all in a server-based IT system these days. I suppose it's no surprise that they are used in GovIT, and the resultant cretinism is actually defended by the ignorant fools in charge.

  • MiserableOldGit (unregistered) in reply to Sole Purpose of VIsit

    Absolutely, trying to "Fix" a race condition by shoving lead boots on one of the participants is just storing up trouble for later, and making it all the more confusing for the poor sod who has to deal with it then.

    I've seen it in the private sector too, but then I guess firms that routinely tolerate hacky crap like this gradually drift out of business as their IT fails to deliver, and I've seen that too.

    More poor to crappy govt IT, been feeding my family fixing* that shite for years now.

    • alright, keeping it limping down the road.
  • Richard Wells (unregistered) in reply to masonwheeler

    https://www.youtube.com/watch?v=pnq96W9jtuw

  • Tim Lord (unregistered)

    If the server was an IBM mainframe it might be impossible to change the time without rebooting.... or buying an incredibly expensive piece of add on equipment.

    And the initial time might be set manually by the operators doing the reboot.

    There's probably a lcd display blinking 00:00.

    (There's a tiny bit of of justification for the design, since the hardware guarantees that no two timestamps can be the same - the least significant bits of the timestamp are uniquifiers - but when you're trying to align logs, write code to parse the binary times, with all the leap seconds, can't figure out WTF is going on, then realize that it's a surreal time clock, there's a tiny bit of justification for breaking the designers on the flywheel).

  • "Lisa" (unregistered) in reply to Sole Purpose of VIsit

    As engineers, we (all of us), have to have a certain amount of pragmatism when applying any kind of solution. There are always situations where non-optimal solutions are put in place as a stop gap to keep things running while a more appropriate solution is created. The story above is one such situation...and it worked for the time it was implemented. We did eventually rolled out the stubbing approach but the "kick the can down the road" solution worked well enough in the interim (zero timed out applications - it demonstrably did SOMETHING right), to buy time to address the issue more appropriately.

    As to the message based protocol - it IS a message based protocol. However, "message end" is determined to be when payment is taken (for reasons already outlined by others here), and the confirmation page was left a little in limbo. Hence the need for a timeout/stub to display the info to the user while the "message" was being moved and processed.

  • Scarlet_Manuka (nodebb) in reply to anonymous

    Have a timeout to delete the pre-generated page in case the user never requests it.

    Yeah, that should work. Let's see, how long to make the timeout? Five minutes seems pretty reasonable to me...

  • Dr. λ the Binder of Variables and Applicator of Terms (unregistered)

    Lisa's fix was stupid! She should connect the process of moving the data and generating the confirmation page so that the data will be moved because the confirmation page is done being generated instead of because some arbitrary time lapse has passed! Her way of "fixing" this problem leads to a fragile system!

    Time for Lisa to stop playing around and go back to the kitchen!

  • "Lisa" (unregistered) in reply to Dr. λ the Binder of Variables and Applicator of Terms

    "Stupid" would perhaps in this case be defined as;

    • not reading the previous comments outlining why that solution isn't viable
    • assuming that you fully understand the complexity of a system based on a brief description
    • assuming that the gender hasn't been anonymized away along with the name of the OP in the article

Leave a comment on “Time To Transfer”

Log In or post as a guest

Replying to comment #:

« Return to Article