• (nodebb)

    Yeah, the moment I heard it was using Excel, I figured it was a row-capacity problem even without digging into the details, but I had forgotten that the old limit was 64 kibirows instead of a megarow (mebirow?). Ugh.

    It's really hard to see how anyone involved thought that Excel was a good tool for large-scale data analysis. Small-scale, OK, kindasorta, but not at whole-country scale.

    Addendum 2020-10-06 06:41: Confirmed. The new capacity is a mebirow, 1048576 rows.

  • Prime Mover (unregistered)

    ... and they believe a mebirow is going to be adequate because ...?

  • MiserableOldGit (unregistered)

    Having worked on UK Gov processes I can confirm they are in love with Excel, especially when it's the wrong tool for the job. Part of the reason is they absolutely hamstring themselves with process when it comes to using anything else, especially if it involves distributing an executable or making a metadata change in a database.
    One department I worked with it used to take 8 weeks just to get an acknowledgement that a project had been added to the queue for the pen testing team to provide an estimate for their input. Once you've sat through a few mind-numbing meetings about that, using Excel (Or word, access, outlook VBA) stops looking like a WTF and starts looking like a welcome relief. But that's as developers, we might have hated doing it, but we could at least write a "proper" solution with those tools and get it out doing its work before another century rolls on by.
    I don't think, in spite of what PHE said (or maybe the BBC reported they said), a developer was involved in this, at least not one who has worked with data, it just makes sense to convert the file to XLS (or even XLSX) UNLESS there is some real dumbass data editing/aggregation taking place manually, in which case I'd bet there's lots more errors lurking in this thing.

  • MiserableOldGit (unregistered)

    It just makes no sense to convert the file ... etc etc ... damn!

  • Divbad (unregistered)

    might be worth mentioning that this contract was given to Dido Harding, fames for her data breaches & massive fines at TalkTalk (mobile phone), without any of the usual legally-required contract bid tender process. Nothing to do with her being joint-owner of a racecourse where all sorts of contracts are "bid" for by government-minister-ownede companies. Also unrelated, her being married to a fgovernment minister, Oh no, that would be too improper!

  • Dave (unregistered)

    That's conspiratorial drivel. If the contract isn't at a fair, arms length price, we can apply a confiscation order to the entire amount, not just the overpayment. That's been introduced quite recently without much fanfare, and it's radically changed the tendering process even absent a public health emergency.

  • MiserableOldGit (unregistered) in reply to Divbad
    Baroness Harding of Winscombe

    FTFY

    Haha, I'd forgotten that. I don't know how as it's been all over the news since it happened.

    Highly connected Tory supporter with no meaningful experience in public health and a track record of some of the biggest IT cock-ups in recent British history got parachuted into the job, what could possibly go wrong?

    What's the betting her swabs would test positive for traces of BoJo?

  • dcummings (unregistered)

    Using Excel for this kind of critical data pipeline is quite a big WTF. But as @MiserableOldGit says, the real WTF is the levels of process and bureaucracy in place that mean it's probably a reasonable decision for every individual involved. Trying to get some kind of script or program to collate and analyse the data approved would just be way too slow and have to follow way too many pointless rules.

  • Robin (unregistered)

    Ha, as a UK resident my first reaction to hearing the story yesterday (after lifting my head off the desk) was "This sounds like classic DailYWTF material, hopefully someone close to it has juicy details to send in". Didn't expect it to hit the site so soon!

  • (nodebb)

    Since the days of magnetic tape, when I've transmitted a batch of data to another vendor, I've always, always, sent along a note saying something like "this batch contains 69148 records." I remember writing a program to read a mag tape and count the records to validate those batch counts.

    In more recent decades :-) I've favored CSV files because the old Unix standby "wc filename" announces the record count. (No, dear English colleagues, it doesn't stand for what you're thinking, it stands for Word Count.)

    It's good practice. But it takes a little extra time to reconcile record counts.

    All that being said, it doesn't make sense to come down too hard on public health workers in this epidemic mess. Not every data screwup is a huge scandal. Some of them are just mistakes.

  • Dave Thompson (unregistered)

    Serco was the contractor PHE had contracted in to manage this whole debacle, so I'm not sure if it was a UK Gov employee or a Serco employee. Either one wouldn't surprise me.

  • don dowdle (unregistered)

    I don't see A WTF at all. I noticed in the article and in the comments boone has mentioned a better solution for portable data analysis. Any type of SQL is stuck on a server and requires quite a bit of training to extract data in any meaningful way. I think the modern times have created a lot of data for even the casual user and there really isn't an user easy method of digesting that data.

  • Aspie Architect (unregistered)

    For me another big WTF is that Excel costs peanuts. The guilty parties charged £12million for this "solution". My taxes paid for this.

  • ZPedro (unregistered)

    I'm on the record as supporting end users in experimenting with programming with the tools they have available (their office suite, mostly): https://thedailywtf.com/articles/comments/Poke-a-Dot#comment-307071 , so I don't think anyone will be surprised when I say I concur this is an IT/organisational issue.

  • Richard (unregistered)

    Another WTF underlying this one is that either

    • Excel does not report loss of data when converting. or
    • Users ignored the warning. (Maybe the Cry Wolf problem with all the other warnings users are conditioned to ignore)
  • (nodebb)

    Sad... when 10 or 20 lines of python or R could generate every desired report or statistic from the source CSV file...

    Addendum 2020-10-06 10:35: ... and even dump the report into a nice shiny PPT file for PHBs to love

  • Harris Mirza (unregistered) in reply to Richard

    Excel does report loss of data, both when importing a CSV too large for the spreadsheet and when saving a too large spreadsheet to XLS. They are either using some sort of VBA based hack, ignoring the errors, or using an older version of Excel that doesn't report errors. Possibly all three.

  • (nodebb)

    six levels of IT management, file requests in triplicate, and have a testing and approval cycle to ensure the report meets the specifications.

    Hmmm.... which one of those avoided hurdles is the one which might have detected the problem.

  • Shill (unregistered)

    For my money, the big sin here is not keeping the original data in a separate file instead of overwriting it.

  • Divbad (unregistered) in reply to Dave

    ?

    It's well documented, widely-published, and there are no libel cases.

    https://en.wikipedia.org/wiki/Dido_Harding https://www.politico.eu/article/the-businesswoman-leading-britains-mid-pandemic-shakeup-of-public-health/

    or you can ask Google yourself how the PPE contracting process went.

  • Risk Adverse (unregistered)

    One piece flow. On each positive test, send a notification message straight from the facility. Adjust who gets notified until numbers go down, whether they were exposed or not. Also adjust the notification. Not responding to SMS?

    huh.

    ... ... ... Hit 'em with "The Purge" siren.

  • Paul Nickerson (google)

    It always irks me when I see .xls files going around. Microsoft Office Excel 2003's extended support ended 2014-04-08, six and a half years ago. This was the last version of Excel that did not support the .xlsx file format.

    So, are these people using .xls because they want to be compatible with ancient unsupported versions of Excel, or are they actually running an ancient unsupported version of Excel themselves? Or maybe these files were originally created in 2014 or earlier, and they've just been updated and re-saved since then?

    Now I'm less irked, and more baffled. Or maybe I'm naive, as I haven't worked in the kinds of companies and government departments that lead to this kind of unnecessary file format usage.

  • Simon (unregistered)

    The UK satirical magazine "Private Eye" has been reporting this for months, particularly the medical correspondent "M. D." (Dr. Phil Hammond). Hammond quite clearly says that being outdoors is far safer than being stuck indoors, for example, and has the stats to show it. "M. D". and Private Eye were the first to say outsourcing it to the usual gang of consultants would end in distaster. Of course it would. I have worked for many pharmaceutical companies doing software, and this is just NOT the way you do research. As a 48-year-old Englishman who has been resident in Hungary for six years, I can perfectly well assure you that my chances of being struck by lightning are about 1000 times higher than of dying from COVID-19. I'm far more likely, as a drinker and smoker, to have a heart attack or a brain haemorrhage.

    of course KPMG, Morgan Stanley, Deloitte and so on take the money and run... sometimes you have to refuse to pay these idiots. Unfortunately we British citizens haven't the choice, since voting and elections were cancelled.... the first time since the Second World War.

  • (nodebb)

    I was once offered a student job to do data crunching on industry data, create a script for future use.

    I asked what tools will be available. Matlab? Python? Octave maybe?

    Excel with VBA...

  • MiserableOldGit (unregistered) in reply to Harris Mirza
    Excel does report loss of data, both when importing a CSV too large for the spreadsheet and when saving a too large spreadsheet to XLS. They are either using some sort of VBA based hack, ignoring the errors, or using an older version of Excel that doesn't report errors. Possibly all three.

    Actually it doesn't necessarily report when you import, I think it assumes you knew what you did and lets you carry on until you come to save/export and even then, it helps you through fixing the problem.

    I think pre-2007 it would report on import, my memory is not good enough to say for sure, but I'll go with the hunch that it did.

    The VBA hack is as simple as DoCmd.SetWarnings False (I think that covers this one), but I struggle to imagine that someone dumb enough to implement a process like this would even know what VBA is. Maybe there's a flag in the options panel somewhere.

    If I was truly bored enough I could search some of those real WTF forums like ExcelExperts and I bet I could find this person asking for help ....

  • (nodebb)

    Look at the upside! We have found a new cure for Covid-19! :)

  • Klimax (unregistered) in reply to Paul Nickerson

    Most likely result of old compatibility settings from time when both formats still coexisted and nobody thought/remembered to change it.

  • MiserableOldGit (unregistered) in reply to Klimax

    The institutional inertia to overcome to make simple, no-impact changes in a government organisation is just overwhelming, easier to just do it and beg forgiveness if it backfires than ask permission, but either way your ass will be roasted if you get it wrong, or even get it right and get found out.

    Common sense should've dictated we use CSV or fixed width or something involving data exchange. But, again, this is government/impossibly big corporate, common sense took a back seat for a laugh and a taco decades ago.

  • Chris D (unregistered)

    That should be 65,536 rows, not 65,636.

  • Simon (unregistered) in reply to R3D3

    Well, as MiserableOldGit pointed out, essentially a plaintext file like CSV is quite handy sometimes. What you then have a problem with, of course, is the GDPR means you must encrypt it, indeed the whole idea of "track and trace" implicitly goes against GDPR. You want a General Data Protecton Regulation that says "you may not share my data without my consent", then an app that says "Please share my data without my consent".... Bzzzzt.

    There is also the small matter that the whole system assumes that everyone in the UK has a smartphone (presumably with their GPS enabled). I don't have a smartphone. I don't want a smartphone. I had one once, and when I turned the GPS on it drained the battery in about three hours, so you can track and trace people for three hours or so until their battery dies. This might work in a conurbation, but mirabile dictu there are places in the United Kingdom that do not get a good mobile phone signal. I lived in one of them. It took me five years of campaigning to get high-speed internet into the village I lived in, twelve miles from Cambridge University, that great fountain of knowledge where students come to drink.

    You're asking people 1. Keep away from everyone else. 2. Let us find out EXACTLY where you are, at all times of day, so that we can send it to a corporate body to do what they like with it. But don't worry, we'll make sure your data is secure, only the UK Govt, Deloitte, Serco, kPMG, Morgan Stanley, Uncle Tom Cobbley and All will ever be able to find out where you are at all times of day, but trust us, we'll keep your data secure, except when we leave it in the back of a taxi....

  • clive (unregistered)

    Modern phones will happily GPS all day - Strava etc rely on this.

  • Dave (unregistered)

    In duh-fense of Excel, I've got to add that Excel is a database. That's what 99.9% of the world uses it for. When was the last time you saw it used as an actual spreadsheet?

    And as a database it's the universal creation and exchange mechanism for data tables that virtually everything can work with. So it was the right choice, they just got hit by backwards-compatibility to Excel Classic, for which see "universal creation and exchange mechanism for data tables that virtually everything can work with".

  • Dave (unregistered) in reply to MiserableOldGit

    DoCmd.SetWarnings False

    "The data entry people keep getting these warnings pop up and it's confusing them, could you do something to make them go away?".

  • Dave (unregistered) in reply to Divbad

    Its well documented nonsense. It doesn't matter what the process is, it matters what the outcome is. If the contracts aren't fairly priced, the entire sum is forfeit. There is no point doing what you allege, they'd lose lots of money doing it.

  • Richard (unregistered) in reply to Simon

    Is it a GDPR problem? Yes you have to ensure the data is secure, butt he person having the test has consented to the result being used in track and trace. It's a pretty explicit expectation of what the system is doing. Similarly those using the app have consented to the app being used to detect nearby infections.

    GDPR requires data security. You can encrypt transfer of CSV so that's no issue. You can also anonymise it or deal in only aggregated data. As this data is feeding into the tracing system it cannot be anonymous at this point.

  • Ruts (unregistered) in reply to Simon

    Had a snigger that line-break formatting of you comment mean I could just see

    It took me five years of campaigning to get high

    That aside, the tracing system does NOT expect or require the UK population to have smartphones. That's just the app. And it uses bluetooth, not GPS, Not going to defend the app though.

    Certainly can't defend SNAFU behaviour using Excel badly!

  • MiserableOldGit (unregistered) in reply to Dave

    I don't know where you are getting this from, contract law does not work like that. You can sue for non-performance, failure, fraud, or whatever, but the price is whatever they want and you are prepared to pay at the time, that's why they are supposed to seek bids.

    It's only in cost plus contracts where measures such as you describe would apply and then it's still a bit of a joke on government contracts. The last one I was on, "big well-known service provider" was tripling my day rate when they billed the client, and charging for team members (like analysts and testers) who were just not going to participate in the actual project. And when the contract runs into trouble, they just run back to the client with their hand out for more because by then it's politically impossible for the ministers to admit the outsourcing deal was ill-conceived and poor value and seeing the service provider walk away would be disastrous. It's a lovely business model, you profit, or if you fail, you profit anyway.

    All of these emergency contracts handed out without competition have had their under performance thresholds removed and/or heavily revised with the actual wording largely redacted from public view for "commercial sensitivity" reasons, even though it isn't if there was no competitive tender process.

    Everything Davbid said is a matter of public record.

  • MiserableOldGit (unregistered) in reply to Richard

    Also, I don't think this data is directly related to the track and trace app, it's data coming from the testing labs about actual test results to be used for statistical analysis. As far as I understand the tracing function is not part of this process.

    So ... it shouldn't have personally identifiable information in it, because it isn't needed, but they should be treating it as if it does and encrypting it because it would be foolish not to ....

    Well, the UK government has form on screwing up by not anonymising data for statistics and then actually losing an unencrypted database of citizens data in the post.

    There are exceptions to GDPR which would cover this anyway, check Article 6.

  • Dave (unregistered) in reply to MiserableOldGit

    Compensation orders. Anything involving unlawful acts is liable to confiscation of the entire amount gained, not just the unlawful gain. It's quite recent, and it has radically changed the legal position. It's similar to the General Anti Avoidance Principle in that it catches everything.

    There have been recent examples of landlords evading building regulations, where the fines were less than the extra profits so they paid the fines and ignored the regulations. They've now had every penny of rent confiscated, without deduction of costs.

    You would have to be mad and stupid to do what the OP alleged - and many journalists have mistakenly claimed. Boris is mad, but not stupid. Dido Harding is neither, just evil. The contracts are fine. The government isn't, and there are enough real criticisms to make; the mistaken piling on just gives justification to Boris/Trump supporters because they remember those instead of the real problems.

  • Prime Mover (unregistered)

    The good news is: apart from that's our tax money that's been embezzled, it doesn't really matter in the long run. We're all going to catch this thing unless we are personally prudent in our daily interactions (like: don't have any at all if at all possible), and (for those of us still alive, which may be as few as a paltry 99% or so of us) we will end up just writing this year off as a bad holiday.

  • MiserableOldGit (unregistered) in reply to Dave

    Well Boris is both mad and stupid so we can kick that one into the long grass.

    These contracts are being awarded with no competitive tendering process, that's a fact. The terms have been watered down in favour of the contractor, that's a fact. Effort has been made, without justification, to conceal these facts. Dido Harding has no relevant experience, that's a fact. She's married to Tory MP, herself a vocal Tory supporter, these are all facts.

    I suspect Boris (personally) enjoys vicarious protection, as do most civil servants and ministers. He might be prosecutable for malfeasance (not even sure on that), but he can't be personally surcharged unless a theft type crime is proven. UK constitutional law is a hideous mess, so no doubt it take years and make barristers insanely rich if it ever were tried. Regardless, I can't see any way a compensation order could be used in the way you describe, it certainly couldn't cut across contract law, offering to do something for far more than it's really worth is not unlawful, its business. What's unlawful is the payback, but that's incredibly hard to prove, especially if it is at arms length, like a peerage or a party donation. Robert Jenrick just got caught doing something very similar and got away scot-free. If we are in the realm of fraud and/or provable corruption that's another matter, but at that level the juice may be worth the squeeze ... look at the fines being given to banks for money laundering proceeds of crime and tax evasion, they barely match the profit on the transaction, and that's for the tiny fraction of transactions that get highlighted.

  • ip-guru (unregistered)

    & the bigest WTF of all, is the current "Solution" to this problem Split the data into smaller chunks & use more than 1 excel file.

    you simply could not make this up.

  • MiserableOldGit (unregistered) in reply to ip-guru

    The sad thing is, having worked in exactly those environments, that struck me as surprisingly sensible.

  • Simon (unregistered) in reply to Dave

    Spot on. Morgan Friedman pointed this out thirty years ago, with hidden costs. "You want to know about corporate failure? Have you ever thought about government failure? But you don't see that: the profit-and-loss system. actually, the loss is more important than the profit. A man enters business and if he doesn't have enough customers he goes bust. WHat happens if the government goes bust? They require you, with the threat of force, to pay them more".

  • Simon (unregistered) in reply to MiserableOldGit

    The even sadder thing is really that relational databases are misnamed to start with. The one thing that relational databases do extremely badly is relationships. Oh, we'll have primary keys and secondary keys and so forth then we'll get a server that has to grind all these keys to find out what was there all along, and RAID drives and put it in the Cloud and add a few more processors and get the Data Admin department who won't let you log in because anyone..,. ANYONE... with a tiny amount of power will use that tiny amout they have mercilessly. I suppose that sounds like I'm a campaigner for open source or something..., as a software engineer I do expect to get paid for my work, I also expect that the company takes the copyright or if I work freelance then I keep the copyright, that's not the point. Perhaps it is. I know several accountants who left the company my wife works for, to work for KPMG in Hungary, Deloitte in Hungary, SERCO in Hungary, Morgan Stanley in Hungary. Who do you think gets the contracts? We could, in theory vote... er no... all elections have been cancelled.

  • Simon (unregistered) in reply to MiserableOldGit

    I have never been able to make my mind up whether Boris is totally mad or absolutely brilliant. I don't mean as a practical man, obviously there he is a total idiot. But he is a brilliant orator - I don't know if he learned from Trump, but he can't put a sentence together without backwards running sentences until reels my mind. I believe on the other side of the pond there is some kind of election going on.... in Europe we are not allowed to have any. THe price of freedom is, er, what was it now, eternal vigilance? We as software professionals, I know this is kinda standing on a soapbox and so on, have a duty of care to ensure security of the individual, to ensure our own safety, and those of the people around us. I work on safety-critical software and do risk analysis. I can tell you quite easily, the actuarial books clearly indicate, I am far more likely to die from eating this lovely cheesecake slice than I am from COVID-19.

    One of the strange things about that kind of work, is that you find out that people die in incredibly unlikely ways. RoSPA is very good at this. More people in England died in December 2019 from soft furnishings than from COVID. The statistical notes are https://www.england.nhs.uk/statistics/statistical-work-areas/ambulance-quality-indicators/ambulance-quality-indicators-data-2019-20/

  • Simon (unregistered)

    Or that according to the BBC, two people have died this year trying to climb Mount Snowdon: https://www.bbc.com/news/uk-wales-54107309

    This is probably a lot fewer than usually die, because in South Wales (the old one, not New South Wales - whoever thought that was a snappy name for a province anyway?) there are restrictions but hey-ho, here we are with COVID, we have our masks on and let's go and die at the top of Mount Snowdon, which by world standards is Not Very Big. Fortunately they were only Welsh so it doesn't matter too much :)

  • KimZ (unregistered)

    .. or use free, open source data tools such as Googles rather excellent and surprisingly free Openrefine....

  • Just another Embedded Designer (unregistered)

    Apart from being "I Have a hammer [Excel] therefore everything is a nail [data processing]", the ral issue is how MUCH of this is obviously MANUAL processing of

    1. receive email
    2. save attachment
    3. Import attachment into excel 4 transfer data into another excel spreadsheet

    The issue on this Distributed Processing system is that the batches were sent too large and too low frequency in order for manual operators to process the work.

    For this type of Disributed work small batches sent often, is ALWAYS better even if just monitoring temperatures of hospital drug fridges and freezers (yes this is done for safety and health reasons). This provides the ability to work with at least some of the data and work out what is missing and not have other SANFUs like the one computer used to collate the data broke so we are three days late.

  • Osric (unregistered) in reply to Steve_The_Cynic

    Even that capacity of a million-ish is just kicking the can down the road...

Leave a comment on “Excellent Data Gathering”

Log In or post as a guest

Replying to comment #:

« Return to Article