The Daily WTF: Curious Perversions in Information Technology

nerd4sale · 2020-11-18 Reply Admin

Wait, so the mainframe guy is not the cause of the WTF? There's got to be a frist for everything I guess.

2020-11-18 Reply Admin

I presume they HAD backups and didn't depend on the SAN being a RAID+x having enough redundancy?

2020-11-18 Reply Admin

Great to see someone get praise on this website, for a change. Yes, the old geezers do know something that the young whippersnappers have yet to learn.

BTW: TRWTF is sending your most junior tech-support helpdesk employee for training. Been there, done that. Still wearing the T-shirt.

2020-11-18 Reply Admin

Erwin wore a "best"? Is that a new-fangled combination of vest and belt?

2020-11-18 Reply Admin

I'd point at the message that said "possible data loss". That ain't the same thing as "This WILL delete ALL your data".

BTW If you turn up to the office in the UK in a vest and suspenders, you would probably be reported to HR....

2020-11-18 Reply Admin

TRWTF is not having one of the guys (or girls) who have been doing it for the last ten years present but relying on some external guy.

2020-11-18 Reply Admin

A best and a velt?

2020-11-18 Reply Admin

I was expecting to find out that Erwin was sabotaging the SAN in a misguided attempt to prove that mainframes were superior.

2020-11-18 Reply Admin

if you are paying them for the teaching, why not use it? if something like this happens, you can hold them responsible.

2020-11-18 Reply Admin

It's always nice if you can blame the s*it on somebody else, but that doesn't help much if you're the one cleaning it up...

2020-11-18 Reply Admin

I submitted this article, and I wanted to say that Remy did a great job keeping the main underlying facts in the story. I was not a storage or MF admin at the time, (I was a virtualization admin that suffered some consequences), but this is one of my favorite 'war stories' to talk about from my career. I have a couple more I hope to submit soon.

2020-11-18 Reply Admin

How did mainframes solve the storage problem? Is it just a case of not having much data to store?

2020-11-18 Reply Admin

Half way through this story I was preparing to rise to my feet.

A little further on and I had my hands raised.

At the story's end I was giving Erwin the standing ovation that he deserves.

I myself have been the recipient of a process for performing a task. And, like Erwin, I ensure that the person handing it off to me dictates in full detail exactly what he does. And I write it down carefully.

(Oh, and as soon as the technician or whoever is out of the door, I write a script or build a robot (coughITAPPMONcough) to do as much of this job automatically. Like the time when a colleague taught me how to use an "automated" procedure which evolved into 5 pages of my A4 notebook crammed with instructions on exactly what to do.)

2020-11-18 Reply Admin

Wow, yes, please do. This is already one of my favourite stories on TDWTF and I've read a few of them, I can tell you.

Steve_The_Cynic · 2020-11-18 Reply Admin

BTW If you turn up to the office in the UK in a vest and suspenders, you would probably be reported to HR....

Doubly so if you're a guy and anyone can tell you're wearing suspenders.

2020-11-18 Reply Admin

Doubly so if you're a guy and anyone can tell you're wearing suspenders.

You clearly never worked in the UK Civil Service. Such things are mandatory if you wish to ascend the ranks there.

Even in a boring software firm in Surrey we had a guy on the team who regularly turned up with eyeliner and lipstick and nail varnish because he hadn't actually seen a bathroom since his night out on the town. I tried to avoid getting near enough to find out if he had perfume or what his breath smelled of .... ahem.

2020-11-18 Reply Admin

'I'd point at the message that said "possible data loss". That ain't the same thing as "This WILL delete ALL your data".'

This, and the great difference between "initialize" and "re-initialize". What's wrong with those people, aren't they capable of writing clear messages, or do they make it unclear for ... reasons? (Can't be in order to improve their reputation, if the result is deleting everything their product is storing.)

FWIW, when I wrote a tool that had to overwrite a disk, I wrote "will delete everything on the disk" ..., even though usually the disk would be empty. Better to err on the side of caution here. Additionally I would display the contents (so the user can check that the disk was actually empty, unless they intentionally want to overwrite a disk), and required them typing "yes" (at lest the vendor here did that). All of it seemed kind of obvious to me (and still do), and it's not even my main area of expertise (unlike the vendor here -- at least it should be theirs).

2020-11-18 Reply Admin

I have to admit, I didn't see that coming.

And I have to admit, I half expected Erwin to have screwed it up by turning a 5 minute procedure into an hour long one. Or showing his disdain for the SAN by being horribly incompetent.

But no, he was being diligent and getting clarity into the procedure making sure to do things right. And careful at that. And asking lots of questions.

Though the UI could be improved including not calling it "Initialize" or "Reinitialize' but something plainly obvious like "Add replacement disk to array" or "Replace missing disk". If it was going to delete all the data because it effectively deletes the entire array, it should say that/ "Reinitialize will delete the entire array, all data will be lost!". Plus, since replacing failed disks is a common operation, that should be an extremely visible button on all pages "A disk is missing, and I found a new disk, replace missing disk with the new disk?"

If the disk has a hot or cold spare, it could even say that "Missing disk replaced with hot spare. Use new disk as spare?"

Common operations are common and should be easy to do. Things like deleting the array are extremely uncommon and may only be done in the unit's lifetime, so hiding it away is a good idea.

2020-11-18 Reply Admin

I suspect the details of whatever the UI said and what went on at that point has been lost in time and subsequently made-up. The issue there is they probably did send along a non-expert to offer "training" because they didn't take the client's concerns seriously. Erwin might have been a petty pedantic PITA on a mission to prove a point, but they made it easy for him.

2020-11-18 Reply Admin

I had a very similar experience around the same time, but I was working in the tourist industry. Not "conservative" but certainly a hub for IT WTF of epic proportions. My sector was doing incredibly well, had no idea what to do with the wheelbarrows of cash they were getting, other than spaff it away on absurd marketing consultants and very ill-advised capital IT "investments".

I think it was around 2004 for us, I remember speaking to friends of friends in the insurance industry who started to drool at the tech we were buying, and laugh uncontrollably at what was being done with it.

We got a nice big SAN, complete with this robot backup machine (it was a beauty to watch it swap a tape) and a load of fibre optic connectors between them and the servers. Of course it fixed nothing, because it may have served up 2 TB, but if the morons in marketing have no concept of resources, they'll gobble that in weeks, and they did.

It became clear the IT manager who had done this had no clue what he was doing, so just before he went on holiday the company (in a rare moment of clarity) got a couple of consultants in to be his standby as they didn't trust his assurances that nothing could possibly go wrong while he was away. During his tour of the server room, they asked the same question about swapping disks and he said "oh, it's all hot swappable, see?" and popped two disks out the array and pushed them back in.

He'd left the building and was on his way to the airport before the horrible reality of the cluster-fuck he'd caused truly came to light ... the SAN had some sort of caching that (sort of) kept shit going for 45 minutes, but then everything went off-line for an 8 hour array rebuild and there was nothing we could do.

We did have an Erwin dealing with the "Medium Iron" that travel companies use, his track record was only slightly better, but he was better at company politics.

2020-11-18 Reply Admin

I'm a field tech, whenever I'm left alone working, everything goes well and customer is satisfied. Whenever I get an "Erwin" pushing on my nerves and breaking my line of though every damn minute, well, shit happens.

Jaime · 2020-11-18 Reply Admin

I once worked at a place where the "little iron", an AS/400 (a.k.a. iSeries) about the size of a small chest fridge, was so fragile that accidentally unplugging the network cable meant that it had to be cold booted. Meanwhile, the x86 junk surrounding it was all virtualized and clustered so that a large chunk of hardware would have to go down before anyone noticed.

R3D3 · 2020-11-19 Reply Admin

I remember a case where a technician was sent for an expensive repair on a large scientific apparatus. The company had outsourced support to local tech companies, and didn't even provide them with documentation — remember, science apparatus, so we are talking about 100k€, if not several, in purchase costs, and adequately expensive support contracts, plus billing for technicians to actually provide on-site support.

After a week of wasting several Scientists time, the device still didn't work and the measurement PC was broken too. The company stepped back from billing the technician hours, but by all right the institute should have been compensated a few thousand € in wasted working hours.

2020-11-19 Reply Admin

I was expecting to find out that Erwin was sabotaging the SAN in a misguided attempt to prove that mainframes were superior.

Nah. Guys like this have their professional pride.

2020-11-19 Reply Admin

Doubly so if you're a guy and anyone can tell you're wearing suspenders.

Who are you to look down on my dear papa?!

2020-11-19 Reply Admin

I have sneaking suspicion based on the vagaries of the story this is/was an Compaq/HP smart array. Those were ripe with WTFery starting with the clumsy web interface intended to make it "easy" but in practice requiring you either had a Compaq tech come out and do it for you because they did it enough to know the procedures cold or read the manual through front to back and only then walk through whatever procedure step by step.

The UI was positively littered with vague, inconsistently used, and confusing terminology like "initialize" vs "re-initialize", or "attach" on one screen vs "online" on another ...

That is before you got to the hardware. SAS shelfs and SCISI had the same SCSI cable connector. Wanna guess if you could put them on the same chain and controller? Well you could cable it that way and it would work ... for a while ... then bad stuff would happen. Why not use a different cable to avoid mistakes when moving things or at least detect the erorr and let the user know they should not start the arrays cabled that way? because... where would be the fun in that I guess.

2020-11-19 Reply Admin

Nah. Guys like this have their professional pride.

Really? I've worked with plenty who spent a good chunk of their day sabotaging and backstabbing. I remember one who used to take every integration project as an opportunity to DOS-attack the other servers with floods of meaningless and unnecessary requests to try and "prove" some sort of point.

2020-11-20 Reply Admin

I've worked with plenty who spent a good chunk of their day sabotaging and backstabbing.

That would, I believe, be the kind that doesn't listen to field techs at all, not the kind that makes them go slower so they can take notes.

2020-11-20 Reply Admin

True ... I recall getting dragged into an "emergency" meeting with this one, the head of IT and some poor unfortunate field tech who'd he'd brought in to install some software tool. Apparently the tool needed MySQL for its datastore, at the time we had no MySQL presence. Turned out he had assumed SQL Server and MySQL were one and the same and it had become my fault when the field tech had pointed he couldn't put his datastore on our SQL Servers. Well everyone's fault, except his.

I did say it would probably take little more than half an hour to spin up a VM, put the MySQL version of their choice on it and sort out whatever network settings and drivers were necessary, we could deal with who was going to look after it later on ... but as shouty man had already burned through an hour arguing with the tech and trying to make him solve a problem he couldn't the poor guy just wanted to get out of there before he wrecked even more appointment slots.

He didn't really listen to anyone... I mean he had ears, but they were just there so his glasses would stay on.

urkerab · 2020-11-24 Reply Admin

The real WTF is that the controller doesn't automatically rebuild the drive (at the very least after you tell it about the replacement, but some controllers will even skip that step for you).

Which I guess is another advantage of having a hot spare, as they always rebuild automatically, and also the operation of adding a new hot spare (or equivalent) is much less dangerous.

2020-11-27 Reply Admin

Doing something, and writing down the steps involved in doing that thing, are often two very different skills. People are often surprised to find that someone can do one and not the other, but they shouldn't be; that is, in fact, normal.

2020-11-29 Reply Admin

The tech might've kept on working, but he made one fatal slip When he tried to train the Erwin with the big iron on his hip. Big iron, big iron, When he tried to train the Erwin with the big iron on his hip.

Big Iron

Leave a comment on “Big Iron”