The Daily WTF: Curious Perversions in Information Technology

2007-02-15 Reply Admin

Actually the real wtf is that they didn't audit his code when it was causing problems with the banks internet connection.

2007-02-15 Reply Admin

doug:
Hate to be the spoil sport but, maybe the fan is only magnifying a problem that is still not fixed. The inability to handle an exception when the network connection is down.

EXACTLY!

Rick · 2007-02-15 Reply Admin

tamosius:
doug:
Hate to be the spoil sport but, maybe the fan is only magnifying a problem that is still not fixed. The inability to handle an exception when the network connection is down.

EXACTLY!

And every production application that you have written handles all network connection failures gracefully.

Give me a break. Who are you kidding.

I try to give appropriate log messages for network failures, but the key word there is 'try'. My applications would take 100 times as long to write if I were to reliably test every possible network connection failure. And I would be out of work.

2007-02-15 Reply Admin

|+|:
The real WTF is the idiot who wrote this story and ended it in the middle.

Actually the real WTF here is the number of people that can't seem to understand a fairly straight forward story. I guess maybe the OP should have dumbed it down a little.

2007-02-15 Reply Admin

Rick:
tamosius:
doug:
Hate to be the spoil sport but, maybe the fan is only magnifying a problem that is still not fixed. The inability to handle an exception when the network connection is down.

EXACTLY!

And every production application that you have written handles all network connection failures gracefully.

Give me a break. Who are you kidding.

If the program's making direct TCP/IP connections, there's a few things you can do, although there's irreducable unreliability in any network connection. And then there's reliance on NFS-mounted files, LDAP servers, proprietary database drivers, services on the local PC that use the internet ... oh, and was the application web-based?

Kaosadvokit · 2007-02-15 Reply Admin

Rick:
tamosius:
doug:
Hate to be the spoil sport but, maybe the fan is only magnifying a problem that is still not fixed. The inability to handle an exception when the network connection is down.

EXACTLY!

And every production application that you have written handles all network connection failures gracefully.

Give me a break. Who are you kidding.

You forgot to mention that the application was having "weird things" happen, which may result from properly handling network failures. The nature of the failure was unspecified, but the application apparently continued to function as best it could despite repeated netwrok failures--which sounds like it handled them jsut fine to me.

Of course, the end user was either not informed of the failure (end users often are not allowed to know these things since it reflects poorly on the product) or ignored the error message and observed behavior that was inconsistent with what would be expected if the operation succeeded (which happens more often than end-users, especially testers, realize).

2007-02-15 Reply Admin

Ours submitter anonymity (he we will call "from part of Steve") felt this puncture, therefore. It would methodically develop accurately and a module that has admirably worked in atmospheres of the development and the test of the bank, but has pronounced UAT. That that is probadores had uncovered a bug in the module that they could not reproduce. The official description was that the module has rendered "disowned things" to arrive.

2007-02-15 Reply Admin

Crookeddy:
Ours submitter anonymity (he we will call "from part of Steve") felt this puncture, therefore. It would methodically develop accurately and a module that has admirably worked in atmospheres of the development and the test of the bank, but has pronounced UAT. That that is probadores had uncovered a bug in the module that they could not reproduce. The official description was that the module has rendered "disowned things" to arrive.

If the way this post was written isn't a joke, then I'd say it's a wtf in and of itself.

2007-02-15 Reply Admin

Oh for the love of Bob...

OK, imagine for a second a real QA person filed a good report: "Intermittently, the application shows a 'Network connection timeout after N seconds in routine XX::YY' error box. Difficult to reproduce. Also, some operations take a long time in one test, and a very short time in another; also intermittent."

We'd all say "Hm, some network problem". Not to mention the dev going in would know what to do.

From the information we've been given, it's absolutely possible that the code produced such an error. Nothing in the description disproves it. But the information also states that the dunderhead QA report was just that "weird things" happen.

Given this information, it seems that two types of people view it in two ways:

some give the benefit of the doubt to the coder and scorn the poor QA reporting, not to mention getting a chuckle out of the root cause
others instantly blame the coder for not having the proper error reporting, when it cannot be knowable from the report what error reporting there was since it's not mentioned

Why fall in the second group? Cynical? Aggressive? Projecting? Daddy never loved you? Not to mention that if you make these kinds of leaps and assumptions all the time, I hope your code is reviewed well and isn't placed in any mission-critical systems where those assumptions might fail.

For me, I'll take my chuckle and move on.

2007-02-15 Reply Admin

I guess this guy's modules were the only ones on that server that access any network resources. That, or all the other modules silently swept their exceptions under the rug.

I wouldn't be surprised. Recently I took over a webapp that swallowed ALL exceptions and did NO logging. A server error such as a db connection failure would result in a valid response, just with no relevant data.

When I added exception handling and logging, I started getting questions about wtf I was doing to break the application since QA were getting error messages and stack traces in the logs when some backend modules were down. Luckily I'm not a junior engineer and have a good enough reputation to be listened to when I pointed out we need enough information when something goes wrong so we can FIX problems instead of ignore them.

Captcha: howdy

2007-02-15 Reply Admin

anonymous:
Crookeddy:
Ours submitter anonymity (he we will call "from part of Steve") felt this puncture, therefore. It would methodically develop accurately and a module that has admirably worked in atmospheres of the development and the test of the bank, but has pronounced UAT. That that is probadores had uncovered a bug in the module that they could not reproduce. The official description was that the module has rendered "disowned things" to arrive.

If the way this post was written isn't a joke, then I'd say it's a wtf in and of itself.

Why it is that is wtf ay? I do not include, when you come and you examine the individuals in my country, my language you speak one I keep excessively condecending the observations the himself.

2007-02-15 Reply Admin

Pwned by Babelfish.

Captcha = gotcha

Indeed.

2007-02-15 Reply Admin

What it is this "Bebelfish", way to speak over

marvin_rabbit · 2007-02-15 Reply Admin

n0ha:
Iceman:
That's all very well, but if it is a browser based application, the browser is going to return a server error when the server gets disconnected from the network, since the connection between the browser and server has been broken. No matter how gracefully the application code handles network interruptions, the browser is going to barf.

That's only true if you use full page refresh. In our application [at the bank of course! :)], there are mainly ajax-style connections to the server, which will yield only a "please try again" warning if the request fails.

I think that if I were an end user, I'd report that as a 'weird thing' happening.

Addendum (2007-02-15 17:00): Edit: I formally take that comment back when I reconsider that this was not an end-user report but rather a report from, presumably, QA testers.

2007-02-15 Reply Admin

marvin_rabbit:
n0ha:
Iceman:
That's all very well, but if it is a browser based application, the browser is going to return a server error when the server gets disconnected from the network, since the connection between the browser and server has been broken. No matter how gracefully the application code handles network interruptions, the browser is going to barf.

That's only true if you use full page refresh. In our application [at the bank of course! :)], there are mainly ajax-style connections to the server, which will yield only a "please try again" warning if the request fails.
I think that if I were an end user, I'd report that as a 'weird thing' happening.

Addendum (2007-02-15 17:00): Edit: I formally take that comment back when I reconsider that this was not an end-user report but rather a report from, presumably, QA testers.

Did you see the server room pics????? This is not a big national banking corporation; this is a small enterprise. I suspect the "QA" people are users who are conscripted to do UAT prior to a release.

2007-02-15 Reply Admin

That's just stupid. It's obvious that he should have used a connectionless protocol. And now he's trying to blame the poor tech for his incompetence.

Captcha: dubya - no fair. I hate that guy...

real_aardvark · 2007-02-15 Reply Admin

StupidGuy:
doug:
The inability to handle an exception when the network connection is down.

Correct!

Sorry, but a network-application that can't handle network errors... Hummm... chchchch

Well, I know, checking return codes or catching Exceptions sounds like wheenies - but I'd like this :-)

Last but not least: There exist servers, it's in a bank... so I assume (or hope) they have good Network-guys, ... Why didn't Network-monitoring-software recognise the problem of a network-Connection going up-n-down 5 Times a second ??

Lucky you -- you've never worked in a financial institution, have you?

lackluster:
Gedoon:
People people! Do READ the story... <snip/>

Pretty sure I did, and it was a terrible story at that. Fan was there to replace a broken AC, but wasn't turned on? Issues still being caused when it's not on? Hmm.. Fan comes on, say spinning at 3200rpm, and "Steve" can actually see the network going up and down? The bank actually uses broken ethernet cables? And why the hell is UAT being performed in a restricted access server room, one with no AC at that? Is there any sort of professionalism left in this world?

Well, you might want to go back and read the story again:

"terrible?" No. Funny.
Fan wasn't turned on? See cause of problem. Fan needs to be on to cause problem.
"Say 3200 rpm?" Yes, I can say it. With the amount of power that this implies, Steve might even be able to see the floor going up and down. But: "see the network going up and down?" I hereby forgo network analysers. I need no sniffing. I need a pocket Steve, with magic X-Ray eyes. ($1.99 at Walmart.)
"Broken ethernet cables." Broken? WTF?
"UAT ... in a restricted access server room." No, I think you'll find that (as implied) the servers were in the server room, and the users were in the user room. With Professor Plum, and the Lead Piping. Otherwise the "odd things" that the UAT group reported would have amounted to "your software makes me all sweaty." Precisely which parallel universe do you inhabit? I think I prefer the (not very funny) would-be Borat impressionist, a few comments above.

In the spirit of the Good Fairy B-Nice, I should at this point give "Steve" a big pat on the back for writing his application in such a way that it actually managed to work at all. A good 90% of the TCP-based applications I've seen fail to close sockets cleanly in unexpected circumstances -- and this sort of scenario is almost guaranteed to fall foul of FIN WAIT-2 timeouts. Kudos to Steve.

jefrainmx · 2007-02-15 Reply Admin

He could also create new Exception, Business Rule exception, MoveTheFineFanFromTheNetCableException.

2007-02-15 Reply Admin

|-|:
|+|:
The real WTF is the idiot who wrote this story and ended it in the middle.

Actually the real WTF here is the number of people that can't seem to understand a fairly straight forward story. I guess maybe the OP should have dumbed it down a little.

I agree. SOME programmers would have to be the most pedantantic bunch in history - wasting time making irrelevant comments (like me, I cant believe i'm doing this). Fairly straight foreward story. Obviously the fan was the cause of the problem. If there was a requirement in the specification for exception handling network faults then he should have put that in. But quite often in large organisations, deadlines are more important than PERFECT code. Id say the production environment, as opposed to the UAT environment, could possibly have had redundant network connections (you'd think, being a bank and all)

2007-02-15 Reply Admin

How do some of you propose to develop an application that's so ultra-robust as to handle every possible random network error? Honestly, in this day and age, those errors are on the same order as "out of memory".

First of all, how do you even catch the exception? Depending on whether you happen to be accessing a shared file, database, web site, web service, directory/LDAP server, mail server, FTP server, or some other network resource, there are literally hundreds of different possible exceptions/errors that could be generated by the library. Many libraries don't propagate basic network errors, they create their own wrappers around them that says something like "service unavailable". Faulty network connections, like low memory, cause very strange errors to occur, often completely different errors on each successive code execution (I saw this with one of our old workstations here which had a faulty connection, and very similar results on another machine with a defective hard drive).

Even if you caught every single error, then what? What do you do? Try again? What if it still doesn't work? Try 5 more times? 10 more times? What if this is a web site where hundreds or thousands of users may be online and the call you're making is very expensive, computationally or bandwidth-wise?

In many cases, when you're dealing with exceptions rather than error codes, it's better NOT to try to handle those kinds of errors, because they indicate a problem that's totally unrelated to the application, which was true in this case. Let the UI put up a friendly error message if it must, one which indicates that Something Very Bad happened internally, while the low-level exception gets logged and probably e-mailed to the sysadmins or a bug tracker.

Or just wrap a try { ... } catch { } around every single line of code, right!?

2007-02-15 Reply Admin

anonymous guy:
The real wtf (and I say that not only in spite of, but because of how many of you feel about the phrase) is that:
The official description was that the module caused "weird things" to happen.
My response would be: I checked it out, weird things are not happening, problem solved. And then proceed to do nothing until a better bug description was forthcoming or I was fired.

"A lot of things are happening. None of the things that are happening are weird. Some are undesirable, a few are unpredicted, but absolutely none of them are supernatural or caused by witchcraft."

[resolved]

[fixed]

Have a nice day.

2007-02-15 Reply Admin

CodeBoy:
It does seem odd though that UAT would take place in an overheated, ultra-secure server room.

UAT should be done with "realistic" data sets. This usually means a copy of production data (hopefully obfuscated). I woudl hope that a bank would keep any copies of production data as secure as the live system, which means UAT servers should be in the secure server room.

Even with some degree of obfuscation, sensitive production data should NEVER EVER be used on developer machines. Developers like to set up their own playgrounds on their desktop or laptop. They also don't generally have the paranoia ingrained into sysadmins.

My company's production databases only ever leave the server room on encrypted media (and in locked containers). Developers can't touch copies of production data that have not been "cleaned" by a thoroughly paranoid DBA. Accidentally divulging customer data can cost millions, so such protections are no longer optional for really any business.

People wonder why all that customer data was on the VA laptop that was stolen... it was probably a developer or report writer with a "test copy" of the data.

2007-02-15 Reply Admin

olsner:
The application should have been built from the start to gracefully handle temporary disconnections... And I hope it's well-known that delays in network propagation and IPC/thread scheduling are more or less arbitrarily large. Monkeys-in-the-server-room protection comes for free if you simply prepare for "unexpected" transient failures from the start.

I think that is part of the problem. This latest code does not handle network breaks - however the older code does. Result even while there was a real physical problem with the network the old code papered over a real problem that should have been fixed. Clearly someone have not been monitoring the network logs properly otherwise they would have noticed the timing of a number of breaks followed a regular pattern.

I wonder with all the traffic overhead how much of a git they were taking. People could have been complaining how slow the network/software was when infact the problem was neither.

2007-02-15 Reply Admin

cklam:
SomeCoder:
Wait, so I'm a little confused. The fan bumping into the cable was the cause of the original problem that was causing his code to be blamed for doing "wierd things", right?
Maybe it's just too early for me to reading TDWTF chugs coffee

[SNIP]

I could not avoid to look out of the window and finally the dime dropped: every time the connection was reset a heavy truck passed the property.

[SNIP]

I help two other techs at my old work place solve a similar problem. They had extend a network between two different buildings of a company using a microwave link. Both microwave transceivers were inside the buildings behind the outside walls. The network worked fine except on the days it failed completely, listening on the conversation I pointed out that the days when they had failures were also the days they had heavy rain. Of-course the transceivers were suppose to work even when it is raining but I pointed outside rain is a bunch of droplets with plenty of gaps, but brick walls with lots of rain hitting them appear as a solid sheet of water. When they checked they found one building was in-fact draining the roof water right in front of the transceiver! In light rain, no problem - heavy rain and the transceiver was trying to punch thru a waterfall. They move the transceivers outside and under overhangs - problem solved.

jokeyxero · 2007-02-15 Reply Admin

Seconded. Networks are not perfect connections, they die and the app should handle that.

2007-02-15 Reply Admin

It's better than a CUM Release.

2007-02-15 Reply Admin

Sorry, but server rooms and non-functioning AC quickly overheat, causing the servers to shut down, if the room contains any respectable number of servers.

I know this because it happened to one of my prod servers only a couple months ago. Within 2 hours we were up to 90 F. and we were shutting down 'non-essential' prod servers trying to prevent the essential servers from shutting down on their own.

Why yes, since you ask, I do work for a bank.

2007-02-15 Reply Admin

Lets be real, it could happen, screw the comments about back up ACs and safeguards. They're guarding your money, which is FDIC insured, FDIC doesn't require AC, and techs that do, well, should get paid less.

Banks are a business just like any other except they comply wiht a WHOLE lot more federal regulations and pay a WHOLE lot more to attorneys to make sure they do and well, someone's gotta pay for that.

~Signed A bank manager with a BS in Comp Sci

2007-02-15 Reply Admin

Hmm -- somewhat related: Years ago I worked in the Microfilm industry. We had a brand new microfiche camera that sported a "Digital Titling System"; only the second camera I am aware of to have such a feature at the time. The only trouble was the fact that the titles were blurred. But only some of them. Other images on the same film roll were nice and sharp.

After weeks of investigation by the technician, we discovered that the cameras were on the same electrical circuit as one of those huge pedestal fans that was in use in a room without AC. Just plugging in the fan caused the issue (not even running) since the motor coils acted as a huge inductor, sucking in RF noise and causing the CRT image to fuzz...

Spacecoyote · 2007-02-16 Reply Admin

Leave this guy alone, obviously his native language is Spanish and he doesn't realize that his English is bad.

icelava · 2007-02-16 Reply Admin

lackluster:
Is there any sort of professionalism left in this world?

Of course there isn't any. The world is alot more uglier than you think.

2007-02-16 Reply Admin

Iceman:
olsner:
The application should have been built from the start to gracefully handle temporary disconnections... And I hope it's well-known that delays in network propagation and IPC/thread scheduling are more or less arbitrarily large. Monkeys-in-the-server-room protection comes for free if you simply prepare for "unexpected" transient failures from the start.

That's all very well, but if it is a browser based application, the browser is going to return a server error when the server gets disconnected from the network, since the connection between the browser and server has been broken. No matter how gracefully the application code handles network interruptions, the browser is going to barf.

captcha: pinball

If an application cannot report a connection lost kind of error without anyone intepreting it as "weird errors" its shit coding then. End of discussion.

vr602 · 2007-02-16 Reply Admin

Many many moons ago, I worked on an IT helpdesk for an industrial gas company. A girl phoned me up, claiming that "very often when she was on the phone, her screen would go all strange". Obviously I was fairly sceptical about this one, not believing that a telephone could generate enough interference to disturb a monitor ( in fact it was an HP green-screen dumb terminal ). However, she insisted that this was a real problem, so I went up to her floor to have a look. I tried using her phone, no problem at all, couldn't reproduce the issue or anything like it. Much puzzlement. In an inspired moment, I asked her to try using the phone; of course, sitting where she was, with the phone under the monitor, the handset cable rubbed on the underside of the screen, tweaking the contrast knob with every movement of her head, and the screen indeed "went all strange"! Moving the telephone 3 inches to the right seemed to solve the problem...

2007-02-16 Reply Admin

Can a fan replace broken air con? Maybe for people (who sweat) but not for servers - it's not going do much but stir the hot air.

2007-02-16 Reply Admin

I'm reminded now of a time when the company I worked for installed a new barcode scanner system in an industrial laundry up in Leeds. The system had been well track tested and there was no special requirements for this customer. After a couple of weeks trouble free usage I got a lot of angry phone calls from the customer citing misread barcodes and the associated trouble that caused in the database. A site visit was deemed good for customer relations. On arriving at the site I found that the customer had moved the scan station from its original location - and tie wrapped the network cable neatly to the industrial 3-phase lecky supply. Maybe that had something to do with it.

2007-02-16 Reply Admin

Yeah! I do like paranoid programmers ;)

2007-02-16 Reply Admin

Zylon:
Been there...:
A programmer tried for months to isolate a problem with the accounting module, which would occasionally leave the corporate books out of balance by just a few dollars each month. One night he was working late, investigating the problem and pouring over the COBOL source code. As he studied the code, the cleaning lady was just finishing with her sweeping in the room. She reached over into the card tub bin, took a card at random, and swept the little pile of dirt onto the card. Then she threw the card and dirt into the trash can.
There does seem to be an entire subclass of computer WTFs based around clueless maintenance and construction workers. My favorite-- computer scientist working on garbage-collection algorithms has all his notes on the subject in a box labeled "GARBAGE". Yeah, do I even need to finish this one?

Bad luck I'd say :D That's why my stuff is in boxes/maps with names like "TOP SECRET", "BREAKABLE", "TOXIC HAZARD", "DANGER: EXPLOSIVES" etc... :P

2007-02-16 Reply Admin

That happened to me many, many times when I worked for a company which provided software maintenance&management for an insurance company (read: fixing bugs and tweaking their software to support their changing business model on a daily basis).

I never worked for a bank, but judging from what I read here they are very much like insurance people. Very often I got a call from our customer informing me that a weird error had happened at some agency. No, they couldn´t give me an accurate description of the error, but somehow some data in the database had been corrupted. No, they couldn´t tell me what the agency was exactly doing before the error happened. No, none was able to reproduce the error in a test environment, and of course trying to reproduce it in production environment is out of the question (none wants to corrupt more data just to make my job easier, understandably, and anyway I´m sure it would have been impossible to reproduce in production anyway).

So, what should I do? I chased ghosts for a couple days, flagged it as "intermittent problem, please inform me if it happens again" (almost never did) and forgot about it.

Waste of time, but I was paid to do it. Way I see it, computer people have a lot in common with hookers: we are not paid to solve problems, we are paid to give satisfaction to customers. If me chasing ghosts gives them pleasure, so be it.

Btw, 90% of the time this kind of problems could be traced to someone launching ill-thought sql queries, BY HAND, against production database. My guess is someone in our company IT dept thought he had l33t skillz, and he could hack away some problem or other launching sql commands by hand. Not bothering to go through proper applications that access the database with respect for the business rules, of course. Resulting in data not-conformant to business rules, hence "data corruption". I never found out who it was. Yes, I could prove it, buy the costumer vehemently denied anyone was recklessly launching sqls by hand, and the costumer is always right. Yes, this could be fixed with a sane permissions policy and a few triggers enforcing business rules, but I guess they thought that would stop them from using their ugly sql hacks, so it was never done despite my advice.

End of the day, if I am paid to track down a problem with a poor description and which probably is nonexistant in the first place, then I will probably do it. Who knows, sometimes the costumer is right and I actually find something to fix.

2007-02-16 Reply Admin

Ahem, no.

It's one thing if your app can't connect to something external, that's what queues are for... yes, by all means do your best to be robust, especially if it's a system you can't get your hands on.

However, if the client's connection to your app itself is unreliable, there's not much you can do, unless you want all your requests to come from javascript with its own timeouts and its own error pages. I think that degree of paranoia and the associated bloat would be a nice dailyWTF.

2007-02-16 Reply Admin

Just because the UAT server is in the secure server room, why do you assume that the UAT testing also occurs there?

Much more likely that the UAT testers sit in their own room (possibly with A/C, but not really relevant) where a number of UAT client machines are connected to the UAT server via the network. The physical cable connection at the server end was being intermittently disconnected by a badly situated fan device. I assumed that the reason Steve wanted to access the server room was to determine what was different about the UAT server compared to the previous dev/test servers, where the code ran fine.

I think the real wtf is/are: (a) that the people responsible for the server room didn't realise that a network cable was being disturbed by the fan i.e. that the developer was responsible for diagnosing a hardware problem (b) that it took so much time and effort for Steve to be granted access to the server room to investigate, having already taken all reasonable steps to show the code was not the problem.

2007-02-16 Reply Admin

cklam:
every time the connection was reset a heavy truck passed the property.

That's a clever trick. So, was the truck delivering the error message?

2007-02-16 Reply Admin

[[My response would be: I checked it out, weird things are not happening, problem solved. And then proceed to do nothing until a better bug description was forthcoming or I was fired.]]

Amen.

2007-02-16 Reply Admin

Our anonymous submitta' (we'll call him "Steve") felt dis stin', too. 'S coo', bro. He'd carefully and medodically developed some module dat wo'ked finely in de bank's development and test environments, but failed UAT. Testers had discovered some bug in de module dat dey couldn't reproduce. De official descripshun wuz dat da damn module caused "funky doodads" t'happen. 'S coo', bro.

2007-02-16 Reply Admin

the_real_tel:
I think the real wtf is/are: (a) that the people responsible for the server room didn't realise that a network cable was being disturbed by the fan i.e. that the developer was responsible for diagnosing a hardware problem (b) that it took so much time and effort for Steve to be granted access to the server room to investigate, having already taken all reasonable steps to show the code was not the problem.

There are quite a few companies that have a culture of shoot first, ask questions later. I had been called on the carpet several times before, when I was not to blame at company that literally deployed the biggest house of cards in the world. I left that place, and found a better place to work. I think that the company "Steve" worked for has the same mindset and culture. Some say it's ignorance on the part upper management, some say they have no protection by their supervisor, I say they can all blow it out their collective... well, anyway...

2007-02-16 Reply Admin

Arrastia:
That happened to me many, many times when I worked for a company which provided software maintenance&management for an insurance company (read: fixing bugs and tweaking their software to support their changing business model on a daily basis).
[snip...]

End of the day, if I am paid to track down a problem with a poor description and which probably is nonexistant in the first place, then I will probably do it. Who knows, sometimes the costumer is right and I actually find something to fix.

No, that's an entirely different situation. You were being paid to track down the problems of end users who can not reasonably be expected to know what a good bug report is. Your job involves a lot of people skills and coaxing information out of non-computer people. In that situation, it's perfectly reasonable to chase down false leads and do what you did.

Having a QA dept which can reproduce the problem, but won't even let you into the room to see them doing it is not reasonable.

2007-02-16 Reply Admin

Ben:
Also, I must say that usually when a user says "something weird" is happening, I assume (from experience) that it means "I didn't pay attention to what I was clicking on and have no idea the damage I've done".

Or "I know what I did but if I play dumb, someone else might fix it and no blame will stick to me"

2007-02-16 Reply Admin

I remember a case where a home user claimed the computer would trash when the dog barked. The skeptical tech eventually went to the house, and nothing seemed wrong till they put the cat out. After a flurry of barking, the computer reset.

Apparently the dog's 'invisible fence' was plugged into the same outlet as the computer.

cklam · 2007-02-16 Reply Admin

Grinden:
cklam:
every time the connection was reset a heavy truck passed the property.

That's a clever trick. So, was the truck delivering the error message?

No, fish (mostly).

2007-02-16 Reply Admin

anonymous guy:
Arrastia:
That happened to me many, many times when I worked for a company which provided software maintenance&management for an insurance company (read: fixing bugs and tweaking their software to support their changing business model on a daily basis).
[snip...]

End of the day, if I am paid to track down a problem with a poor description and which probably is nonexistant in the first place, then I will probably do it. Who knows, sometimes the costumer is right and I actually find something to fix.
No, that's an entirely different situation. You were being paid to track down the problems of end users who can not reasonably be expected to know what a good bug report is. Your job involves a lot of people skills and coaxing information out of non-computer people. In that situation, it's perfectly reasonable to chase down false leads and do what you did.

Having a QA dept which can reproduce the problem, but won't even let you into the room to see them doing it is not reasonable.

Go back and re-read the OP. There is ZERO evidence that it was a QA dept. All he said was UAT and testers - that could very likely mean end-users tasked with testing the app before rollout.

2007-02-16 Reply Admin

olsner:
The application should have been built from the start to gracefully handle temporary disconnections... And I hope it's well-known that delays in network propagation and IPC/thread scheduling are more or less arbitrarily large

That's nice, though there's a difference between disruption of intermediate networks (where you just lose packets), and losing the physical connection at the server (which can cause all kinds of services to unbind from the adaptor).

A lot of folk assume that the server's LAN connection isn't going to "come and go". Unfortunately, it can happen.

The Bank Has Spoken!

Leave a comment on “The Bank Has Spoken!”