- Feature Articles
-
CodeSOD
- Most Recent Articles
- Halfway to a Date
- Brushing Up
- Irritants Make Perls
- Crossly Joined
- My Identification
- Mr Number
- intint
- Empty Reasoning
-
Error'd
- Most Recent Articles
- Secret Horror
- Not Impossible
- Monkeys
- Killing Time
- Hypersensitive
- Infallabella
- Doubled Daniel
- It Figures
- Forums
-
Other Articles
- Random Article
- Other Series
- Alex's Soapbox
- Announcements
- Best of…
- Best of Email
- Best of the Sidebar
- Bring Your Own Code
- Coded Smorgasbord
- Mandatory Fun Day
- Off Topic
- Representative Line
- News Roundup
- Editor's Soapbox
- Software on the Rocks
- Souvenir Potpourri
- Sponsor Post
- Tales from the Interview
- The Daily WTF: Live
- Virtudyne
Admin
Uptime doesn't matter. Donwtime does.
Which would you rather have: a machine that stays up for forty days and reboots reliably in thirty seconds (five nines), or a machine that stays up for four hundred days and reboots, with 95% reliability, in four hours (somewhat fewer nines)?
The latter might be impressive; but in ahem mission-critical systems, it's a bit of a bust.
It's all about downtime.
Admin
Back when I worked for AOL, we had some old, old, old, old Netscape equipment that had uptimes in 5 and 7 years respectively. They were old HPUX machines, and were critical for some Netscape 4.5-related issues (I forgot, but even when Netscape 7.0 came out, we had over 15,000 Netscape 4.5 users still). I don't know what would have happened if these machines died, but was told "it would be bad." I managed them as part of a "give the new guy the stuff we don't want" collection (which also included DMOZ). There was no documentation and I didn't have root access. The funny thing is NOBODY had root access to any box! As part of their "multi-turnkey" security, the only way to get root access to a box was a set 15 minute window through an OOB. The root passwords were scrambled by an automatic process every 15 minutes, and you had to jump through hoops to get whatever the scrambled password was in some 15 minute window.
So they were unpatched, running old several old SCSI disks. No backups. The consensus was we were afraid any shut down and the disks would not spin up anymore.
That was back in 2005. I wonder if they are still going?
Admin
[quote user="Kuba"][quote user="Martin"] This is akin to labeling the "OFF" switch with an "ON" label and blaming the user for not looking in the manual. [/quote]
When office equipment started coming out with "world" rather than "US" markings I went to do something with a PC and there was the 0/1 switch... Except they used stylized graphics so it was really a circle / line switch. Which is which? Open eye = on, closed eye = off was my initial reading. Wrong! Progress in ergonimics: cross-culturally counterintutive markings on critical operating controls.
Admin
A clbuttic mistake.
Admin
Actually, you can NEVER really "test M/S windows scripts". Dlls can change at any moment and the whole system could change. So windows has all the problems that they said the tandem has AND THEN SOME! And most computers measure uptime in YEARS! UNIX systems do, I understand mainframes do, etc... Disks take at LEAST an average of 3 years, although I have had many last over 10! So it is WINDOWS that gets the blame for crashes!!!!!
Admin
The REAL WTF is that Chris's company did business with apartheid-era South Africa.
Admin
Flipping the power switch on that Tandem for a few seconds would not require a complete restart. Tandem machines (still) do a power-fail restart and continue where they were interrupted. Thus, after power on, that machine, its database and application very likely were in perfect shape, without the need for a "Cold-Load" (or IPL, or Ctrl-Alt-Del, reboot); or a need to run the broken scripts. It may have lost its network connections to the ATMs (if that was based on tcp/ip), but otherwise no damage done.
Admin
www.vgoldseller.com is an professional store for runescape gold,items,money,accounts,powerleveling,questqoint,runes and some other goods with fast delivery and world ...
Admin
sadasdasdad http://www.ecforshop.com/
Admin
Old thread, found in a search result. Any extreme reliability design would take into account PERSISTENT storage. Storing code, procedures, values, in volatile RAM (no matter how many standby CPUs are present) is NOT persistent, period. Doesn't matter how many UPS backup units etc. you have, batteries fail, wiring fails, circuit breakers fail, power contacts fail etc. I've written over a million lines of code in my lifetime and I can tell you that potential power interruption, at ALL levels must be considered and is part of design reliability. The operating system, software and other scripts must all rely on PERSISTENT redundant storage.