killall sleep

2014-03-27 Reply Admin

TRWTF is checking on processes only once every hour.

2014-03-27 Reply Admin

Why would this be slow? Sleeping processes only consume memory, but not CPU time.

2014-03-27 Reply Admin

Depends on if your server has enough memory or is swapping all the time

DaveK · 2014-03-27 Reply Admin

Shouldn't matter. Sleeping processes get swapped out when the memory is needed, swapped back in once an hour; that's not a whole lot of I/O load. Plus all the read-only code pages for the shell and sleep executables are loaded only once and shared between all running instances. I'd want to see some more detailed CPU and memory usage data before jumping to conclusions.

2014-03-27 Reply Admin

If this is a cron job, then it would eventually bring the server to its knees because it'll start more and more of these processes. (If, instead, you have a cron job that invokes an Upstart job, for instance, you won't have that problem. Or just, yaknow, have a cron job that does the check once and terminates, and gets invoked every hour. But that's way too obvious for a Scripting Expert.) The entire purpose of cron facilities is to relieve you of the need to invoke a ton of infinitely-looping processes like this.

2014-03-27 Reply Admin

i=1 while [ $i = 1 ]; do

This is a fail in it's own class.

No need to use a "variable" (it never gets changed, so not very "varible"), 1 = 1 would do the same
Launches unrequired test command
= compares strings, you should use -eq to compare numbers.

ratchet freak · 2014-03-27 Reply Admin

YARCI: Yet Another Resource Conscious Idiot

when something takes too many resources and you wish to avoid it then you should ensure that your alternative doesn't take even more resources

2014-03-27 Reply Admin

DaveK:
Shouldn't matter. Sleeping processes get swapped out when the memory is needed, swapped back in once an hour; that's not a whole lot of I/O load.

Maybe the expert / guru felt it was slow and implemented his own memory swapping based on infinite loops with sleeps?

2014-03-27 Reply Admin

DaveK:
Shouldn't matter. Sleeping processes get swapped out when the memory is needed, swapped back in once an hour; that's not a whole lot of I/O load. Plus all the read-only code pages for the shell and sleep executables are loaded only once and shared between all running instances. I'd want to see some more detailed CPU and memory usage data before jumping to conclusions.

I agree from a pure performance preservative I don't see why this should be a problem. In some ways it might be better because you are not spawning a new shell every time the job runs and inuring all that process start up overhead. That said unless we are talking about some really old systems with tiny amounts of memory and CPU power and or impossibly large numbers of shell scripts I doubt it matters either way.

Still I'd call this a wtf. Why because if you used cron you'd have an idea of what should be happening when and at least a log of the ultimate return code. If something goes wrong you can determine which of the implied many scripts is having a problem fairly quickly and easily. Some transient errors will correct themselves too, like if a network resource becomes unavailable if its back the next time cron starts the script the things will probably be fine.

Under this scheme I don't how you determine which out script has died, if there is a problem (somehow I doubt there is rigorous logging). I would assume the thing is more brittle because if there are temporary errors the scripts will probably just die and not restart on their own.

2014-03-27 Reply Admin

And again an 'expert' is judged on a single item. Boring

2014-03-27 Reply Admin

Of course if all of these monitor jobs got started when the system booted they'd all land on it about the same time. With cron presumably you'd skew the start times a bit.

DaveK:
Shouldn't matter. Sleeping processes get swapped out when the memory is needed, swapped back in once an hour; that's not a whole lot of I/O load. Plus all the read-only code pages for the shell and sleep executables are loaded only once and shared between all running instances. I'd want to see some more detailed CPU and memory usage data before jumping to conclusions.

2014-03-27 Reply Admin

Geoff:
DaveK:
Shouldn't matter. Sleeping processes get swapped out when the memory is needed, swapped back in once an hour; that's not a whole lot of I/O load. Plus all the read-only code pages for the shell and sleep executables are loaded only once and shared between all running instances. I'd want to see some more detailed CPU and memory usage data before jumping to conclusions.

I agree from a pure performance preservative I don't see why this should be a problem. In some ways it might be better because you are not spawning a new shell every time the job runs and inuring all that process start up overhead.

On the other hand, you now have the overhead of starting the sleep process (which is not usually a shell builtin) while cron sleeps internally.

That said unless we are talking about some really old systems with tiny amounts of memory and CPU power and or impossibly large numbers of shell scripts I doubt it matters either way.

Agreed. If the server was really so slow, there must be another reason. Even though the number of sleeps running adds to the ps list, ps and grep won't take a noticeable amount of time when called every hour.

Still I'd call this a wtf. [...] I would assume the thing is more brittle because if there are temporary errors the scripts will probably just die and not restart on their own.

Quite likely. And checking the process regularly is just wrong to start with. Just have the script that starts the process wait for it (i.e. do nothing, since that what it does by default if you don't explicitly put it in the background or so) and when it terminates, the start script itself can give a useful message and (e.g.) restart the process. Been there, done that, no big deal, no cron needed, no delayed response. Probably too easy for a real guru.

2014-03-27 Reply Admin

You can't really do :

ps -ef | grep some_process | wc -l

as the grep will normally match itself - so you'll always get one match.

You can work around this using :

ps -ef | grep [s]ome_process | wc -l

should work as intended....

2014-03-27 Reply Admin

Cron has its own problems, and a loop may be better in some cases.

See what Tom Limoncelli says about it: http://queue.acm.org/blogposting.cfm?id=76566

Tom Limoncelli (Everything Sysadmin):
A friend of mine told me of a situation where a cron job took longer to run than usual. As a result the next instance of the job started running and now they had two cronjobs running at once. The result was garbled data and an outage.
The problem is that they were using the wrong tool. Cron is good for simple tasks that run rarely. It isn't even good at that. It has no console, no dashboard, no dependency system, no API, no built-in way to have machines run at random times, and its a pain to monitor. [...]

As others have pointed out, the sleep builtin should not cause performance problems. The server had probably just too many regular tasks to handle the load, and changing to cron would probably not help.

2014-03-27 Reply Admin

I came to post the same trick. I suppose I 'discovered' this independently; I've never seen it anywhere before. I usually see:

ps | grep process | grep -v grep | ...

I also wonder if the author was counting on the grep match, and the warning occurs if a process with that name is ever caught running :)

I would use it with a process name of $0 in that case...

ochrist · 2014-03-27 Reply Admin

This is the same way I read The Daily WTF:

Check the website. Nothing interesting. Go back to sleep.

2014-03-27 Reply Admin

Instead of "$i = 1", he should have used ":".

2014-03-27 Reply Admin

"Bourne"? How does that work?

From Wiktonary:

Noun

bourne (countable and uncountable, plural bournes)

(countable, archaic) A boundary.
(archaic) A goal or destination.
(countable) A stream or brook in which water flows only seasonally.

Exactly which of the above usages is intended here?

ochrist · 2014-03-27 Reply Admin

QJo:
"Bourne"? How does that work?
From Wiktonary:

Noun

bourne (countable and uncountable, plural bournes)

(countable, archaic) A boundary.

(archaic) A goal or destination.

(countable) A stream or brook in which water flows only seasonally.

Exactly which of the above usages is intended here?

Maybe The Bourne shell

TheCPUWizard · 2014-03-27 Reply Admin

QJo:
"Bourne"? How does that work? ... Exactly which of the above usages is intended here?

http://en.wikipedia.org/wiki/Jason_Bourne

<grin>

2014-03-27 Reply Admin

Bourne refers to a unix shell.

This one: http://en.wikipedia.org/wiki/Bash_%28Unix_shell%29 Or possibly this: http://en.wikipedia.org/wiki/Bourne_shell

2014-03-27 Reply Admin

Always code to the Principle of least astonishment. The right thing to do is make a daemon if the task requires it. If a daemon is deemed to be too heavy handed then a cron job might be the next place to look.

2014-03-27 Reply Admin

It is just me or did this comments section turn into some kind of amateur unix sysadmin pissing contest?

Bobby Tables · 2014-03-27 Reply Admin

QJo:
"Bourne"? How does that work?
From Wiktonary:

Noun

bourne (countable and uncountable, plural bournes)

(countable, archaic) A boundary.

(archaic) A goal or destination.

(countable) A stream or brook in which water flows only seasonally.

Exactly which of the above usages is intended here?

Bourne-again shell. Something commonly known as bash or the "bash shell" (which would actually make it the "Bourne-again shell shell"). Or it could be the Bourne shell (sh) it's based on. At any rate - it's definitely a shell.

2014-03-27 Reply Admin

I always just do this:

while [ 1 ]; do

Captcha: persto

persist. How fitting for a discussion on infinite loops

Steve The Cynic · 2014-03-27 Reply Admin

cron:
As others have pointed out, the sleep builtin should not cause performance problems. The server had probably just too many regular tasks to handle the load, and changing to cron would probably not help.

It might, if the problem is too many processes, because by running the real work from cron instead of from a script, you eliminate one (maybe two, depending on whether sleep is a built-in or external command) processes from the process table.

This can be significant - in a previous life I worked on large(*) Solaris-on-Sparc iron where the process table on the dev machines was limited to 8000-ish processes, and the normal load when the machine was up and running but not doing much was around 7000...

(*) The main production machines had 192 processors and oooooooodles of memory. The physical dev machines were partitioned into multiple logical machines, and the logical machine I normally used was "only" 32 processors.

2014-03-27 Reply Admin

Bobby Tables:
QJo:
"Bourne"? How does that work?
From Wiktonary:

Noun

bourne (countable and uncountable, plural bournes)

(countable, archaic) A boundary.

(archaic) A goal or destination.

(countable) A stream or brook in which water flows only seasonally.

Exactly which of the above usages is intended here?

Bourne-again shell. Something commonly known as bash or the "bash shell" (which would actually make it the "Bourne-again shell shell"). Or it could be the Bourne shell (sh) it's based on. At any rate - it's definitely a shell.

Brilliant, thanks. I am suitably enlightened.

2014-03-27 Reply Admin

So what do you do if you have a long running script, and want to re-run it every hour after it is finished?

With this sleep method: run the script, it takes 90 min, then an hour later you run it again.

With cron: You set it to run every hour. The first hour runs the script. Then with 30 min left to go be fore the script finishes, you start another instance? That's no good. Do you just set the script to run every 150 min?

What's the best option here?

DrakeSmith · 2014-03-27 Reply Admin

TheGingerDog:
You can't really do :
ps -ef | grep some_process | wc -l

as the grep will normally match itself - so you'll always get one match.

You can work around this using :

ps -ef | grep [s]ome_process | wc -l

should work as intended....

Or you could just use pgrep...

operagost · 2014-03-27 Reply Admin

ratchet freak:
YARCI: Yet Another Resource Conscious Idiot
when something takes too many resources and you wish to avoid it then you should ensure that your alternative doesn't take even more resources

Yet another example of the inner-platform effect.

2014-03-27 Reply Admin

On my system at least, running a sleep takes up about 500 bytes of memory. You'd run out of PIDs long before you had any memory or slowness issues from too many sleeps.

operagost · 2014-03-27 Reply Admin

Taco:
And again an 'expert' is judged on a single item. Boring

That "expert" can't be very good at time management. I'm lazy. I work on Windows systems. I like to schedule stuff with "at". Why recreate a poor version of a scheduler? Powershell is making it easier, but *nix users should already be experts at replacing everything with a very small shell script.

dkf · 2014-03-27 Reply Admin

nitePhyyre:
So what do you do if you have a long running script, and want to re-run it every hour after it is finished?

If grepping the output of ps is taking over an hour, you've got bigger problems than that.

2014-03-27 Reply Admin

RFox:
Of course if all of these monitor jobs got started when the system booted they'd all land on it about the same time. With cron presumably you'd skew the start times a bit.

Actually, it's probably the other way around. You could set up cron so everything started at a different time, but in practice you typically see all the jobs started with "0 * * * *" or "@hourly" or thrown into /etc/cron.hourly (depending on which flavor of cron you're using). These all start jobs at minute 0 of each hour.

A loop like "while [ 1 ]; do something; sleep 3600; done" takes an hour plus the duration of "something" to run. They'll all initially start at the same time, but assuming each of the tasks has a different "something" they'll each have a slightly different duration and will get out of sync fairly quickly. This is especially true on a busy system, since that will affect the duration of the "something", forcing the jobs farther out of sync. It's kind of self-balancing that way.

2014-03-27 Reply Admin

tim:
It is just me or did this comments section turn into some kind of amateur unix sysadmin pissing contest?

As opposed to the usual "let me show you how to write that trivial code"* amateur programmer pissing contests that we have?

*non-ironically as in writing a Fizz-Buzz, rather than for humor, such as trying to introduce cliches or different WTFs.

2014-03-27 Reply Admin

if you really want to get pedantic, while [ 1 ]; do will work just as well, why even invoke a comparison when all you want is 'true'

DaveK · 2014-03-27 Reply Admin

tim:
It is just me or did this comments section turn into some kind of amateur unix sysadmin pissing contest?

It's just you. Your description of a perfectly moderate and even-tempered discussion of the issues raised by today's WTF is needlessly hyperbolic.

2014-03-27 Reply Admin

The fact that nearly every article on this site spawns a host of "I don't see the problem", "TRWTF is", and "here is how to really solve this problem"-type comments tells me just how tough this shit is.

CAPTCHA: Tego, n.: Lego bricks made out of teak.

2014-03-27 Reply Admin

The problem might have been in the ps command. In the 90's I worked on hp-ux machines in which the man page warned that using ps would lock the process queue while it ran. Users were loudly complaining about slow start up times so I poked around. I found a script that ran every second and did a 'ps | grep something | grep -v grep' despite the fact that the process it was checking had a known pid. Changed that to a kill -0 and the bottleneck disappeared.

D-Coder · 2014-03-27 Reply Admin

Geoff:
DaveK:
Shouldn't matter. Sleeping processes get swapped out when the memory is needed, swapped back in once an hour; that's not a whole lot of I/O load. Plus all the read-only code pages for the shell and sleep executables are loaded only once and shared between all running instances. I'd want to see some more detailed CPU and memory usage data before jumping to conclusions.
<<snip>>
Under this scheme I don't how you determine which out script has died, if there is a problem (somehow I doubt there is rigorous logging). I would assume the thing is more brittle because if there are temporary errors the scripts will probably just die and not restart on their own.

Duh! You write a script with a loop and a one-hour sleep that makes sure all the other scripts are running correctly!

2014-03-27 Reply Admin

The other reason to use this pattern (sleep in an infinite loop) is that you can instantly trigger a check that everything's running:

killall sleep

(Note that on some non-Linux platforms, killall does something very different and unfriendly)

2014-03-27 Reply Admin

QJo:
Brilliant, thanks. I am suitably enlightened.

I think you mean "brillant".

VinDuv · 2014-03-27 Reply Admin

coder:
if you really want to get pedantic, while [ 1 ]; do will work just as well, why even invoke a comparison when all you want is 'true'

You can also use “while [ 0 ]; ” which does the same thing and is more confusing.

The best way to do this is probably “while true ;”, or “while : ;”.

NamingException · 2014-03-27 Reply Admin

If we were meant to write infinite loops, there would be a native language construct for them.

2014-03-27 Reply Admin

I doubt the "couple of hundred instances of sleep" were the real problem. It's not exactly uncommon on servers running things like apache to have a few hundred sleeping processes and a sleeping process should be pretty low overhead.

More likely the problem was lots of random badly writen scripts chewing up resources.

2014-03-27 Reply Admin

Anonymous') OR 1=1:
QJo:
Brilliant, thanks. I am suitably enlightened.

I think you mean "brillant".

Yes, I can't argue there, I do indeed mean "brillant".

2014-03-27 Reply Admin

TRWTF is

Scripts doing infinite loops and then sleeping for an hour were often the norm.

If you were doing something after the loop, it's not that infinite, now is it?

Of course in this, you're doing something in the loop...

no laughing matter · 2014-03-27 Reply Admin

coder:
if you really want to get pedantic, while [ 1 ]; do will work just as well, why even invoke a comparison when all you want is 'true'

If all you want is true, why do you additionally run test?

Your valid options are:

while true; do ... while false; do ... while [ ! -e $file ]; do

Nagesh · 2014-03-27 Reply Admin

TheCPUWizard:
QJo:
"Bourne"? How does that work? ... Exactly which of the above usages is intended here?

http://en.wikipedia.org/wiki/Jason_Bourne
<grin>

I am also thinking of JSON bourne that super spy who is no match for James Bond.

2014-03-27 Reply Admin

nitePhyyre:
So what do you do if you have a long running script, and want to re-run it every hour after it is finished?
With this sleep method: run the script, it takes 90 min, then an hour later you run it again.

With cron: You set it to run every hour. The first hour runs the script. Then with 30 min left to go be fore the script finishes, you start another instance? That's no good. Do you just set the script to run every 150 min?

What's the best option here?

man at

Bourne to be While

killall sleep

Leave a comment on “Bourne to be While”