• Brompot (unregistered)

    TRWTF is checking on processes only once every hour.

  • Black Bart (unregistered)

    Why would this be slow? Sleeping processes only consume memory, but not CPU time.

  • Neil (unregistered) in reply to Black Bart

    Depends on if your server has enough memory or is swapping all the time

  • (cs) in reply to Neil

    Shouldn't matter. Sleeping processes get swapped out when the memory is needed, swapped back in once an hour; that's not a whole lot of I/O load. Plus all the read-only code pages for the shell and sleep executables are loaded only once and shared between all running instances. I'd want to see some more detailed CPU and memory usage data before jumping to conclusions.

  • Chris Angelico (unregistered)

    If this is a cron job, then it would eventually bring the server to its knees because it'll start more and more of these processes. (If, instead, you have a cron job that invokes an Upstart job, for instance, you won't have that problem. Or just, yaknow, have a cron job that does the check once and terminates, and gets invoked every hour. But that's way too obvious for a Scripting Expert.) The entire purpose of cron facilities is to relieve you of the need to invoke a ton of infinitely-looping processes like this.

  • Martin (unregistered)

    i=1 while [ $i = 1 ]; do

    This is a fail in it's own class.

    • No need to use a "variable" (it never gets changed, so not very "varible"), 1 = 1 would do the same
    • Launches unrequired test command
    • = compares strings, you should use -eq to compare numbers.
  • (cs)

    YARCI: Yet Another Resource Conscious Idiot

    when something takes too many resources and you wish to avoid it then you should ensure that your alternative doesn't take even more resources

  • Bob (unregistered) in reply to DaveK
    DaveK:
    Shouldn't matter. Sleeping processes get swapped out when the memory is needed, swapped back in once an hour; that's not a whole lot of I/O load.
    Maybe the expert / guru felt it was slow and implemented his own memory swapping based on infinite loops with sleeps?
  • Geoff (unregistered) in reply to DaveK
    DaveK:
    Shouldn't matter. Sleeping processes get swapped out when the memory is needed, swapped back in once an hour; that's not a whole lot of I/O load. Plus all the read-only code pages for the shell and sleep executables are loaded only once and shared between all running instances. I'd want to see some more detailed CPU and memory usage data before jumping to conclusions.

    I agree from a pure performance preservative I don't see why this should be a problem. In some ways it might be better because you are not spawning a new shell every time the job runs and inuring all that process start up overhead. That said unless we are talking about some really old systems with tiny amounts of memory and CPU power and or impossibly large numbers of shell scripts I doubt it matters either way.

    Still I'd call this a wtf. Why because if you used cron you'd have an idea of what should be happening when and at least a log of the ultimate return code. If something goes wrong you can determine which of the implied many scripts is having a problem fairly quickly and easily. Some transient errors will correct themselves too, like if a network resource becomes unavailable if its back the next time cron starts the script the things will probably be fine.

    Under this scheme I don't how you determine which out script has died, if there is a problem (somehow I doubt there is rigorous logging). I would assume the thing is more brittle because if there are temporary errors the scripts will probably just die and not restart on their own.

  • Taco (unregistered)

    And again an 'expert' is judged on a single item. Boring

  • RFox (unregistered) in reply to DaveK

    Of course if all of these monitor jobs got started when the system booted they'd all land on it about the same time. With cron presumably you'd skew the start times a bit.

    DaveK:
    Shouldn't matter. Sleeping processes get swapped out when the memory is needed, swapped back in once an hour; that's not a whole lot of I/O load. Plus all the read-only code pages for the shell and sleep executables are loaded only once and shared between all running instances. I'd want to see some more detailed CPU and memory usage data before jumping to conclusions.
  • foo AKA fooo (unregistered) in reply to Geoff
    Geoff:
    DaveK:
    Shouldn't matter. Sleeping processes get swapped out when the memory is needed, swapped back in once an hour; that's not a whole lot of I/O load. Plus all the read-only code pages for the shell and sleep executables are loaded only once and shared between all running instances. I'd want to see some more detailed CPU and memory usage data before jumping to conclusions.

    I agree from a pure performance preservative I don't see why this should be a problem. In some ways it might be better because you are not spawning a new shell every time the job runs and inuring all that process start up overhead.

    On the other hand, you now have the overhead of starting the sleep process (which is not usually a shell builtin) while cron sleeps internally.

    That said unless we are talking about some really old systems with tiny amounts of memory and CPU power and or impossibly large numbers of shell scripts I doubt it matters either way.
    Agreed. If the server was really so slow, there must be another reason. Even though the number of sleeps running adds to the ps list, ps and grep won't take a noticeable amount of time when called every hour.
    Still I'd call this a wtf. [...] I would assume the thing is more brittle because if there are temporary errors the scripts will probably just die and not restart on their own.
    Quite likely. And checking the process regularly is just wrong to start with. Just have the script that starts the process wait for it (i.e. do nothing, since that what it does by default if you don't explicitly put it in the background or so) and when it terminates, the start script itself can give a useful message and (e.g.) restart the process. Been there, done that, no big deal, no cron needed, no delayed response. Probably too easy for a real guru.
  • TheGingerDog (unregistered)

    You can't really do :

    ps -ef | grep some_process | wc -l

    as the grep will normally match itself - so you'll always get one match.

    You can work around this using :

    ps -ef | grep [s]ome_process | wc -l

    should work as intended....

  • cron (unregistered)

    Cron has its own problems, and a loop may be better in some cases.

    See what Tom Limoncelli says about it: http://queue.acm.org/blogposting.cfm?id=76566

    Tom Limoncelli (Everything Sysadmin):
    A friend of mine told me of a situation where a cron job took longer to run than usual. As a result the next instance of the job started running and now they had two cronjobs running at once. The result was garbled data and an outage.

    The problem is that they were using the wrong tool. Cron is good for simple tasks that run rarely. It isn't even good at that. It has no console, no dashboard, no dependency system, no API, no built-in way to have machines run at random times, and its a pain to monitor. [...]

    As others have pointed out, the sleep builtin should not cause performance problems. The server had probably just too many regular tasks to handle the load, and changing to cron would probably not help.

  • Bob Hope (unregistered) in reply to TheGingerDog

    I came to post the same trick. I suppose I 'discovered' this independently; I've never seen it anywhere before. I usually see:

    ps | grep process | grep -v grep | ...

    I also wonder if the author was counting on the grep match, and the warning occurs if a process with that name is ever caught running :)

    I would use it with a process name of $0 in that case...

  • (cs)

    This is the same way I read The Daily WTF:

    Check the website. Nothing interesting. Go back to sleep.

  • Chaarmann (unregistered)

    Instead of "$i = 1", he should have used ":".

  • QJo (unregistered)

    "Bourne"? How does that work?

    From Wiktonary:

    Noun

    bourne (countable and uncountable, plural bournes)

    1. (countable, archaic) A boundary.

    2. (archaic) A goal or destination.

    3. (countable) A stream or brook in which water flows only seasonally.

    Exactly which of the above usages is intended here?

  • (cs) in reply to QJo
    QJo:
    "Bourne"? How does that work?

    From Wiktonary:

    Noun

    bourne (countable and uncountable, plural bournes)

    1. (countable, archaic) A boundary.

    2. (archaic) A goal or destination.

    3. (countable) A stream or brook in which water flows only seasonally.

    Exactly which of the above usages is intended here?

    Maybe The Bourne shell

  • (cs) in reply to QJo
    QJo:
    "Bourne"? How does that work? ... Exactly which of the above usages is intended here?

    http://en.wikipedia.org/wiki/Jason_Bourne

    <grin>
  • Jay (unregistered) in reply to QJo

    Bourne refers to a unix shell.

    This one: http://en.wikipedia.org/wiki/Bash_%28Unix_shell%29 Or possibly this: http://en.wikipedia.org/wiki/Bourne_shell

  • Smug Unix User (unregistered)

    Always code to the Principle of least astonishment. The right thing to do is make a daemon if the task requires it. If a daemon is deemed to be too heavy handed then a cron job might be the next place to look.

  • tim (unregistered)

    It is just me or did this comments section turn into some kind of amateur unix sysadmin pissing contest?

  • (cs) in reply to QJo
    QJo:
    "Bourne"? How does that work?

    From Wiktonary:

    Noun

    bourne (countable and uncountable, plural bournes)

    1. (countable, archaic) A boundary.

    2. (archaic) A goal or destination.

    3. (countable) A stream or brook in which water flows only seasonally.

    Exactly which of the above usages is intended here?

    Bourne-again shell. Something commonly known as bash or the "bash shell" (which would actually make it the "Bourne-again shell shell"). Or it could be the Bourne shell (sh) it's based on. At any rate - it's definitely a shell.

  • Jon (unregistered) in reply to Chaarmann

    I always just do this:

    while [ 1 ]; do

    Captcha: persto

    persist. How fitting for a discussion on infinite loops

  • (cs) in reply to cron
    cron:
    As others have pointed out, the sleep builtin should not cause performance problems. The server had probably just too many regular tasks to handle the load, and changing to cron would probably not help.
    It might, if the problem is too many processes, because by running the real work from cron instead of from a script, you eliminate one (maybe two, depending on whether sleep is a built-in or external command) processes from the process table.

    This can be significant - in a previous life I worked on large(*) Solaris-on-Sparc iron where the process table on the dev machines was limited to 8000-ish processes, and the normal load when the machine was up and running but not doing much was around 7000...

    (*) The main production machines had 192 processors and oooooooodles of memory. The physical dev machines were partitioned into multiple logical machines, and the logical machine I normally used was "only" 32 processors.

  • QJo (unregistered) in reply to Bobby Tables
    Bobby Tables:
    QJo:
    "Bourne"? How does that work?

    From Wiktonary:

    Noun

    bourne (countable and uncountable, plural bournes)

    1. (countable, archaic) A boundary.

    2. (archaic) A goal or destination.

    3. (countable) A stream or brook in which water flows only seasonally.

    Exactly which of the above usages is intended here?

    Bourne-again shell. Something commonly known as bash or the "bash shell" (which would actually make it the "Bourne-again shell shell"). Or it could be the Bourne shell (sh) it's based on. At any rate - it's definitely a shell.

    Brilliant, thanks. I am suitably enlightened.

  • nitePhyyre (unregistered)

    So what do you do if you have a long running script, and want to re-run it every hour after it is finished?

    With this sleep method: run the script, it takes 90 min, then an hour later you run it again.

    With cron: You set it to run every hour. The first hour runs the script. Then with 30 min left to go be fore the script finishes, you start another instance? That's no good. Do you just set the script to run every 150 min?

    What's the best option here?

  • (cs) in reply to TheGingerDog
    TheGingerDog:
    You can't really do :

    ps -ef | grep some_process | wc -l

    as the grep will normally match itself - so you'll always get one match.

    You can work around this using :

    ps -ef | grep [s]ome_process | wc -l

    should work as intended....

    Or you could just use pgrep...

  • (cs) in reply to ratchet freak
    ratchet freak:
    YARCI: Yet Another Resource Conscious Idiot

    when something takes too many resources and you wish to avoid it then you should ensure that your alternative doesn't take even more resources

    Yet another example of the inner-platform effect.

  • Andrew (unregistered)

    On my system at least, running a sleep takes up about 500 bytes of memory. You'd run out of PIDs long before you had any memory or slowness issues from too many sleeps.

  • (cs) in reply to Taco
    Taco:
    And again an 'expert' is judged on a single item. Boring
    That "expert" can't be very good at time management. I'm lazy. I work on Windows systems. I like to schedule stuff with "at". Why recreate a poor version of a scheduler? Powershell is making it easier, but *nix users should already be experts at replacing everything with a very small shell script.
  • (cs) in reply to nitePhyyre
    nitePhyyre:
    So what do you do if you have a long running script, and want to re-run it every hour after it is finished?
    If grepping the output of ps is taking over an hour, you've got bigger problems than that.
  • Chelloveck (unregistered) in reply to RFox
    RFox:
    Of course if all of these monitor jobs got started when the system booted they'd all land on it about the same time. With cron presumably you'd skew the start times a bit.

    Actually, it's probably the other way around. You could set up cron so everything started at a different time, but in practice you typically see all the jobs started with "0 * * * *" or "@hourly" or thrown into /etc/cron.hourly (depending on which flavor of cron you're using). These all start jobs at minute 0 of each hour.

    A loop like "while [ 1 ]; do something; sleep 3600; done" takes an hour plus the duration of "something" to run. They'll all initially start at the same time, but assuming each of the tasks has a different "something" they'll each have a slightly different duration and will get out of sync fairly quickly. This is especially true on a busy system, since that will affect the duration of the "something", forcing the jobs farther out of sync. It's kind of self-balancing that way.

  • ¯\(°_o)/¯ I DUNNO LOL (unregistered) in reply to tim
    tim:
    It is just me or did this comments section turn into some kind of amateur unix sysadmin pissing contest?
    As opposed to the usual "let me show you how to write that trivial code"* amateur programmer pissing contests that we have?

    *non-ironically as in writing a Fizz-Buzz, rather than for humor, such as trying to introduce cliches or different WTFs.

  • coder (unregistered) in reply to Martin

    if you really want to get pedantic, while [ 1 ]; do will work just as well, why even invoke a comparison when all you want is 'true'

  • (cs) in reply to tim
    tim:
    It is just me or did this comments section turn into some kind of amateur unix sysadmin pissing contest?
    It's just you. Your description of a perfectly moderate and even-tempered discussion of the issues raised by today's WTF is needlessly hyperbolic.
  • BlueBearr (unregistered)

    The fact that nearly every article on this site spawns a host of "I don't see the problem", "TRWTF is", and "here is how to really solve this problem"-type comments tells me just how tough this shit is.

    CAPTCHA: Tego, n.: Lego bricks made out of teak.

  • Dan Mercer (unregistered)

    The problem might have been in the ps command. In the 90's I worked on hp-ux machines in which the man page warned that using ps would lock the process queue while it ran. Users were loudly complaining about slow start up times so I poked around. I found a script that ran every second and did a 'ps | grep something | grep -v grep' despite the fact that the process it was checking had a known pid. Changed that to a kill -0 and the bottleneck disappeared.

  • (cs) in reply to Geoff
    Geoff:
    DaveK:
    Shouldn't matter. Sleeping processes get swapped out when the memory is needed, swapped back in once an hour; that's not a whole lot of I/O load. Plus all the read-only code pages for the shell and sleep executables are loaded only once and shared between all running instances. I'd want to see some more detailed CPU and memory usage data before jumping to conclusions.
    <<snip>>

    Under this scheme I don't how you determine which out script has died, if there is a problem (somehow I doubt there is rigorous logging). I would assume the thing is more brittle because if there are temporary errors the scripts will probably just die and not restart on their own.

    Duh! You write a script with a loop and a one-hour sleep that makes sure all the other scripts are running correctly!

  • Joe (unregistered)

    The other reason to use this pattern (sleep in an infinite loop) is that you can instantly trigger a check that everything's running:

    killall sleep

    (Note that on some non-Linux platforms, killall does something very different and unfriendly)

  • Anonymous') OR 1=1 (unregistered) in reply to QJo
    QJo:
    Brilliant, thanks. I am suitably enlightened.

    I think you mean "brillant".

  • (cs) in reply to coder
    coder:
    if you really want to get pedantic, while [ 1 ]; do will work just as well, why even invoke a comparison when all you want is 'true'
    You can also use “while [ 0 ]; ” which does the same thing and is more confusing.

    The best way to do this is probably “while true ;”, or “while : ;”.

  • (cs)

    If we were meant to write infinite loops, there would be a native language construct for them.

  • Peter Michael Green (unregistered)

    I doubt the "couple of hundred instances of sleep" were the real problem. It's not exactly uncommon on servers running things like apache to have a few hundred sleeping processes and a sleeping process should be pretty low overhead.

    More likely the problem was lots of random badly writen scripts chewing up resources.

  • QJo (unregistered) in reply to Anonymous') OR 1=1
    Anonymous') OR 1=1:
    QJo:
    Brilliant, thanks. I am suitably enlightened.

    I think you mean "brillant".

    Yes, I can't argue there, I do indeed mean "brillant".

  • itn (unregistered)

    TRWTF is

    Scripts doing infinite loops and then sleeping for an hour were often the norm.

    If you were doing something after the loop, it's not that infinite, now is it?

    Of course in this, you're doing something in the loop...

  • (cs) in reply to coder
    coder:
    if you really want to get pedantic, while [ 1 ]; do will work just as well, why even invoke a comparison when all you want is 'true'
    If all you want is true, why do you additionally run test?

    Your valid options are:

    while true; do ... while false; do ... while [ ! -e $file ]; do

  • (cs) in reply to TheCPUWizard
    TheCPUWizard:
    QJo:
    "Bourne"? How does that work? ... Exactly which of the above usages is intended here?

    http://en.wikipedia.org/wiki/Jason_Bourne

    <grin>

    I am also thinking of JSON bourne that super spy who is no match for James Bond.

  • a real sysadmin (unregistered) in reply to nitePhyyre
    nitePhyyre:
    So what do you do if you have a long running script, and want to re-run it every hour after it is finished?

    With this sleep method: run the script, it takes 90 min, then an hour later you run it again.

    With cron: You set it to run every hour. The first hour runs the script. Then with 30 min left to go be fore the script finishes, you start another instance? That's no good. Do you just set the script to run every 150 min?

    What's the best option here?

    man at

Leave a comment on “Bourne to be While”

Log In or post as a guest

Replying to comment #:

« Return to Article