• (disco)

    For some reason I'm compelled to start singing

    99 handles of files on the wall, 99 handles of files … Take one down, pass it around, 98 handles of files on the wall …


    Filed under: Raymond Chen's blog is full of such global solutions to a local problem, That's a link, not a tag, You're welcome

  • (disco)

    Unfortunately, it’s sometimes necessary to close all opened file descriptors in a process (when trying to exec(3) another one in a clean environment), and AFAIK this method is the only truly portable one...

    For instance: https://github.com/twisted/twisted/blob/d7c231ce36efca2a8d97fec602d626db63182ac8/twisted/internet/process.py#L554 (note that it’s a fallback method, not used on most systems)

    TRWTF is the 263 FD limit. There is no situation that could possibly require that many file descriptors being open simultaneously. Not to mention that select(2)-based programs will start freaking out at 1024 FDs...

  • (disco)

    Of course, there's another way of ensuring you don't leave loads of file handles open: close them when you're done with them.

    But then that would be Doing It Right™.

    <!-- Emoji'd by MobileEmoji 0.2.0-->
  • (disco)

    Which is what the python with statement is for. You create a context within which a file is open, and the interpreter closes the file as soon as you leave said context. Simples.

  • (disco) in reply to theheadofabroom

    Yep. just like C#'s using statement.

    If you program in either language and don't know about those statements don't code another line until you read up on them (and if you use a different language please take a moment to see if your language of choice has a similar construct)

    Seriously. They'll change your life and it won't take very long to learn about them.

  • (disco) in reply to VinDuv

    It's not select that will freak out, it's the size of the structure allocated to hold the bitsets of FD : FD_SET does not check its boundary, neither the related macros, so it's easy to corrupt the memory by going out of the allocated structure.

    I use select with increased fd range (up to 9000), it's just a matter to have permissions from the system to open that many file descriptor and to resize in the code that damned struct.

  • (disco) in reply to Jerome_Grimbert
    Jerome_Grimbert:
    and to resize in the code that damned struct.

    o_Ô Does that really work in practice?

  • (disco) in reply to accalia
    accalia:
    just like C#'s using statement.

    Or Java's try-with-resources

  • (disco) in reply to Yamikuronue

    huh.... now that is a nice bit of syntactic sugar.

    of course i haven't done any Java since java6SE so i would have missed that.

  • (disco) in reply to accalia

    Yeah, people like to judge Java by Java 6, but 7 and 8 each bring quite nice incremental improvements.

  • (disco) in reply to Yamikuronue

    undoubtedly. and if i ever do Java again i'll make full use of them (i hear java 8 introduced lambdas. fracking finally!)

  • (disco) in reply to Yamikuronue

    I do java all the time, and we moved to 7 last year. I still haven't started using anything new. :frowning: We should hopefully move to 8 soon, but have to wait for the framework to catch up. :cry:

  • (disco) in reply to Yamikuronue
    Yamikuronue:
    Yeah, people like to judge Java by Java 6, but 7 and 8 each bring quite nice incremental improvements.

    Every release gets it closer to being 1/3rd as good as .net!

    Except the JVM, that's still just as broken as ever.

  • (disco) in reply to blakeyrat
    blakeyrat:
    Except the JVM, that's still just as broken as ever.

    Huh. I recall more often encountering the sentiment "Yeah, Java is pretty bad, but the JVM is actually a solid piece of tech."

    I haven't developed against either one since about 2002 though, and that was classroom projects. So I have no idea.

  • (disco) in reply to kilroo
    kilroo:
    Huh. I recall more often encountering the sentiment "Yeah, Java is pretty bad, but the JVM is actually a solid piece of tech."

    Right; until you ask for the local user folder in Windows, then it give you wrong bullshit. Which is why virtually all (if not all) Java GUI programs are broken on Windows-- they inherit Oracle's bugs in addition to their own.

  • (disco) in reply to VinDuv
    VinDuv:
    Does that really work in practice?

    Well, you have to declare your own structure, or edit the system include files... again and again. Overriding the default size of structure with compiler flags is also possible, but far less portable. Editing the system include file(s) is of course a WTF.

  • (disco) in reply to accalia
    accalia:
    Yep. just like C#'s using statement.
    Assuming `Dispose()` has been implemented correctly, of course ;)
    accalia:
    They'll change your life and it won't take very long to learn about them.
    QFT
    Yamikuronue:
    Or Java's try-with-resources
    Hmm… been a while since I last used Java; evidently, it's moved on somewhat.
    accalia:
    i hear java 8 introduced lambdas. fracking finally!
    QFT <!-- Emoji'd by MobileEmoji 0.2.0-->
  • (disco) in reply to RaceProUK
    RaceProUK:
    Assuming Dispose() has been implemented correctly, of course

    true enough, but if your third party library messes that up there's nothing you can do about it.

    if it's your own library... well then you can fix it. :-P

  • (disco) in reply to blakeyrat
    blakeyrat:
    Right; until you ask for the local user folder in Windows, then it give you wrong bullshit. Which is why virtually all (if not all) Java GUI programs are broken on Windows-- they inherit Oracle's bugs in addition to their own.

    Good point. Most of the contexts in which I've come across that particular sentiment have been server-based (which doesn't necessarily mean such a thing is completely irrelevant, but does increase the likelihood that no one was thinking about it).

  • (disco) in reply to blakeyrat
    blakeyrat:
    Right; until you ask for the local user folder in Windows, then it give you wrong bullshit.
    I may be (am) crazy, but I'm sure there's a suitable WinAPI to get the local user folder; I guess Oracle have their own WTF solution. <!-- Emoji'd by MobileEmoji 0.2.0-->
  • (disco) in reply to RaceProUK
    RaceProUK:
    I may be (am) crazy, but I'm sure there's a suitable WinAPI to get the local user folder; I guess Oracle have their own WTF solution.

    http://forums.thedailywtf.com/forums/t/24771.aspx

  • (disco) in reply to blakeyrat
    blakeyrat:
    http://forums.thedailywtf.com/forums/t/24771.aspx
    You probably don't agree that there are some good pieces of OSS out there, but one thing I think we *will* agree on is far too much of it is written by fucktards who can't be arsed to learn how Windows actually works. <!-- Emoji'd by MobileEmoji 0.2.0-->
  • (disco) in reply to accalia

    What about when first party libraries mess up, like Microsoft did with WCF?

  • (disco) in reply to Medinoc
    Medinoc:
    What about when *first party* libraries mess up, like Microsoft did with WCF?
    Throwing exceptions in `Dispose()`? That's enough to make my quills curl… <!-- Emoji'd by MobileEmoji 0.2.0-->
  • (disco) in reply to Medinoc

    ....

    and there's the reason i never used WCF and never plan to.

  • (disco) in reply to VinDuv
    VinDuv:
    TRWTF is the 263 FD limit. There is no situation that could possibly require that many file descriptors being open simultaneously.

    But there's also no reason to arbitrarily limit file descriptors or any other resource. No, enabling a hack like the one in the article is not a valid reason, that would be like limiting a network to 256 addresses so you can scan all of them more easily to find new printers.

  • (disco) in reply to VinDuv

    Also: if you really need to close all open files for some reason, aren't there still better workarounds? For example, you could globally override the open function to keep a list of all open files somewhere, then just consult it to close them all. Or are there other methods that open files used there?

  • (disco) in reply to anonymous234

    But there's also no reason to arbitrarily limit file descriptors or any other resource.

    Oh yes there is!

    1. Let's say you have a process that opens up 400,000 file descriptors and creates so much disk buffer activity that an admin can't even log into the system to correct the problem. And no, that doesn't have to be a malicious application, just one where somebody keeps opening up files and never closing them, and everything appears to work fine until the system becomes unresponsive 3 months down the line.
    2. The 256-limit means you can store which file descriptor you're accessing in 1 byte, which I'm guessing is demanded by hard drive firmware. If you bump it up to 2 bytes, you'll still have a hard limit of 65,535.

    Operating systems are designed with both the limits of hardware and the stupidity of programmers in mind.

  • (disco) in reply to EatenByAGrue
    EatenByAGrue:
    The 256-limit means you can store which file descriptor you're accessing in 1 byte, which I'm guessing is demanded by hard drive firmware. If you bump it up to 2 bytes, you'll still have a hard limit of 65,535.

    Most of the time, the limit on the number of simultaneously open files isn't something you're going to get close to in desktop code. Servers can get quite a bit more open at once, but it's usually easier to split things into multiple processes than to make resource cleaning perfect at the hundreds-of-thousands-open-at-once level.

    I've only rarely ever got close to the 1024 limit of the default fd_set; that's really quite a lot…

  • (disco) in reply to anonymous234
    anonymous234:
    For example, you could globally override the open function to keep a list of all open files somewhere
    The use case I’m thinking of is a library wanting to start a helper process in a clean environment. It needs to close all FDs opened by the host process after `fork`ing, without having prior knowledge of these FDs. Hooking `open` to do this would be a bit too much invasive, wouldn’t it?

    The correct solution would be to use posix_spawn with a non-NULL file_actions, but I’m not really sure how portable it is.

  • (disco) in reply to VinDuv
    VinDuv:
    The use case I’m thinking of is a library wanting to start a helper process in a clean environment. It needs to close all FDs opened by the host process after forking, without having prior knowledge of these FDs. Hooking open to do this would be a bit too much invasive, wouldn’t it?

    Run the helper process via the exec and most of those FDs will be auto-closed for you (since that's a feature that's enabled by default).

  • (disco) in reply to anonymous234
    anonymous234:
    But there's also no reason to arbitrarily limit file descriptors or any other resource. No, enabling a hack like the one in the article is not a valid reason, that would be like limiting a network to 256 addresses so you can scan all of them more easily to find new printers

    Oddly, and a little OT, I have spent time in the past trying to get companies to pursue a rational strategy on printer IPs; the scattergun approach meant that in one organisation they had over 3000 printers believed to be in use on their network, but only 450 of them could be found by IT. This is not good when you want to drive out a load of updates from a central location.

  • (disco) in reply to dkf
    dkf:
    Run the helper process via the `exec` and most of those FDs will be auto-closed for you (since that's a feature that's enabled by default).

    O_CLOEXEC isn’t set by default by most FD-creating functions (open, fopen, mq_open, timerfd_create, socket...) and I’m pretty sure most people don’t bother setting it...

  • (disco) in reply to VinDuv

    O_CLOEXEC isn't available except on linux, and the alternative is not thread safe.

    I've seen code like that for earlier versions of python when subprocess.Popen wasn't available and/or didn't have close_fds=true

    But I don't think I've seen anything that tried to close zillions of file handles

  • (disco) in reply to blakeyrat
    blakeyrat:
    Right; until you ask for the local user folder in Windows, then it give you wrong bullshit. Which is why virtually all (if not all) Java GUI programs are broken on Windows-- they inherit Oracle's bugs in addition to their own.

    Is this where I should insert the javascript isn't DOM rant?

  • (disco)

    ...This triggered a cascading series of Bad file descriptor errors, and every time the function ran, it would grind the system to a halt while it tried to close every possible file ever opened.

    *Envisions similar process used to close every comment/topic in Discourse & resulting effects...*

  • (disco) in reply to Yamikuronue

    Unfortunately us Android schmucks are still on Java 6. :cry:

  • (disco) in reply to EatenByAGrue

    Most operating systems allow setting per-user limits on that kind of stuff. E.g. RHEL 5 limits the number of open files to 4096 per process for regular users.

    Hardcoding "reasonable" limits is terrible, just look at the IPv4 address shortage, they will soon start growing in value and become traded goods like oil or bitcoins! Oh, and TCP connections also count as file descriptors. A properly written push server (or NAT router) can handle tens or even hundreds of thousands connections and doesn't complain.

    Or for example stupid decisions like "150 characters is more than enough for a company name" on a travel passport application site - government organization names can run 200+ characters with all the repeating "state" "federal" "government" "municipal" "ministry" qualifiers. Had to "hack" the HTML to submit my passport application because using abbreviations or the commonly known short name for my university caused the application to be rejected.

    Some OS (like HP-UX) allow setting almost any system resource limits - like limiting per-process CPU cores and memory. Very when you app starts leaking memory and crashes at 2GB rather than consuming all RAM and swap and slowing down the whole server.

  • (disco) in reply to zlogic
    zlogic:
    Hardcoding "reasonable" limits is terrible
    Matches:
    I present to you: How NOT to limit text length!

    http://what.thedailywtf.com/users/matches/activity

    :wtf:

  • (disco) in reply to VinDuv
    VinDuv:
    The use case I’m thinking of is a library wanting to start a helper process in a clean environment. It needs to close all FDs opened by the host process after forking, without having prior knowledge of these FDs. Hooking open to do this would be a bit too much invasive, wouldn’t it?
    [pjh@sofa discourse]$ ls /proc/self/fd -l | anonymize 
    total 0
    lrwx------ 1 pjh pjh 64 Feb 23 20:38 0 -> /dev/pts/9
    l-wx------ 1 pjh pjh 64 Feb 23 20:38 1 -> pipe:[12616147]
    lrwx------ 1 pjh pjh 64 Feb 23 20:38 2 -> /dev/pts/9
    lr-x------ 1 pjh pjh 64 Feb 23 20:38 3 -> /proc/2624/fd/
    [pjh@sofa discourse]$ 
    
  • (disco) in reply to spadgas
    spadgas:
    O_CLOEXEC isn't available except on linux

    It's in the POSIX spec, so no. I don't know if the semantics of it (or rather of using fcntl with F_SETFD to set the FD_CLOEXEC bit) have been screwed around with, as the results of searching were a bit confusing, but the spec itself is clear.

    Windows is the platform that doesn't have it. It has the equivalent concept though.

  • (disco) in reply to spadgas
    spadgas:
    the alternative is not thread safe.

    I'm not sure how exec() is thread-safe to begin with, you never know what fired and what didn't in other threads.

    Anyway you typically exec() after a fork(), so there are no other threads around to begin with. If you really need just exec(), I'd advise to either close down the other threads normally, or fork()/exit() the parent and exec() in the child to get rid of the other threads.

    Unless I'm missing something...

  • (disco) in reply to blakeyrat
    blakeyrat:
    Right; until you ask for the local user folder in Windows, then it give you wrong bullshit.

    Actually (yes I did do that), that's a problem with the standard platform library, not the JVM. I don't know for sure why Sun and/or Oracle haven't done anything about it in the past fifteen years, but I would guess they don't see much value for them in Java software on Windows desktops and instead choose to focus on the server.

  • (disco) in reply to PleegWat
    PleegWat:
    Anyway you typically exec() after a fork(), so there are no other threads around to begin with.

    But a fork() is even more problematic, because you have no idea what the state of the locks are. The state of the locks you know about can be guaranteed (lock them all before forking) but that doesn't help with whatever other locks there are about, such as in the standard library…

  • (disco)

    Does anyone notice that that routine would close the stdout (1) and stderr (2)?

  • (disco) in reply to leeyc0

    Yes, as a matter of fact: I noticed that it closes everything. No matter about state.

    One might as well go on an object scan in a C++ program and carefully release each object to save memory, without regard if the program is using that object or not.

    The real world equivalent would be removing bolts from a flying commercial jet, with the goal of saving metal.

    Gives me the willies to think about any of them.

  • (disco) in reply to Yamikuronue
    Yamikuronue:
    Yeah, people like to judge Java by Java 6, but 7 and 8 each bring quite nice incremental improvements.

    And a bonus free copy of the Ask Toolbar just for choosing to update today.

  • (disco) in reply to Yamikuronue

    Pretty soon it will have proper stack unwinding and nearly be as good as C++.

  • (disco)

    I saw a similar situation at my work recently. They did the equivalent of select_all; delete for a test case. I had to remind them it is a shared system and they'd likely be deleting things that don't belong to them. Please keep track of things you create and delete them. I don't care if you make some "with" concept or anything else that works with regards to exceptions.

  • (disco) in reply to VinDuv
    VinDuv:
    TRWTF is the 263 FD limit. There is no situation that could possibly require that many file descriptors being open simultaneously. Not to mention that select(2)-based programs will start freaking out at 1024 FDs...

    No, TRWTF is posix and its assumptions about fds.

Leave a comment on “A Small Closing”

Log In or post as a guest

Replying to comment #:

« Return to Article