in reply to old file descriptors not being cleaned up

Thanks for the replies guys. Here are my answers.

@anon: I monitor the process from another terminal. I run ps to get the pid and then do ls /proc/PID/fd and watch the list grow until it crashes at 256.

@mr_mischief: I explicitly close them after sysread returns 0. That is in the elsif block where $len == 0. The close() returns true. Also the read is from a pipe to the ping command, not a file.

@Khen1950fx: I'm not using seek. >>

  • Comment on Re: old file descriptors not being cleaned up

Replies are listed 'Best First'.
Re^2: old file descriptors not being cleaned up
by mr_mischief (Monsignor) on Dec 12, 2010 at 18:48 UTC

    Um, no. Apparently you're not following me here.

    You open all of the file handles (that's what they are called even if it's a pipe open) in a for loop. That loop executes before the while loop below it. That while loop contains the statement in which you attempt to close a file handle.

    They are all opened before you close the first one, as I already told you. The one loop executes before the other, you see. Explicitly closing them will not keep them from all being opened at once unless you close them in the same loop where you open them. The while loop won't go back in time to close files for you. I know the Perl 5 team is good, but they haven't mastered time travel into the past just yet. I doubt you have the hardware necessary anyway.

    The control flow in your program must follow the rules of control flow in the language you are using to implement the program.

    You have to think of your program as executing over time. Making two system calls that relate to one another does not mean they happen in the same part of the program or near the same time. You must make the calls in the part of the program they are needed. The language is only doing what you ask, not guessing what you meant. I know Perl is designed to make small choices for you based on context when things are left implicit. It won't reorder your explicit control flow over multiple lines and multiple blocks, though.

    If you want fewer files open at a time, you are going to have to close some of the files before you open others.

    That the order of a set of actions influences their outcome isn't some crackpot conspiracy theory I'm trying to sell you in a survivalist pamphlet. That's just reality. If you go into your house after work tonight and open all the windows, then all of your windows will be open. If you then start closing them, you will still have had all of your windows open at some point. In order to have only a portion of the windows in your house open at a time, you must open some portion of them then close some portion of them before opening the rest. Your actions in the present can't change the past.

    You came here asking for help. If you want to argue that you're smarter than I or more experienced, that's fine. Let's have that argument. I'm game for that, although this really isn't the venue. Don't argue with the advice I gave when you asked, though, until you try it or at least take the time to understand it. Arguing against the help you requested based on your own flawed understanding will not further anyone's understanding of the situation with your work. It may inform people about you in some ways, though. Confidence is good, but being cocksure about your understanding of some concept about which you just asked for help is just silly. Either you need help or you don't. Asking for help and dismissing unconsidered any help you get is not only rude but wasteful.

      I don't understand the tone of ur reply. I never said anything remotely smug.

      But anyway, I don't understand what ur trying to say about closing them. It's not that I want fewer handles open at one time, it's that I want them to close and completely go away between invocations of that subroutine. I don't see how operating on the handles in two loop structures makes any difference to their persistence. Let's say I got rid of the for() loop and manually opened a number of handles as variables in the @pings array. Ur saying that would make a difference? The handles are all opened in the context of a subroutine and operated on solely within that same subroutine. Asked another way, what situation would make a file handle that just had close() called on it not release back to the OS?

      Thanks.

        Asked another way, what situation would make a file handle that just had close() called on it not release back to the OS?

        close can fail, but if close succeeds, then there are four possibilities

        • the monitoring program is lying (cached data...)
        • the operating system is lying (think of chroot)
        • you're looking at the wrong thing at the wrong time and reaching the wrong conclusion
        • cosmic rays/gremlins
        To me, it just looks like you're opening too many pings at one time

        The subroutine is not shown entirely in your code. The point of declaration for your @pings array is not shown in your code, which means it is likely outside the scope shown. There is no evidence in your code that the file handles are being closed before hitting your ulimit. According to your code, there is a file handle opened for every host in your @hosts array. If your @hosts array has more members than your ulimit of file handles, then your code, as provided by you and nobody else, will attempt to open more file handles than your ulimit allows.

        If you don't understand where you come across as cocky (I never said "smug", but close enough), I invite you to read the one curt reply you gave three people (in Re: old file descriptors not being cleaned up):

        Thanks for the replies guys. Here are my answers. @anon: I monitor the process from another terminal. I run ps to get the pid and then do ls /proc/PID/fd and watch the list grow until it crashes at 256. @mr_mischief: I explicitly close them after sysread returns 0. That is in the elsif block where $len == 0. The close() returns true. Also the read is from a pipe to the ping command, not a file. @Khen1950fx: I'm not using seek. >>

        The only potentially useful thing you said to me was to try to correct me about a pipe being seen as a file. I've got news for you: a pipe works using file handles and file opens and closes because although it is a special case, it is indeed treated as if it was a file. The same file descriptor limits apply. For all cases in which a piped file handle, a fifo, a device file (which was in my example in response to you), and a vanilla file on a mounted file system act the same, the distinction is irrelevant. Only the ways in which they differ matter.

        Furthermore, the very problem I warned you about you glossed over in your rudely short and impersonal reply. I told you it's a good idea to close them where you open them. You specifically tell me that you are closing them, then point out where you are closing them. That place is not in the same loop in which you are opening them, which is what I told you was your most likely problem. I went on in my reply at Re: old file descriptors not being cleaned up to write you three example programs that illustrate the issue I warned you about. You made no mention of running them, perusing the code, comparing them to your situation, or contrasting why your code doesn't suffer from the same problem as the first example.

        You were also entirely dismissive of both other people in the thread. The anonymous monk asked (in Re: old file descriptors not being cleaned up) what lsof said about the ownership of the open file handles. You ignored that. He or she also asked what the results of close calls are, but instead of saying in your reply or even asking what was meant you simply ignored that. Khen1950fx asked in Re: old file descriptors not being cleaned up about seek(), which you dismissed, then advised trying a seek on the file handle (which, again, is still what it is called even when there's a pipe involved). You dismissed the first mention of the seek function but ignored completely the second.

        You exact description of your problem is:

        The problem is that the file descriptors (or file handles) are sticking around even though the processes exit and I close()'d them. The script eventually dies with a "too many open files" message. The box is Solaris 10, Perl 5.12.2, and the ulimit is 256 handles. This can be raised to 1024 but it would still crash if I need to ping more than that. I've tried kill, readline, waitpid, set sig-child/pipe to ignore but they still build up.

        If you're actually successfully closing the filehandles, then they won't be open. You example code does not show a test for making sure the close of the pipe is successful. This the anonymous monk asked abut already, but you ignored.

        Since it's a pipe for reading from the chained command and you've already tested the length, there's little reason for it to fail. Still, to be really sure the filehandles are closed all the way down to the OS file descriptors you should test the close call. Once you can confirm you're actually successfully closing them when you think you are, you can move on to seeing if you're opening more than your ulimit allows before attempting to close them.

        You could have spent just a little time reading the replies to your urgent (to you) request for help. You could have taken a bit of your time seeing how the replies we took our time to give you free of charge applied to your situation. You then could have replied to each person individually without completely ignoring certain points or refuting others without explanation. You could have refrained from belaboring points of terminology that make a distinction without a difference in an attempt at... what, exactly? General pedantry was the goal perhaps? Only you could answer that, and I really don't care anyway.

        I'm only still interested in helping you for a few reasons. Your own evaluation of whether I have the right to consider your response rude is certainly not one of them. In fact, helping you with this problem has nothing to do with you at all. I like solving problems, and some other visitor to the site with a similar problem might find this thread later. So, if you still want some help, perhaps you could take what's been offered and ask for more if that truly doesn't help. Otherwise, there are three people who have already given our time to give you tips you have openly and curtly refused to consider. Troubleshooting is an iterative process. Until you rule out those tips as unhelpful through actual thought and reasoning and not a knee-jerk general display of self-importance and infallibility, there's little reason for us to devote any more time to you or your problem.

        My one very specific tip to give you right now about your code rather than your behavior is this: Be sure the size of @hosts is smaller than your ulimit for file descriptors and see if the problem goes away. Keep another array outside the sub and only feed the sub ulimit minus some amount (at least five or so to account for the STDIN, STDOUT, STDERR, and a couple of other open files) to the sub at a time. If that fixes your problem, BTW, then my original diagnosis which you completely ignored was the exact right one.