powerman has asked for the wisdom of the Perl Monks concerning the following question:

My script use many file descriptors. Too many. Maximum.
The problem is: how I can test how much file descriptors still available for my process?
My current solution is too slowly for me:
sub FD_used { local *FD; opendir FD, "/proc/self/fd"; return @{[readdir FD]} - 2; };
I need something faster. Any ideas?

Background: this script is a very quick "spider", which use non-blocking sockets to reach speed > 200 urls per second. So i open many simultaneous connections. Some of them is UDP sockets to DNS server, other is TCP sockets to HTTP servers. And more connections mean more urls per second. So, now I open ~ 200 sockets in a second. And I need to test how many sockets opened before trying to open new socket. But 200 calls of my FD_used() take too many time (0.05 sec on Celeron 333) and slow other code! Errr!!

Replies are listed 'Best First'.
Re: resource control: FD
by derby (Abbot) on Apr 24, 2002 at 17:01 UTC
    or via POSIX:
    #!/usr/local/bin/perl use POSIX qw( sysconf _SC_OPEN_MAX ); $max = sysconf( &_SC_OPEN_MAX ); print "This process can have ", $max, " open files\n"; # We all ready have STDIN, STDOUT, and STDERR open $x = 3; while( $x < $max ) { open( $x, "/etc/motd") || die "cannot open file $!\n"; $x++; } print "Should not have failed!\n"; print "Should fail now:\n"; open( $x, "/etc/motd" ) || die "cannot open file $!\n";

    -derby

Re: resource control: FD
by ferrency (Deacon) on Apr 24, 2002 at 16:49 UTC
    On a side note which might be related:

    A coworker ran into a problem where he was creating socket connections in an eval block with an alarm() to time them out if he didn't receive a response in a certain time period. It worked really really well, he was able to speed up his code by huge amounts. Until he inexplicably started running out of FD's.

    It turned out that the FD's weren't being released after timing out of the eval block, so every time it timed out he accumulated another open FD that never went away (and they were being created implicitly inside perl, actually, so he didn't even have a handle to call close() on).

    I'm not sure if this is at all related to your problem, but you might want to check on it. Most systems I've played with have the number of available FD's up in the thousands, so if you really only have a few hundred open, I'd think something weird is going on if you're getting errors.

    Alan

Re: resource control: FD
by belg4mit (Prior) on Apr 24, 2002 at 20:04 UTC
    There is nothing wrong with keeping a counter in a module. This combined with overloading open, close, socket... so you could keep a valid counter is one possiblity (not recommended). Opening until you get an error is better. Even better is to make the user responsible, and let them set the maximum (in which case you need not worry about how many they have open). Defaulting to a reasonable value like 16. This is much friendlier for a module anyways, what if I don't want to swamp my box? This is basically how FileCache works (hence my jumping the gun below/earlier).

    UPDATE: *sigh* Pay no attention to the man behind the curtain

    ~~~ This is a curtain ~~~

    Perl comes with a standard module for handling (UPDATE: the non-socket case of) this, it is FileCache. Perl 5.8 will have an improved version that is available over there (or in bleadperl of course, require perl 5.6+).

    --
    perl -pew "s/\b;([mnst])/'$1/g"

Re: resource control: FD
by Fletch (Bishop) on Apr 24, 2002 at 16:57 UTC

    You can check from your shell (man ulimit or check your shell's manpage), or install BSD::Resource and use getrlimit.

    $ perl -MBSD::Resource -le '@l = getrlimit(RLIMIT_NOFILE); print "soft + $l[0]\thard $l[1]"' soft 1024 hard 1024
    A reply falls below the community's threshold of quality. You may see it by logging in.
Re: resource control: FD
by perlplexer (Hermit) on Apr 24, 2002 at 16:34 UTC
    I don't know if I understand the problem correctly but...
    If you know that you can open only 200 sockets then why don't you just keep a counter and increment/decrement it whenever you open/close sockets?
    Or maybe even make it dynamic; i.e., keep opening sockets until you get an error and note how many sockets you had open at that time - this will be your maximum.

    --perlplexer
      I can't keep a counter becouse my goal is make perl MODULE from this script (something like LWP::Parallel, but more efficient becouse of non-blocking sockets inplace of select()).
      And this module will be used from user script... and this script can open his own files... any number of files... and I don't know in my module how many files open main script.

      Update:I never say that I open only 200 sockets! To download 200 urls/sec I must open much more than 200 socket (~900), but after I open them and reach speed of 200 url/sec, I'm opening one new socket after one url downloaded and it socket is closed.
      So, when script started it open ~900 sockets, and every second 200 from these 900 sockets will be closed and new 200 open.

        powerman,

        unfortunately a public api for that piece of the task structure has never been exposed. I looked for some api but couldn't find one. If you're opening pretty fast, you could always use fileno to let you know how close you are to the max but once you get going heavy and start closing, reuse will kick in and that will break.

        I have to ask but what are you going to do when you get close to the max? Sleep? Can't you replicate that same behaviour by wrapping the open(s) in an eval? It seems to me, the following would be equivalent:

        # the way you want if( magical_how_many_open() < $what_i_need ) { do_wait_some_how(); } # the way you're being pushed eval { open( $fh, "whatever" ); }; if( $@ ) { if( $@ eq "Too many open files" ) { do_wait_some_how(); } else { degrade_gracefully(); } }

        -derby

SOLUTION
by powerman (Friar) on Apr 24, 2002 at 21:13 UTC
    Thanx a lot for all and espesially for belg4mit!
    The simple solution for my question is: create internal counter in my module and increment/decrement it when this module open/close sockets; allow user update this counter if user need more than, say, 100 FD's.
Re: resource control: FD
by powerman (Friar) on Apr 24, 2002 at 16:18 UTC
    I don't undestand how to update my question, so i "reply for update". ;-)
    If you want to ask "why not just check error code returned by socket() to find that script already reach maximum FD's?" the answer is simple: script must never reach maximum FD's! Becouse of many reasons. I can explain all of them by request.