hv has asked for the wisdom of the Perl Monks concerning the following question:
I'm writing code that includes an option to get a tail -f effect, and it works fine. However I believe it is working rather inefficiently due to an unexpected detail of 4-arg select().
In the code, I open a number of files using:
, then sit in a loop with:sysopen $fh, $file, O_RDONLY | O_NONBLOCK
.. before using sysread() to get new data from the marked files.$ract = $eact = $rvec; # vector of all filenos to read ($nfound, $timeleft) = select($ract, undef, $eact, $timer);
This all works fine, except that each time round the loop select() returns saying that every one of the files is readable. On checking various manpages and Stevens I find that I've misunderstood select() - it is telling me not that there is data to read on the marked files, but that a read from this filehandle would not block, and in particular (Stevens): If we encounter the end of file on a descriptor, that descriptor is considered readable by select().
Looking at prior art, in File::Tail the author does lots of clever stuff to try to predict when a file is likely to have more data ready for read, and read from it only then; in the implementation of tail(1) for the ppt project (here) the author chooses not to retain open file handles at all, but instead repeatedly stat()s the file(s) to determine whether something has changed.
So, is there any reasonable workaround to this, to avoid repeatedly reading from filehandles that have no new data to supply? The intended use for this code will typically have one or more instances each tailing around 60 file descriptors on a production box, and if they can't spend most of their time sleeping in the kernel under select() I fear it will have a noticeable impact on performance of the machine as a whole.
Hugo
|
|---|