in reply to Re: some forking help
in thread some forking help

With a 100Mb file and 50+ strings to search for, there could be some speed advantage to forking separate processes for each search string and letting them run in parallel. Especially, if the regexen are precompiled before forking.

Of course, the sheer simplicity of merlyn's solution probably more than compensates for the overall savings in time through the use of parallelism, when you realize that the tricky task of gathering up the individual counts from each of the child processes is not as straightforward as it may at first glance appear.

dmm

You can give a man a fish and feed him for a day ...
Or, you can
teach him to fish and feed him for a lifetime

Replies are listed 'Best First'.
Re: Re(2): some forking help
by fokat (Deacon) on Dec 25, 2001 at 00:32 UTC
    The only way in which a fork()ing solution would be faster than the solutions posted so far, would be in a MP machine, where each process could scan the file separatedly. This, assuming that the file fits within the buffer cache.

    Otherwise, the price of the context switches will make this solution run slower.

    Just my $0.02 :)

    Merry Christmas to all the fellow monks!

      I'm not sure. My gut feeling is that searching a file is fairly I/O bound, and therefore would involve a significant amount of waiting for the disk regardless; why not capitalize on that by waiting in parallel?

      dmm

      You can give a man a fish and feed him for a day ...
      Or, you can
      teach him to fish and feed him for a lifetime