janmartin has asked for the wisdom of the Perl Monks concerning the following question:

This is about Parallel::ForkManager on 64 bit strawberryperl.

I need to process 1000s of images. Each totally independent of the others.

Processing has CPU intensive parts, and parts that hardly use the CPU, but the GPU instead. The hardware has 8 cores, so I think of 8 processes? Each process to start multi-thread programs.

Each image is of different size, and I know the processing time is different according to it. Therefore processes should run a different length.

If a process is finished, a new process should start. To always have 8 processes.

This should level the CPU usage after a few rounds due to the different time each process runs. But it doesn't.

Looking at the Windows 8 Resource Monitor I see that all 8 processes run perfectly synchronized even after a long time.

It looks like 8 process are started, then only after all 8 are finished 8 new processes are started?

Code is straight from: http://search.cpan.org/~szabgab/Parallel-ForkManager-1.06/lib/Parallel/ForkManager.pm

use Parallel::ForkManager; ... @links=( ["http://www.foo.bar/rulez.data","rulez_data.txt"], ["http://new.host/more_data.doc","more_data.doc"], ... ); ... # Max 30 processes for parallel download my $pm = Parallel::ForkManager->new(30); foreach my $linkarray (@links) { $pm->start and next; # do the fork my ($link,$fn) = @$linkarray; warn "Cannot get $fn from $link" if getstore($link,$fn) != RC_OK; $pm->finish; # do the exit in the child process } $pm->wait_all_children;

Replies are listed 'Best First'.
Re: Parallel::ForkManager and CPU usage?
by BrowserUk (Patriarch) on Sep 19, 2014 at 15:58 UTC
    Looking at the Windows 8 Resource Monitor I see that all 8 processes run perfectly synchronized even after a long time.

    Could you explain what you mean by that?

    Ie. What are you seeing in the resource monitor and how are you interpreting it?


    With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday'
    Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
    "Science is about questioning the status quo. Questioning authority".
    In the absence of evidence, opinion is indistinguishable from prejudice.
Re: Parallel::ForkManager and CPU usage?
by Laurent_R (Canon) on Sep 19, 2014 at 21:38 UTC
    "Each process to start multi-thread programs."

    Are you saying that each process is launching several threads? You would have to explain that, but that sounds like defeating the purpose of Parallel::ForkManager. There are a number of technical differences between processes and threads, but they are essentially the same types of animal: my choice of words might be poor, but both are a stream of code execution. If you have a 8 processes running on 4 CPUs (or a CPU with four cores), you will have context switches, i.e. your program will switch from one process to another many times per second. Context switch has a cost (CPU registers, caches and execution stacks have to be saved and loaded) again, but that's often OK because if a process is waiting, say, for data from the disk, another process might do many things meanwhile before the first one is ready to continue. A thread is essentially a lightweight process, which means that context switch is likely to be less costly (depending on the operating system, it might actually be almost as expensive). Anyway, the basic idea of something like Parallel::ForkManager is to limit the number of processes in order to avoid too many context switches. But if each process can run many threads, then the point of the operation is essentially lost.

    In my experience (which has nothing to do with Windows), it can be useful to run more processes than the number of CPUs. With our relatively IO intensive processes, the best performance results are obtained when you have about twice more processes than the number of CPUs. So, with our 4-CPU server, we usually run 8 to 10 processes in parallel, sometimes a bit more because some specific processes are just waiting for the other to complete and are sleeping most of the time. I can't extrapolate anything to the Windows OS, but I would assume that it is reasonable to accept a few more processes than the number of CPUs or CPU cores. Say anywhere between the number of cores and twice the number of cores. Only testing will tell you, because there are so many things out of your control (specific characteristics of your program, hardware and OS caching strategies, etc.) that it is almost impossible to predict what is going to happen unless perhaps you have a really deep thorough knowledge on your hardware and OS.

    In brief, using multiple processes and multi-threaded processes sounds to me like a bad idea in general, but, to tell the truth, if you can guarantee that each process is generating, say, only two threads, you might end up closer to the optimum. But if each process is generating dozens of threads, then you are much more likely to be very far from the optimum.

      In my experience (which has nothing to do with Windows),

      Unfortunately, that renders most of what you've written entirely useless to the OP.

      • There are no such things as "lightweight processes" on windows.

        There are threads, and processes. (And fibres, but they're not accessible from Perl.)

        Every process consists of at least one thread; and is started fresh; never cloned from an existing process.

      • There are no such things as forked processes on windows.

        Fork is emulated using threads.

      In brief, using multiple processes and multi-threaded processes sounds to me like a bad idea in general,

      It's quite normal under Windows.


      With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday'
      Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
      "Science is about questioning the status quo. Questioning authority".
      In the absence of evidence, opinion is indistinguishable from prejudice.
Re: Parallel::ForkManager and CPU usage?
by CountZero (Bishop) on Sep 19, 2014 at 19:45 UTC
    You can use the run_on_start and run_on_finish callbacks to print some debug info about when the subprocesses start and stop. That will give you some info to understand better how the children are managed;<

    CountZero

    A program should be light and agile, its subroutines connected like a string of pearls. The spirit and intent of the program should be retained throughout. There should be neither too little or too much, neither needless loops nor useless variables, neither lack of structure nor overwhelming rigidity." - The Tao of Programming, 4.1 - Geoffrey James

    My blog: Imperial Deltronics
Re: Parallel::ForkManager and CPU usage?
by codiac (Beadle) on Sep 20, 2014 at 02:35 UTC
    Not enough information for a real answer. What state are the cores in? Do you have 8 GPUs? Probably not, so if you run 8 jobs at once how does the GPU handle that concurrency? Are your cores cache thrashing? Do you have enough RAM for 8 jobs? Does your GPU have enough RAM for 8 jobs? If you have 1 thread per run but run the script in 8 different shells at once, does it perform differently? If you run 4 threads instead of 8, does it behave the same?
Re: Parallel::ForkManager and CPU usage?
by locked_user sundialsvc4 (Abbot) on Sep 20, 2014 at 16:12 UTC

    Since each process is “processing images,” you will probably be able to run quite a few more processes than you have cores, because each process will spend most of its time waiting for disk-I/O.   The amount of memory won’t be outrageous, either.   You should arrange for the number of processes to be adjustable.   Fiddle with it to find the “sweet spot” for your system.

    And, if you are on a Unix/Linux system, don’t forget the -N numprocs parameter of good ol’ xargs.   You could write a simple Perl script that expects the name of a file on the command-line and which processes just one file.   Then, build a file with a list of all the filenames (or pipe an ls command output), and feed that into xargs.   The job is done, in multi-process style, but without writing any complicated Perl code.   Maybe just the ticket if this is a “one-off” task?

      1. Whether the processing will be CPU- or IO- bound depends a lot on the processing. If the processing is complex enough, the IO will be negligible.
      2. If the processes "spend most of their time waiting for disk-IO" and all those images are on the same disk, then starting a lot of processes, all competing for the same disk, is not the best thing to do. Disks nowadays have caches and clever firmware doing read-aheads and other tricks to minimize the need to move the reading heads too much, but with enough processes reading big enough images you can easily render all the caching ineffective and spend time waiting for the heads to move to read the next bit of one of the files. The fact that the tasks are IO-bound doesn't necessarily mean you should start many.
      3. If the processing takes long enough, then starting and destroying a new process for each and every image may not matter much, but it might still help to start eight processes and keep them instead. The easiest solution would be to split the list into eight parts at the start and start a script to process each batch. With thousands of images of a fairly random size, they should all end their work at around the same time, give or take a few images.

      Jenda
      Enoch was right!
      Enjoy the last years of Rome.

      A reply falls below the community's threshold of quality. You may see it by logging in.
      Usage: xargs [-0prtx] [--interactive] [--null] [-d|--delimiter=delim] [-E eof-str] [-e[eof-str]] [--eof[=eof-str]] [-L max-lines] [-l[max-lines]] [--max-lines[=max-lines]] [-I replace-str] [-i[replace-str]] [--replace[=replace-str]] [-n max-args] [--max-args=max-args] [-s max-chars] [--max-chars=max-chars] [-P max-procs] [--max-procs=max-procs] [--show-limits] [--verbose] [--exit] [--no-run-if-empty] [--arg-file=file] [--version] [--help] [command [initial-arguments]]

      I think -P maybe, there is no -N, well at least not on Ubuntu or Solaris.

      #ls | xargs -N xargs: invalid option -- 'N'

      Had no idea xargs had that many options