Hello monks. I seek your wisdom.

I have observed something odd regarding multiprocessing performance on Windows. When I run the test below, it seems that there is a * hugh * amount of process switching overhead. When I run the same test on a Linux server it runs as expected (almost no overhead). Here are the results.

############################################################ # # ithreads on Centos Linux, 64 bit, 8 CPU's # Perl v5.10.1 built for x86_64-linux-thread-multi # Threads Clock CPU ==> Speed Overhead # ------- ----- ---- ----- -------- # 1 18.2 18.2 ==> 1.0x 0% # 2 9.1 18.2 ==> 2.0x 0% # 3 6.2 18.3 ==> 2.9x 1% # 5 3.7 18.2 ==> 4.9x 0% # 8 2.4 18.4 ==> 7.6x 1% # # ithreads on my Windows 7, 64 bit, 8 CPU's # Perl v5.12.3 built for MSWin32-x86-multi-thread # Threads Clock CPU ==> Speed Overhead # ------- ----- ---- ----- -------- # 1 25.0 25.0 ==> 1.0x 0% # 2 14.6 28.1 ==> 1.7x 12% # 3 12.9 37.0 ==> 1.9x 48% # 5 9.9 47.8 ==> 2.5x 91% # 8 8.2 62.1 ==> 3.0x 148% # ############################################################

Running a single child process establishes a baseline, 1.0x speed at 0% overhead. With Linux, running 5 processes, I see a 4.9x speed improvement with less than 1% overhead. Very good. But with Windows, running 5 processes, I see only a 2.5x speed improvement with about 91% overhead! In other words, the speed improvement was only about half of what it should have been and the CPU time almost doubled. What was the CPU doing this extra 91% of the time?

I realize that the test results aren't very accurate (about 10%). I ran them on live, but mostly idle, machines. The deviations in the Windows results are much more than 10%, however, so I think that they are relevant. Here is the test code.

use strict; use warnings; use threads; use Time::HiRes 'time'; my $nr_children = 1; my @threads; my $start = time; foreach my $i (1 .. $nr_children) { $threads[$i] = threads->create(\&Work, $i); } foreach my $i (1 .. $nr_children) { $threads[$i]->join(); } my $stop = time - $start; printf "\nclock: %.1f sec\n", $stop; my @run = times; printf "user: %.1f sec\n", $run[0]; exit; ##### sub Work { my ($i) = @_; foreach ( 1 .. (20e5/$nr_children) ) { my $acct_nrs = "abc\txyz\tdef\tabc\tghi\tghi"; my @temp = split(m/\t/, $acct_nrs, -1); @temp = ( sort keys %{{ map { $_ => 1 } @temp }} ); my $ans = join(', ', @temp); } print " $i"; return; }

The processes run compute bound and keep all 8 CPU's (when using 8 child processes) at 100% simultaneously, both on Windows and Linux. There is no I/O (except one print at the end), no blocking, no locking, and no shared memory. The processes last long enough that the setup time shouldn't be very important. Thus I'm left thinking that the overhead would be due to process switching by the operating system.

This test uses ithreads. I also ran a similar test using forks and the results in both cases, Linux and Windows, were almost identical to the itread results.

I realize that if the processes were normally blocked this wouldn't be as big an issue. But my job is compute bound. So event loops (POE, Coro, etc) wouldn't help. Not even POE's "Wheel", which uses fork, from what I read.

In summary, my questions are: 1) Is my test valid? 2) Is my conclusion valid? 3) Is there a way to get better multiprocessing performance on Windows?

Thanks, John.


In reply to Multiprocessing on Windows by JohnRS

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.