BernieC has asked for the wisdom of the Perl Monks concerning the following question:

I have a program that can reasonably easily set up to "parallel process". Back in my Unix days this was easy to do. I've looked at Perlfork and it says
On some platforms such as Windows where the fork() system call is not available, Perl can be built to emulate fork() at the interpreter level. While the emulation is designed to be as compatible as possible with the real fork() at the level of the Perl program, there are certain important differences that stem from the fact that all the pseudo child "processes" created this way live in the same real process as far as the operating system is concerned.
and I don't know what it means to have 'pseudo child "processes" created this way live in the same real process as far as the operating system is concerned'. I'm using Strawberry perl on win10. My hope is to be able to get my compute-bound Perl program to use more of the several "cores" in my cpu. Has anyone tried this? Does it work?

Replies are listed 'Best First'.
Re: Parallel processing on Windows
by Corion (Patriarch) on Sep 20, 2022 at 18:17 UTC

    The interesting thing about fork is that it works at all.

    If you are not really bound on compatibility between a program that already uses fork(), I would rather look at threads and then communicate between them via Thread::Queue. This is a far better approach than trying to make form emulation work under Windows, at least as long as you initialize resources from within each thread instead of trying to share them between threads.

      I'll look at that. I gotta think about how to coördinate the child processes and get data to/from them. But it looks good if I can understand how threads are different than processes :).

      Just to experiment, I tried fork()/wait() because I'm familiar with that. I wrote a little test program to see if the forked children were running and I got:

      >parallel this is a test of parallel processing Starting the first child first child pid is -14236 I'm the first child and I'm gonna wait for a while second child pid is -10372 I'm the second child and I'm gonna wait for a while number 1 signing off number 1 is done number 2 signing off number 2 is done we're done
      but I haven't a clue about the negative process IDs {and, indeed, I looked at task manager and there wasn't either process in the list}, so I couldn't tell if they were really independent processes and would run on different CPU cores.

      Back in my Unix days I wrote a complete TCP server in Perl! worked like a champ. Sucks that Windows doesn't have a fork/kill/wait process structure} so it looks like i have a lot of learning to do to get something like that to work on Win10.

      Thanks!

        I recently wrote some simple multi-process code at Re^7: Multiprocess - child process cannot be finished successfully. Yes, the "fake" Windows PID from a Perl fork is negative and my wait statement accounts for that. Code shown does run on Windows and should run also on Unix. This demo code worked better than I thought it would - meaning that sleep in each sub process worked and did not interfere with each other. I am not sure how sleep() is implemented on Windows, but it worked better than expected!

        However, it sounds like just using threads is the best way for your compute bound project. The less data you share between threads (ideally nothing that is r/w), the better. Threads can get complicated if there is a lot of sharing going on.

        This is a code fragment from some code years ago... The program has about 70,000 input strings. For each input string, it is desired to know which of the other input strings are "close enough" according to some complex rules. For each input, a regex is generated that is run against all other inputs. This is a NxN algorithm. For 70K inputs it took ~1.5 hours. I have a 4 core machine. Running 4 threads, execution time was something like 3.8x (can't get to exactly 4.0, but that is a very good result). So anyway execution time went to ~20 minutes and that was "good enough" and I stopping improving things.

        Anyway see below for an example of parallelizing a number cruncher job.

        ### This is a non-runnable code fragment ### only to show a general idea of threads pulling work ### from a common input queue and putting results on a ### common output queue. use Threads::Queue; my @all_inputs; global data for all threads as read only ### Worker threads ##### my $thread_limit = 4; my @threads; push @threads, threads->create(sub{DoWork($workQueue, $doneQueue)}) fo +r 1 .. $thread_limit; foreach my $input ( @all_inputs ) #init the work queue with all input +s { $workQueue->enqueue($input); } $workQueue->enqueue(undef) for 1 .. $thread_limit; #"work finished" m +arkers #each thread will +give up #when it sees an u +ndef $workQueue->end(); $_->join() for @threads; #waits for all threads to finish! print "END of threading...\n"; #### Get results off of Queue my @results; while ($doneQueue->pending() && ($_ = $doneQueue->dequeue()) ) { push (@results, $_); #results are pointers to array (AoA) } sub DoWork { my ($workQueue, $doneQueue) = @_; while (my $input = $workQueue->dequeue()) { return unless defined ($input); #this ends this thread's job!! if ($input =~ m|/|) { $doneQueue -> enqueue([$input,'SKIPPING THIS ENTRY!']); next; } my $regex = get_regex_patterns ($input); $regex =~ s/\(|\)//g; #captured values not needed, only yes/no my @matches = grep{ m/$regex/ and $_ ne $input}@all_inputs; push (@matches,'') if @matches==0; $doneQueue -> enqueue([$call,@matches]); } }

        Back in my Unix days I wrote a complete TCP server in Perl! worked like a champ. Sucks that Windows doesn't have a fork/kill/wait ...

        Note that you can write network servers in Perl, that work fine on both Unix and Windows, without forking and without threads, simply by taking an event-driven approach via IO::Select.

        Here's a complete working example of one I used for testing Syslog a while back: Test Syslog Server

        I am lookgin at the threads modules {threads and threads::shared} and its not encouraging. The threads module comes with the warning
        The "interpreter-based threads" provided by Perl are not the fast, lightweight system for multitasking that one might expect or hope for. Threads are implemented in a way that make them easy to misuse. Few people know how to use them correctly or will be able to provide help. The use of interpreter-based threads in perl is officially discouraged.
        But the doc doesn't say what you should do about the discouragement. Should I give it a try , or is there yet something else/newer for this? or is this just kinda impossible in Perl on Windows...
      Thanks all. I"m a long-time unix hacker but threads are something new to me, so it'll take me a while to try to understand them. But I'll give it a shot -- how hard can it be :o). Thanks again....

        I'm a long-time unix hacker but threads are something new to me

        How long is "long-time"? The POSIX Threads API (aka pthreads) was defined in 1995! ... admittedly, pthreads and Unix have a long and checkered history.

Re: Parallel processing on Windows
by eyepopslikeamosquito (Archbishop) on Sep 20, 2022 at 23:01 UTC
Re: Parallel processing on Windows
by BillKSmith (Monsignor) on Sep 20, 2022 at 22:04 UTC
    If you really intend to use 'cores' you may want to investigate the modules "Many-Core Engine" MCE and its friends. My own experience is limited to running a few examples on windows 7. The application that I had in mind requires Tk which is not compatible with MCE.
    Bill
      Hello BillKSmith,

      > The application that I had in mind requires Tk which is not compatible with MCE

      this is not entirely true: see Re: Perl Tk and Threads by master zentara:

      > 1. The thread must be created before any Tk widgets are invoked...

      > 2. Do not put any Tk code into the thread, and do not try to access Tk widgets from the thread. Use shared variables to communicate with the main thread, and have a timer or fileevent in the main Tk thread, read from the thread.

      See also Re^3: Parallel download Tk where choroba points to his fully threaded Tk application, which is able to use different threading machanisms on windows, MCE among them.

      L*

      There are no rules, there are no thumbs..
      Reinvent the wheel, then learn The Wheel; may be one day you reinvent one of THE WHEELS.
        Thanks for the correction. It appears that I may have given up to soon.
        Bill
Re: Parallel processing on Windows
by Fletch (Bishop) on Sep 20, 2022 at 19:46 UTC

    Rather than attempting to work around the inferior OS you could always install a docker host and run things under a real(ish) *NIX with actual forks.

    The cake is a lie.
    The cake is a lie.
    The cake is a lie.

      Classy