in reply to Perl and parallel processors

Perl does nothing with parallel processors by itself (AFAIK). If your algorithm lets you, probably the easiest way to get some extra efficiency on multi-processor systems is to fork() off a bunch of processes that work together. Parallel::ForkManager might be useful.

Another option is to use threads but working with threads in perl isn't as straight-forward as you might imagine. Especially don't expect to be able to share objects from classes that weren't written to be thread-aware (which probably includes most modules on CPAN).

For numeric processing PDL supports something called PDL threads, but exactly how that relates to multi-processors is foggy to me.

Replies are listed 'Best First'.
Re^2: Perl and parallel processors
by andye (Curate) on Apr 24, 2007 at 13:32 UTC
    Hi Joost - just a quick note to point out that PDL threading isn't (yet, as far as I know) something that lets you use multiple processors.

    It's probably easiest to just think of a PDL thread as simply an optimised loop: instead of using a loop in Perl, it lets you use a loop in PDL's C code, which is of course quite a lot quicker.

    The docs say:

    why threading ?

    Well, code that uses threading should be (considerably) faster than code that uses explicit for-loops (or similar perl constructs) to achieve the same functionality. Especially on supercomputers (with vector computing facilities/parallel processing) PDL threading will be implemented in a way that takes advantage of the additional facilities of these machines. Furthermore, it is a conceptually simply construct (though technical details might get involved at times) and can greatly reduce the syntactical complexity of PDL code (but keep the admonition for documentation in mind). Once you are comfortable with the threading way of thinking (and coding) it shouldn't be too difficult to understand code that somebody else has written than (provided he gave you an idea what exspected input dimensions are, etc.). As a general tip to increase the performance of your code: if you have to introduce a loop into your code try to reformulate the problem so that you can use threading to perform the loop (as with anything there are exceptions to this rule of thumb; but the authors of this document tend to think that these are rare cases ;).
    http://pdl.sourceforge.net/PDLdocs/Indexing.html#threading (emphasis mine)

    Best wishes, andye

      Note from THE FUTURE about PDL parallel processing:
      • In 1998, PDL version v1.99987, there was extremely basic support for POSIX threading
      • In 2011, John Cerney added PDL::ParallelCPU and the machinery for doing "broadcasting" (then, confusingly, called "threading") to "pthread" over a specified number of CPUs/cores, first released with 2.4.10
      • In 2021, 2.058 added the ability for this to take place over ndarrays with dimensions not exactly divisible by the given number of cores
      • 2.059 then added a default value for the number of cores as the available number of cores at program startup (found using code lifted from git)