Be aware! "PDL threading" has nothing whatever to do with threads. They do not, for example, run concurrently. They do not allow you to make use of multiple cores. A really unfortunate choice of term IMO.
Whether you can use PDL to vectorise your algorithm will depend very much upon what the algorithm does.
PDL allows you to apply a calculation to a whole array 'simultaneously'. So, for example, instead of coding:
my $result = 0; for my $i ( 0 .. length( $str ) - 1 ) { $result += ord( substr $str, $i, 1 ) * 3 / 4; } return $result;
it might allow you to code (something like--I've forgotten whatever little PDL syntax I once knew):
my $pdl = PDL split '', $str; my $result = sum( $pdl * 3 / 4 );
In essence, this is simply moving the explicit for loop in the first example into C code. It is still looped over, but runs a more quickly because the loop is in C code, and the data is stored as a C-style array (in contiguous memory space) and is subject to far less overhead than perl scalars.
So firstly, both your data and algorithm have to lend themselves to being piddled. Secondly, you'll need to re-write them to use piddles.
But it is not "vectorisation" in the floating point coprocessor or graphics card sense. With FPU (SSE3 opcodes), multiple data items (up to say 256) are operated on concurrently. With GPUs (as found on graphics cards), thousands of data values can be processed concurrently. PDL does not do this. It operates on one data value at a time. The speed up comes from moving the loops into C.
If you have a system available to you that has multiple cores, it would be trivial to have N copies of your existing subroutine running concurrently using threads. This should give you close to time/n speed up.
If you're are seeking to spread your load across multiple systems, you'll need something more than either PDL or threads. Something along the lines of Parallel::MPI. But you need to install/develop appropriate software on each system you hope to utilise and convert your existing subroutine to MPI. This is a distinctly non-trivial process to set up.
That is a very brief run through of the options. A full discussion would require a book :). If you were to post the subroutine, or something similar if you are protective of your algorithm, then it might be possible to make a clearer assessment of what might help you.
In reply to Re: How to vectorize a subroutine--perhapse using PDL piddles or threads
by BrowserUk
in thread How to vectorize a subroutine--perhapse using PDL piddles or threads
by Feynman
| For: | Use: | ||
| & | & | ||
| < | < | ||
| > | > | ||
| [ | [ | ||
| ] | ] |