in reply to PDL vs C speed question

You example could be a more clear. You don't give the type of @data and $pears does not seem to be being used. Also you don't say how the data array is getting loaded or what the maximum size is. If data will fit into a C array of doubles I think that you might see a speedup of as much as 50 fold.

For 10,000 datapoints I was getting 6 seconds for C code and 340 seconds for the perl.

sub sum { my $a = shift; $a + $a; } my @data; for (0..10000) { push(@data, 3.1415926 * $_); } for ($i = 0; $i < @data; $i++){ for ($j = $i + 1; $j < @data; $j++){ sum($data[$i] * $data[$j]); } } __END__ #include <limits.h> double test(double x) { return x + x; } #define SIZE 10000 main() { int i, j; double data[SIZE]; int x; for (x = 0; x < SIZE; x++) { data[x] = 3.1415926 * x; } for (i = 0; i < SIZE; i++){ for (j = 0; j < SIZE; j++){ test(data[i] * data[j]); } } }
There may be a faster way to do this in Perl though.
-- gam3
A picture is worth a thousand words, but takes 200K.

Replies are listed 'Best First'.
Re^2: PDL vs C speed question
by glwtta (Hermit) on May 06, 2005 at 03:50 UTC
    Sorry, should clarify: @data is an array of PDL values. The piddles themselves are simple one dimensional vectors of floats. The maximum size of @data is about 45000, the piddles usually have around 100 values.

    $pears is not used because that's the end result I'm going after, it will just get stored to a file or database, there isn't much there, optimization-wise. Similarly, it doesn't matter how @data gets loaded, the time that it takes is negligible compared to the main computation.

    So @data is preloaded with PDL vectors of floats, which are the z-scores of the original values. All I need is the fastest way to divide the sum of the products of these standard scores, for each pair of elements of @data, by the size of the piddles, to get the correlation coefficient (they are all the same size, and I know the size from the start).

    Say N is the number of elements in @data, and M is the size of those elements; as far as I can tell I need to perform a minimum of N^2/2*M multiplications, N^2/2*(M-1) substractions and N^2/2 divisions - I just don't know how much slower than C perl (and PDL specifically) is at such things.

    btw, I am not quite sure what your perl example is doing (especially the sum() sub), I don't think that's what I have in mind.

    Should be pretty obvious that I know very little C, though I am sure I should be able to scrape togther something this trivial if it's worth it (and 50X speed up would definitely be worth it, if that is actually the case).

    Thanks for the help