glwtta has asked for the wisdom of the Perl Monks concerning the following question:
@data contains several tens of thousands of piddles with precomputed standard scores; $samples is the number of values in each piddle (side note: I would've thought that should be '$samples - 1', but this gives the same result as all the public implementations I've tried).for($i = 0; $i < @data; $i++){ for($j = $i + 1; $j < @data; $j++){ $pears = sum($data[$i] * $data[$j]) / $samples; } }
So my question is: if I were to reimplement this part in C, would I expect some sort of significant speedup? I'm looking for a very general ballpark, like 2-3x, or "don't bother, probably none at all".
Also I was wondering if stuffing the whole thing into a 2D piddle, rather than keeping it in a Perl array might offer some benefit? I'm inclined to think that probably not, but you never know. (memory is not an issue at all here, just execution time).
What I have right now is probably fast enough to make the project feasable - the largest dataset I have would take around 12 hours; but I have hundreds of these datasets, and even though I have the hardware to do quite a few in parallel and most are much much smaller than the largest, reducing the run-time even a little at this stage might save quite a bit of time by the time I'm done.
Any thoughts? Or other ideas I haven't thought of?
|
---|
Replies are listed 'Best First'. | |
---|---|
Re: PDL vs C speed question
by gam3 (Curate) on May 06, 2005 at 02:07 UTC | |
by glwtta (Hermit) on May 06, 2005 at 03:50 UTC | |
Re: PDL vs C speed question
by DrHyde (Prior) on May 06, 2005 at 10:34 UTC | |
by glwtta (Hermit) on May 06, 2005 at 23:12 UTC | |
Re: PDL vs C speed question
by glwtta (Hermit) on May 06, 2005 at 23:03 UTC | |
Re: PDL vs C speed question
by Roy Johnson (Monsignor) on May 06, 2005 at 23:34 UTC | |
by glwtta (Hermit) on May 07, 2005 at 23:58 UTC | |
Re: PDL vs C speed question
by Anonymous Monk on May 09, 2005 at 01:46 UTC | |
by glwtta (Hermit) on May 09, 2005 at 12:36 UTC |