@data contains several tens of thousands of piddles with precomputed standard scores; $samples is the number of values in each piddle (side note: I would've thought that should be '$samples - 1', but this gives the same result as all the public implementations I've tried).for($i = 0; $i < @data; $i++){ for($j = $i + 1; $j < @data; $j++){ $pears = sum($data[$i] * $data[$j]) / $samples; } }
So my question is: if I were to reimplement this part in C, would I expect some sort of significant speedup? I'm looking for a very general ballpark, like 2-3x, or "don't bother, probably none at all".
Also I was wondering if stuffing the whole thing into a 2D piddle, rather than keeping it in a Perl array might offer some benefit? I'm inclined to think that probably not, but you never know. (memory is not an issue at all here, just execution time).
What I have right now is probably fast enough to make the project feasable - the largest dataset I have would take around 12 hours; but I have hundreds of these datasets, and even though I have the hardware to do quite a few in parallel and most are much much smaller than the largest, reducing the run-time even a little at this stage might save quite a bit of time by the time I'm done.
Any thoughts? Or other ideas I haven't thought of?
In reply to PDL vs C speed question by glwtta
| For: | Use: | ||
| & | & | ||
| < | < | ||
| > | > | ||
| [ | [ | ||
| ] | ] |