benwills has asked for the wisdom of the Perl Monks concerning the following question:
I'm getting consistently inconsistent results with Perl's Benchmark module and wanted to see if there was something I'm missing. I've checked the Benchmark documentation and can't find anything in there on this (though perhaps I missed it). And I also used Super Search and didn't find anything there.
Context: I have a couple dozen regular expressions that will be run millions of times a day. Some will be run billions of times a day. For that reason, I'm trying to get every bit of performance out of these. I'm testing these right now on a box that's doing absolutely nothing else other than being connected to the internet and running these tests. (no servers running, etc) So I can't imagine this is from something else running on the machine taking resources at random times.
I began noticing that certain subroutines which ran earlier in the benchmarking process would often have slower load times. I then tried running additional Benchmarks, swapping the order of which would run first. And, even after swapping the order, the first one would still run slower when, before, it ran faster. I was able to reproduce this, not with 100% consistency, but enough consistency to notice.
In other words, if sub_1 is actually the faster subroutine (as tested by multiple, 60 second benchmarks) by 15%, on the first run of the benchmark of five seconds or less, it will often show as being slower than sub_2.
Here are the patterns of inconsistency I've noticed:
I realize this is all pretty hand-wavey. But I tested it enough times with enough variations that I feel comfortable bringing it here.
If I need to run a bunch of tests and put together some concrete examples and hard numbers, I can do that. But before I did, I wanted to see if it was common knowledge that this happens, or if this is something I should look into more concretely to understand what's going on.
And finally, if there's a more consistent and precise way to do this kind of testing, what would you suggest? I like Benchmark because I can iterate quickly as I have new ideas. Running NYTProf (which I rely on at other times) simply takes too much time to iterate through a bunch of variations as I think of them.
Perl 5.18.2 running on ubuntu 14.04
Edited:
Okay, here's an example.
Benchmark: running a_1, a_2 for at least 5 CPU seconds ... a_1: 5 wallclock secs ( 5.00 usr + 0.05 sys = 5.05 CPU) @ 67 +.33/s (n=340) a_2: 6 wallclock secs ( 5.29 usr + 0.00 sys = 5.29 CPU) @ 73 +.35/s (n=388) Rate a_1 a_2 a_1 67.3/s -- -8% a_2 73.3/s 9% -- Benchmark: running a_1, a_2 for at least 60 CPU seconds ... a_1: 60 wallclock secs (60.05 usr + 0.00 sys = 60.05 CPU) @ 74 +.44/s (n=4470) a_2: 63 wallclock secs (63.13 usr + 0.00 sys = 63.13 CPU) @ 73 +.28/s (n=4626) Rate a_2 a_1 a_2 73.3/s -- -2% a_1 74.4/s 2% --
a_1 is 9% slower when it's the first subroutine tested for 5 seconds, but 2% faster when it's run for 60 seconds
And then the only change I make is to switch the names of a_1 and a_2 so that they run in the opposite order.
And now (below), the original a_1 (now a_2) is 9% faster on the 5 second test, as it's the second one run. And it is now 6% faster on the 60 second run.
Benchmark: running a_1, a_2 for at least 5 CPU seconds ... a_1: 5 wallclock secs ( 4.98 usr + 0.08 sys = 5.06 CPU) @ 68 +.58/s (n=347) a_2: 5 wallclock secs ( 5.24 usr + 0.00 sys = 5.24 CPU) @ 74 +.62/s (n=391) Rate a_1 a_2 a_1 68.6/s -- -8% a_2 74.6/s 9% -- Benchmark: running a_1, a_2 for at least 60 CPU seconds ... a_1: 60 wallclock secs (59.26 usr + 0.84 sys = 60.10 CPU) @ 70 +.72/s (n=4250) a_2: 63 wallclock secs (62.98 usr + 0.00 sys = 62.98 CPU) @ 74 +.63/s (n=4700) Rate a_1 a_2 a_1 70.7/s -- -5% a_2 74.6/s 6% --
This is the pattern I'm seeing fairly often.
|
|---|