No, you can't have references of two different types that have the same address, so there would be no point for the code to compare the type of references.
I ran your code and the "strings" case was always the fastest. The other cases except for "take a reference" were within 20% of this speed. Some runs had "quick_refs" faster than "numbers", some vice versa.
Each of these facts reinforce my opinion that this is yet another example a premature nano-optimization. q-:
Even the "take a reference" case was only two-times as slow. Having some comparisons be two-times as slow is likely to make my real-life script, um.... 0.1% slower. I don't care. I've already wasted more time than that would ever save me adjusting my .sig. (:
Update:
I wind up with numbers fastest, hands down
I don't see how you can call 15% in a benchmark "hands down". In a benchmark, I call 20% "indeterminate". And, yes, I know you didn't start this thread. :)
- tye (yeah, this part)
| [reply] |
(Edit: I agree with tye, it's not the difference between string, int, or ref compares that's going to make or break the script. But it was still a fun benchmark. Oh, and 10 or 15% over 3 seconds is fairly consistently repeatable. 5% would be "noise". :))
Interesting that you wind up with strings fastest. I wind up with numbers fastest, hands down, using 5.6.1 on Cygwin.
I agree that this is a micro (or nano:)) optimization; I was just responding to his query as to why the ref compare was so slow in his code.
$ perl ref_cmp.pl
Benchmark: running numbers 2, numbers 4, qrefs 2, qrefs 4, refs 2, ref
+s 4, strings 2, strings 4, each for at least 3 CPU seconds...
numbers 2: 4 wallclock secs ( 3.08 usr + 0.00 sys = 3.08 CPU) @ 74
+0880.04/s (n=2278947)
numbers 4: 5 wallclock secs ( 3.18 usr + 0.03 sys = 3.21 CPU) @ 49
+7290.48/s (n=1593816)
qrefs 2: 3 wallclock secs ( 3.08 usr + -0.01 sys = 3.07 CPU) @ 64
+0138.26/s (n=1967785)
qrefs 4: 4 wallclock secs ( 3.02 usr + 0.00 sys = 3.02 CPU) @ 42
+8683.86/s (n=1296340)
refs 2: 4 wallclock secs ( 3.11 usr + 0.01 sys = 3.12 CPU) @ 41
+4780.49/s (n=1292456)
refs 4: 5 wallclock secs ( 3.02 usr + 0.01 sys = 3.03 CPU) @ 28
+0746.21/s (n=851784)
strings 2: 3 wallclock secs ( 3.06 usr + 0.02 sys = 3.08 CPU) @ 64
+2239.02/s (n=1974885)
strings 4: 4 wallclock secs ( 3.11 usr + 0.01 sys = 3.12 CPU) @ 47
+6537.87/s (n=1484892)
Rate refs 4 refs 2 qrefs 4 strings 4 numbers 4 qrefs 2 s
+trings 2 numbers 2
refs 4 280746/s -- -32% -35% -41% -44% -56%
+ -56% -62%
refs 2 414780/s 48% -- -3% -13% -17% -35%
+ -35% -44%
qrefs 4 428684/s 53% 3% -- -10% -14% -33%
+ -33% -42%
strings 4 476538/s 70% 15% 11% -- -4% -26%
+ -26% -36%
numbers 4 497290/s 77% 20% 16% 4% -- -22%
+ -23% -33%
qrefs 2 640138/s 128% 54% 49% 34% 29% --
+ -0% -14%
strings 2 642239/s 129% 55% 50% 35% 29% 0%
+ -- -13%
numbers 2 740880/s 164% 79% 73% 55% 49% 16%
+ 15% --
--
Mike | [reply] [d/l] |
Actually, with this very code I got 15% variation between runs of identical code. I find this is often true when benchmarking nano-optimizations like this. That is why I consider 20% to be "indeterminate" (I add 5% to "be safe") for nano-optimization benchmarks.
Now, for macro-optimization benchmarks, I consider 5% to be "indeterminate". The problem with nano-optimization benchmarks is that the run time of the code being timed is so miniscule per iteration that slight variations in the "outside" code can make a relatively large impact on the timing results.
And, of course, the other problem with nano-optimization is that even if you find a 2-fold speed-up for one of these tiny, tiny operations, the actual change you end up seeing in how long it takes your script to run is usually a tiny fraction of that.
- tye (but they are fun, aren't they?)
| [reply] |
Very interesting! I would have thought that taking a reference was a compile-time thing!
Perhaps comparing two variables, rather than a variable against a literal, makes a difference too. I think that would slow down the numbers too, but I don't have time to try that case right now.
| [reply] |