in reply to RE:(3) Hash of hashes question
in thread Hash of hashes question

I think it is the because sort() is very fast, and you are comparing it with an if/then (?:) control structure, which is inherantly slow.

Well, another thing to do besides benchmarking is profiling:

#!/usr/bin/perl use strict; use Devel::OpProf qw(profile print_stats zero_stats ); profile(1); # turn on profiling my $now = 8; # we'll pretend it's between 8 and 9 PM my %url = ( monday => { @{[map(($_,1), (1..1000))]} } ); #start profiling zero_stats; $now = (sort grep {$_ <= $now} keys %{$url{'monday'}})[-1]; print_stats; zero_stats; $now = ($now < $_ && $_ < 8 ? $_ : $now) for keys %{$url{'monday'}}; print_stats;

Well, this tells us a lot!
First algorithm
null operation 1006
block 1001
private variable1001
grep iterator 1000
numeric le (<=) 1000
scalar variable 1000
pushmark7
next statement5
glob value 3
constant item 2
subroutine entry2
private hash1
private array 1
sort1
keys1
conditional expression 1
list slice1
hash dereference 1
hash element 1
array dereference 1
scalar assignment 1
block entry 1
grep 1
Second algorithm
private variable 3000
null operation 2006
logical and (&&) 2001
glob value 1996
numeric lt (<) 1992
scalar dereference 1992
next statement1006
conditional expression 1001
foreach loop iterator 1001
scalar assignment 1000
iteration finalizer 1000
constant item 993
pushmark 4
subroutine entry2
block 1
keys 1
foreach loop entry1
array dereference 1
block entry 1
private array 1
private hash1
hash dereference1
hash element 1
loop exit 1

So we see that just the control loop by iself takes almost as much "action" as the whole first algorithm

Paris Sinclair    |    4a75737420416e6f74686572
pariss@efn.org    |    205065726c204861636b6572
I wear my Geek Code on my finger.

Replies are listed 'Best First'.
RE:(5) Hash of hashes question
by Russ (Deacon) on Jun 19, 2000 at 23:10 UTC
    use Devel::OpProf qw(profile print_stats zero_stats ); profile(1); # turn on profiling

    Wow! Devel::OpProf. My toolbox just got bigger...

    So, the follow-up question: is there a better way to code max() than the control structure I used? The theory is that sort() should be less efficient than max(). There must be a more efficient way to code max(), then. max() should be in O(1), where Perl's sort is (usually) in O(N log N). How can we get the best performance for the "greatest value less than another value" algorithm?

    Russ

      Okay, here is the max() given in Mastering Algorithms with Perl (O'Reilly):

      sub max { # Numbers. my $max = shift; foreach ( @_ ) { $max = $_ if $_ > $max } return $max; }

      So, modifying that to fit into our existing benchmark/profiling code, it looks like:

      foreach ( keys %{$url{'monday'}} ) { $now = $_ if $_ > $now }

      And then, the score:

       grep

      constant item2
      numeric le (<=)1000
      shift1
      sort1
      hash dereference2
      hash element1
      pushmark10
      grep1
      grep iterator1000
      scalar variable2001
      next statement6
      list slice1
      scalar assignment2
      keys1
      subroutine exit1
      block1000
      array dereference1
      glob value5
      private variable3
      subroutine entry4
       if/then/else

      constant item993
      foreach loop entry1
      shift1
      foreach loop iterator1001
      iteration finalizer1000
      hash dereference2
      hash element1
      pushmark7
      scalar variable4992
      next statement1007
      logical and (&&)2001
      scalar assignment1001
      conditional expression1000
      keys1
      subroutine exit1
      loop exit1
      array dereference1
      glob value6
      private variable3
      subroutine entry4
      numeric lt (<)1992
       book max()

      constant item1
      foreach loop entry1
      shift1
      foreach loop iterator1001
      iteration finalizer1000
      hash dereference2
      numeric gt (>)1000
      hash element1
      pushmark7
      scalar variable2292
      next statement2006
      logical and (&&)2001
      scalar assignment147
      keys1
      subroutine exit1
      loop exit1
      array dereference1
      glob value6
      private variable3
      subroutine entry4
      Benchmark: timing 10000 iterations of bench_grep, bench_if_then, bench_max...
      bench_grep: 115 wallclock secs (111.04 usr +  0.01 sys = 111.05 CPU) @ 90.05/s (n=10000)
      bench_if_then: 73 wallclock secs (70.02 usr +  0.01 sys = 70.03 CPU) @ 142.80/s (n=10000)
       bench_max: 64 wallclock secs (62.72 usr +  0.01 sys = 62.73 CPU) @ 159.41/s (n=10000)
      

      Well, we have some surprises here! grep comes in last, even though it has the least operations. As you can see, not all operations are equal... but the book technique is the big winner. Which tells me it was worth the $35US.

      The results also show that Devel::OpProf is a useful tool, but only as an accessory to Benchmark. On it's own, it can be misleading.

      Paris Sinclair    |    4a75737420416e6f74686572
      pariss@efn.org    |    205065726c204861636b6572
      I wear my Geek Code on my finger.
      
        Very interesting.

        I started with the Mastering Perl Algorithms code (I left it in a subroutine), and abandoned it for the ternary when the ternary was faster. Of course, the subroutine entry was the main speed bottleneck.

        This is fascinating. In the if/then/else results, we pay a considerably higher cost for the "constant item," "scalar variable" and "scalar assignment" operations. I wonder why that is so... Even the numeric comparison is more expensive in if/then/else than book max.

        Also, your grep benchmarked slower than if/then/else. I saw the opposite. Would you post your complete code, to let me compare?

        Thanks, you've made a fantastic post!

        Russ