in reply to RE: RE:(3) Hash of hashes question
in thread Hash of hashes question

use Devel::OpProf qw(profile print_stats zero_stats ); profile(1); # turn on profiling

Wow! Devel::OpProf. My toolbox just got bigger...

So, the follow-up question: is there a better way to code max() than the control structure I used? The theory is that sort() should be less efficient than max(). There must be a more efficient way to code max(), then. max() should be in O(1), where Perl's sort is (usually) in O(N log N). How can we get the best performance for the "greatest value less than another value" algorithm?

Russ

Replies are listed 'Best First'.
RE: RE:(5) Hash of hashes question
by Aighearach (Initiate) on Jun 20, 2000 at 04:00 UTC

    Okay, here is the max() given in Mastering Algorithms with Perl (O'Reilly):

    sub max { # Numbers. my $max = shift; foreach ( @_ ) { $max = $_ if $_ > $max } return $max; }

    So, modifying that to fit into our existing benchmark/profiling code, it looks like:

    foreach ( keys %{$url{'monday'}} ) { $now = $_ if $_ > $now }

    And then, the score:

     grep

    constant item2
    numeric le (<=)1000
    shift1
    sort1
    hash dereference2
    hash element1
    pushmark10
    grep1
    grep iterator1000
    scalar variable2001
    next statement6
    list slice1
    scalar assignment2
    keys1
    subroutine exit1
    block1000
    array dereference1
    glob value5
    private variable3
    subroutine entry4
     if/then/else

    constant item993
    foreach loop entry1
    shift1
    foreach loop iterator1001
    iteration finalizer1000
    hash dereference2
    hash element1
    pushmark7
    scalar variable4992
    next statement1007
    logical and (&&)2001
    scalar assignment1001
    conditional expression1000
    keys1
    subroutine exit1
    loop exit1
    array dereference1
    glob value6
    private variable3
    subroutine entry4
    numeric lt (<)1992
     book max()

    constant item1
    foreach loop entry1
    shift1
    foreach loop iterator1001
    iteration finalizer1000
    hash dereference2
    numeric gt (>)1000
    hash element1
    pushmark7
    scalar variable2292
    next statement2006
    logical and (&&)2001
    scalar assignment147
    keys1
    subroutine exit1
    loop exit1
    array dereference1
    glob value6
    private variable3
    subroutine entry4
    Benchmark: timing 10000 iterations of bench_grep, bench_if_then, bench_max...
    bench_grep: 115 wallclock secs (111.04 usr +  0.01 sys = 111.05 CPU) @ 90.05/s (n=10000)
    bench_if_then: 73 wallclock secs (70.02 usr +  0.01 sys = 70.03 CPU) @ 142.80/s (n=10000)
     bench_max: 64 wallclock secs (62.72 usr +  0.01 sys = 62.73 CPU) @ 159.41/s (n=10000)
    

    Well, we have some surprises here! grep comes in last, even though it has the least operations. As you can see, not all operations are equal... but the book technique is the big winner. Which tells me it was worth the $35US.

    The results also show that Devel::OpProf is a useful tool, but only as an accessory to Benchmark. On it's own, it can be misleading.

    Paris Sinclair    |    4a75737420416e6f74686572
    pariss@efn.org    |    205065726c204861636b6572
    I wear my Geek Code on my finger.
    
      Very interesting.

      I started with the Mastering Perl Algorithms code (I left it in a subroutine), and abandoned it for the ternary when the ternary was faster. Of course, the subroutine entry was the main speed bottleneck.

      This is fascinating. In the if/then/else results, we pay a considerably higher cost for the "constant item," "scalar variable" and "scalar assignment" operations. I wonder why that is so... Even the numeric comparison is more expensive in if/then/else than book max.

      Also, your grep benchmarked slower than if/then/else. I saw the opposite. Would you post your complete code, to let me compare?

      Thanks, you've made a fantastic post!

      Russ

        My complete code is a CGI that spits out the results inside the tables, so it's a bit ugly. Here I've commented out the CGI line, so it will print out the HTML without the header. You might want to redirect the output to a file, and view it in a browser.

        I did wrap each version into a sub, as a max() should usually be used that way.

        Also note that this code REQUIRES perl 5.6.0. If you have an old version, you will need to replace "our" with "use vars"

        #!/usr/bin/perl use strict; #use CGI qw( header ); print header(); use Benchmark; use Devel::OpProf qw(profile zero_stats stats ); profile(1); # turn on profiling our $now = 8; # we'll pretend it's between 8 and 9 PM our %url = ( monday => { @{[map(($_,1), (1..1000))]} } ); #start profiling $|++; print "<TABLE>\n"; print "<TR><TD>\n"; my_profile( "grep",\&test_grep ); print "</TD><TD>\n"; my_profile( "if/then/else",\&test_if_then ); print "</TD><TD>\n"; my_profile( "book max()",\&test_max ); print "</TD></TR>\n"; # now lets benchmark print "<TR><TD COLSPAN=3><PRE>"; timethese(10000, { bench_grep => q{ test_grep(); }, bench_if_then => q{ test_if_then(); }, bench_max => q{ test_max(); }} ); print "</PRE></TD></TR></TABLE>\n"; sub test_grep { $now = (sort grep {$_ <= $now} keys %{$url{'monday'}})[-1]; } sub test_if_then { $now = ($now < $_ && $_ < 8 ? $_ : $now) for keys %{$url{'monday'} +}; } sub test_max { foreach ( keys %{$url{'monday'}} ) { $now = $_ if $_ > $now } } sub my_profile { my $title = shift; my $test_sub = shift || return; zero_stats; &$test_sub; stats_td( $title ); # print_stats; } sub stats_td { my $title = shift; my %stats = %{stats()}; print "<TABLE><TR><TH COLSPAN=2>&nbsp;$title<BR><HR></TH></TR>\n"; while ( my ( $stat, $value ) = each %stats ) { $stat =~ s/</\&lt\;/; $stat =~ s/>/\&gt\;/; if ( $value ) { $value =~ s/>/\&gt\;/; $value =~ s/</\&lt\;/; print "<TR><TH>$stat</TH><TD>$value</TD></TR>\n"; } } print "</TABLE>\n"; }

        I just whipped this up without much planning, so there are some problems with it. One is that the call to stat() should be in my_profile() not stats_td(). But, because of it's location in stats_td(), it is a O(1) error, and so it shouldn't skew the results much.

        Paris Sinclair    |    4a75737420416e6f74686572
        pariss@efn.org    |    205065726c204861636b6572
        I wear my Geek Code on my finger.