RE:(3) Hash of hashes question

True. Of course, in this application, there appear to be a maximum of 24 possible entries...

Being curious, I did a quick benchmark. However, its results defy logic (my (sort grep keys)[-1] is always faster than a O(1) "max()" algorithm!)

Since I am obviously missing something, would you be willing to post some benchmarking code showing the efficiency differences between "sort" and "max()" in this context?

Russ

P.S. Just for the sake of helping me find my silliness, here is what I was running. Changing the map to larger numbers (to make the source hash bigger) made no difference to the benchmark times.

#!/usr/bin/perl -w
use strict;
use Benchmark;
my $now = 8; # we'll pretend it's between 8 and 9 PM
my %url = (
  monday => {
    @{[map(($_,1), (1..1000))]}
  }
);
timethese(100000, {
  a => q{
    $now = (sort grep {$_ <= $now} keys %{$url{'monday'}})[-1];
  },
  b => q{
    $now = ($now < $_ && $_ < 8 ? $_ : $now) for keys %{$url{'monday'}
+};
  }
});
[download]

The results were the same even without doing the assignment each time in b (only assign when we should, rather than each time through the loop). I left it this way for no apparent reason...

Comment on RE:(3) Hash of hashes question Download Code

Replies are listed 'Best First'.

RE: RE:(3) Hash of hashes question
by Aighearach (Initiate) on Jun 19, 2000 at 11:56 UTC

I think it is the because sort() is very fast, and you are comparing it with an if/then (?:) control structure, which is inherantly slow.

Well, another thing to do besides benchmarking is profiling:

#!/usr/bin/perl
use strict;
use Devel::OpProf qw(profile print_stats zero_stats );
profile(1);    # turn on profiling
my $now = 8; # we'll pretend it's between 8 and 9 PM
my %url = (
       monday => {
           @{[map(($_,1), (1..1000))]}
       }
       );
#start profiling
zero_stats;
$now = (sort grep {$_ <= $now} keys %{$url{'monday'}})[-1];
print_stats;

zero_stats;
$now = ($now < $_ && $_ < 8 ? $_ : $now) for keys %{$url{'monday'}};
print_stats;
[download]

Well, this tells us a lot!

First algorithm

null operation 1006

block 1001

private variable 1001

grep iterator 1000

numeric le (<=) 1000

scalar variable 1000

pushmark 7

next statement 5

glob value 3

constant item 2

subroutine entry 2

private hash 1

private array 1

sort 1

keys 1

conditional expression 1

list slice 1

hash dereference 1

hash element 1

array dereference 1

scalar assignment 1

block entry 1

grep 1

Second algorithm

private variable 3000

null operation 2006

logical and (&&) 2001

glob value 1996

numeric lt (<) 1992

scalar dereference 1992

next statement 1006

conditional expression 1001

foreach loop iterator 1001

scalar assignment 1000

iteration finalizer 1000

constant item 993

pushmark 4

subroutine entry 2

block 1

keys 1

foreach loop entry 1

array dereference 1

block entry 1

private array 1

private hash 1

hash dereference 1

hash element 1

loop exit 1

So we see that just the control loop by iself takes almost as much "action" as the whole first algorithm

Paris Sinclair    |    4a75737420416e6f74686572
pariss@efn.org    |    205065726c204861636b6572
I wear my Geek Code on my finger.

[reply]
[d/l]

RE:(5) Hash of hashes question

by Russ (Deacon) on Jun 19, 2000 at 23:10 UTC

use Devel::OpProf qw(profile print_stats zero_stats ); profile(1); # turn on profiling

Wow! Devel::OpProf. My toolbox just got bigger...

So, the follow-up question: is there a better way to code max() than the control structure I used? The theory is that sort() should be less efficient than max(). There must be a more efficient way to code max(), then. max() should be in O(1), where Perl's sort is (usually) in O(N log N). How can we get the best performance for the "greatest value less than another value" algorithm?

Russ

[reply]

RE: RE:(5) Hash of hashes question

by Aighearach (Initiate) on Jun 20, 2000 at 04:00 UTC

Okay, here is the max() given in Mastering Algorithms with Perl (O'Reilly):

sub max { # Numbers.
    my $max = shift;
    foreach ( @_ ) { $max = $_ if $_ > $max }
    return $max;
}
[download]

So, modifying that to fit into our existing benchmark/profiling code, it looks like:

foreach ( keys %{$url{'monday'}} ) { $now = $_ if $_ > $now }
[download]

And then, the score:

grep
constant item	2
numeric le (<=)	1000
shift	1
sort	1
hash dereference	2
hash element	1
pushmark	10
grep	1
grep iterator	1000
scalar variable	2001
next statement	6
list slice	1
scalar assignment	2
keys	1
subroutine exit	1
block	1000
array dereference	1
glob value	5
private variable	3
subroutine entry	4

if/then/else
constant item	993
foreach loop entry	1
shift	1
foreach loop iterator	1001
iteration finalizer	1000
hash dereference	2
hash element	1
pushmark	7
scalar variable	4992
next statement	1007
logical and (&&)	2001
scalar assignment	1001
conditional expression	1000
keys	1
subroutine exit	1
loop exit	1
array dereference	1
glob value	6
private variable	3
subroutine entry	4
numeric lt (<)	1992

book max()
constant item	1
foreach loop entry	1
shift	1
foreach loop iterator	1001
iteration finalizer	1000
hash dereference	2
numeric gt (>)	1000
hash element	1
pushmark	7
scalar variable	2292
next statement	2006
logical and (&&)	2001
scalar assignment	147
keys	1
subroutine exit	1
loop exit	1
array dereference	1
glob value	6
private variable	3
subroutine entry	4

Benchmark: timing 10000 iterations of bench_grep, bench_if_then, bench_max...
bench_grep: 115 wallclock secs (111.04 usr +  0.01 sys = 111.05 CPU) @ 90.05/s (n=10000)
bench_if_then: 73 wallclock secs (70.02 usr +  0.01 sys = 70.03 CPU) @ 142.80/s (n=10000)
 bench_max: 64 wallclock secs (62.72 usr +  0.01 sys = 62.73 CPU) @ 159.41/s (n=10000)

Well, we have some surprises here! grep comes in last, even though it has the least operations. As you can see, not all operations are equal... but the book technique is the big winner. Which tells me it was worth the $35US.

The results also show that Devel::OpProf is a useful tool, but only as an accessory to Benchmark. On it's own, it can be misleading.

Paris Sinclair    |    4a75737420416e6f74686572
pariss@efn.org    |    205065726c204861636b6572
I wear my Geek Code on my finger.

[reply]
[d/l]
[select]

RE:(7) Hash of hashes question

by Russ (Deacon) on Jun 20, 2000 at 05:28 UTC

RE: RE:(7) Hash of hashes question

by Aighearach (Initiate) on Jun 20, 2000 at 05:45 UTC