in reply to Re: Beauty is in the eye of the beholder
in thread Beauty is in the eye of the beholder

One thousand pardons, m'lord japhy, but you wrote:

But if you do this a lot of times, you're kissing inefficiency on its ratty lips. Use a hash:

In the oft-maligned name of pragmatism, I cobbled together the following brief, unscientific test:

#!/usr/bin/perl -w use Benchmark; @List = ("aaaa" .. "zzzz"); print "Elements:", scalar @List, "\n"; sub first (&@) { my $cref = shift; $cref->($_) and return $_ for @_; } $t0 = new Benchmark; $first_match = first {$_ eq "zzzz"} @List; $t1 = new Benchmark; $td = timediff($t1, $t0); print "Found it! (sub)\n" if $first_match; print "sub took: ", timestr($td), "\n"; # Crufty way to figure out how big I am on a Linux box - /msg me with +improvements, please! -McD $size = (split(" ", `ps -hlp $$`))[6]; print "Size: $size\n"; $t0 = new Benchmark; my $Found=0; for (@List) { if ($_ eq "zzzz") { $Found=1; last; } } $t1 = new Benchmark; $td = timediff($t1, $t0); print "Found it! (scan)\n" if $Found; print "scan took: ", timestr($td), "\n"; $size = (split(" ", `ps -hlp $$`))[6]; print "Size: $size\n"; $t0 = new Benchmark; my %seen; @seen{@List} = (); $t1 = new Benchmark; $td = timediff($t1, $t0); if (exists $seen{"zzzz"}) { print "Found it! (hash)\n"; } print "hash took: ", timestr($td), "\n"; $size = (split(" ", `ps -hlp $$`))[6]; print "Size: $size\n";
To make this a worst-case for the linear teams, I searched for the last item in the list.

As it turns out, the least appealing code is the fastest - but while the hash approach may not be kissing performance inefficiency on it's ratty lips, it's certainly been caught in some kind of carnal embrace with memory inefficiency.

Here are the results on my box:

./existance.pl
Elements:456976
Found it! (sub)
sub took:  2 wallclock secs ( 1.49 usr +  0.01 sys =  1.50 CPU)
Size: 58632
Found it! (scan)
scan took:  1 wallclock secs ( 0.40 usr +  0.00 sys =  0.40 CPU)
Size: 58632
Found it! (hash)
hash took:  2 wallclock secs ( 0.90 usr +  0.15 sys =  1.05 CPU)
Size: 88324
Of course, we've strayed far from meditation and into experimentation. I'm sorry, what were we optimizing for again? :-)

Peace,
-McD

Replies are listed 'Best First'.
Re: Re: Re: Beauty is in the eye of the beholder
by japhy (Canon) on Feb 21, 2001 at 02:47 UTC
    You quoted me, and then didn't follow my lead. "But if you do this a lot of times, you're kissing inefficiency on its ratty lips."

    You performed these tests ONCE each. Use the first() function several times. Scan through the array several times. Create the hash ONCE, and use exists() many times.

    japhy -- Perl and Regex Hacker
      Nuts. You're absoloutly right.

      I misunderstood what you meant at first - now I see. This is a perfect example of the classic tradeoff of speed vs. memory.

      Which brings us back to meditation, after all, doesn't it?

      Time for a beer, methinks. All this tinkering and meditating is thirsty work. Thanks for following up!

      Peace,
      -McD

Re: Re: Re: Beauty is in the eye of the beholder
by japhy (Canon) on Feb 21, 2001 at 03:02 UTC
    Here's my test. first() can be made faster by passing an array reference, not the array.
    (RESULTS) Elements:17576 sub took: 52 (51.14 usr + 0.00 sys = 51.14 CPU) scan took: 17 (13.85 usr + 0.00 sys = 13.85 CPU) hash took: 0 ( 0.17 usr + 0.13 sys = 0.30 CPU) (CODE) #!/usr/bin/perl -w use Benchmark; @List = ("aaa" .. "zzz"); print "Elements:", scalar @List, "\n"; @rand_list = map $List[rand @List], 1 .. 100; sub first (&@) { my $cref = shift; $cref->($_) and return $_ for @_; } $t0 = new Benchmark; for (@rand_list) { $rand_element = $_; $first_match = first {$_ eq $rand_element } @List; } $t1 = new Benchmark; $td = timediff($t1, $t0); print "sub took: ", timestr($td), "\n"; $t0 = new Benchmark; for (@rand_list) { $rand_element = $_; for (@List) { last if $_ eq $rand_element } } $t1 = new Benchmark; $td = timediff($t1, $t0); print "scan took: ", timestr($td), "\n"; $t0 = new Benchmark; my %seen; @seen{@List} = (); $t1 = new Benchmark; $td = timediff($t1, $t0); for (@rand_list) { 1 if exists $seen{$_} } print "hash took: ", timestr($td), "\n";


    japhy -- Perl and Regex Hacker