RE: List processing (benchmark results)

Update: btrott found that I had left out a piece of the original code. New code and results are shown here.

Being interminably curious, I benchmarked a few of the suggested snippets of code.

Benchmark: timing 1000 iterations of Odud, PMGurus, btrott, visnu...
      Odud: 10 wallclock secs ( 9.61 usr +  0.01 sys =  9.62 CPU)
   PMGurus:  6 wallclock secs ( 5.41 usr +  0.01 sys =  5.42 CPU)
    btrott:  7 wallclock secs ( 7.17 usr +  0.04 sys =  7.21 CPU)
     visnu: 11 wallclock secs (11.31 usr +  0.00 sys = 11.31 CPU)
[download]

So, combining the two actions into one (btrott's suggestion) results in an immediate 25% improvement and further refinement by the Perl Monks Guru squad reduced the times by another 25% over that. The one-liner (and I always lean toward one-liners in Perl) performed worse than the original... How depressing! :-)

I am pasting my entire testing code (since I have been known to do silly things in the name of benchmarking ;-))

#!/usr/bin/perl -w
use strict;
use Benchmark;
use vars qw/@words @common/;

@words  = qw/Notice the use of an anonymous inner class Here an instan
+ce of an
unnamed class derived from Thread is created and a run method that
calls endApp is defined for the class

All of this means that when the application is about to terminate
the JVM starts the thread representing by the passed-in thread
object When the thread starts the run method is called The run
method calls endApp which closes the log file This flushes the
output buffer

To underscore the effect of the shutdown hook comment out the
addShutdownHook lines in ShutdownDemo You'll see that the log
file is empty when the program terminates

You can register multiple shutdown hooks In this case each thread
that represents a hook is started in an unspecified order and the
various threads run simultaneously You cannot register or
unregister a shutdown hook after the shutdown sequence has started
Doing so results in an IllegalStateException

Because shutdown hooks run as threads you must use thread-safe
programming techniques Otherwise you risk having threads interfere
with each other Also it's wise to design your application for
simple and fast shutdown processing For example you might run
into trouble if your application uses services during shutdown that
are themselves in the processing of being shut down

There are cases where shutdown processing does not happen even if
you have registered shutdown hooks One example is corrupted native
methods for example when you dereference a null pointer in C code

This feature is somewhat similar to the atexit library function in
/;

@common = qw|a an and at the to|;

timethese(1000, {
  Odud => q{
    my (@only, %r);

    foreach $r (@words){
      $r{lc $r} = 1;
    }
    my @uniqwords = sort keys %r;

    @seen{@common} = ();
    foreach $item (@uniqwords) {
      push(@only,$item) unless exists $seen{$item};
    }
    my @newwords = @only;
  },
  PMGurus => q{
    my %seen;
    @seen{@common} = (1) x @common;
    my @newwords = sort grep !$seen{+lc}++, @words;
  },
  btrott => q{
    my(%seen, %r);
    @seen{@common} = ();
    for my $r (@words) {
      next if exists $seen{ lc $r };
      $r{lc $r} = 1;
    }
    my @newwords = sort keys %r;
  },
  visnu => q{
    my @newwords = sort keys %{+{ map { !$common{$_ = lc $_} ? ($_, 1)
+ : () } 
                                      @words }};
  },
});
[download]

Russ
Brainbench 'Most Valuable Professional' for Perl

Comment on RE: List processing (benchmark results) Select or Download Code

Replies are listed 'Best First'.
RE: RE: List processing (benchmark results) by visnu (Sexton) on Jul 12, 2000 at 06:26 UTC
that is disappointing... the overhead of map creating the temporary array must be too great. hmmm.... now i'm trying to think of a way to pre-allocate so i can skew the benchmark results more in my favor..	[reply]
RE:(3) List processing (benchmark results) by Russ (Deacon) on Jul 12, 2000 at 06:27 UTC
I hope you can. I'd love to see the one-liner win out! :-) Russ Brainbench 'Most Valuable Professional' for Perl	[reply]
RE: RE:(3) List processing (benchmark results) by btrott (Parson) on Jul 12, 2000 at 07:16 UTC
All of the code posted in the golf thread are one-liners. Here's mine, again: `my @words = sort grep !$seen{$_=lc}++, @words;` [download]	[reply] [d/l]
RE: RE: List processing (benchmark results) by btrott (Parson) on Jul 12, 2000 at 07:14 UTC
I ran this myself, and visnu's code is broken in your benchmarking. visnu's code is checking the %common hash for the common words, and you don't have such a hash defined. Since the other benchmarked code has setup code within the benchmarks (setting up the common words hash), we should do that for visnu's, so add this at the beginning: `my %common; @common{@common} = (1) x @common;` [download] Also, the PMGurus code doesn't work quite right, as it doesn't lower-case the words in the grep. After taking out the PMGurus code, and fixing up visnu's, I got: `Benchmark: timing 1000 iterations of Odud, btrott, visnu... Odud: 4 secs ( 3.55 usr 0.00 sys = 3.55 cpu) btrott: 2 secs ( 2.27 usr 0.00 sys = 2.27 cpu) visnu: 4 secs ( 3.98 usr 0.00 sys = 3.98 cpu)` [download]	[reply] [d/l] [select]
RE:(3) List processing (benchmark results) by Russ (Deacon) on Jul 13, 2000 at 01:09 UTC
I made the changes you mentioned. I had missed visnu's %common, and your code and PMGurus code ignored the common words, in my version. Also, since the PMGurus code does lower-case the words in the grep, I have included those results. Thanks for pointing out what I had missed. Here is the latest code, and results: @common = qw\|a an and at the to\|; timethese(1000, { Odud => q{ my (@only, %r, %seen); foreach $r (@words){ $r{lc $r} = 1; } my @uniqwords = sort keys %r; foreach my $item (@uniqwords) { push(@only,$item) unless exists $seen{$item}; } my @newwords = @only; }, PMGurus => q{ my %seen; @seen{@common} = (1) x @common; my @newwords = sort grep !$seen{+lc}++, @words; }, btrott_OneLiner => q{ my %seen; @seen{@common} = (1) x @common; my @newwords = sort grep !$seen{$_=lc}++, @words; }, btrott_FirstPass => q{ my (%seen, %r); @seen{@common} = (1) x @common; for my $r (@words) { next if exists $seen{ lc $r }; $r{lc $r} = 1; } my @newwords = sort keys %r; }, visnu => q{ my %common; @common{@common} = (1) x @common; my @newwords = sort keys %{+{ map { !$common{$_ = lc $_} ? ($_, 1) + : () } @words }}; }, }); Benchmark: timing 1000 iterations of Odud, PMGurus, btrott_FirstPass, +btrott_OneLiner, visnu... Odud: 9 wallclock secs ( 9.61 usr + 0.00 sys = 9.61 CPU) PMGurus: 6 wallclock secs ( 5.40 usr + 0.00 sys = 5.40 CPU) btrott_FirstPass: 7 wallclock secs ( 7.07 usr + 0.00 sys = 7.07 CPU) btrott_OneLiner: 6 wallclock secs ( 5.73 usr + 0.00 sys = 5.73 CPU) visnu: 10 wallclock secs (10.29 usr + 0.00 sys = 10.29 CPU) [download] Russ Brainbench 'Most Valuable Professional' for Perl	[reply] [d/l]