in reply to List processing performance

Update: btrott found that I had left out a piece of the original code. New code and results are shown here.

Being interminably curious, I benchmarked a few of the suggested snippets of code.

Benchmark: timing 1000 iterations of Odud, PMGurus, btrott, visnu... Odud: 10 wallclock secs ( 9.61 usr + 0.01 sys = 9.62 CPU) PMGurus: 6 wallclock secs ( 5.41 usr + 0.01 sys = 5.42 CPU) btrott: 7 wallclock secs ( 7.17 usr + 0.04 sys = 7.21 CPU) visnu: 11 wallclock secs (11.31 usr + 0.00 sys = 11.31 CPU)

So, combining the two actions into one (btrott's suggestion) results in an immediate 25% improvement and further refinement by the Perl Monks Guru squad reduced the times by another 25% over that. The one-liner (and I always lean toward one-liners in Perl) performed worse than the original... How depressing! :-)

I am pasting my entire testing code (since I have been known to do silly things in the name of benchmarking ;-))

#!/usr/bin/perl -w use strict; use Benchmark; use vars qw/@words @common/; @words = qw/Notice the use of an anonymous inner class Here an instan +ce of an unnamed class derived from Thread is created and a run method that calls endApp is defined for the class All of this means that when the application is about to terminate the JVM starts the thread representing by the passed-in thread object When the thread starts the run method is called The run method calls endApp which closes the log file This flushes the output buffer To underscore the effect of the shutdown hook comment out the addShutdownHook lines in ShutdownDemo You'll see that the log file is empty when the program terminates You can register multiple shutdown hooks In this case each thread that represents a hook is started in an unspecified order and the various threads run simultaneously You cannot register or unregister a shutdown hook after the shutdown sequence has started Doing so results in an IllegalStateException Because shutdown hooks run as threads you must use thread-safe programming techniques Otherwise you risk having threads interfere with each other Also it's wise to design your application for simple and fast shutdown processing For example you might run into trouble if your application uses services during shutdown that are themselves in the processing of being shut down There are cases where shutdown processing does not happen even if you have registered shutdown hooks One example is corrupted native methods for example when you dereference a null pointer in C code This feature is somewhat similar to the atexit library function in /; @common = qw|a an and at the to|; timethese(1000, { Odud => q{ my (@only, %r); foreach $r (@words){ $r{lc $r} = 1; } my @uniqwords = sort keys %r; @seen{@common} = (); foreach $item (@uniqwords) { push(@only,$item) unless exists $seen{$item}; } my @newwords = @only; }, PMGurus => q{ my %seen; @seen{@common} = (1) x @common; my @newwords = sort grep !$seen{+lc}++, @words; }, btrott => q{ my(%seen, %r); @seen{@common} = (); for my $r (@words) { next if exists $seen{ lc $r }; $r{lc $r} = 1; } my @newwords = sort keys %r; }, visnu => q{ my @newwords = sort keys %{+{ map { !$common{$_ = lc $_} ? ($_, 1) + : () } @words }}; }, });
Russ
Brainbench 'Most Valuable Professional' for Perl

Replies are listed 'Best First'.
RE: RE: List processing (benchmark results)
by visnu (Sexton) on Jul 12, 2000 at 06:26 UTC
    that is disappointing... the overhead of map creating the temporary array must be too great. hmmm.... now i'm trying to think of a way to pre-allocate so i can skew the benchmark results more in my favor..
      I hope you can. I'd love to see the one-liner win out! :-)

      Russ
      Brainbench 'Most Valuable Professional' for Perl

        All of the code posted in the golf thread are one-liners. Here's mine, again:
        my @words = sort grep !$seen{$_=lc}++, @words;
RE: RE: List processing (benchmark results)
by btrott (Parson) on Jul 12, 2000 at 07:14 UTC
    I ran this myself, and visnu's code is broken in your benchmarking. visnu's code is checking the %common hash for the common words, and you don't have such a hash defined. Since the other benchmarked code has setup code within the benchmarks (setting up the common words hash), we should do that for visnu's, so add this at the beginning:
    my %common; @common{@common} = (1) x @common;
    Also, the PMGurus code doesn't work quite right, as it doesn't lower-case the words in the grep.

    After taking out the PMGurus code, and fixing up visnu's, I got:

    Benchmark: timing 1000 iterations of Odud, btrott, visnu... Odud: 4 secs ( 3.55 usr 0.00 sys = 3.55 cpu) btrott: 2 secs ( 2.27 usr 0.00 sys = 2.27 cpu) visnu: 4 secs ( 3.98 usr 0.00 sys = 3.98 cpu)
      I made the changes you mentioned. I had missed visnu's %common, and your code and PMGurus code ignored the common words, in my version. Also, since the PMGurus code does lower-case the words in the grep, I have included those results.

      Thanks for pointing out what I had missed.

      Here is the latest code, and results:

      @common = qw|a an and at the to|; timethese(1000, { Odud => q{ my (@only, %r, %seen); foreach $r (@words){ $r{lc $r} = 1; } my @uniqwords = sort keys %r; foreach my $item (@uniqwords) { push(@only,$item) unless exists $seen{$item}; } my @newwords = @only; }, PMGurus => q{ my %seen; @seen{@common} = (1) x @common; my @newwords = sort grep !$seen{+lc}++, @words; }, btrott_OneLiner => q{ my %seen; @seen{@common} = (1) x @common; my @newwords = sort grep !$seen{$_=lc}++, @words; }, btrott_FirstPass => q{ my (%seen, %r); @seen{@common} = (1) x @common; for my $r (@words) { next if exists $seen{ lc $r }; $r{lc $r} = 1; } my @newwords = sort keys %r; }, visnu => q{ my %common; @common{@common} = (1) x @common; my @newwords = sort keys %{+{ map { !$common{$_ = lc $_} ? ($_, 1) + : () } @words }}; }, }); Benchmark: timing 1000 iterations of Odud, PMGurus, btrott_FirstPass, +btrott_OneLiner, visnu... Odud: 9 wallclock secs ( 9.61 usr + 0.00 sys = 9.61 CPU) PMGurus: 6 wallclock secs ( 5.40 usr + 0.00 sys = 5.40 CPU) btrott_FirstPass: 7 wallclock secs ( 7.07 usr + 0.00 sys = 7.07 CPU) btrott_OneLiner: 6 wallclock secs ( 5.73 usr + 0.00 sys = 5.73 CPU) visnu: 10 wallclock secs (10.29 usr + 0.00 sys = 10.29 CPU)
      Russ
      Brainbench 'Most Valuable Professional' for Perl