Re^3: Can this code be optimized further?

I added a testing mode to the code, so you can verify that each sub yields proper results. When running, if you pass a n argument to the program, it will run a test instead of a benchmark. Benchmark runs for three seconds instead of 100_000 iterations, now.

I also tweaked a few of the routines, which made substantial differences in their runtimes. And I dumped a couple of uninteresting subs. The code is in the readmore.

#!perl
use strict;
use warnings;
use Benchmark qw/cmpthese/;

my $testing = @ARGV;
my @master=("a_1","b_1","a_2","a_3","a_4","b_2","a_5","b_3","a_6","b_4
+");


# samy_kumar
sub original {
    my (@a, @b) ;
    my @temp = @master;
    foreach my $value (@temp) {
    push @a, $1 if ($value =~ /a_(.*)/) ;
    push @b, $1 if ($value =~ /b_(.*)/) ;
    }
    print "@a <==> @b\n" if $testing;
}


# dragonchild
sub hash_style {
    my %values;
    my @temp = @master;
    foreach (@temp) {
        push @{$values{$1}}, $2 if /^([ab])_(.*)/;
    }
    print "@{$values{a}} <==> @{$values{b}}\n" if $testing;
}

sub switch_style {
    my (@a, @b);
    my @temp = @master;
    foreach (@temp) {
    my $prefix = substr( $_, 0, 1 );
    
    if ( $prefix eq 'a' ) {
        push @a, substr( $_, 2 );
    } else {
        push @b, substr( $_, 2 );
    }
    }
    print "@a <==> @b\n" if $testing;
}

sub map_style {
    my @temp = @master;
    my @a = map {/a_(.*)/ ? $1 : ()} @temp;
    my @b = map {/b_(.*)/ ? $1 : ()} @temp;
    print "@a <==> @b\n" if $testing;
}

sub new_map {
    my @temp = @master;
    my @a = map {/a_(.*)/} @temp;
    my @b = map {/b_(.*)/} @temp;
    print "@a <==> @b\n" if $testing;
}

sub grep_style {
    my @temp = @master;
    my @a_arr = grep { s/^a_(.*)/$1/} @temp;
    my @b_arr = grep { s/^b_(.*)/$1/} @temp;
    print "@a_arr <==> @b_arr\n" if $testing;
}

sub new_grep {
    my @temp = @master;
    my @a = grep { s/^a_//} @temp;
    my @b = grep { s/^b_//} @temp;
    print "@a <==> @b\n" if $testing;
}


sub trinary {
  my @temp = @master;
  my (@a, @b) ;
  m[^([ab])_(.*)$] and push @{$1 eq 'a' ? \@a : \@b}, $2 for @temp;
  print "@a <==> @b\n" if $testing;
}

sub tri_substr {
    my @temp = @master;
    my (@a, @b) ;
    push @{substr($_,0,1) eq 'a' ? \@a : \@b}, substr($_,2) for @temp;
    print "@a <==> @b\n" if $testing;
}

sub tri_substr2 {
    my @temp = @master;
    my (@a, @b) ;
    (substr($_,0,1) eq 'a')
      ? push @a , substr($_,2)
      : push @b , substr($_,2)
    for @temp;
    print "@a <==> @b\n" if $testing;
}

my %contenders = (
   Original    => \&original,
   Hash        => \&hash_style,
   Map        => \&map_style,
   New_Map    => \&new_map,
   Grep        => \&grep_style,
   New_Grep    => \&new_grep,
   Switch    => \&switch_style,
   Trinary    => \&trinary,
   Tri_Substr    => \&tri_substr,
   Tri_Substr2    => \&tri_substr2,
);

if ($testing) {
    while (my ($k,$v) = each %contenders) {
    print "$k: ";
    &$v;
    print "---\n";
    }
}
else {
    cmpthese( -3, \%contenders);
}
[download]

My results:

              Rate Grep Trinary Hash  Map Original New_Map New_Grep Tr
+i_Substr Switch Tri_Substr2
Grep         969/s   --    -30% -33% -37%     -43%    -50%     -56%   
+    -68%   -71%        -76%
Trinary     1379/s  42%      --  -4% -10%     -19%    -29%     -37%   
+    -55%   -59%        -66%
Hash        1442/s  49%      5%   --  -6%     -15%    -26%     -34%   
+    -53%   -57%        -64%
Map         1528/s  58%     11%   6%   --     -10%    -21%     -31%   
+    -50%   -54%        -62%
Original    1702/s  76%     23%  18%  11%       --    -12%     -23%   
+    -44%   -49%        -57%
New_Map     1937/s 100%     40%  34%  27%      14%      --     -12%   
+    -36%   -42%        -52%
New_Grep    2202/s 127%     60%  53%  44%      29%     14%       --   
+    -28%   -34%        -45%
Tri_Substr  3050/s 215%    121% 111% 100%      79%     57%      39%   
+      --    -9%        -24%
Switch      3353/s 246%    143% 132% 119%      97%     73%      52%   
+     10%     --        -16%
Tri_Substr2 4000/s 313%    190% 177% 162%     135%    106%      82%   
+     31%    19%          --
[download]

Tri_Substr2 is effectively a synthesis of switch and trinary. You can see the differences little design changes make. Using references slows things down. Matching and substituting back in instead of just deleting makes a big difference for grep vs. new_grep.

Caution: Contents may have been coded under pressure.

Comment on Re^3: Can this code be optimized further? Select or Download Code

Replies are listed 'Best First'.
Re^4: Can this code be optimized further? by dragonchild (Archbishop) on Feb 10, 2005 at 18:35 UTC
`sub hash_style2 { my %values; my @temp = @master; push @{$values{substr($_,0,1)}}, substr($_,2) for @temp; print "@{$values{a}} <==> @{$values{b}}\n" if $testing; } sub ikegami { my @temp = @master; local $_ = "@temp"; my @a = /\ba_(\d+)\b/g; my @b = /\bb_(\d+)\b/g; print "@a <==> @b\n" if $testing; }` [download] When I rerun adding this method (removing the three worst performers), here's what I get: `Rate Trinary New_Grep Original ikegami Hash2 Tri_Substr + Switch Tri_Substr2 Trinary 10273/s -- -5% -22% -34% -40% -41% + -41% -50% New_Grep 10766/s 5% -- -18% -31% -37% -38% + -38% -48% Original 13096/s 27% 22% -- -16% -23% -25% + -25% -37% ikegami 15683/s 53% 46% 20% -- -8% -10% + -10% -24% Hash2 17016/s 66% 58% 30% 8% -- -2% + -3% -18% Tri_Substr 17437/s 70% 62% 33% 11% 2% -- + -0% -16% Switch 17453/s 70% 62% 33% 11% 3% 0% + -- -15% Tri_Substr2 20651/s 101% 92% 58% 32% 21% 18% + 18% --` [download] Changing to substr() brings the hash into respectable range. Being right, does not endow the right to be rude; politeness costs nothing. Being unknowing, is not the same as being stupid. Expressing a contrary opinion, whether to the individual or the group, is more often a sign of deeper thought than of cantankerous belligerence. Do not mistake your goals as the only goals; your opinion as the only opinion; your confidence as correctness. Saying you know better is not the same as explaining you know better.	[reply] [d/l] [select]
Re^5: Can this code be optimized further? by Roy Johnson (Monsignor) on Feb 10, 2005 at 18:52 UTC
Yes, it looks like using substr instead of matching with capture is the most significant improvement to be made. In fact, changing the original to do that causes it to run faster than anything except Tri_Substr2. `sub orig_substr { my (@a, @b) ; my @temp = @master; foreach my $value (@temp) { push @a, substr($value, 2) if (substr($value,0,2) eq 'a_') ; push @b, substr($value, 2) if (substr($value,0,2) eq 'b_') ; } print "@a <==> @b\n" if $testing; }` [download] Caution: Contents may have been coded under pressure.	[reply] [d/l]