Re: Can this code be optimized further?
by dragonchild (Archbishop) on Feb 10, 2005 at 13:56 UTC
|
There's a few improvements that can be made.
use strict;
my @temp = qw( a_1 b_1 a_2 a_3 a_4 b_2 a_5 b_3 a_6 b_4 );
my %values;
foreach (@temp) {
/^([ab])_(.*)/ && do {
push @{$values{$1}}, $2;
next;
};
die "'$_' doesn't match the pattern of ^[ab]_.*\n";
}
print "$_ => @{$values{$_}}\n" for qw( a b );
Being right, does not endow the right to be rude; politeness costs nothing. Being unknowing, is not the same as being stupid. Expressing a contrary opinion, whether to the individual or the group, is more often a sign of deeper thought than of cantankerous belligerence. Do not mistake your goals as the only goals; your opinion as the only opinion; your confidence as correctness. Saying you know better is not the same as explaining you know better.
| [reply] [d/l] |
Re: Can this code be optimized further?
by eXile (Priest) on Feb 10, 2005 at 14:33 UTC
|
my @temp=("a_1","b_1","a_2","a_3","a_4","b_2","a_5","b_3","a_6","b_4")
+;
my @a_arr = grep { s/^a_(.*)/$1/} @temp;
my @b_arr = grep { s/^b_(.*)/$1/} @temp;
PS: using @a and @b should be OK, but $a and $b have special meaning (in 'sort'), so I try to never use 'a' and 'b' as variable names. | [reply] [d/l] |
Re: Can this code be optimized further?
by BrowserUk (Patriarch) on Feb 10, 2005 at 14:53 UTC
|
#!/usr/bin/perl -w
use strict;
my @temp=("a_1","b_1","a_2","a_3","a_4","b_2","a_5","b_3","a_6","b_4")
+;
my (@a, @b) ;
m[^([ab])_(.*)$] and push @{$1 eq 'a' ? \@a : \@b}, $2 for @temp;
print "@a <==> @b";
Examine what is said, not who speaks.
Silence betokens consent.
Love the truth but pardon error.
| [reply] [d/l] |
Re: Can this code be optimized further?
by Fletch (Bishop) on Feb 10, 2005 at 14:06 UTC
|
## I'd personally use qw( ) here, but this is probalby just sample dat
+a
my @temp = ("a_1","b_1","a_2","a_3","a_4","b_2","a_5","b_3","a_6","b_4
+");
my( @a, @b );
my %arrays;
eval qq{\$arrays{ "$_" } = \\\@$_} for qw( a b );
for( @temp ) {
die "Invalid prefix on element '$_'\n" unless /^([ab])_(.*)/;
push @{ $arrays{ $1 } }, $2;
}
Update: Gah, left off a couple of \\ to get a reference in the eval.
| [reply] [d/l] |
|
|
eval qq{\$arrays{ "$_" } = \\\@$_} for qw( a b ); ?!?!?
Am I the only one that thinks unncessary evals are just ugly? While I commend you for trying to keep the surrounding code unchanged (by creating @a and @b), I think the better solution is to improve the surrounding code by making sure the appropriate data structures are used - in this case, a HoA is appropriate.
Being right, does not endow the right to be rude; politeness costs nothing. Being unknowing, is not the same as being stupid. Expressing a contrary opinion, whether to the individual or the group, is more often a sign of deeper thought than of cantankerous belligerence. Do not mistake your goals as the only goals; your opinion as the only opinion; your confidence as correctness. Saying you know better is not the same as explaining you know better.
| [reply] [d/l] |
|
|
If you can think of a better way to get a reference to an arbitrary lexical without eval sure. The original question was how to get things into arrays @a and @b; this gets things into the lexical arrays.
Granted they may really need a HoA to begin with, but this technique (or the equivalent $arrays{ $_ } = do { no strict 'refs'; \@{ $_ } } for a package variable) sometimes is useful.
| [reply] [d/l] [select] |
|
|
You can use a hash to reroute things to the lexical arrays so you don't have to eval:
my (@a, @b);
my %router = (a => \@a, b => \@b);
foreach my $value (@temp) {
push @{$router{$1}}, $2 if $value =~ /([ab])_(.*)/;
}
print "@a <==> @b";
Caution: Contents may have been coded under pressure.
| [reply] [d/l] |
|
|
And if you had 10 arrays you'd want to keep that hash up to date? 20? 30?
(Of course not, you'd use a HoA and not screw with this in the first place. :)
That being said, in this specific case of just a and b I'd personally have probably used something like BrowserUK's @{ $1 eq 'a' ? \@a : \@b} below; but the eval's more flexible if you have a large number of destinations (then again with a large number of destinations you'd probably want an HoA). Of course at the moment I'd be more likely to just use Enumerable#partition and be done with it, but that's another language all together.
| [reply] [d/l] |
|
|
Re: Can this code be optimized further?
by Anonymous Monk on Feb 10, 2005 at 14:23 UTC
|
I wouldn't use push. If @temp is large (which assume it is, if it isn't there's no need to bother with optimizing it), repeated pushes means repeated mallocing the array needed for @a and @b as it extends.
I'd use:
my @temp = (...);
my @a = map {/a_(.*)/ ? $1 : ()} @temp;
my @b = map {/b_(.*)/ ? $1 : ()} @temp;
If you expect to not have many matches, that is, most elements in @temp don't contain a_ or b_, I'd try to see whether the simpler regex is enough gain to have both the map and the grep:
my @a = map {/a_(.*)/; $1} grep /a_/ @temp;
my @b = map {/b_(.*)/; $1} grep /b_/ @temp;
| [reply] [d/l] [select] |
Re: Can this code be optimized further?
by phaylon (Curate) on Feb 10, 2005 at 14:41 UTC
|
This works w/o pcre's and more than a and b, though it stores in a hash.
#!/usr/bin/perl
use warnings;
use strict;
my @temp = qw(
a_1 b_1 a_2 a_3
a_4 b_2 a_5 b_3
a_6 b_4
);
my %parts;
push @{ $parts{ substr $_, 0, 1 }}, substr $_, 2
foreach @temp;
use Data::Dumper;
print Dumper \%parts;
| [reply] [d/l] |
|
|
| [reply] |
|
|
Yep, but I often talk to people not involved, so it's just a habit. Besides, I find it more aesthetic than "re's" or "RegExp's" or something :)
| [reply] |
Re: Can this code be optimized further?
by rir (Vicar) on Feb 10, 2005 at 20:22 UTC
|
my ( @a, @b);
{
no strict "refs";
push @{substr( $_,0,1)}, substr($_,2) foreach ( @temp)
}
There is not much point in avoiding sym-refs here.
Be well,
rir | [reply] [d/l] |
Re: Can this code be optimized further?
by cog (Parson) on Feb 10, 2005 at 14:49 UTC
|
Use the Benchmark module to try out all these answers :-) | [reply] |
Re: Can this code be optimized further?
by holli (Abbot) on Feb 10, 2005 at 18:56 UTC
|
This is not an optimization, but for the sake of TIMTOWTDI:
#!/usr/bin/perl -w
use strict;
my @temp=("a_1","b_1","a_2","a_3","a_4","b_2","a_5","b_3","a_6","b_4")
+;
my (@a, @b) ;
my %dispatch =
(
"a" => sub { push @a, shift },
"b" => sub { push @b, shift },
);
foreach my $value (@temp)
{
if ( $value =~ /^([ab])_(.+)$/ )
{
my $sub = $dispatch{$1};
&$sub($2);
}
}
print "@a <==> @b\n";
| [reply] [d/l] |
Re: Can this code be optimized further?
by runrig (Abbot) on Feb 10, 2005 at 16:58 UTC
|
As long as everyone's just chiming in with different ways, if you don't mind modifying the original array: for (@temp) {
push @a, $_ if s/^a_//;
push @b, $_ if s/^b_//;
}
You could even modify this to use a HoA-type solution, e.g.(update: fixed code..still untested):my %hoa;
s/^([a-z])_// and push @{$hoa{$1}}, $_ for @temp;
| [reply] [d/l] [select] |
Re: Can this code be optimized further?
by jdporter (Paladin) on Feb 10, 2005 at 20:58 UTC
|
This solution is more golfish than fast. (Anyone care to benchmark it?)
eval join '', map /(.*)_(.*)/ ? "push \@$1,'$2';" : (), @temp;
Here's another which is shorter but perhaps slower, since it does many evals instead of one:
/(.*)_(.*)/ && eval "push \@$1,'$2'" for @temp;
| [reply] [d/l] [select] |
Re: Can this code be optimized further?
by RazorbladeBidet (Friar) on Feb 10, 2005 at 15:23 UTC
|
#!/usr/bin/perl -w
use strict;
use Time::HiRes qw( gettimeofday tv_interval);
my @temp=("a_1","b_1","a_2","a_3","a_4","b_2","a_5","b_3","a_6","b_4")
+;
my @startTime = gettimeofday;
for ( 1..10000 ) {
my (@a, @b) ;
foreach my $value (@temp) {
push @a, $1 if ($value =~ /a_(.*)/) ;
push @b, $1 if ($value =~ /b_(.*)/) ;
}
}
print "Time: ".tv_interval( \@startTime )."\n";
@temp = qw( a_1 b_1 a_2 a_3 a_4 b_2 a_5 b_3 a_6 b_4 );
@startTime = gettimeofday;
for ( 1..10000 ) {
my %values;
foreach (@temp) {
/^([ab])_(.*)/ && do {
push @{$values{$1}}, $2;
next;
};
die "'$_' doesn't match the pattern of ^[ab]_.*\n";
}
}
print "Time: ".tv_interval( \@startTime )."\n";
@temp = qw( a_1 b_1 a_2 a_3 a_4 b_2 a_5 b_3 a_6 b_4 );
@startTime = gettimeofday;
for ( 1..10000 ) {
my (@a, @b);
foreach (@temp) {
my $prefix = substr( $_, 0, 2 );
if ( $prefix eq 'a_' ) {
push @a, substr( $_, 2 );
} elsif ( $prefix eq 'b_' ) {
push @b, substr( $_, 2 );
}
}
}
print "Time: ".tv_interval( \@startTime )."\n";
@startTime = gettimeofday;
for ( 1..10000 ) {
my( @a, @b );
my %arrays;
eval qq{\$arrays{ "$_" } = \\\@$_} for qw( a b );
for( @temp ) {
die "Invalid prefix on element '$_'\n" unless /^([ab])_(.*)/;
push @{ $arrays{ $1 } }, $2;
}
}
print "Time: ".tv_interval( \@startTime )."\n";
@startTime = gettimeofday;
for ( 1..10000 ) {
my @a = map {/a_(.*)/ ? $1 : ()} @temp;
my @b = map {/b_(.*)/ ? $1 : ()} @temp;
}
print "Time: ".tv_interval( \@startTime )."\n";
@startTime = gettimeofday;
for ( 1..10000 ) {
my @a_arr = grep { s/^a_(.*)/$1/} @temp;
my @b_arr = grep { s/^b_(.*)/$1/} @temp;
}
print "Time: ".tv_interval( \@startTime )."\n";
# GAVE ME substr outside of string at ./testme2.pl line 88.
#@startTime = gettimeofday;
#for ( 1..10000 ) {
# my %parts;
# push @{ $parts{ substr $_, 0, 1 }}, substr $_, 2
# foreach @temp;
#}
#print "Time: ".tv_interval( \@startTime )."\n";
@startTime = gettimeofday;
for ( 1..10000 ) {
my (@a, @b) ;
m[^([ab])_(.*)$] and push @{$1 eq 'a' ? \@a : \@b}, $2 for @temp;
}
print "Time: ".tv_interval( \@startTime )."\n";
Output:
Time: 0.575739
Time: 0.717803
Time: 0.46604
Time: 1.731715
Time: 0.777471
Time: 0.462416
Time: 0.103439
I'd go with BrowserUK's code :)
Update!!!:
Somewhere along the way the @temp was being affected, hence the warnings on phaylon's code.
Here's the updated output w/ mine removed and phaylon's added (also a little more descriptive):
Baseline Time: 0.584196
Regexp w/ Hash Time: 0.7385
Eval Time: 1.774581
Map Time: 0.76179
Grep Time: 0.456991
Hash w/ substr Time: 0.411059
Regexp w/ eq Time: 0.683154
Router Time: 0.668033
Hence, the HoA using substrings would be the most flexible and quickest (albeit a bit risky). Grep is a close second with the eval being 3x as long. | [reply] [d/l] [select] |
|
|
#!/usr/bin/perl -w
use strict;
use Benchmark qw/cmpthese/;
my @temp=("a_1","b_1","a_2","a_3","a_4","b_2","a_5","b_3","a_6","b_4")
+;
# samy_kumar
sub original {
my (@a, @b) ;
foreach my $value (@temp) {
push @a, $1 if ($value =~ /a_(.*)/) ;
push @b, $1 if ($value =~ /b_(.*)/) ;
}
}
# dragonchild
sub hash_style {
my %values;
foreach (@temp) {
/^([ab])_(.*)/ && do {
push @{$values{$1}}, $2;
next;
};
}
}
# roy jonhson
sub router_style {
my (@a, @b);
my %router = (a => \@a, b => \@b);
foreach my $value (@temp) {
push @{$router{$1}}, $2 if $value =~ /([ab])_(.*)/;
}
}
sub switch_style {
my (@a, @b);
foreach (@temp) {
my $prefix = substr( $_, 0, 2 );
if ( $prefix eq 'a_' ) {
push @a, substr( $_, 2 );
} elsif ( $prefix eq 'b_' ) {
push @b, substr( $_, 2 );
}
}
}
sub map_style {
my @a = map {/a_(.*)/ ? $1 : ()} @temp;
my @b = map {/b_(.*)/ ? $1 : ()} @temp;
}
sub grep_style {
my @a_arr = grep { s/^a_(.*)/$1/} @temp;
my @b_arr = grep { s/^b_(.*)/$1/} @temp;
}
sub grep_map_style {
my @a = map {/a_(.*)/; $1} grep /a_/, @temp;
my @b = map {/b_(.*)/; $1} grep /b_/, @temp;
}
sub trinary {
my (@a, @b) ;
m[^([ab])_(.*)$] and push @{$1 eq 'a' ? \@a : \@b}, $2 for @temp;
}
cmpthese( 100_000, {
"Original" => \&original,
"Hash" => \&hash_style,
"Router" => \&router_style,
"Switch" => \&switch_style,
"Map" => \&map_style,
"Grep" => \&grep_style,
"Grep + Map" => \&grep_map_style,
"Trinary" => \&trinary
});
__DATA__
C:\test>perl 429768.pl
Rate Grep Switch Map Router Original Grep + Map Ha
+sh Trinary
Grep 32489/s -- -67% -70% -78% -81% -83% -8
+5% -87%
Switch 98425/s 203% -- -9% -34% -42% -49% -5
+5% -60%
Map 108460/s 234% 10% -- -27% -36% -44% -5
+1% -56%
Router 149031/s 359% 51% 37% -- -12% -23% -3
+2% -40%
Original 168634/s 419% 71% 55% 13% -- -13% -2
+3% -32%
Grep + Map 194175/s 498% 97% 79% 30% 15% -- -1
+2% -21%
Hash 220264/s 578% 124% 103% 48% 31% 13%
+-- -11%
Trinary 246914/s 660% 151% 128% 66% 46% 27% 1
+2% --
C:\test>perl 429768.pl
Rate Grep Switch Map Router Original Grep + Map Ha
+sh Trinary
Grep 32658/s -- -66% -69% -78% -81% -83% -8
+5% -86%
Switch 95602/s 193% -- -10% -36% -43% -49% -5
+5% -60%
Map 106610/s 226% 12% -- -28% -37% -43% -5
+0% -55%
Router 148810/s 356% 56% 40% -- -12% -21% -3
+0% -37%
Original 168350/s 415% 76% 58% 13% -- -11% -2
+1% -29%
Grep + Map 188324/s 477% 97% 77% 27% 12% -- -1
+2% -21%
Hash 213220/s 553% 123% 100% 43% 27% 13%
+-- -10%
Trinary 236967/s 626% 148% 122% 59% 41% 26% 1
+1% --
C:\test>perl 429768.pl
Rate Grep Switch Map Router Original Grep + Map Ha
+sh Trinary
Grep 32819/s -- -66% -69% -78% -81% -82% -8
+6% -87%
Switch 96899/s 195% -- -9% -35% -42% -47% -5
+8% -61%
Map 106724/s 225% 10% -- -28% -37% -42% -5
+3% -57%
Router 148810/s 353% 54% 39% -- -12% -19% -3
+5% -40%
Original 168350/s 413% 74% 58% 13% -- -8% -2
+6% -32%
Grep + Map 182815/s 457% 89% 71% 23% 9% -- -2
+0% -26%
Hash 228833/s 597% 136% 114% 54% 36% 25%
+-- -7%
Trinary 246305/s 650% 154% 131% 66% 46% 35%
+8% --
C:\test>
| [reply] [d/l] |
|
|
Rate Grep Trinary Hash Map Original New_Map New_Grep Tr
+i_Substr Switch Tri_Substr2
Grep 969/s -- -30% -33% -37% -43% -50% -56%
+ -68% -71% -76%
Trinary 1379/s 42% -- -4% -10% -19% -29% -37%
+ -55% -59% -66%
Hash 1442/s 49% 5% -- -6% -15% -26% -34%
+ -53% -57% -64%
Map 1528/s 58% 11% 6% -- -10% -21% -31%
+ -50% -54% -62%
Original 1702/s 76% 23% 18% 11% -- -12% -23%
+ -44% -49% -57%
New_Map 1937/s 100% 40% 34% 27% 14% -- -12%
+ -36% -42% -52%
New_Grep 2202/s 127% 60% 53% 44% 29% 14% --
+ -28% -34% -45%
Tri_Substr 3050/s 215% 121% 111% 100% 79% 57% 39%
+ -- -9% -24%
Switch 3353/s 246% 143% 132% 119% 97% 73% 52%
+ 10% -- -16%
Tri_Substr2 4000/s 313% 190% 177% 162% 135% 106% 82%
+ 31% 19% --
Tri_Substr2 is effectively a synthesis of switch and trinary. You can see the differences little design changes make. Using references slows things down. Matching and substituting back in instead of just deleting makes a big difference for grep vs. new_grep.
Caution: Contents may have been coded under pressure.
| [reply] [d/l] [select] |
|
|
|
|
|
|
Redefine @temp before each algorithm or pass it in as an argument.
I see that Benchmark has a different output format, but the results should be close to the same, no?
| [reply] |
|
|
|
|
Re: Can this code be optimized further?
by TedPride (Priest) on Feb 10, 2005 at 16:15 UTC
|
I don't think regex is needed here, and you really should make the code somewhat more forgiving to data variations. What if you use all letters of the alphabet instead of just a and b?
use strict;
use warnings;
my @temp=("a_1","b_1","a_2","a_3","a_4","b_2","a_5","b_3","a_6","b_4")
+;
my ($key, $val, %hash);
for (@temp) {
($key, $val) = split '_';
if (exists($hash{$key})) { push @{$hash{$key}}, $val; }
else { $hash{$key} = [$val]; }
}
for (sort keys %hash) {
print "$_: " . join(' ', @{$hash{$_}}) . "\n";
}
You could probably optimize this further by allocating the entire expected space for each array when initially created, but I'll leave that for the next guy. | [reply] [d/l] |
Re: Can this code be optimized further?
by ikegami (Patriarch) on Feb 10, 2005 at 18:23 UTC
|
use strict;
use warnings;
my @temp = qw( a_1 b_1 a_2 a_3 a_4 b_2 a_5 b_3 a_6 b_4 );
local $_ = join(' ', @temp);
my @a = /a_(\d+)/g; # or: my @a = /\ba_(\d+)\b/g;
my @b = /b_(\d+)/g; # or: my @b = /\bb_(\d+)\b/g;
$, = ", ";
$\ = "\n";
print(@a);
print(@b);
__END__
output
======
1, 2, 3, 4, 5, 6
1, 2, 3, 4
| [reply] [d/l] |