http://qs1969.pair.com?node_id=615781

Anonymous Monk has asked for the wisdom of the Perl Monks concerning the following question:

Can someone explain what the differences would be between these two chunks of code?
my @galleries_found = $mech->content =~ m/(gallery\.php\?gid=\d+)/gi;
And
push(@pics, $1) while $mech->content =~ m#(images/thumb/\d+/\d+/ +\d+\.jpg)#gi;
Asasuming both are collecting the same data, that is. What's the difference/what's the better way of forming an array from all matches found from a mech->content fetch?

Replies are listed 'Best First'.
Re: Difference between this array assignment and push
by kyle (Abbot) on May 16, 2007 at 14:42 UTC

    One difference is that the first creates a smaller array. This is because the $1 variable is special and carries a little extra special data with it. When you stuff it into an array, that stuff stays. (I remember this fact without remembering the details. Perhaps a monk more familiar with Perl internals could explain better.)

    use Data::Dumper; use Devel::Size qw(total_size); my $bigfoo = 'foo' x 1_000; my @assigned = $bigfoo =~ /(foo)/g; my @pushed; push @pushed, $1 while $bigfoo =~ /(foo)/g; printf "pushed: %d\n", total_size( \@pushed ); printf "assigned: %d\n", total_size( \@assigned ); if ( Dumper( \@pushed ) eq Dumper( \@assigned ) ) { print "They look the same.\n"; } else { print "They look different.\n"; } __END__ pushed: 52132 assigned: 32052 They look the same.

    The effect is lessened if you push @pushed, "$1":

    pushed: 32132 assigned: 32052

    Even then @pushed still comes out larger, probably because it started small and grew through the loop while @assigned was the right size to begin with.

    My guess would be that the assigned method is faster too (especially with the repeated calls to $mech->content), but I haven't tested that.

      The following benchmark concurs with the idea that "Assign" is faster than "Push" for this example.

      use Benchmark qw(:all) ; my $count = 1000; my $foobar = 'foobar' x 1000; timethese($count, { 'Assigned' => sub { my @assigned = $foobar =~ /(foo)/g;}, 'Pushed' => sub { my @pushed; push @pushed, $1 while $foobar =~ /( +foo)/g; }, });

      Benchmark Results:

      Benchmark: timing 1000 iterations of Assigned, Pushed... Assigned: 1 wallclock secs ( 0.80 usr + 0.00 sys = 0.80 CPU) @ 12 +50.00/s (n=1000) Pushed: 1 wallclock secs ( 0.93 usr + 0.00 sys = 0.93 CPU) @ 10 +75.27/s (n=1000)
      Interesting analysis, and facts on $1 give me new lesson. I know that $1 and friends are special, but that's it. With the context provided, I obviously prefer the assignment technique. The only reason I would go with push and while is perhaps, if I needed to check $1 againsts some condition prior to deciding to push it to the target array.

      Open source softwares? Share and enjoy. Make profit from them if you can. Yet, share and enjoy!

Re: Difference between this array assignment and push
by rinceWind (Monsignor) on May 16, 2007 at 14:37 UTC

    I would go with the first approach as it's much clearer what you are doing.

    In the second case, @pics may have some previous contents that you are adding to. The code is iterating matches in a while loop; there's no reason to see why this would ever be better than picking up all matches as a list and assigning or pushing them to the array.

    --
    wetware hacker
    (Qualified NLP Practitioner and Hypnotherapist)

Re: Difference between this array assignment and push
by ikegami (Patriarch) on May 16, 2007 at 15:23 UTC

    I'm betting the first one is slightly faster (fewer Perl ops). The second uses less overhead memory (only one match on the stack at a time). These factors are probably negligible.

    Perl is very efficient at expanding (and shrinking) arrays at either end, so there's no advantage to knowing the size of the array before the assignment. If Perl wasn't so good at this, the second one would be much slower due to multiple reallocations of the array.