Beefy Boxes and Bandwidth Generously Provided by pair Networks
Just another Perl shrine
 
PerlMonks  

cull

by tlm (Prior)
on Mar 31, 2007 at 20:14 UTC ( [id://607645]=CUFP: print w/replies, xml ) Need Help??

I imagine there's a few million versions out there of the cull function I'm about to describe, but I couldn't find one in the obvious places (e.g. List::Util, List::MoreUtils, etc.), so here it goes.

The cull function faintly resembles grep. It takes as arguments two references, the first one to a sub, and the second one to an array. It returns all the elements of the array for which the sub evaluates to true, but, contrary to grep it has the side effect of removing these elements from the original array.

Example:

my @x = qw( eenie meenie minie moe ); my @y = cull { /m.*nie/ } @x; print "@y\n"; print "@x\n"; __END__ meenie minie eenie moe
Note that when invoked as shown, it is not necessary to include the sub keyword nor to precede the array with \.

As with grep and map, the sub passed as the argument to cull does not get its argument from $_[0], but rather it reads it from $_.

One important difference between cull and grep is that cull takes exactly two arguments: the first one must be a subref (or a code block, but not a boolean expression), and the second one must be an array. Hence, for example, the second statement below would result in an error:

my @x = grep { /m.*nie/ } qw( eenie meenie minie moe ); my @y = cull { /m.*nie/ } qw( eenie meenie minie moe );
Of course, these are OK too:
my @y = cull sub { /m.*nie/ }, @x; my @y = cull( sub { /m.*nie/ }, @x );
As always, I look forward to reading your comments and suggestions.

Update: the point in the comments about the unnecessary expansion of the array indices is a good one, so I changed the original line
for my $i ( reverse 0 .. $#$array ) {
as suggested. Thanks.

sub cull (&\@) { my ( $test, $array ) = @_; my @culled; for ( my $i = $#$array; $i > -1; --$i ) { local $_ = $array->[ $i ]; unshift @culled, splice @$array, $i, 1 if $test->(); } return @culled; }

Replies are listed 'Best First'.
Re: cull
by hossman (Prior) on Mar 31, 2007 at 23:35 UTC

    I haven't seen any other implementations of a function like this before -- but i haven't exactly been looking for one either.

    My only suggestion would be using wantarray to check your calling context -- if it's void or scalar don't bother maintaining @culled, just record the count and save yourself some memory.

    Update: One other thing I just noticed; I'm no expert on the internal optimizations of the perl compiler, but would a C style for loop be a little better in this case? I seem to recall hearing that for (1..$big_num) had been optimized to not create a big ass list of sequential integers, but I'd be surprised if it's smart enough to make the same optimization in your reverse case.

      but I'd be surprised if it's smart enough to make the same optimization in your reverse case
      Unfortunately there's no surprise for you:
      $ perl -we 'for(0..999999999){last}' $ perl -we 'for(reverse 0..999999999){last}' Out of memory! $
      (That's perl 5.8.8.)
Re: cull
by parv (Parson) on Apr 01, 2007 at 21:48 UTC

    In order to avoid expansion of array indices list, I wrote ...

    use List::MoreUtils qw( firstidx ); sub cull_parv (&\@) { my ( $test , $array ) = @_; my @culled; while ( -1 < ( my $i = firstidx { &$test } @$array ) ) { push @culled , splice @$array , $i , 1 ; } return @culled; }

    ...which seems to be speedier when the size of the array increases ...

      Interesting. Before reading your comment, I came up with the solution below. The only differences are style (e.g. $code->() vs. &$code) that I refrain from building a return array if the context is void or scalar.

      I guess TMTOWTDI leads to some random convergence at times. ;-)

      use strict; use warnings; use List::MoreUtils qw[firstidx]; sub cull (&\@) { my ($code, $array) = (shift, shift); my @matches; my $count; # let List::MoreUtils (fast!) find the items # splice() returns the items removed.... while ( (my $idx = firstidx { $code->() } @$array) > -1 ) { my $item = splice(@$array,$idx,1); wantarray ? push(@matches, $item) : $count++; } wantarray ? @matches : $count; } my @test = qw[how I wish I could remember pi easily]; my @new = cull { $_ eq 'I' } @test; print join(':',@new), "\n"; print join(':',@test), "\n";
      <radiant.matrix>
      Ramblings and references
      The Code that can be seen is not the true Code
      I haven't found a problem yet that can't be solved by a well-placed trebuchet
Re: cull
by jdporter (Paladin) on Apr 01, 2007 at 16:15 UTC

    Because of the way you've written it, the second arg must be a ref to an actual array, which means your first example dies:

    Type of arg 2 to main::cull must be array (not constant item)... near +"qw( eenie meenie minie moe );" Too many arguments for main::cull ... near "qw( eenie meenie minie moe + );" Execution aborted due to compilation errors.

    (This is with perl 5.8.8)

    A word spoken in Mind will reach its own level, in the objective world, by its own weight
      OP has already listed that point.

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: CUFP [id://607645]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others goofing around in the Monastery: (5)
As of 2024-03-29 13:06 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found