Beefy Boxes and Bandwidth Generously Provided by pair Networks
XP is just a number
 
PerlMonks  

findone { coderef } @array

by shotgunefx (Parson)
on Feb 25, 2002 at 23:22 UTC ( [id://147423]=CUFP: print w/replies, xml ) Need Help??

Like grep but returns the first match. successive calls return successive elements.
This was the solution to a reference problem I was working on awhile back. It was one of those things that I wanted to do, just to do it because it should be possible. (Have a sub that remembers some context info.)
Plus I like the construct :).
#!/usr/bin/perl ###################################################################### +############### # findone # usage: findone { coderef }, @array , OPTIONAL start_index OPTIONAL +end_index # Like grep but returns the first match. successive calls return succe +ssive elements. # Can be nested. Uses Weak references to prevent the leaking of memory +. ###################################################################### +############### BEGIN { use warnings; use strict; use WeakRef; my %find_cache = (); # Temporary Storeage. sub findone(&\@;$$) { my $coderef = shift ; # Generate a key from the caller function so we can track whic +h call it is. my $codekey = join(":",caller(),"$_[0]"); # Generate key my %persistant_args = () ; ############################################################## +########## # Clean up old keys to prevent leaking mem. If the data does +not exist, # then it has been freed and we don't need to keep position in +fo. ############################################################## +########## while(my ($k,$v) = each %find_cache){ delete $find_cache{$k} unless defined ($v->{dataref}); } unless (defined $find_cache{$codekey} ){ %persistant_args = ('index' =>($_[1]||0), 'dataref' => $_[ +0] ); }else{ %persistant_args = %{$find_cache{$codekey}}; } my $end_index = $_[2] || $#{ $_[0] }; for (; $persistant_args{index} <= $end_index; $persistant_ar +gs{index}++ ){ $_ = $_[0]->[$persistant_args{index}]; if (&$coderef){ $persistant_args{index}++; $find_cache{$codekey} = {%persistant_args}; weaken ($find_cache{$codekey}->{dataref}); return wantarray ? ($_ ,($persistant_args{index} -1 ) + ) : $_; } } delete $find_cache{$codekey}; return; } } ########################### # Silly Example # ########################### my @words = (qw(this Is a silly coNtrived Test)) x 5; print "\@words is $#words\n"; while ( my ( $val ,$index ) = findone { m/[A-Z]/ } @words ){ print "_" x 40,"\n"; print "Matched uppercase letter $val at $index \n"; while ( my ( $otherval ,$otherindex ) = findone { !m/[A-Z]/ } +@words , 10, 22 ){ print "\tInner matched all lower $otherval at $otherindex\n"; } sleep(1); }
updated Changes in response to jynx
updated Changed to accept end_index as well.
updated Changed to use stringified reference in cache key.
Now you can nest them.
my @AoA = (Array of arrays);

while ( my ($val,$index) = findone { findone{ other_criteria } @{$_} } @AoA ){

}
UPDATE: Subtle Bug/Feature, see this. I have a newer version that uses Filter::Simple, but I still hope to fix this without resorting to it.
UPDATE: Rough filter simple version here.

-Lee

"To be civilized is to deny one's nature."

Replies are listed 'Best First'.
Re: findone { coderef } @array
by jynx (Priest) on Feb 26, 2002 at 04:15 UTC

    A couple of comments on style:

    • You didn't use warnings or strict, even though your code seems to run fine with warnings turned on. The only difference for turning on strict is localizing %find_cache with my. My testing wasn't that extensive, just a little playing around, so there may be some deeper issues that i completely missed.

    • On the wantarray inside the if (&$coderef) block, it seems to read better if you put the return outside the ?: operator. This isn't necessarily like (ab)using map in a void context, but i consider it to be roughly similar (map has the for loop, ?: has the if/else block)

    • Since you don't sprinkle $_ anywhere but inside the for loop, the local $_; call can probably be moved to just inside the for loop. Whenever i see that at the top of a block i start expecting heavy usage of $_ to go on, but in this case i was reading carefully for no reason. Putting localization code in the smallest possible block is also generally a good thing (imho).

    • You include a hook for giving a starting point, but nothing for the ending point. It would be a simple change and add some (admittedly only moderately useful) functionality. This seems like a natural way to extend findone further though...

    • And speaking of extending code, now that you can findone you can write a wrapper to find as many as you want :-)

    Unfortunately my CPAN search skills aren't very good, so i don't know if anyone else has written a lazy grep (which is what this is with some extra bells and whistles, if my understanding is correct). If no one has posted one to CPAN or if the code submitted isn't that good this seems a good candidate for packaging and sending in...

    my $0_02,
    jynx

      Thanks for the feedback. I left out the my and the use (warnings && strict) while pasting. My bad.
      The reason I have the wantarray is because I wanted to be able to ignore the index with out forcing list context.
      my ($match,$index) = findone { $_ > 100 } @nums ; my $matchonly = findone { $_ > 100 } @nums ;
      Your right about the local $_ (Lava flow from first try). I lost it all together as $_ is localized by for anyway.
      Could you elaborate on what you had in mind for an end hook?

      What I think is interesting about this (for me anyway) is being able to retain state without resorting to objects. Using an %args hash for a function and this technique, it should be pretty easy to write iterator functions without OO. Nothing wrong with objects but for something like this, the grep syntax feels more natural and looks better to me.
      I don't know if there is anything on CPAN for this. The closest I found was first() in List::Util
      It was mainly an "I should be able to do that.." exercise.

      -Lee

      "To be civilized is to deny one's nature."

        For the wantarray, i see the usefulness, but i was commenting on moving the return outside of the ?:. It's not really a big deal, but it would look something like:
        return wantarray ? ($_,'blah') : $_;
        For the end hook, you have a start index, you can receive an extra scalar as the end index, and only pick it up if they've put in a start index as well. Prototype would be kinda like:

        findone { code } @list
        findone { code } @list, $start_index
        findone { code } @list, $start_index, $end_index

        So if they know they don't need to walk further than a certain place on the array they can be certain to stop there.

        jynx

Re: findone { coderef } @array
by dash2 (Hermit) on Feb 27, 2002 at 17:57 UTC
    I expect you have thought of this, but rather than the code key, you could have a function which returned an iterator. This probably makes for uglier calling syntax:
    my $next = grepper {$_ == &somethingorother} @array; while (my $found = $next->()) { # ... do something with each match }

    but it may be more beautiful underneath. Basically, you create a closure which holds the data and code you need, and a counter which remembers how far along you have got.

    dave hj~

    PS: oh ok. You're an abbot, you know this...

      I thought about that but I was trying to make something like a built-in. To be honest I've spent WAY to much time on this. I don't need to do this but it one of those things that I want to do, just to do.

      I do have a version that works usually (Still tweaking) using Filter::Simple that basically changes any call that is not on a line with for/foreach/while to a version findfirst that doesn't keep track.

      UPDATE
      Here's a rough version.
      package FindOne; ###################################################################### +############### # findone # usage: findone { coderef }, @array , OPTIONAL start_index OPTIONAL +end_index # Like grep but returns the first match. successive calls return succe +ssive elements. # Can be nested. Uses Weak references to prevent the leaking of memory +. ###################################################################### +############### use 5.006; use strict; use warnings; use Carp; use Filter::Simple; use WeakRef; FILTER_ONLY code => sub { my @code = split (/\n/); # This is suboptimal I'm sure. foreach (@code){ s/findone/FindOne::findfirst/g unless m/\b(for|foreach +|while)\W/; } $_ = join ("\n",@code); }; require Exporter; our @ISA = qw(Exporter); our $DEBUG = 0; # If $DEBUG is any true value, will tell you when calls to findone # get changed by source filter. # If any value above 1, will give more info. our @EXPORT = qw( &findone &findfirst ); our $VERSION = '0.02'; my %find_cache = (); # Temporary Storeage. sub findone(&\@;$$) { my $coderef = shift ; # Generate a key from the caller function so we can track whic +h call it is. my $codekey = join(":",caller(),"$_[0]","$coderef"); # Gene +rate key warn "Codekey is $codekey and code ref is $coderef" if $DEBUG +> 1; my %persistant_args = () ; ############################################################## +########## # Clean up old keys to prevent leaking mem. If the data does +not exist, # then it has been freed and we don't need to keep position in +fo. ############################################################## +########## while(my ($k,$v) = each %find_cache){ delete $find_cache{$k} unless defined ($v->{dataref}); } unless (defined $find_cache{$codekey} ){ warn "new find call" if $DEBUG > 1; %persistant_args = ('index' =>($_[1]||0), 'dataref' => $_[ +0] ); }else{ %persistant_args = %{$find_cache{$codekey}}; } my $end_index = $_[2] || $#{ $_[0] }; for (; $persistant_args{index} <= $end_index; $persistant_ar +gs{index}++ ){ warn "Element is $_" if $DEBUG > 1; $_ = $_[0]->[$persistant_args{index}]; if (&$coderef){ $persistant_args{index}++; $find_cache{$codekey} = {%persistant_args}; weaken ($find_cache{$codekey}->{dataref}); warn "$_ matched!" if $DEBUG > 1; return wantarray ? ($_ ,($persistant_args{index} -1 ) + ) : $_; } } delete $find_cache{$codekey}; return; } sub findfirst(&\@;$$) { my $coderef = shift ; carp "Really calling _findfirst" if $DEBUG; my %args = ('index' =>($_[1]||0), 'dataref' => $_[0] ); my $end_index = $_[2] || $#{ $_[0] }; for (; $args{index} <= $end_index; $args{index}++ ){ warn "Element is $_ at $args{index}" if $DEBUG > 1; $_ = $_[0]->[$args{index}]; if (&$coderef){ warn $_," matched!" if $DEBUG > 1; return wantarray ? ($_ ,($args{index} -1 ) ) : $_; }else{ warn $_, "Didn't match" if $DEBUG > 1; } } return; } 1; __END__ # Below is stub documentation for your module. You better edit it! =head1 NAME FindOne - Perl extension lazy searching for array elements. =head1 SYNOPSIS use FindOne; while (my ($val,$index) = findone { $_ > 10 } @tokens ){ print "Found $val at $index in \@tokens\n"; } my $foo = 'START'; die "Couldn't find $foo" unless findone { m/$foo/ } @tokens; =head1 DESCRIPTION findone usage: findone { coderef }, @array , OPTIONAL start_index OPTIONAL en +d_index This is a basically a lazy grep. Will return the first element found. On subsequent calls it will return subsequent elements. undef if none are left. If called in a list context, it will return ($match,$index); If called in a scalar context, just returns $match. Internally it uses Filter::Simple to change any calls to findone with a call to findfirst unless the current line contains a do, for or foreach statement. This avoids the following gotcha. my @tokens = (tokens here); while ($line =<>){ chomp($line); die "$line is not a valid token" unless findone { m/^$line/ } @tok +ens; } Because it keeps track of where and what it's called with, it will never match past the first match which means it will fail on the second block. (Unless there are multiple matches in @token +s. Then it will fail on the Matches+1 iteration.) $FindOne::DEBUG = 1; # Will output debug info. =bugs Plenty, I'm sure. =head2 EXPORT findone() findfirst() =head1 AUTHOR Pumphret Lee, E<lt>perl@leeland.net<gt> =head1 SEE ALSO WeakRef Filter::Simple =disclaimer This module is provided with no guarantee at all. Use at your own risk. You are responsible for anything that goes wrong if you use this. Enjoy! =copyright COPYRIGHT Copyright (c) 2000-2002, Lee Pumphret. All Rights Reserved. This module is free software. It may be used, redistributed and/or modified under the same terms as Perl itself. L<perl>. =cut

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: CUFP [id://147423]
Approved by root
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others taking refuge in the Monastery: (4)
As of 2024-03-28 21:15 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found