The problem
Some years ago I wrote a class to find all possible matches of a pattern against a string, including overlapping matches. If there was an /a switch that accomplished this, it would look like this:
foreach ('abc' =~ /.+?/a) { print "$_\n"; } __END__ a ab abc b bc c
I figure I'm going to release it to CPAN. Before I do that I'd appreciate some feedback.
Description of the classes
I currently call it Regexp::AllMatches and use an OO interface. Here's how it's used:
use Regexp::AllMatches; my $matcher = Regexp::AllMatches->new(STRING => qr/PATTERN/); while (my ($match) = $matcher->next) { print "$match\n"; }
$matcher is a simple iterator, and the only methods are
* new * clone * next
$match is a match object that stringifies to the matched string ($&) and implements the following methods:
* prematch ($`) * match ($&) * postmatch ($`) * group ($<*digits*>) * groups
I also wrote Regexp::AllMatches::Extended that implements some extra convenience methods at the cost of memory and speed.
* curr * prev * reset * all
Regexp::AllMatches and Regexp::AllMatches::Extended will be two different modules. The match object is currently defined in Regexp::AllMatches, and is at the moment not for public instantiation.
So, what do you think of
lodin
Update: Regexp::Exhaustive is the new name. I'm not too happy about Regexp::Exhaustive::Extended though. Any ideas? How about Regexp::Exhaustive::Extra(s) or Regexp::Exhaustive::Convenient?
Update: After a bit of cleaning Regexp::Exhaustive::Extended became nothing but a generic iterator decorator, so it's gone. The all method is now put directly in Regexp::Exhaustive instead.
Update: Uploaded to CPAN as Regexp::Exhaustive.
|
|---|
| Replies are listed 'Best First'. | |
|---|---|
|
Re: RFC: Regexp::AllMatches
by moritz (Cardinal) on Aug 06, 2007 at 17:06 UTC | |
by lodin (Hermit) on Aug 06, 2007 at 17:30 UTC | |
by wind (Priest) on Aug 07, 2007 at 00:25 UTC | |
|
Re: RFC: Regexp::AllMatches
by blokhead (Monsignor) on Aug 07, 2007 at 02:07 UTC | |
by lodin (Hermit) on Aug 07, 2007 at 02:37 UTC |