in reply to Getting + and * to generate multiple captures
Yet another completely different approach is to embed code in the regexp.
# We need use re 'eval' because we use interpolation and (?{...}) # in the same regexp. Beware of the implications of this directive. use re 'eval'; our @matches; # Don't use a lexical for this. local *matches; # Protect our caller's variables. / (?{ [] }) # Create a stack $text (?: \s (\w+) (?{ [ @{$^R}, $1 ] }) # Save last match on the stack. )+ (?{ @matches = @{$^R}; }) # Success! Save the result. /x;
Since Perl 5.8.0, the $1 in the above can be replaced with $^N.
It's possible to simplify the above code since the regexp engine will never backtrack through (?{ [ @{$^R}, $1 ] }) in this particular regexp, but it's much safer to assume there's always the possibility of backtracking through any (?{...}). That's why $^R is used.
Update: The stack is unnecessarily big in the above code. The following greatly reduces the size of the stack, which probably also speeds things up greatly.
sub flatten_list { my ($rv, $p) = @_; @$rv = (); while ($p) { unshift @$rv, $p->[1]; $p = $p->[0]; } } our @matches; local *matches; / $text (?: \s (\w+) (?{ [ $^R, $1 ] }) )+ (?{ flatten_list \@matches, $^R }) /x;
|
|---|
| Replies are listed 'Best First'. | |
|---|---|
|
Re^2: Getting + and * to generate multiple captures
by jgeisler (Initiate) on Aug 17, 2006 at 19:29 UTC | |
by ikegami (Patriarch) on Aug 17, 2006 at 19:53 UTC |