Yet another completely different approach is to embed code in the regexp.
# We need use re 'eval' because we use interpolation and (?{...}) # in the same regexp. Beware of the implications of this directive. use re 'eval'; our @matches; # Don't use a lexical for this. local *matches; # Protect our caller's variables. / (?{ [] }) # Create a stack $text (?: \s (\w+) (?{ [ @{$^R}, $1 ] }) # Save last match on the stack. )+ (?{ @matches = @{$^R}; }) # Success! Save the result. /x;
Since Perl 5.8.0, the $1 in the above can be replaced with $^N.
It's possible to simplify the above code since the regexp engine will never backtrack through (?{ [ @{$^R}, $1 ] }) in this particular regexp, but it's much safer to assume there's always the possibility of backtracking through any (?{...}). That's why $^R is used.
Update: The stack is unnecessarily big in the above code. The following greatly reduces the size of the stack, which probably also speeds things up greatly.
sub flatten_list { my ($rv, $p) = @_; @$rv = (); while ($p) { unshift @$rv, $p->[1]; $p = $p->[0]; } } our @matches; local *matches; / $text (?: \s (\w+) (?{ [ $^R, $1 ] }) )+ (?{ flatten_list \@matches, $^R }) /x;
In reply to Re: Getting + and * to generate multiple captures
by ikegami
in thread Getting + and * to generate multiple captures
by jgeisler
| For: | Use: | ||
| & | & | ||
| < | < | ||
| > | > | ||
| [ | [ | ||
| ] | ] |