in reply to Re: Arbitrary number of captures in a regular expression
in thread Arbitrary number of captures in a regular expression

TimToady:  This is an unfortunate consequence of the fact that semantics of Perl 5 regex were designed by a young idiot.

I came to PM late and didn't possbly get the whole problem
connected w/above topic. Wouldn't this be very simple to solve
by code assertion?

Did I miss sth.?
my @foobar = ( 'foo m 1 m 2 m 3 m 4 bar', 'foo m 2 m 4 m 7 bar', 'foo m 1 bar' ); my ($cnt, @match) = (0, ()) ; /^foo (?:m (\d+)(?{push @{$match[$cnt]}, $^N}) )+bar(?{++$cnt})/ for + @foobar; print map "@$_\n", @match;

Or is it a non-go here to put code into regexes?

Regards
mwa

Replies are listed 'Best First'.
Re^3: Arbitrary number of captures in a regular expression
by TimToady (Parson) on Sep 25, 2007 at 18:33 UTC
    I came to PM late and didn't possbly get the whole problem connected w/above topic.
    That's what the "in thread" link up at the top is for, I believe... :-)

    And, in fact, ikegami already posted a code assertion solution earlier in the thread.

    Wouldn't this be very simple to solve by code assertion?
    I suspect your definition of "very simple" must be different from mine. It's nice to have an escape hatch like code assertions for when the basic mechanism is insufficient (and indeed, Perl 6 provides more such escape mechanisms and also makes them easier to use), but it would be even better if the basic capture mechanism did what you wanted it to do. That's my idea of simple.
Re^3: Arbitrary number of captures in a regular expression
by ikegami (Patriarch) on Sep 25, 2007 at 19:50 UTC

    Using lexical (my) variables from outside the regex in (?{ ... }) is dangerous. Your code will break if it's moved to a function. Use package (our/use vars) variables instead.

    Also, it's unsafe to modify @match at the point where you did modify it. If any backtracking through that the (?{ ... }) that changes @match occurs, you won't get the correct result. Now, the only time your code backtracks is when the match is unsuccessful. Even if you realized that and found it acceptable, you're playing with fire for the smallest change to the regexp can change that.

    See earlier post Re: Arbitrary number of captures in a regular expression for the safe approach.

      ikegami: ... If any backtracking through that the (?{ ... }) that changes @match occurs, you won't get the correct result. ...

      OK, I got your point. If the data isn't that regular as given,
      the whole wizardry will break. To make that robust, much more effort
      is needed (which you already went through in another node => Re: Arbitrary number of captures in a regular expression).

      Thank you very much for your hint,

      Regards
      mwa