in reply to Arbitrary number of captures in a regular expression

This is an unfortunate consequence of the fact that semantics of Perl 5 regex were designed by a young idiot. They got an older (and possibly wiser) idiot to design Perl 6 regex, so you should be able to say something like:
$str ~~ mm/foo [ m (\d+) ]* bar/; my @matches = @$0;
Indeed, the notion that repeating groups shouldn't throw away all but the final capture is fundamental to making Perl 6 regexes powerful enough to parse Perl 6. In a similar vein, the ordinary scalar comma operator does not throw away its left argument anymore in Perl 6 either. Perl 6 has pretty much cleaned out all the dirty little spots where Perl 5 has "return the last one" semantics.

Update: I also forgot the 'bar'...

Replies are listed 'Best First'.
Re^2: Arbitrary number of captures in a regular expression
by demerphq (Chancellor) on Sep 25, 2007 at 23:24 UTC

    At what cost tho? Maintaining that array and rolling it back during backtracking must impose a runtime cost for what IMO is not all that common a use case.

    ---
    $world=~s/war/peace/g

      At what cost tho? Maintaining that array and rolling it back during backtracking must impose a runtime cost for what IMO is not all that common a use case.
      Er, you're falling into Perl-5-Think here. The very fact that I used parens means that I do want to capture the array. If I didn't, I'd have used square brackets for the groupings I didn't want to capture. In Perl 6 we made it just as easy to not capture as it is to capture, so there's no need to guess about use cases in advance. You just write it how you want it.
Re^2: Arbitrary number of captures in a regular expression
by mwah (Hermit) on Sep 25, 2007 at 17:24 UTC
    TimToady:  This is an unfortunate consequence of the fact that semantics of Perl 5 regex were designed by a young idiot.

    I came to PM late and didn't possbly get the whole problem
    connected w/above topic. Wouldn't this be very simple to solve
    by code assertion?

    Did I miss sth.?
    my @foobar = ( 'foo m 1 m 2 m 3 m 4 bar', 'foo m 2 m 4 m 7 bar', 'foo m 1 bar' ); my ($cnt, @match) = (0, ()) ; /^foo (?:m (\d+)(?{push @{$match[$cnt]}, $^N}) )+bar(?{++$cnt})/ for + @foobar; print map "@$_\n", @match;

    Or is it a non-go here to put code into regexes?

    Regards
    mwa
      I came to PM late and didn't possbly get the whole problem connected w/above topic.
      That's what the "in thread" link up at the top is for, I believe... :-)

      And, in fact, ikegami already posted a code assertion solution earlier in the thread.

      Wouldn't this be very simple to solve by code assertion?
      I suspect your definition of "very simple" must be different from mine. It's nice to have an escape hatch like code assertions for when the basic mechanism is insufficient (and indeed, Perl 6 provides more such escape mechanisms and also makes them easier to use), but it would be even better if the basic capture mechanism did what you wanted it to do. That's my idea of simple.

      Using lexical (my) variables from outside the regex in (?{ ... }) is dangerous. Your code will break if it's moved to a function. Use package (our/use vars) variables instead.

      Also, it's unsafe to modify @match at the point where you did modify it. If any backtracking through that the (?{ ... }) that changes @match occurs, you won't get the correct result. Now, the only time your code backtracks is when the match is unsuccessful. Even if you realized that and found it acceptable, you're playing with fire for the smallest change to the regexp can change that.

      See earlier post Re: Arbitrary number of captures in a regular expression for the safe approach.

        ikegami: ... If any backtracking through that the (?{ ... }) that changes @match occurs, you won't get the correct result. ...

        OK, I got your point. If the data isn't that regular as given,
        the whole wizardry will break. To make that robust, much more effort
        is needed (which you already went through in another node => Re: Arbitrary number of captures in a regular expression).

        Thank you very much for your hint,

        Regards
        mwa