in reply to Re: zero-length match increments pos() (saner)
in thread zero-length match increments pos()

I expect there will be more control over this feature in perl6 - we've certainly discussed the need, specifically with reference to //gc-style matches, though I don't think Larry has settled on the mechanisms yet.

I don't think I fully understand the vi interpretation from your description, but even if there is no explicit combination of overrides that would request it I'd expect it to be easy to add such a thing by subclassing the grammar grammar.

Hugo

  • Comment on Re^2: zero-length match increments pos() (saner)

Replies are listed 'Best First'.
Re^3: zero-length match increments pos() (saner)
by tye (Sage) on Feb 23, 2005 at 07:44 UTC

    The thought plickens...

    I wanted to add some more examples to make sure the point is clear and so needed a handy copy of sed and eventually turned to my Zaurus (since I was in bed) and produced:

    $ echo bbbaaabbb | sed -e 's/\(b*\)/(\1)/g' (bbb)()a()a()a(bbb) $

    which added a new point on the speculum (ducks1).

    I eventually calmed down and convinced myself it was just a quirk of busybox's imitation of sed and found a real copy of sed on FreeBSD and produced:

    $ echo bbbaaabbb | sed -e 's/\(b*\)/(\1)/g' (bbb)a()a()a(bbb) $

    to compare to Perl:

    $ echo bbbaaabbb | perl -pe 's/(b*)/(\1)/g' (bbb)()a()a()a(bbb)() () $ echo bbbaaabbb | perl -lpe 's/(b*)/(\1)/g' > (bbb)()a()a()a(bbb)() $

    So we see that the ancient lords of s///g, sed and vi(ex), agree that it doesn't make sense for two successive matches to end at the same point.

    We also see how easy it is to overlook this point. The authors of busybox (or the regex library it uses) realized that once you reach the end, you are done, but not that it doesn't make sense for two matches to end at the same point other than at the end: (bbb)()a()a()a(bbb)

    So I'm sure Perl6 will need to support Perl5-compatable mode, but it'd be nice if it'd also supported sed / vi / saner mode (and, personally, I'd make that the default mode -- the Perl5 mode has even been accused of being a "bug" right here at PerlMonks more than once, other than by me).

    While thinking about this, I also envisioned a fun 'watch me backtrack' mode.

    - tye        

    1 That's enough to make a Welsh Harlequin blush.

      Do we need a mode for this? Getting all the matches? Its possible to do with an embedded code block (as I think you know :-)

      perl -le"$_='bbaabb'; /b*(?{print '.' x $-[0],qq<($&)>})(*FAIL)/g" (bb) (b) () .(b) .() ..() ...() ....(bb) ....(b) ....() .....(b) .....() ......()

      On earlier perls than mine you can spell (*FAIL) as (?!)

      ---
      $world=~s/war/peace/g

      I also envisioned a fun 'watch me backtrack' mode.

      These are precisely the matches that will be returned by another option, which I think was called ':exhaustive'.

      I expect that option also to be very useful for combinatorial exercises.

      Hugo