The thought plickens...
I wanted to add some more examples to make sure the point is clear and so needed a handy copy of sed and eventually turned to my Zaurus (since I was in bed) and produced:
$ echo bbbaaabbb | sed -e 's/\(b*\)/(\1)/g' (bbb)()a()a()a(bbb) $
which added a new point on the speculum (ducks1).
I eventually calmed down and convinced myself it was just a quirk of busybox's imitation of sed and found a real copy of sed on FreeBSD and produced:
$ echo bbbaaabbb | sed -e 's/\(b*\)/(\1)/g' (bbb)a()a()a(bbb) $
to compare to Perl:
$ echo bbbaaabbb | perl -pe 's/(b*)/(\1)/g' (bbb)()a()a()a(bbb)() () $ echo bbbaaabbb | perl -lpe 's/(b*)/(\1)/g' > (bbb)()a()a()a(bbb)() $
So we see that the ancient lords of s///g, sed and vi(ex), agree that it doesn't make sense for two successive matches to end at the same point.
We also see how easy it is to overlook this point. The authors of busybox (or the regex library it uses) realized that once you reach the end, you are done, but not that it doesn't make sense for two matches to end at the same point other than at the end: (bbb)()a()a()a(bbb)
So I'm sure Perl6 will need to support Perl5-compatable mode, but it'd be nice if it'd also supported sed / vi / saner mode (and, personally, I'd make that the default mode -- the Perl5 mode has even been accused of being a "bug" right here at PerlMonks more than once, other than by me).
While thinking about this, I also envisioned a fun 'watch me backtrack' mode.
[bbaabb] (bb) (b) () (b) () .() ..() ...(bb) ...(b) ...() ....(b) ....() .....() [bbaabb]
while matching 'bbaabb' =~ /b*?/g would return the following matches in the following order:
[bbaabb] () (b) (bb) () (b) .() ..() ...() ...(b) ...(bb) ....() ....(b) .....() [bbaabb]
- tye
1 That's enough to make a Welsh Harlequin blush.
In reply to Re^3: zero-length match increments pos() (saner)
by tye
in thread zero-length match increments pos()
by Errto
| For: | Use: | ||
| & | & | ||
| < | < | ||
| > | > | ||
| [ | [ | ||
| ] | ] |