in reply to Re: Global regexp
in thread Global regexp

I meant adjacent digits.

To be true my problem was more complex than just matching digits. I wanted to find all possible matches for any regexp.
Take
print $_, ", " for ('1aaab2' =~ /(a+b)/g)
#prints aaab,

It prints only one result, instead of list (what's needed): 'ab', 'aab', 'aaab'.
Solution given by Corion works perfectly in this case:
print $_, ", " for ('1aaab2' =~ /(?=(a+b))/g)
#prints aaab, aab, ab,

But it has one side effect. See an example:
while ('1234' =~ /(\d\d)/g) {
print "$`<$&>$'", ", ";
}
#prints <12>34, 12<34>,

Extended regexp:
while ('1234' =~ /(?=(\d\d))/g) {
print "$`<$&>$'", ", ";
}
#prints <>1234, 1<>234, 12<>34,

So side effect is that this extended regexp doesn't allow to use $`, $&, $' variables as usually.

Replies are listed 'Best First'.
Re^3: Global regexp (all possible)
by ikegami (Patriarch) on Jun 17, 2008 at 16:35 UTC

    To be true my problem was more complex than just matching digits. I wanted to find all possible matches for any regexp.

    That's simple enough to.

    local our @results; # Not "my". /(\d\d)(?{ push @results, $1 })(?!)/;
      Thanks!
      I guess magic (?!) do all the job
        (?!) never succeeds (like 5.10's (?*FAIL)), so it force the regexp engine to backtrack and find another match if there is one.
Re^3: Global regexp
by Corion (Patriarch) on Jun 17, 2008 at 09:28 UTC

    Which is to be expected, because my approach never matches anything in the "real body" of the regular expression. If you want different behaviour of the regex engine, you can only achieve that by making it match different things, which will result in the match variables containing different values. If you want to keep the behaviour of $`, $& and $', then you will need to fiddle with pos. You haven't stated why you don't want to do that.

      I think dealing with pos will result in ugly code: maybe code with cycle over string length or time consuming code.
      So you mean perl regexps are always greedy: they match as much as possible and there's no way to configure them beside your approach with (?= ) ?

        Basically yes - Perl regular expressions will always match the leftmost longest match. Any other approaches will need other regular expression engines. If you're running Perl 5.10, you have the option of using different regular expression engine (see re::engine).

Re^3: Global regexp (Regexp::Exhaustive)
by lodin (Hermit) on Jun 17, 2008 at 18:09 UTC

    See Regexp::Exhaustive to get every possible match of a pattern against a string. It supports the use of $& et al (without global penalty).

    use Regexp::Exhaustive 'exhaustive'; my @matches = exhaustive( 'asdf' => qr/..??/, qw[ $` $& $' ], ); printf "%s<%s>%s\n", @$_ for @matches; __END__ <a>sdf <as>df a<s>df a<sd>f as<d>f as<df> asd<f>

    lodin

      Nice package. Thanks.
      Investigation of code in Regexp/Exhaustive.pm showed there's constructions like in ikegami's post: <(?!)/tt> and (?{push @array, ...})
Re^3: Global regexp
by starbolin (Hermit) on Jun 17, 2008 at 16:31 UTC

    So, have the regex engine do what it does best, that is, return the longest matching string. Then, have a routine that takes that string and provides the permutations.


    s//----->\t/;$~="JAPH";s//\r<$~~/;{s|~$~-|-~$~|||s |-$~~|$~~-|||s,<$~~,<~$~,,s,~$~>,$~~>,, $|=1,select$,,$,,$,,1e-1;print;redo}