suhailck has asked for the wisdom of the Perl Monks concerning the following question:

perl -e 'my @words = split /(?<=A-Z)(?{print pos})/, 'CCamelString';
print "\n";
print join '_', map { lc } @words;'
OUTPUT
112277 #HERE
c_c_amels_tring
Can anyone pls tell me why pos printing twice its value in this above program??

Replies are listed 'Best First'.
Re: pos CONFUSION in REGEX
by moritz (Cardinal) on Dec 14, 2009 at 16:42 UTC
    You regexes matches with length zero at position 1, so after the first match pos is set to 1.

    Then the regex engine tries again starting from point 1, and matches again. Since it matched with zero width twice in a row, the regex engine fears an infinite recursion and artificially bumps the position.

    Most patterns don't have side effects, so usually you don't observe this behaviour - but it's the reason why split // works (unless that case is special-cased).

    Perl 6 - links to (nearly) everything that is Perl 6.
      Thanq moritz
      and Anonymous Monk for your advice to use code tag :)
Re: pos CONFUSION in REGEX
by ikegami (Patriarch) on Dec 14, 2009 at 18:18 UTC
    Because it matches zero characters, the pattern would match at exactly the same starting position and for the same length more than once. An infinity of times, really. Perl detects this and causes the pattern not to match. so backtracking occurs.

    You can see the same thing in this simpler example:

    $ perl -le'"aaaaab" =~ /[aA](?{ print pos })[bB]/' 1 2 3 4 5
    Or better yet:
    $ perl -le'"aaaaac" =~ /[aA](?{ print pos })[bB]/' 1 2 3 4 5

    The expression in (?{ }) needs to be backtrackable.

    $ perl -le'"aaaaab" =~ /[aA](?{ pos })[bB](?{ print $^R })/' 5 $ perl -le'"aaaaac" =~ /[aA](?{ pos })[bB](?{ print $^R })/' $

    Unfortunately, that can't be done in your case.

Re: pos CONFUSION in REGEX
by Anonymous Monk on Dec 14, 2009 at 16:28 UTC