in reply to Re: Return string that starts and ends with specific characters
in thread Return string that starts and ends with specific characters

I had basically the same idea, but tried \0 to force the backtracking. (which didn't work)

Why and how does ^ work?

Cheers Rolf
(addicted to the Perl Programming Language and ☆☆☆☆ :)
Je suis Charlie!

Replies are listed 'Best First'.
Re^3: Return string that starts and ends with specific characters
by Eily (Monsignor) on Jun 02, 2016 at 16:59 UTC

    It works with (?=\0) so I'd say it comes from zero-width assertions preventing optimization.

    Edit : In this special case (single word without a j at the end) it also works with \b

      ... it also works with \b

      ... unless the string ends with the target character/pattern:

      c:\@Work\Perl\monks>perl -wMstrict -le "my @answer; 'asdfgfhjkljx' =~ /f.*j(?{push @answer, $&})\b/; print for @answer; " fgfhjklj fgfhj fhjklj fhj c:\@Work\Perl\monks>perl -wMstrict -le "my @answer; 'asdfgfhjklj' =~ /f.*j(?{push @answer, $&})\b/; print for @answer; " fgfhjklj
      Personally, I feel more comfortable using something that cannot possibly be true to force backtracking, like  (?!) or, from Perl 5.10 on,  (*FAIL) or  (*F) even though they take more keystrokes to type.

      Update: Oops... I missed "... this special case (single word without a j at the end) ..." in Eily's post; see Eily's reply below.


      Give a man a fish:  <%-{-{-{-<

        ... unless the string ends with the target character/pattern
        That's why I said it only worked in this special case, not ending in j. But I agree with you, if you're not trying to golf, (*FAIL) is probably the best option, because it is far more explicit even if you've never seen the construct before. (?!) is still weird enough that you still need to check in the documentation if you don't recognize it, so that's still good.

      Cool. I picked it up from perl golf, and use ^ because it's shorter :)

Re^3: Return string that starts and ends with specific characters
by Anonymous Monk on Jun 02, 2016 at 16:55 UTC

    I think regex is smart enough to detect no \0 but not ^

      Yes regexes are optimized, if there is a leading or trailing string it's taken into consideration ( I'm to lazy to proof it with re deparse)

      but this looks like a fortunate bug, because ^ is a metacharacter in some positions, it's treated differently.

      Cheers Rolf
      (addicted to the Perl Programming Language and ☆☆☆☆ :)
      Je suis Charlie!

        > but this looks like a fortunate bug, because ^ is a metacharacter in some positions, it's treated differently.

        I finally understood that it's not a bug, because ^ is always a metacharacter, in order to match a literal ^ one needs to escape it

        And "Match the beginning of the line" will always fail unless modifiers like /m are used

        DB<1> p "a^b" =~ /a^b/ DB<2> p "a^b" =~ /a\^b/ 1 DB<3> p "\nb" =~ /^b/m 1

        Cheers Rolf
        (addicted to the Perl Programming Language and ☆☆☆☆ :)
        Je suis Charlie!