in reply to Re^4: Regex help \b & \Q
in thread Regex help \b & \Q

Better:

my $count = () = $title =~ m{ (?:^|\s)\K \Q$kw\E (?! [^\s,;] ) }xig;

Replies are listed 'Best First'.
Re^6: Regex help \b & \Q
by AnomalousMonk (Archbishop) on Apr 14, 2016 at 19:43 UTC
    (?:^|\s)\K is more efficient than (?<! \S )

    But it does not meet the requirements of the latest update (of the latest update (of the latest update (of the latest update...))) of the "specification" in the OP. However, it's easily fixed:
        (?: ^ | [\s,;]) \K
    and I'm happy to accept that it's more efficient.

    Update: In fact,  (?<! [^\s,;]) works just as well as  (?: ^ | [\s,;]) \K and has a certain orthogonality. It's still double-negatory, though. I've no idea about efficiency.

    BTW: It should be noted that  \K is only available from Perl version 5.10 onward.

    ◾The s and m flags weren't necessary

    In order to limit the "degrees of freedom" of (and the necessity for thought about) the  . ^ $ operators and for readability, I always use  /xms in every regex.


    Give a man a fish:  <%-{-{-{-<

Re^6: Regex help \b & \Q
by rsFalse (Chaplain) on Apr 14, 2016 at 20:07 UTC
    (?! [^\s,;] )
    Sometimes double negation is difficult to understand, so someone would like to read:
    (?= [\s,;] | \z)
      Sometimes double negation is difficult to understand ...

      More than just sometimes, IMHO, but it's tolerable if taken in moderation. E.g., if you need a "digit boundary" assertion analogous to  \b in that it also matches at the start/end of a string, then  (?<! \d) and  (?! \d) are very attractive. Then  (?<! \d) \d{4} (?! \d) matches  '1234' 'x1234' '1234x' 'x1234x' but none of  '12345' 'x12345x' etc. Extend this to  (?<! \D) and  (?! \D) and you have a sometimes-useful double-negation asserting "non-digit boundary".


      Give a man a fish:  <%-{-{-{-<

        Wow!
        But it seems that (?<! \D) and (?! \D) is not "equivalent" to \B, because \B don't match beginning or ending of string:
        # I've changed all x-es to spaces (for comparison). for my $line ('1234', ' 1234', '1234 ', ' 1234 ', '12345', ' 12345 ',, ' 123456 '){ print map { sprintf "%10s: $_\n", "'$line'" } join ' ', map { $line =~ qr/$_/x ? 'OK' : 'NO' } '(?<! \d) \d{4} (?! \d)', '(?<! \D) \d{4} (?! \D)', '\b \d{4} \b ', '\B \d{4} \B ', } __END__ '1234': OK OK OK NO ' 1234': OK NO OK NO '1234 ': OK NO OK NO ' 1234 ': OK NO OK NO '12345': NO OK NO NO ' 12345 ': NO NO NO NO ' 123456 ': NO OK NO OK