$string =~ s/[^<B>]($swapString)[^<\/B>]/<B>$1<\/B>/gi;

...removed the first character before the match and the last character after the match.Can somebody explain this to me?

Yes this is quite simple on the left you match

[^<B>] <- this matches any *single* char that is not a < B > ($swapString) <- this put the match for $swapstring into $1 [^<\/B>] <- this matches any *single* char that is not a < / B >

So you are substituting the characters before and after $swapString, but as you do not capture them they naturally disappear as you do not replace them in the replacement.

The suggestion that you move the capture parenths kind of works but gives this output.

$string =~ s/([^<B>]($swapString)[^<\/B>])/<B>$1<\/B>/gi; there is a <B>foo bar</B> and a <B>bar foo</B> and also<B> foo </B>and +<B> bar.</B>

This is a better way to do things using what are know as lookback assertions.

$string =~ s/(?<!<B>)($swapString)(?!<\/B>)/<B>$1<\/B>/gi; # this gives: there is a <B>foo bar</B> and a <B>bar foo</B> and also <B>foo</B> and + <B>bar</B>.

which I expect is what you had in mind.

The lookback assertions give you a sneak peak of what is or is not immediately around a match. They do not eat up the string so you do not need to replace the bits they match.

The assertions are:

(?=pattern)
A zero-width positive look-ahead assertion. For example, /\w+(?=\t)/ matches a word followed by a tab, without including the tab in $&.

(?!pattern)
A zero-width negative look-ahead assertion. For example /foo(?!bar)/ matches any occurrence of ``foo'' that isn't followed by ``bar''. Note however that look-ahead and look-behind are NOT the same thing. You cannot use this for look-behind.

If you are looking for a ``bar'' that isn't preceded by a ``foo'', /(?!foo)bar/ will not do what you want. That's because the (?!foo) is just saying that the next thing cannot be ``foo''--and it's not, it's a ``bar'', so ``foobar'' will match. You would have to do something like /(?!foo)...bar/ for that. We say ``like'' because there's the case of your ``bar'' not having three characters before it. You could cover that this way: /(?:(?!foo)...|^.{0,2})bar/. Sometimes it's still easier just to say: if (/bar/ && $` !~ /foo$/)

For look-behind see below.

(?<=pattern)
A zero-width positive look-behind assertion. For example, /(?<=\t)\w+/ matches a word that follows a tab, without including the tab in $&. Works only for fixed-width look-behind.

(?<!pattern)
A zero-width negative look-behind assertion. For example /(?<!bar)foo/ matches any occurrence of ``foo'' that does not follow ``bar''. Works only for fixed-width look-behind.

hope this helps

tachyon


In reply to Re: Double regex match not working correctly. by tachyon
in thread Double regex match not working correctly. by the_0ne

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.