Hi monks, have a question on a double regex I'm trying to perform. I'm trying to swap multiple word strings and single word string in a larger string with the match encased in <B> tags. This code...
@array = ( "foo bar", "bar foo" );
@array2 = ( "foo", "bar" );
$string = "there is a foo bar and a bar foo and also foo and bar.";
print "string before - $string\n";
$swapString = join ("|", @array);
print "swapString 1 - $swapString\n";
$string =~ s/($swapString)/<B>$1<\/B>/gi;
print "string after first - $string\n";
...works correctly by giving me this...
string after first - there is a <B>foo bar</B> and a <B>bar foo</B> and also foo and bar.
The 'foo bar' and 'bar foo' are tagged correctly. Then I perform the swap on the single words, but I don't want to put bold tags on things that should not be like the <B>foo bar<B>, so I run this regex...
$swapString = join ("|", @array2);
print "swapString 2 - [$swapString]\n";
$string =~ s/[^<B>]($swapString)[^<\/B>]/<B>$1<\/B>/gi;
print "string after first - $string\n";
This gives me this...
string after first - there is a <B>foo bar</B> and a <B>bar foo</B> and also<B>foo</B>and<B>bar</B>
...now this looks like it worked correctly, but notice the spaces between the words where the newly inserted bold tags are. They're gone and so is the period on the end. I'm not understanding how this...
$string =~ s/[^<B>]($swapString)[^<\/B>]/<B>$1<\/B>/gi;
...removed the first character before the match and the last character after the match. Can somebody explain this to me? Obviously I'm doing something wrong, but it looks like I have it right and it actually does work correctly except for the character before and after the match. Not sure why they disappear.
Thanks again monks for any knowledge you can share.
Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
Read Where should I post X? if you're not absolutely sure you're posting in the right place.
Please read these before you post! —
Posts may use any of the Perl Monks Approved HTML tags:
- a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
| |
For: |
|
Use: |
| & | | & |
| < | | < |
| > | | > |
| [ | | [ |
| ] | | ] |
Link using PerlMonks shortcuts! What shortcuts can I use for linking?
See Writeup Formatting Tips and other pages linked from there for more info.