The (..)|(..) bits in the first part capture into either $1 or $2. As we simply replace $1 with $1 this has the effect of matching tags so you don't substitute text within them. Your regex can effectively be distilled down to: $link_results =~ s,(\Q$term\E),<b>$1</b>,gi;

So all it does is put bold tags around whatever is in $term. It is case insensitive couresy of the /i modifier. The \Q activates quotemata to escape regex specials in $term. The \E is not required in this case but deactivates quotemeta. See perlman:perfunc. Here is a little widget to do the sort of thing you want. Just push all the terms you want to bold into @terms.

# test string my $link_results = '<p>Hello: World Hello hello drewboy Drewboy: Drewb +oy!</p>'; # define an array of the terms we want to bold my @terms = ( 'Hello:', 'Drewboy!' ); # make all the terms regex safe by quotemeta-ing them $_ = quotemeta $_ for @terms; # join all terms with a pipe | so we find any of them - alternation my $bold = join '|', @terms; # make all the subs - case sensitive and global $link_results =~ s#(<[^>]+?>)|($bold)#$1 ? $1 : "<b>$2</b>"#eg; # proof is in da pudding print $link_results;

To avoid bolding where you don't want to we switch off case insensitivity and insist on the punctuation which is apparently present. You could also add the \b or \B boundary modifiers to help ensure that you only match the desired term. I'll leave that as a exercise for you. Using HTML::Parser is a more robust idea to get the text outside of tags for processing.

Update

Corected a technical inexactitude ;-) thanks to scain

Here is an atonement - this is how you do it right using HTML::Parser. We define a hash of tags where the text they contain is OK for substitution. We make our substitution array as before. We then use the power of Parser to selectively make some substitutions - only in the text between the selected tags and absolutely positively not in the tags themselves.

package Filter; use strict; use base 'HTML::Parser'; my ($filter, $sub_OK); my @ok_tags = qw ( h1 h2 h3 h4 p ); my %ok_tags; $ok_tags{$_}++ for @ok_tags; my @terms = ( 'head', 'Parser' ); $_ = quotemeta $_ for @terms; my $bold = join '|', @terms; sub start { my ($self, $tag, $attr, $attrseq, $origtext) = @_; $sub_OK = exists $ok_tags{$tag} ? 1 : 0; $filter .= $origtext; } sub text { my ($self, $text) = @_; $text =~ s#\b($bold)\b#<b>$1</b>#g if $sub_OK; $filter .= $text; } sub comment { # uncomment to not strip comments # my ($self, $comment) = @_; # $filter .= "<!-- $comment -->"; } sub end { my ($self, $tag, $origtext) = @_; $filter .= $origtext; } my $parser = new Filter; my $html = join '', <DATA>; $parser->parse($html); $parser->eof; print $html; print "\n\n------------------------\n\n"; print $filter; __DATA__ <html> <head> <title>Title</title> </head> <body> <h1>Hello Parser</h1> <p>You need HTML::Parser to ger ahead</p> <p>So use your head <h2>Parser rocks my head!</h2> <a href="html.head.parser.com">html.head.parser.com</a> <hr> <pre> use HTML::Parser; head </pre> <!-- HTML PARSER ROCKS MY HEAD! --> </body> </html>

cheers

tachyon

s&&rsenoyhcatreve&&&s&n.+t&"$'$`$\"$\&"&ee&&y&srve&&d&&print


In reply to Re: Replacement based on pattern by tachyon
in thread Replacement based on pattern by drewboy

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.