So, let me get this straight. You have a user-entered search phrase and you want to highlight HTML content where it matches those words.

First, let me recommend that when you change your node, you mark it as Update: and either use strike notation or simply post your updated material in a separate paragraph, leaving your original post content alone.

Second, you want to parse out a search phrase into words and put them in an array -- use split to accomplish this.

Third, you'll want to step through that array, using a construct like this (untested):

for my $word (@search_words) { $html =~ s/($word)/<b><u>$1<\/u><\/b>/g; }

This acts on the HTML in $html that you are evaluating and replaces $word with a highlighted version of itself (that's what the $1 accomplishes). It acts on the entire contents of $html because of the /g modifier on the regex.

Fourth, if you want to parse out sections or tags of HTML, applying your substitution to some while ignoring others, you'll probably want to use a CPAN module to do that. I've used HTML::TreeBuilder for such things before, but you may want to search around a little for something that suits your needs.

Update: Ah, I see that Fletch already recommended this. Well, now you've heard it from two people! :)


No good deed goes unpunished. -- (attributed to) Oscar Wilde

In reply to Re: Perl regex by ptum
in thread Perl regex by axl163

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.