csotzing has asked for the wisdom of the Perl Monks concerning the following question:

Hi!
I need to find a regular expression that will allow me to bold face a specific string that is not inside an html tag.

For instance, for the given string "string"-->
Original:
<string>text<other tag>more text...
string</string>another string< string tag>

The results I want:
<string>text<other tag>more text...
<B>string</B></string>another <B>string</B>< string tag>

Please help!!

Spiffed up html entities 2002-02-09 by dvergin

Replies are listed 'Best First'.
Re: Reg Expr help
by talexb (Chancellor) on Feb 10, 2002 at 04:59 UTC
    Trying to use a regexp to manipulate HTML is notoriously difficult.

    For example, it might look like you could key on the existence (or not) of the closing angle bracket after string in your example, but that would only hold true if the tag following string was on the same line .. if it were on the following line, you'd be cooked.

    It seems like using the module HTML::Parser might be a better solution for what you are looking for. In your case, you would install a handler for text (that is, non-tags), catch the string you're after and bold just that string. You would pass everything else through untouched.

    Hopefully more knowledgeable monks will chip in with more information -- I haven't used this module, just read about it.

    --t. alex

    "Of course, you realize that this means war." -- Bugs Bunny.

Re: Reg Expr help
by gellyfish (Monsignor) on Feb 10, 2002 at 09:44 UTC

    talexb is right - you really should be using HTML::Parser for the majority of this thing - a quicky example :

    #!/usr/bin/perl -w use strict; use HTML::Parser; my $tagged_string = "<string>text<other tag>more text...\nstring</stri +ng>another string<string tag>"; my $replace = 'string'; my $parser = HTML::Parser->new(api_version => 3, default_h => [sub {print shift}, 'text' +], text_h => [\&text, 'dtext']); $parser->parse($tagged_string); sub text { my ( $text ) = @_; $text =~ s%\b($replace)\b%<b>$1</b>%sg; print $text; }

    /J\

Re: Reg Expr help
by tachyon (Chancellor) on Feb 10, 2002 at 09:52 UTC

    Hi csotzing Some time ago I posted an answer to this exact question that uses HTML::Parser You can find it at Re: Replacement based on pattern. It does precisely what you want. The thread also highlights the problems if using regexen to parse HTML.

    cheers

    tachyon

    s&&rsenoyhcatreve&&&s&n.+t&"$'$`$\"$\&"&ee&&y&srve&&d&&print

Re: Reg Expr help
by csotzing (Sexton) on Feb 10, 2002 at 12:22 UTC
    Thanks, everyone for the good advice. :-)

    -cs