in reply to Replacing 3 and 4 digit numbers.

Does variable contain html before you try your digits substitution?

If the answer is no, do the digits_4/digits_3 substitution in a single pass

If the answer is yes, use a parser, like XML::Twig or HTML::TreeBuilder::XPath, XML::LibXML, to either find nodes using "xpath" and regex

You could also use only regex but its harder to write match *bold* formatting, but avoid html, Re: match *bold* formatting, but avoid html (tokenize)

  • Comment on Re: Replacing 3 and 4 digit numbers. (html hilite highlight bold)

Replies are listed 'Best First'.
Re^2: Replacing 3 and 4 digit numbers. (html hilite highlight bold)
by beech (Parson) on Apr 07, 2016 at 08:52 UTC

    You can use similar approach with almost any xml/html dom/tree module , you can skip the pretty printer if you're using a browser to view the results

    #!/usr/bin/perl -- use strict; use warnings; use XML::LibXML; use XML::LibXML::PrettyPrint; my $html = q{<p> Looking for targets <p> Text nodes to <i> bold </i> <p> Inside <em> all kinds <i> of <a href="#tags"> tags </a></i></em> <p> Maybe even <em><i>sep</i><u>a</u><i>rat</i><u>ed</u></em> in the f +uture <p> But <b targets="targets" bold="bold" tags="tags" separated="separa +ted"> not </b> inside attributes }; my $xpp = XML::LibXML::PrettyPrint->new; my $dom = XML::LibXML->new()->load_html( string => $html ); print $xpp->pretty_print( $dom ); hilite_text( $dom , 'target|bold|tags|separated' ); print $xpp->pretty_print( $dom ); sub hilite_text { my( $dom, $targets ) = @_; for my $text ( $dom->findnodes( '//text()' ) ){ my( $before, $word , $after, ) = split /($targets)/, "$text"; if( defined $word and length $word ){ $before = $dom->createTextNode( $before ); $after = $dom->createTextNode( $after ); my $bold = $dom->createElement('b'); $bold->appendText($word); $text->parentNode->replaceChild( $before, $text ); $before->addSibling( $bold ); $bold->addSibling( $after ); } } return $dom; }