HTML::TokeParser::Simple makes this task very simple:
use HTML::TokeParser::Simple; my $p = HTML::TokeParser::Simple->new(*DATA); while ( my $token = $p->get_token() ) { if ( $token->is_text() ) { $token->[1] =~ s/2004/2006/; } print $token->as_is; } __DATA__ <html> <head> </head> <body> <h1 id="2004">Euro 2004 : The English were robbed</h1> <p>We <strong>will</strong> have revenge in the 2006 World Cup!</p> <!-- Last edited in 2004 --> </body> </html>
The 2004 occurring as an HTML attribute and the 2004 in the comment remain unchanged.
Cheers
ViceRaid
In reply to Re: Munging Rendered HTML While Preserving Formatting
by ViceRaid
in thread Munging Rendered HTML While Preserving Formatting
by Limbic~Region
| For: | Use: | ||
| & | & | ||
| < | < | ||
| > | > | ||
| [ | [ | ||
| ] | ] |