in reply to Parsing HTML with tags intact

Kind of hackish... but as the start handler is being called before the text handler, you could store the tag name in a global variable, which you can then use in the text method to decorate the content as desired:

... my $tagname; sub start { my ($self, $tag, $attr, $attrseq, $origtext) = @_; $tagname = $tag; # store tag name for later use if ($tag =~ /^span$/i && $attr->{'class'} =~ /^main-content$/i +) { # set if we find <span class="main-content" $content_flag = 1; } } sub text { my ($self, $text) = @_; # If we're in <H1>...</H1> or my $tagged_text = $text !~ /^\s*$/ ? "<$tagname>$text</$tagnam +e>" : $text; if ($content_flag) { $main_content .= $tagged_text; } } ...

Output:

<bold_text> Here's the body 1 </bold_text> <p> para1 </p> <p> para2 </p>

(Might need some further tweaking (like the $text !~ /^\s*$/ ? ...) to handle edge cases — but you get the idea.)