Re: Regex to match first html tag previous to text

A regular task of mine is rewriting html and my choice of poison is HTML::TokeParser::Simple. While it may look verbose it is a good trade off, imo, for having an intuitive interface. I find it easy to write and easy to read a week later. :-) ymmv.

#!/usr/bin/perl

use strict;
use warnings;
use HTML::TokeParser::Simple;

my $p = HTML::TokeParser::Simple->new(*DATA);

my ($html, $in_email_link);
while (my $t = $p->get_token){

  $in_email_link++, next if
    $t->is_start_tag(q{a})
      and
    $t->get_attr(q{href})
      and
    $t->get_attr(q{href}) =~ m|email\.|;

  $in_email_link--, next if
    $in_email_link
      and
    $t->is_end_tag(q{a});

  next if $in_email_link;

  $html .= $t->as_is;

}

print qq{$html};

__DATA__
<p>one</p>
<a href="detail.jsp?key=7147&rc=d_20071128&p=2&pv=1">Next Page</a><br/
+>
<p>two</p>
<a class="foot" href="email.jsp?key=7147">E-mail Story</a>
<p>three</p>
<a href="detail.jsp?key=7147&rc=d_20071128&p=2&pv=1">Next Page</a><br/
+>
[download]

output:

<p>one</p>
<a href="detail.jsp?key=7147&rc=d_20071128&p=2&pv=1">Next Page</a><br/
+>
<p>two</p>

<p>three</p>
<a href="detail.jsp?key=7147&rc=d_20071128&p=2&pv=1">Next Page</a><br/
+>
[download]

Comment on Re: Regex to match first html tag previous to text Select or Download Code