As Molt says, parsing HTML with regexes is very fragile and you'd be better off using a real HTML parser to do this.
Here's a simple example using HTML::Parser.
--use warnings; use strict; use HTML::Parser; my $html = do { local $/; <> }; my @text; my $p = HTML::Parser->new(text_h=> [\@text, 'dtext']); $p->parse($html); print map { $_->[0] } @text;
"The first rule of Perl club is you do not talk about
Perl club."
-- Chip Salzenberg
In reply to Re: Stripping of HTML content
by davorg
in thread Stripping of HTML content
by Nemp
| For: | Use: | ||
| & | & | ||
| < | < | ||
| > | > | ||
| [ | [ | ||
| ] | ] |