in reply to Remove html tags to obtain plain text

I hope you realise that your HTML is pretty seriously broken. If that's carelessness on your part, it's not a good sign. If it's typical of what you are likely to get in real life, I understand.

I found it easier to write my own parser than to work through the HTML::Parser docs & I don't remember them defining how it works with broken HTML like yours. My parser, XML::Lenient, is specifically intended to cope. Two ways of extracting your text are shown in the code below:

use Modern::Perl; use XML::Lenient; my $p = XML::Lenient->new(); my $string = '<style>table{border-collapse: collapse;margin-left: 1cm; +font-Family: courier;width: 60%}.hoverTable tr{background: #D8D8D8;} +.hoverTable tr:hover{background-color: #ffff99; }</style><table borde +r=2 class="hoverTable">[20160628_151916] <tr><td bgcolor="#366092"><f +ont color="White"> PLAIN TEXT TO BE EXTRACTED</td>'; say $p->innertext($p->within($string, 'td')); say $p->wpath($string, 'td/font');

As you don't tell us why you want to use HTML::Parser, I have no idea whether my module would be better for you, though. And remember that I'm biased, like any parent.

Regards,

John Davies

Replies are listed 'Best First'.
Re^2: Remove html tags to obtain plain text
by Mj1234 (Sexton) on Jun 30, 2016 at 05:19 UTC
    This is just part of the complete HTML text. I am trying to use HTML::Parser as I am unable to use HTML::Strip.

      I am trying to use HTML::Parser as I am unable to use HTML::Strip.

      Then use one of the other "html strip" modules