in reply to How to encode after using HTML::Strip

Hi, HTML::Strip has this bug Bug #42834 for HTML-Strip: HTML::Strip breaks UTF-8 The quoted workaround makes your code work as you desire:
use Encode; use utf8; sub parse_workaround { my $html = shift; my $hs = HTML::Strip->new(); my $octets = encode_utf8($html); utf8::downgrade($octets); my $stripped = $hs->parse($octets); $hs->eof; return decode_utf8($stripped); }
And subbing in your original code:
my $clean_text = parse_workaround( $raw_html ); # my $hs = HTML::Strip->new(); # my $clean_text = $hs->parse( $raw_html ); # $hs->eof;