Notice how anything/everything outside of the ASCII range is going to be considered as "bad" and the "operative theory" here is that you are supposed to use an entity for all such characters, no matter what the html header says.sub _text { my ($self,$text) = @_; while ( $text =~ /([^\x09\x0A\x0D -~])/g ) { my $bad = $1; $self->gripe( 'text-use-entity', char => sprintf( '\x%02lX', ord($bad) ), entity => $char2entity{ $bad }, ); } }
Well, crap. I hate that sort of attitude in a module, but if you really want to toe that line, there's a not-too-offensive way to do that... Filter the html data so that all the wide characters are turned into entities:
$html =~ s/([^\x09\x0A\x0D -~])/sprintf("&#%d;",ord($1))/eg; $lint->parse ($html);
There! That shut him up. Maybe you don't want to go to such lengths, and most likely the right solution would be to fix that function in Lint.pm ... Your choice.
In reply to Re: HTML::Lint and utf-8 document woes
by graff
in thread HTML::Lint and utf-8 document woes
by GrandFather
| For: | Use: | ||
| & | & | ||
| < | < | ||
| > | > | ||
| [ | [ | ||
| ] | ] |