in reply to HTML::Lint and utf-8 document woes
Notice how anything/everything outside of the ASCII range is going to be considered as "bad" and the "operative theory" here is that you are supposed to use an entity for all such characters, no matter what the html header says.sub _text { my ($self,$text) = @_; while ( $text =~ /([^\x09\x0A\x0D -~])/g ) { my $bad = $1; $self->gripe( 'text-use-entity', char => sprintf( '\x%02lX', ord($bad) ), entity => $char2entity{ $bad }, ); } }
Well, crap. I hate that sort of attitude in a module, but if you really want to toe that line, there's a not-too-offensive way to do that... Filter the html data so that all the wide characters are turned into entities:
$html =~ s/([^\x09\x0A\x0D -~])/sprintf("&#%d;",ord($1))/eg; $lint->parse ($html);
There! That shut him up. Maybe you don't want to go to such lengths, and most likely the right solution would be to fix that function in Lint.pm ... Your choice.
|
|---|
| Replies are listed 'Best First'. | |
|---|---|
|
Re^2: HTML::Lint and utf-8 document woes
by GrandFather (Saint) on Nov 01, 2006 at 04:02 UTC | |
by graff (Chancellor) on Nov 01, 2006 at 04:17 UTC | |
by rhesa (Vicar) on Nov 01, 2006 at 04:39 UTC |