comment on

The problem lies in this subroutine defined in Lint.pm:

sub _text {
    my ($self,$text) = @_;

    while ( $text =~ /([^\x09\x0A\x0D -~])/g ) {
        my $bad = $1;
        $self->gripe(
            'text-use-entity', 
                char => sprintf( '\x%02lX', ord($bad) ),
                entity => $char2entity{ $bad },
        );
    }
}
[download]

Notice how anything/everything outside of the ASCII range is going to be considered as "bad" and the "operative theory" here is that you are supposed to use an entity for all such characters, no matter what the html header says.

Well, crap. I hate that sort of attitude in a module, but if you really want to toe that line, there's a not-too-offensive way to do that... Filter the html data so that all the wide characters are turned into entities:

$html =~ s/([^\x09\x0A\x0D -~])/sprintf("&#%d;",ord($1))/eg;

$lint->parse ($html);
[download]

There! That shut him up. Maybe you don't want to go to such lengths, and most likely the right solution would be to fix that function in Lint.pm ... Your choice.

In reply to Re: HTML::Lint and utf-8 document woes by graff
in thread HTML::Lint and utf-8 document woes by GrandFather

Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!

Titles consisting of a single word are discouraged, and in most cases are disallowed outright.

Read Where should I post X? if you're not absolutely sure you're posting in the right place.

Please read these before you post! —

Posts may use any of the Perl Monks Approved HTML tags:

a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr

You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)

	For:		Use:
	&		`&`
	<		`<`
	>		`>`
	[		`[`
	]		`]`

Link using PerlMonks shortcuts! What shortcuts can I use for linking?

See Writeup Formatting Tips and other pages linked from there for more info.