in reply to HTML::Lint and utf-8 document woes

\xE7 doesn't look like a valid utf8 character. It seems like a latin1 version instead. Since your c-cedille shows properly in this page, and the document charset is iso-8859-1, that's most likely the case.

FWIW, I'm rather partial to HTML::Tidy myself...

Replies are listed 'Best First'.
Re^2: HTML::Lint and utf-8 document woes
by GrandFather (Saint) on Nov 01, 2006 at 03:41 UTC

    The bytes in the .pl file are actually C3 A7. It is possible that they have been rendered differently in the process of pasting the code into PerlMonks and then rendered inside code tags by PerlMonks.


    DWIM is Perl's answer to Gödel
      OK, that is indeed the correct utf8 encoding. I suppose that means HTML::Lint does do bad things. In fact, it looks to me like the _text() method in HTML::Lint::Parser gets the wrong encoding back from HTML::Parser.