(W) The first chunk parsed appears to contain undecoded UTF-8 and one or more argspecs that decode entities are used for the callback handlers. The result of decoding will be a mix of encoded and decoded characters for any entities that expand to characters with code above 127. This is not a good thing. The solution is to use the Encode::encode_utf8() on the data before feeding it to the $p->parse(). For $p->parse_file() pass a file that has been opened in ":utf8" mode. The parser can process raw undecoded UTF-8 sanely if the C is enabled or if the "attr", "@attr" or "dtext" argspecs is avoided.