Re^3: UTF8 related proof of concept exploit released at T-DOSE

Why go through that trouble if ":encoding(UTF-8)" does exactly the same thing, the same safe way, only with less code?

If it is sufficient that the app simply never gets to see a malformed byte sequence (or anything following a malformed character) when reading from a source that is expected to be utf8, you're right -- better to handle it via the ":encoding(utf8)" layer in PerlIO.

But if there's any need to diagnose the nature of the malformedness, or to recover any amount of usable data following a bad byte sequence within a given input record, then the extra steps involving "decode('utf8',$string,...)" are the only way to do that, I think.

Comment on Re^3: UTF8 related proof of concept exploit released at T-DOSE

Replies are listed 'Best First'.
Re^4: UTF8 related proof of concept exploit released at T-DOSE by Juerd (Abbot) on Oct 15, 2007 at 16:09 UTC
Using warnings takes care of most, but indeed if you want to catch it and do anything special with it, the extra step is the easiest way. Good point.	[reply]