in reply to Re: Locale Responsibilities
in thread Locale Responsibilities

If I don't use the use bytes pragma I get:

Malformed UTF-8 character (unexpected non-continuation byte 0xf4, 1 byte after start byte 0xf1, expected 4 bytes) in pattern match (m//) at /home/aecoope/code/monotone.ca/mtn-browse/lib/perl/FindFiles.pm line 601.

when searching binary data with an re.

Tony.

Replies are listed 'Best First'.
Re^3: Locale Responsibilities
by ikegami (Patriarch) on May 25, 2009 at 17:15 UTC
    No, your incorrect use of _utf8_on (or equivalent such as the :utf8 PerlIO layer) is causing that. use bytes kinda fixes your earlier bug.
    $ perl -MEncode=_utf8_on -e'$s = "\xF1\xF4"; _utf8_on($s); "" =~ /$s/' Malformed UTF-8 character (unexpected non-continuation byte 0xf4, imme +diately after start byte 0xf1) in regexp compilation at -e line 1. Malformed UTF-8 character (1 byte, need 4, after start byte 0xf4) in r +egexp compilation at -e line 1. Malformed UTF-8 character (unexpected non-continuation byte 0xf4, imme +diately after start byte 0xf1) in regexp compilation at -e line 1. Malformed UTF-8 character (1 byte, need 4, after start byte 0xf4) in r +egexp compilation at -e line 1. $ perl -MEncode=_utf8_on -e'$s = "\xF1\xF4"; "" =~ /$s/' $