in reply to Re^6: Malformed UTF-8
in thread Malformed UTF-8

Anyway, I updated my test program to something that should now replicate your error:
use strict; use warnings; use Devel::Peek 'Dump'; my $token = "ba\x{f1}o"; utf8::upgrade($token); # force utf-8 encoding and flag. my $term = "ba\303\261o"; # "utf-8 encoded" but no flag. warn "Token:\n"; Dump($token); warn "Term:\n"; Dump($term); print "match\n" if $token =~ /^$term/i;
Note that this runs fine (i.e. no match, no error) on my system (5.8.8 built for i686-linux-thread-multi).

Replies are listed 'Best First'.
Re^8: Malformed UTF-8
by spiros (Beadle) on May 15, 2007 at 17:55 UTC
    bunny:/tmp spiros$ perl test.pl Token: SV = PV(0x1801460) at 0x180bcf0 REFCNT = 1 FLAGS = (PADBUSY,PADMY,POK,pPOK,UTF8) PV = 0x30afc0 "ba\303\261o"\0 [UTF8 "ba\x{f1}o"] CUR = 5 LEN = 6 Term: SV = PV(0x1801484) at 0x180bce4 REFCNT = 1 FLAGS = (PADBUSY,PADMY,POK,pPOK) PV = 0x300fa0 "ba\303\261o"\0 CUR = 5 LEN = 6 Malformed UTF-8 character (unexpected non-continuation byte 0x00, imme +diately after start byte 0xc3) in pattern match (m//) at test.pl line + 17.
    Hurray ! Thanks once more!
        Ah yeah, I know. Unfortunately, upgrading is not an option so now I will have to look into it more (and try to figure out the binmode issue at some point). Thanks once more!