traveler has asked for the wisdom of the Perl Monks concerning the following question:

I have this code:
use Switch; use XML::Simple; ... my $xml = XMLin("file"); my $foo = $xml; # the xml is <foo>combo</foo> print "$foo\n"; # prints the correct value switch ($foo) { case /combo/ { print "found a combo\n"; } }
and I got these errors (the hex values change) when I added the switch/case.
Malformed UTF-8 character (unexpected continuation byte 0x9c, with no +preceding start byte) in bitwise and (&) at C:/Perl/lib/Switch.pm line 257. Malformed UTF-8 character (unexpected continuation byte 0x90, with no +preceding start byte) in bitwise and (&) at C:/Perl/lib/Switch.pm line 257. Malformed UTF-8 character (unexpected continuation byte 0x92, with no +preceding start byte) in bitwise and (&) at C:/Perl/lib/Switch.pm line 257. Malformed UTF-8 character (unexpected continuation byte 0x9d, with no +preceding start byte) in bitwise and (&) at C:/Perl/lib/Switch.pm line 257. Malformed UTF-8 character (unexpected continuation byte 0x90, with no +preceding start byte) in bitwise and (&) at C:/Perl/lib/Switch.pm line 257.
I tried turning on and off utf8, but it didn't help. What do I need to do?

Thanks, --traveler

Replies are listed 'Best First'.
Re: XML::Simple, Switch and UTF8 errors
by diotalevi (Canon) on Feb 02, 2004 at 18:55 UTC
    Here is a reasonable guess. Switch is really a source filter - it attempts to rewrite your program for you. I'm betting that unless you fix the bug in Switch you probably shouldn't use it here. Be sure to file your bug report at http://rt.cpan.org/NoAuth/Bugs.html?Dist=Switch.

    Switch.pm: BUGS
    There are undoubtedly serious bugs lurking somewhere in code this funky :-) Bug reports and other feedback are most welcome.

Re: XML::Simple, Switch and UTF8 errors
by ysth (Canon) on Feb 02, 2004 at 21:45 UTC
    Can you do a:
    use Data::Dumper; $Data::Dumper::Useqq = 1; print Dumper $foo;
    and let us know what it shows?
      Yep. I had not included it because it was not "special":
      $VAR1 = "combo";
      It looks that way on the screen, too.

      BTW, I have even tried changing the encoding parameter in the XML. If it matters, this is all under ActiveState 5.8.1 on XP. The XML was created using Notepad...

      --traveler

        Reduced the problem to this:
        $ perl -we'$x = "combo\x{fff}"; chop($x); 0 eq (~$x&$x)' Useless use of string eq in void context at -e line 1. Malformed UTF-8 character (unexpected continuation byte 0x9c, with no +preceding start byte) in bitwise and (&) at -e line 1. Malformed UTF-8 character (unexpected continuation byte 0x90, with no +preceding start byte) in bitwise and (&) at -e line 1. Malformed UTF-8 character (unexpected continuation byte 0x92, with no +preceding start byte) in bitwise and (&) at -e line 1. Malformed UTF-8 character (unexpected continuation byte 0x9d, with no +preceding start byte) in bitwise and (&) at -e line 1. Malformed UTF-8 character (unexpected continuation byte 0x90, with no +preceding start byte) in bitwise and (&) at -e line 1.
        Switch.pm is using a peculiarity of ~ and & operator to see if the value is a number or a string, and it's failing when the value being switched upon has the UTF8 flag set but no characters with code points > 255.

        This seems to be a bug in ~: it's returning non-UTF8 with the UTF8 flag set.

        Update: This was fixed just a few days ago (see perlbug #24926).