nglenn has asked for the wisdom of the Perl Monks concerning the following question:

I am using XML::Simple to parse some utf8 xml files. However, utf8 files with BOM cause it to crash and burn with the following error:

Wide character in subroutine entry at C:/Perl/lib/Encode.pm line 174.

The offending code is:

use XML::Simple; local $XML::Simple::PREFERRED_PARSER = 'XML::Parser'; $xml->XMLin($file,ForceArray => ['map'], KeyAttr =>{},);

The error does NOT occur with utf8 files that have no BOM. Seems bazaar, and I have no idea what to do. Any suggestions?

Update:

It has to be the way I'm using it. Here's more context for the code I posted earlier:

In ETTX.pm:

package ETTX; use strict; use XML::Simple; local $XML::Simple::PREFERRED_PARSER = 'XML::Parser'; use XML::SemanticDiff; sub new(){#scalar file name my $class = shift; my $self = { ettxFile => '', ettx => {}, }; bless $self, $class; load($self,shift) if @_ ==1; return $self; }; sub load(){#scalar file name my ($self, $ettxFile) = @_; print "loading $ettxFile"; open(my $fh, '<', $ettxFile); binmode($fh); my $xml = XML::Simple->new(); $self->{ettx} = $xml->XMLin($fh, ForceArray => ['map'], KeyAttr => {}, ) ->{table}; $self->{ettxFile} = $ettxFile; # print Dumper($self); 1; }

The code above was changed to to use the binmode() thing, but still fails just the same.

Called like so:

use lib 'C:\Texts\Programs'; #wherever ETTX.pm is use ETTX; my $ettxFile = 'someXMLfile.ettx'; my $ettx->load($ettxFile);
And here's a sample xml file:

<?xml version="1.0" encoding="utf-8" standalone="yes"?> <ettx ver="2"> <table id="{4fa6cd7a-f7b6-416d-8f59-3acc0eab9bdb}" name="TestFile"> <level type="V"> <map sync="Title" src="someText"/> <map sync="Title Page" src="someText"/> </level> </table> </ettx>

Replies are listed 'Best First'.
Re: Encode throws "Wide character in subroutine entry" when using XML::Simple
by Jim (Curate) on Dec 12, 2010 at 06:19 UTC

    This works for me.

    D:\>file Simple.xml Simple.xml: Text file, UTF-8 format D:\>cat Simple.xml <?xml version="1.0" encoding="UTF-8"?> <root> <foo> bar </foo> </root> D:\>od -h Simple.xml | head -1 | cut -c 1-24 0000000000 EF BB BF D:\>cat Simple.pl #!perl use strict; use warnings; use XML::Simple; local $XML::Simple::PREFERRED_PARSER = 'XML::Parser'; my $file = shift @ARGV; my $xml = XMLin($file, ForceArray => ['map'], KeyAttr => {}); D:\>perl Simple.pl Simple.xml D:\>perl -v | head -8 | fmt -w 68 This is perl 5, version 12, subversion 2 (v5.12.2) built for MSWin32-x86-multi-thread (with 8 registered patches, see perl -V for more detail) Copyright 1987-2010, Larry Wall Binary build 1202 [293621] provided by ActiveState http://www.ActiveState.com Built Sep 6 2010 23:36:03 D:\>

    Simple.xml is a simple XML document in the UTF-8 character encoding scheme of the Unicode coded character set. It has a UTF-8 byte order mark in it. XML::Simple handles the XML document properly.

      Not directly related but I had the same error and google made me land here. You get the same error when you try to decode and specify a wrong encoding: I tried to decode to iso 8859-1 (latin-1) when it was utf-8.
Re: Encode throws "Wide character in subroutine entry" when using XML::Simple
by Jim (Curate) on Dec 12, 2010 at 00:10 UTC

    Try this (untested):

    use strict; use warnings; use autodie; use XML::Simple; local $XML::Simple::PREFERRED_PARSER = 'XML::Parser'; open my $fh, '<:encoding(UTF-8)', $file; $xml->XMLin( $fh, ForceArray => ['map'], KeyAttr => {}, );

    Just a thought... But see 850187.

      Backwards. That would cause the problem.
      open(my $fh, '<:encoding(UTF-8)', $file);
      should be
      open(my $fh, '<', $file); binmode($fh);

        That is as counterintuitive as a thing can be. You have a file that is encoded in UTF-8 and has a UTF-8 byte order mark in it, yet to solve a problem with it not being interpreted properly as UTF-8 text by a module, you have to use binmode, not the proper UTF-8 encoding layer :encoding(UTF-8). It just doesn't make sense. Who would intuit that? Obviously, not I. :-(

        Nope. This still throws the same errors:

        open(my $fh, '<', $file); binmode($fh); my $xml = XML::Simple->new(); $self->{ettx} = $xml->XMLin($fh, ForceArray => ['map'], KeyAttr => {}, ) ->{table};