in reply to Re: XML::Simple throws decode error in encode.pm
in thread XML::Simple throws decode error in encode.pm

Shoot! I pasted the wrong error. I meant to say that the error printed is

Cannot decode string with wide characters at C:/Perl/lib/Encode.pm line 174.

Okay, I found a way to reduplicate it. Place the following lines in a text file and save it as utf8:

<?xml version="1.0" encoding="utf-8" standalone="yes"?>

<etax id="{e961ee2c-a029-489a-8bf4-3c2ecef7f019}" ettx="TL-Scriptures.ettx">

<sifx>

<LEX st="Kasulatan" id="tl" chrBrk="–—" tSt="s11"/>

<LEX st="Pambungad" id="tl" chrBrk="–—" tSt="p11"/>

<LEX st="Panimula" id="tl" chrBrk="–—" tSt="h11"/>

</sifx>

The following is the module I wrote where the error occurs:

#!/usr/bin/perl -l package SIFX; use strict; use XML::Simple; use Data::Dumper; sub new(){#scalar file name optional my $class = shift; my $self = { etaxFile => '', SIFX => {}, }; bless $self, $class; load($self,shift) if @_ ==1; return $self; } sub load(){ return -1 if(@_ == 0); my ($self, $input) = @_; my $sifx; if((substr $input, -4, 4) eq '.txt'){#if input was a file name $self->{etaxFile} = $input; open my $etax, '<utf8', $input or print "Could not open etax f +ile at __LINE__"; my $text; while($text ne '<sifx>'){#until beginning of SIFX chomp($text = <$etax>); } $sifx= '<sifx>'; do{#until end of SIFX chomp($text = <$etax>); $sifx .= $text; }while($text !~ m#</sifx>#); close $etax; } else{ $sifx = $input; } my $xml = XML::Simple->new(); $self->{SIFX} = $xml->XMLin($sifx); print "SIFX hash is : "; print Dumper($self->{SIFX}); } 1;

And you can test it with the following after changing the $testSifx variable to the path of the text file:

my $xml = XML::Simple->new(); my $testSifx = "C:\\Users\\nate\\Desktop\\testSIFX.txt"; my $sifx = SIFX->new($testSifx); print Dumper($sifx);

I apologize for the original, inadequate, inaccurate post.

Replies are listed 'Best First'.
Re^3: XML::Simple throws decode error in encode.pm
by almut (Canon) on Aug 04, 2010 at 20:59 UTC

    As already hinted at, XMLin() doesn't like already decoded input; it wants bytes/octets.  IOW, simply open the file as

    open my $etax, '<', $input or ... # ^ no :utf8

      Works like a charm! Thanks!

      Works great now! Thanks!