error returned with XML::Simple or Data::Dumper

Anonymous Monk has asked for the wisdom of the Perl Monks concerning the following question:

Replies are listed 'Best First'.
Re: error returned with XML::Simple or Data::Dumper by roboticus (Chancellor) on Jul 17, 2010 at 18:04 UTC
Perhaps since $_data is undefined, the encoding detection software isn't happy with it? Why not try setting $_data to some XML and see what it does. ...roboticus	[reply]
Re^2: error returned with XML::Simple or Data::Dumper by Anonymous Monk on Jul 17, 2010 at 19:32 UTC
Sorry, this is in a sub routine, and $_data is passed to the subroutine... so it is not undefined... I have it write to a debug file and here is an example of it, I only changed the private data... `Data Received: " <?xml version = "1.0"?> <response> <status>success</status> <cardnumber>4141414141414141</cardnumber> <balance>1872.39</balance> </response> " XML Parser Parsed it into: $VAR1 = { 'balance' => '1872.39', 'cardnumber' => '4141414141414141', 'status' => 'success' };` [download] So, I know it is getting data passed to it. Richard	[reply] [d/l]
Re^3: error returned with XML::Simple or Data::Dumper by derby (Abbot) on Jul 17, 2010 at 21:32 UTC
Your $_data contains data in a character encoding that XML::SAX::PurePerl knows nothing about. Where are you getting $_data from? Is it UTF-8? ASCII? Some other charset? Is the charset being mangled about along the line? I couldn't see how to specifically tell XMLin which charset to use. -derby	[reply]
Re^4: error returned with XML::Simple or Data::Dumper by Anonymous Monk on Jul 17, 2010 at 22:49 UTC
Re^5: error returned with XML::Simple or Data::Dumper by aquarium (Curate) on Jul 18, 2010 at 23:51 UTC
Re: error returned with XML::Simple or Data::Dumper by grantm (Parson) on Jul 19, 2010 at 01:14 UTC
That message is a warning (rather than an error) which comes from the XML::SAX::PurePerl parser module. The EncodingDetect.pm file contains a routine to guess what encoding the source document uses. The routine will only be invoked if your source document does not start with an XML declaration that declares the encoding. So if you get an encoding declaration added to the document when it is generated then the warning will go away. The encoding detection routine has very simple logic. It first looks at the first few bytes of the file to see if it starts with a 'Byte Order Mark' (BOM). If a BOM is present, the encoding will be detected automatically. If there is no BOM but the first four bytes are ASCII "<?xm" then UTF-8 encoding is assumed. If the first non-whitespace byte is ASCII "<" then UTF-8 encoding is assumed. Finally a check is done to see if the bytes look like EBCDIC. If all these checks fail (as is happening in your case) and the warning is emitted, UTF-8 encoding is assumed and parsing will continue. However it seems very unlikely you have a valid XML document if none of those checks were successful. The most likely scenario is that the input XML is either undefined or an empty string. I recommend you go back and throw in a 'print' to confirm you really do have some XML.	[reply]