Beefy Boxes and Bandwidth Generously Provided by pair Networks
Don't ask to ask, just ask
 
PerlMonks  

XML::Parser does not parse the Symbol

by gopalr (Priest)
on Jul 11, 2013 at 10:49 UTC ( [id://1043689]=perlquestion: print w/replies, xml ) Need Help??

gopalr has asked for the wisdom of the Perl Monks concerning the following question:

Hi Monks

The following Symbol does not parsing in XML::Parse and its not validated, eventhough I use CDATA.

¿ (¿)

$parser="<symbols><![CDATA[Testing for Symbol ¿]]><symbols>"

$parser = new XML::Parser(Style => 'Tree');

Can you please provide your guidance ?

Thanks in Advance!!

Gopal R

Replies are listed 'Best First'.
Re: XML::Parser does not parse the Symbol
by mirod (Canon) on Jul 11, 2013 at 10:57 UTC

    What's the error message? Without it I can only guess...

    ... that maybe the inverted question mark is encoded in extended-ascii (ISO-8859-1). Since you don't specify an encoding in the XML string, it is assumed to be in UTF-8, and you should get an "invalid character" or such error.

    If in your real code the string is hard-coded in the program file, then you need to use utf8;.

    If you get the data from a file, you need to either add an XML declaration specifying the encoding, pre-process the data to convert it to utf-8 or use the ProtocolEncoding option when you create the XML::Parser object (I would advise against this last solution though, better to keep the info about the encoding of the data with the data than in the code).

      It is working fine if I use ProtocolEncoding

      $parser = new XML::Parser(Style => 'Tree'); $xml = $parser->parse($xml, ProtocolEncoding => 'ISO-8859-1')

      I have one more clarification, If we use ISO-8859-1, will it support to UTF-8 as well ?

        no

        Did you really read the part where I advised you NOT to use ProtocolEncoding?

        If you have to deal with XML, please educate yourself about encodings, it will pay off in the very short term.

Re: XML::Parser does not parse the Symbol
by choroba (Cardinal) on Jul 11, 2013 at 10:55 UTC
    Works for me. I had to tweak your code a bit to make it run:
    #!/usr/bin/perl use warnings; use strict; use utf8; use XML::Parser; my $xml = "<symbols><![CDATA[Testing for Symbol ¿]]></symbols>"; my $parser = 'XML::Parser'->new; $parser->parse($xml);
    لսႽ† ᥲᥒ⚪⟊Ⴙᘓᖇ Ꮅᘓᖇ⎱ Ⴙᥲ𝇋ƙᘓᖇ

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://1043689]
Approved by marto
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others contemplating the Monastery: (10)
As of 2024-03-28 12:04 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found