siva kumar has asked for the wisdom of the Perl Monks concerning the following question:

I have huge xml file. I have given below the part of problem area code.
The part of my xml data looks like this ..
<Company_Name>XYZ CMPNY</Company_Name><ArgId>775340</ArgId><First_Name +>Carol &amp; Jerry</First_Name>
I need only "Company_Name" and "First_Name". I have written perl code using XML::Parser. The code below
use strict; use XML::Parser; my $parser = new XML::Parser(ErrorContext => 2); my $xmlStr = "<data> <Company_Name>XYZ CMPNY</Company_Name> <ArgId>775340</ArgId> <First_Name>Carol &amp; Jerry</First_Name> </data>"; my $writeDataFlag = 0; $parser->setHandlers( Start => \&start_handler, Char => \&char_handler, End => \&end_handler); $parser->parse($xmlStr); sub char_handler { my ($p, $data) = @_; if($writeDataFlag ==1 ){ print "Data - [$data] \n"; } } sub start_handler { my ($p, $data) = @_; if($data =~ /^(Company_Name|First_Name)$/) { $writeDataFlag = 1; } } sub end_handler { my ($p, $data) = @_; $writeDataFlag = 0; } 1;
OUTPUT:
Company_Name ---> [XYZ CMPNY] First_Name ---> [Carol ] [&] [ Jerry]
I need OTUPUT:
Company_Name ---> [XYZ CMPNY] First_Name ---> [Carol & Jerry]
Please suggest me whats wrong with my code?
Thanks Sivakumar

Replies are listed 'Best First'.
Re: A Small XML::Parser issue
by davorg (Chancellor) on Feb 16, 2007 at 12:34 UTC

    The documentation for the Char handler says this:

    A single non-markup sequence of characters may generate multiple calls to this handler

    So you need to accumulate the data and only print it in the end handler. Something like this perhaps:

    use strict; use XML::Parser; my $parser = new XML::Parser(ErrorContext => 2); my $xmlStr = "<data> <Company_Name>XYZ CMPNY</Company_Name> <ArgId>775340</ArgId> <First_Name>Carol &amp; Jerry</First_Name> </data>"; my $writeDataFlag = 0; $parser->setHandlers( Start => \&start_handler, Char => \&char_handler, End => \&end_handler); $parser->parse($xmlStr); my $text; sub char_handler { my ($p, $data) = @_; if ($writeDataFlag) { $text .= $data; } } sub start_handler { my ($p, $data) = @_; if($data =~ /^(Company_Name|First_Name)$/) { $writeDataFlag = 1; } } sub end_handler { my ($p, $data) = @_; if ($writeDataFlag) { print "Data - [$text] \n"; $text = ''; $writeDataFlag = 0; } }
      Great "davorg".
      Worked without issues. Thank you so much.
      Sivakumar
Re: A Small XML::Parser issue
by fmerges (Chaplain) on Feb 16, 2007 at 12:27 UTC
Re: A Small XML::Parser issue
by Jenda (Abbot) on Feb 16, 2007 at 19:35 UTC

    I know I'm getting annoying

    use XML::Rules; my $parser = XML::Rules->new( rules => [ _default => '', 'Company_Name,First_Name' => sub { print "$_[0] ---> [$_[1]->{_content}]\n"; return; }, ] ); my $xmlStr = "<data> <Company_Name>XYZ CMPNY</Company_Name> <ArgId>775340</ArgId> <First_Name>Carol &amp; Jerry</First_Name> </data>"; $parser->parse($xmlStr);