in reply to Re^3: XML processing taking too much time
in thread XML processing taking too much time
:-))
If you are sure each <KVPair> contains both <Key> and <Value> and is always in <SigData> you can use something as simple as this:
use XML::Rules; my (@keys, @values); my $parser = XML::Rules->new( rules => { _default => '', Key => sub {push @keys, $_[1]->{_content}}, Value => sub {push @values, $_[1]->{_content}}, }, ); $parser->parse(\*DATA); use Data::Dumper; print Dumper(\@keys); print Dumper(\@values); __DATA__ <root> <SigData> <KVPair> <Key>eb08f9990ae6545f9ea625412c71f24f7bf007ed</Key> <Value>c73df5228c35c419f884ba9571310cd7</Value> </KVPair> <bogus>sdf sdhf nsdfg sdfgh nserg sfgdfgh</bogus> </SigData> <SigData> <KVPair> <Key>EB08F9990AE6545F9EA625412C71F24F7BF007ED</Key> <Value>C73DF5228C35C419F884BA9571310CD7</Value> </KVPair> </SigData> </root>
If there is more in the XML you may skip some tags and their children by adding
into the XML::Rules constructor.start_rules => { 'the,list,of,such,tags' => 'skip' },
If you do not want to use the globals, you may do something like:
(assuming there is exactly one <KVPair> in each <SigData>! You'd have to add a test if it was optional.).my $parser = XML::Rules->new( stripspaces => 3, rules => { _default => '', Key => 'content', Value => 'content', KVPair => 'pass', SigData => sub {return '@keys' => $_[1]->{Key}, '@values' => $ +_[1]->{Value}}, root => 'pass', }, ); my $data = $parser->parse(\*DATA); use Data::Dumper; print Dumper($data);
Actually are you sure you want to build two interrelated arrays? Wouldn't it make more sense to create a single hash? Or maybe process the pair as soon as you read it instead of keeping them all in memory?
The first would be
the other just means that you change the anonymous subroutine specified in the rule for <KVPair> or <SigData> to do the processing and to return nothing. That way you only need memory proportional to the size of the individual keys and values.my $parser = XML::Rules->new( stripspaces => 3, rules => { _default => '', Key => 'content', Value => 'content', KVPair => sub {return $_[1]->{Key} => $_[1]->{Value}}, SigData => 'pass', root => 'pass', }, ); my $data = $parser->parse(\*DATA);
|
|---|