I am sticking with XML::LibXML; especially because I need to be validating my input with a local schema prior to the actual parsing.
Then validate the input prior to actual parsing and do not let the choice of validator affect your choice of parser (extractor).
if you want to do something with the action_request/info_requoest right away:
use strict; use XML::Rules; my $parser = XML::Rules->new( stripspaces => 7, namespaces => { 'http://www.somedomain.tld/market_reg/admin_server/1.0' => '', 'http://www.w3.org/2001/XMLSchema-instance' => 'xsi', }, rules => { _default => 'content', instance_information => 'as is', 'action_request,info_request' => sub { my ($tag,$attr) = @_; print $attr->{action}, "\n"; while ( my ($k,$v) = each %{$attr->{instance_information}} +) { print " $k: $v\n"; } print "\n"; return; }, }, ); $parser->parse(\*DATA); __DATA__ <?xml version="1.0" encoding="UTF-8"?> <instruction_request xmlns="http://www.somedomain.tld/market_reg/admin +_server/1.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://www.somedomain.tld/market_reg/admin_ser +ver/1.0 admin_server.xsd"> <request> ...
if you want to extract the data and just the data:
use strict; use XML::Rules; my $parser = XML::Rules->new( stripspaces => 7, namespaces => { 'http://www.somedomain.tld/market_reg/admin_server/1.0' => '', 'http://www.w3.org/2001/XMLSchema-instance' => 'xsi', }, rules => { _default => 'content', instance_information => 'pass', 'action_request,info_request' => 'pass', 'request' => 'as array', 'instruction_request' => sub {$_[1]->{request}}, }, ); my $data = $parser->parse(\*DATA); use Data::Dumper; print Dumper($data); __DATA__ ...
if you need to distinguish between action and info requests:
... 'action_request,info_request' => sub { my ($tag, $attr) = @_; +$attr->{type} = $tag; return %{$attr}}, ...
In the first case only the data of one <request> are in memory at any time, in the others the whole data ends in memory, but trimmed down substantially.
Jenda
Enoch was right!
Enjoy the last years of Rome.
In reply to Re: XML::LibXML - parsing question!!
by Jenda
in thread XML::LibXML - parsing question!!
by MarkovChain
| For: | Use: | ||
| & | & | ||
| < | < | ||
| > | > | ||
| [ | [ | ||
| ] | ] |