comment on

:-))

If you are sure each <KVPair> contains both <Key> and <Value> and is always in <SigData> you can use something as simple as this:

use XML::Rules;

my (@keys, @values);

my $parser = XML::Rules->new(
    rules => {
        _default => '',
        Key => sub {push @keys, $_[1]->{_content}},
        Value => sub {push @values, $_[1]->{_content}},
    },
);
$parser->parse(\*DATA);

use Data::Dumper;
print Dumper(\@keys);
print Dumper(\@values);

__DATA__
<root>
<SigData>
<KVPair>
<Key>eb08f9990ae6545f9ea625412c71f24f7bf007ed</Key>
<Value>c73df5228c35c419f884ba9571310cd7</Value>
</KVPair>
<bogus>sdf sdhf nsdfg sdfgh nserg sfgdfgh</bogus>
</SigData>
<SigData>
<KVPair>
<Key>EB08F9990AE6545F9EA625412C71F24F7BF007ED</Key>
<Value>C73DF5228C35C419F884BA9571310CD7</Value>
</KVPair>
</SigData>
</root>
[download]

If there is more in the XML you may skip some tags and their children by adding

  start_rules => {
    'the,list,of,such,tags' => 'skip'
  },
[download]

into the XML::Rules constructor.

If you do not want to use the globals, you may do something like:

my $parser = XML::Rules->new(
    stripspaces => 3,
    rules => {
        _default => '',
        Key => 'content',
        Value => 'content',
        KVPair => 'pass',
        SigData => sub {return '@keys' => $_[1]->{Key}, '@values' => $
+_[1]->{Value}},
        root => 'pass',
    },
);
my $data = $parser->parse(\*DATA);

use Data::Dumper;
print Dumper($data);
[download]

(assuming there is exactly one <KVPair> in each <SigData>! You'd have to add a test if it was optional.).

Actually are you sure you want to build two interrelated arrays? Wouldn't it make more sense to create a single hash? Or maybe process the pair as soon as you read it instead of keeping them all in memory?

The first would be

my $parser = XML::Rules->new(
    stripspaces => 3,
    rules => {
        _default => '',
        Key => 'content',
        Value => 'content',
        KVPair => sub {return $_[1]->{Key} => $_[1]->{Value}},
        SigData => 'pass',
        root => 'pass',
    },
);
my $data = $parser->parse(\*DATA);
[download]

the other just means that you change the anonymous subroutine specified in the rule for <KVPair> or <SigData> to do the processing and to return nothing. That way you only need memory proportional to the size of the individual keys and values.

Jenda
Support Denmark!
Defend the free world!

In reply to Re^4: XML processing taking too much time by Jenda
in thread XML processing taking too much time by koti688

Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!

Titles consisting of a single word are discouraged, and in most cases are disallowed outright.

Read Where should I post X? if you're not absolutely sure you're posting in the right place.

Please read these before you post! —

Posts may use any of the Perl Monks Approved HTML tags:

a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr

You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)

	For:		Use:
	&		`&`
	<		`<`
	>		`>`
	[		`[`
	]		`]`

Link using PerlMonks shortcuts! What shortcuts can I use for linking?

See Writeup Formatting Tips and other pages linked from there for more info.