XML::Simple help

Viki@Stag has asked for the wisdom of the Perl Monks concerning the following question:

Replies are listed 'Best First'.

Re: XML::Simple help
by andreas1234567 (Vicar) on Oct 18, 2007 at 09:29 UTC

Perl-XML Frequently Asked Questions

How to choose a parser module

If speed is critical, you'll find that XML::LibXML is much faster but a bit more 'bleeding edge'.

--
Andreas

[reply]

Re: XML::Simple help
by j1n3l0 (Friar) on Oct 18, 2007 at 08:58 UTC

XML-Records

XML-Twig

XML-Records

Smoothie, smoothie, hundre prosent naturlig!

[reply]

Re: XML::Simple help
by DrHyde (Prior) on Oct 18, 2007 at 10:25 UTC

I'll hazard a guess that XML::Simple is reading the whole file, parsing it, creating a data structure and only then returning. That's always going to take time, and for a large document it'll produce a *huge* data structure which might even make your machine swap - hence the sloooooowness.

You might want to look at a streaming parser instead, which reads the file a bit at a time, generating a series of events for you to handle. It'll be a bit more work but will save an awful lot of memory. Even if it's not any faster over all, it'll at least start giving you data sooner so will *appear* to be faster!

[reply]

Re: XML::Simple help
by Jenda (Abbot) on Oct 18, 2007 at 12:52 UTC

It's hard to give you an advice if we do not know what do you plan to do with the data from the XML and/or how much of the data do you even plan to use!

Apart from the modules others already suggested you might try XML::Rules. (Yeah, it's mine, if I don't advertise it, no one will.) It'll allow you to filter the XML as it's being parsed so then you end up with only the stuff you are interested in instead of a huuuuge, deep tree containing mostly stuff you have no use for and that only occupies the memory and maybe even forces your computer to start swapping memory.

You can think of XML::Rules as XML::Simple on steroids, in XML::Simple you can say that you want these tags to be represented as arrays even if there is just one and to use an attribute as the hash key, but that's about it. XML::Rules will allow you to specify that for this tag you want just the content, for that one just this attribute, that you only want the dat in this tag if the attribute foo's value is 'bar', etc. etc. etc.

Jenda
Support Denmark!
Defend the free world!

[reply]

Re: XML::Simple help
by Krambambuli (Curate) on Oct 18, 2007 at 10:10 UTC

...Is there a way to pass only part of XML files to parse using this module

SYNOPSIS
           use XML::Simple;

           my $ref = XMLin([<xml file or string>] [, <options>]);
[download]

[reply]
[d/l]

Re^2: XML::Simple help

by Anonymous Monk on Oct 18, 2007 at 12:02 UTC

Sounds like a good idea, if not, XML::Twig can do just that .. see the twig_roots mode ..

[reply]

Re: XML::Simple help
by toolic (Bishop) on Oct 18, 2007 at 21:01 UTC

http://www.perl.com/cookbook/perlckbk2/solution.csp?day=2

It comes from Perl Cookbook, section: 22.8. Processing Files Larger Than Available Memory.

There is a small example of how to use XML::Twig to read in only portions of a large XML file.

Note: the URL above will point to something different tomorrow.

[reply]