nneul has asked for the wisdom of the Perl Monks concerning the following question:
Snippet of the code (%contacts has values of a XML::LibXML::Node for each contact).
# other code to retrieve the documents into %contacts foreach my $id ( keys %$contacts ) { my $entry = $contacts->{$id}; my $xml = $entry->toStringC14N(); }
Retrieve a large number of small XML documents (with say 100 nodes each) - this is slow, as it has lots of separate web requests to retrieve the content
or
Retrieve a small number of large XML documents (with say 10k-25k nodes each) - this is fast, can retrieve almost all of the content in one or two requests.
Only difference between the two loops is that the nodes in %contacts are either from 1-2 XML::LibXML::Document's, or are from 100+ documents.) Both end up with the same total set of data.
Problem is, when I use the 'large xml document' approach, each toStringC14N() call on is very slow. (1-2/sec)
If I use the small chunk approach, retrieving the data takes a lot longer, but the processing of the nodes runs in the 100-200/sec range or higher.
Is there anything I can do to speed things up when using the large document retrieval, or do I have to just pick a balance between the two extremes?
|
|---|
| Replies are listed 'Best First'. | |
|---|---|
|
Re: How to speed up XML::LibXML toStringC14N - when used on large document?
by Jenda (Abbot) on Feb 15, 2010 at 15:45 UTC |