Ah I see, the bottleneck is in Net::S3::Amazon, which appears to be using an XPath approach to get at the needed information. Looks like list_all calls list_bucket_all, which calls list_bucket, which does the XPath dirty work.

If I were in your shoes, I might try to write an alternative sub to list_bucket which uses an approach other than XPath. If you look in:

http://search.cpan.org/src/PAJAS/XML-LibXML-1.65/lib/XML/LibXML/XPathC +ontext.pm

sub find calls new for each node it needs to find:

sub find { my ($self, $xpath, $node) = @_; my ($type, @params) = $self->_guarded_find_call('_find', $xpath, $ +node); if ($type) { return $type->new(@params); } return undef; }

This is where the OO interface of XML::LibXML::XPathContext is your bottleneck. You could probably develop a faster interface using a streaming parser, but how much faster I don't know. You'll need some sort of optimization in there to get a faster result. Sorry I can't be of much more help.

UPDATE - you could also use some of the modules lower level functions, and attempt to parallelize the operation by having each cpu count x percent of the buckets. That's probably easier than speeding up the xml parser, and I think it is probably your best bet to get a two or more times speedup. You could fork off a process that writes the results to a temp file, and then add up all the results at the end. I think that may be your shortest course to victory.


In reply to Re^3: Module uses loads of CPU.. or is it me by redhotpenguin
in thread Module uses loads of CPU.. or is it me by hsinclai

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.