In terms of gaining speed, this would come from avoiding having to parse the entire xml document. I'm not sure you can avoid this.

Given that parsing has to take place in either case, you can make trade offs of speed versus memory after that point, but you will still have that initial time investment in parsing.

This is where something like XML::Bare would help you out, as it will parse an order of magnitude at least faster than XML::Simple.

I have gone down the road of trying to find clever solutions to process very large xml files quickly, but ultimately I have generally settled on a file management solution instead.

Simply breaking your xml documents into smaller logical pieces will give you more speed gains than a strictly xml parsing approach. I however did this in a case where I needed to process all records in the xml and wanted to gain simple parsing speed.

This may mean something as simple as keeping books with titles beginning with a certain letter in individual files. This is not really applicable to your case if you want to be able to filter by any field.

My real advice would be to look at a database solution instead. This is really the only way to return matching records with reliable speed. If you are more comfortable with text files, at least to start, you can look at DBD::CSV as a way to get your foot in the DBI door.

If you go with XML::Bare, be aware that in my experience the ForceArray parameter does not produce expected results. I created my own work around for this to process the data structure afterwards into what XML::Simple with ForceArray would produce. I can pass it along to you if you go this route.


In reply to Re^5: How do I create an array of hashes from an input text file? by Kc12349
in thread How do I create an array of hashes from an input text file? by MrSnrub

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.