That's not REALLY your XML, is it? (Did you just make stuff up instead of copying and pasting?) Hint: XML usually uses "/" instead of "\". In Windows, the two are interchangeable for path separators. Not so in unix or XML.

That said, you have a hierarchical database. It's big. And you want to load, parse, and query it in a subprocess. That sounds like a recipe for slowness.

Instead, I would do the following. First, I would implement the naive XML::Twig implementation. I'd load the whole sucker into RAM, and have it available for queries. Then I'd set it up as a daemon, probably with Net::Server. And then the subprocess that you're currently using would just connect to the daemon, send the query, and the daemon would use that to look up in the in-memory cache, and return the value (see Storable for sending data from one process to another, especially if they're on the same machine which means they should be using the same level of perl). This theory is based on the assumption that it's the loading and parsing of XML that takes the longest. Then I'd see if the performance was acceptable. If not, plan B. (Though, if it's just swapping problems, add more RAM.)

The next option is to hand the entire piece of work over to a more generic hierarchical database. If no hierarchical database is available, you may be able to use a separate program to parse the XML and load a relational database, though I hear that DB2 has a new "pureXML" ability which allows it to shred XML right into the database and give you an SQL interface (other vendors may have something similar, I don't know). This would be more expensive (unless pureXML is available with their Express-C option, I don't know that, either), but it's likely to work fairly quickly. And probably a lower RAM requirement than my first option above. The other expensive part is switching your mindset over to an SQL-like method of querying instead of trying to do it all in one process. If this also doesn't have acceptable performance (either relationally or hierarchically), you probably have requirements that are going to be hard to meet in your current hardware setup.


In reply to Re: XML::Twig questions by Tanktalus
in thread XML::Twig questions by r1_fiend

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.