in reply to XML::Twig questions
That's not REALLY your XML, is it? (Did you just make stuff up instead of copying and pasting?) Hint: XML usually uses "/" instead of "\". In Windows, the two are interchangeable for path separators. Not so in unix or XML.
That said, you have a hierarchical database. It's big. And you want to load, parse, and query it in a subprocess. That sounds like a recipe for slowness.
Instead, I would do the following. First, I would implement the naive XML::Twig implementation. I'd load the whole sucker into RAM, and have it available for queries. Then I'd set it up as a daemon, probably with Net::Server. And then the subprocess that you're currently using would just connect to the daemon, send the query, and the daemon would use that to look up in the in-memory cache, and return the value (see Storable for sending data from one process to another, especially if they're on the same machine which means they should be using the same level of perl). This theory is based on the assumption that it's the loading and parsing of XML that takes the longest. Then I'd see if the performance was acceptable. If not, plan B. (Though, if it's just swapping problems, add more RAM.)
The next option is to hand the entire piece of work over to a more generic hierarchical database. If no hierarchical database is available, you may be able to use a separate program to parse the XML and load a relational database, though I hear that DB2 has a new "pureXML" ability which allows it to shred XML right into the database and give you an SQL interface (other vendors may have something similar, I don't know). This would be more expensive (unless pureXML is available with their Express-C option, I don't know that, either), but it's likely to work fairly quickly. And probably a lower RAM requirement than my first option above. The other expensive part is switching your mindset over to an SQL-like method of querying instead of trying to do it all in one process. If this also doesn't have acceptable performance (either relationally or hierarchically), you probably have requirements that are going to be hard to meet in your current hardware setup.
|
|---|