Depending on how you are transforming your XML, you can use possibly use the language that was designed to transform XML, namely XSLT. There are plenty of tutorials on Google, I've always liked w3schools.com tutorials. There is also a module on cpan (XML::XSLT) that would parse the XML file and XSLT file.
It's not a Perl solution, so if you don't know it you may need to study it a little bit, but it's worth giving a look at.
-Bryan
| [reply] |
I agree with the XSLT point, although I myself stick with XML::LibXML and XML::LibXSLT. They start off a little more difficult to but work the same way as the DOM parsers and XSL processors in most other languages. And they are the quickest around.
Don Shanks
WhitePages.com, Inc.
| [reply] |
Thanks for the advice! I've never actually used the modules I suggested or these, because I haven't had much use for them yet. I've only read about XSLT, but I would love to get into a project where I can use some of the XML information I've been learning.
-Bryan
| [reply] |
Hi Vanquish,
Basically you will want parse the original file and export it to the xml you want. Now there are several XML parsers in the CPAN archive but I personally like XML::Twig.
Perhaps if we saw a sample record of the original XML doc and what you want to convert it to, then we would be able to better direct you in the right direction.
Jason L. Froebe
Team Sybase member No one has seen what you have seen, and until that happens, we're all going to think that you're nuts. - Jack O'Neil, Stargate SG-1
| [reply] |
| [reply] |
How are you simplifying it? Removing attributes and/or fields? This could be done most efficiently using a stream if you do not need to see the whole tree before you prune the xml. That is to say: i each time you encounter attribute x, you do not want to put it into the new xml, but you do want to put attribute y, then you might use SAX. This way you could include/exclude elements as you come to them.
However, if you need to analyze the whole tree before you decide how to simplify it, then you should use DOM.
The difference is that SAX parsers are event based and return items as they finds them, whereas DOM parsers read the entire document and then builds the tree. You then querry the tree.
Both of the links ( to SAX and DOM ) include links to documentation and sample code.
| [reply] |