Vanquish has asked for the wisdom of the Perl Monks concerning the following question:

What is the best module to use for parsing XML to XML. The XML to be parsed is complicated and need to be transformed to simple XML so that it can be loaded in the Database.Can anyone refer to some sample code. Thanks

Replies are listed 'Best First'.
Re: XML to XML
by mrborisguy (Hermit) on May 18, 2005 at 22:21 UTC

    Depending on how you are transforming your XML, you can use possibly use the language that was designed to transform XML, namely XSLT. There are plenty of tutorials on Google, I've always liked w3schools.com tutorials. There is also a module on cpan (XML::XSLT) that would parse the XML file and XSLT file.

    It's not a Perl solution, so if you don't know it you may need to study it a little bit, but it's worth giving a look at.

    -Bryan

      I agree with the XSLT point, although I myself stick with XML::LibXML and XML::LibXSLT. They start off a little more difficult to but work the same way as the DOM parsers and XSL processors in most other languages. And they are the quickest around.

      Don Shanks
      WhitePages.com, Inc.

        Thanks for the advice! I've never actually used the modules I suggested or these, because I haven't had much use for them yet. I've only read about XSLT, but I would love to get into a project where I can use some of the XML information I've been learning.

        -Bryan

Re: XML to XML
by jfroebe (Parson) on May 18, 2005 at 22:04 UTC

    Hi Vanquish,

    Basically you will want parse the original file and export it to the xml you want. Now there are several XML parsers in the CPAN archive but I personally like XML::Twig.

    Perhaps if we saw a sample record of the original XML doc and what you want to convert it to, then we would be able to better direct you in the right direction.

    Jason L. Froebe

    Team Sybase member

    No one has seen what you have seen, and until that happens, we're all going to think that you're nuts. - Jack O'Neil, Stargate SG-1

Re: XML to XML
by scmason (Monk) on May 18, 2005 at 22:13 UTC
    How are you simplifying it? Removing attributes and/or fields? This could be done most efficiently using a stream if you do not need to see the whole tree before you prune the xml. That is to say: i each time you encounter attribute x, you do not want to put it into the new xml, but you do want to put attribute y, then you might use SAX. This way you could include/exclude elements as you come to them.

    However, if you need to analyze the whole tree before you decide how to simplify it, then you should use DOM.

    The difference is that SAX parsers are event based and return items as they finds them, whereas DOM parsers read the entire document and then builds the tree. You then querry the tree.

    Both of the links ( to SAX and DOM ) include links to documentation and sample code.

Re: XML to XML
by ajt (Prior) on May 19, 2005 at 15:36 UTC