XML Manipulation

LPC2010 has asked for the wisdom of the Perl Monks concerning the following question:

Hi All- I have a master XML document that I am reading with Perl. I also have a list of xpaths, that should be deleted from the document. I have seen XML::Twig, but am not sure how to use it. Basically I have a XML doc in a scalar, and a array of XPATHs. I would like to write a new XML that deletes all the XPATHS specified. Is there a very hard thing to accomplish. I was hoping Twig has a prebuilt function for this.

Comment on XML Manipulation

Replies are listed 'Best First'.
Re: XML Manipulation by mirod (Canon) on Nov 19, 2010 at 20:03 UTC
How complex are those XPath expression? If they are simple and do not require backtracking, then XML::Twig is ideal for it: set the `twig_roots` option to a hash `<xpath> => 1`, then use the `twig_print_outside_roots` option to output the rest of the XML. See "Building an XML filter" in the docs for an example of that kind of processing, and "twig_handlers" to see the subset of XPath that you can use in this mode. Alternatively you can use XML::Twig::XPath, parse the file, then go through you list of XPath expressions, use `findnodes` to get the hits, and delete them. Beyond XML::Twig, you could try XML::LibXML, which also supports XPath. it is faster and better maintained than XML::XPath. Processing would be the same as above.	[reply] [d/l] [select]
Re: XML Manipulation by roboticus (Chancellor) on Nov 19, 2010 at 18:16 UTC
LPC2010: You should really learn how to browse around on CPAN for the interesting words in your question. For example, searching for XML and XPATH quickly turns up XML::XPath, which may be more useful to you than XML::Twig (or may not, I don't do much XML processing here). Spend a little time reading the documentation for some of the XML and XPATH modules on CPAN, and you'll be able to get started. ...roboticus	[reply]