LPC2010 has asked for the wisdom of the Perl Monks concerning the following question:

Hi All- I have a master XML document that I am reading with Perl. I also have a list of xpaths, that should be deleted from the document. I have seen XML::Twig, but am not sure how to use it. Basically I have a XML doc in a scalar, and a array of XPATHs. I would like to write a new XML that deletes all the XPATHS specified. Is there a very hard thing to accomplish. I was hoping Twig has a prebuilt function for this.

Replies are listed 'Best First'.
Re: XML Manipulation
by mirod (Canon) on Nov 19, 2010 at 20:03 UTC

    How complex are those XPath expression? If they are simple and do not require backtracking, then XML::Twig is ideal for it: set the twig_roots option to a hash <xpath> => 1, then use the twig_print_outside_roots option to output the rest of the XML. See "Building an XML filter" in the docs for an example of that kind of processing, and "twig_handlers" to see the subset of XPath that you can use in this mode.

    Alternatively you can use XML::Twig::XPath, parse the file, then go through you list of XPath expressions, use findnodes to get the hits, and delete them.

    Beyond XML::Twig, you could try XML::LibXML, which also supports XPath. it is faster and better maintained than XML::XPath. Processing would be the same as above.

Re: XML Manipulation
by roboticus (Chancellor) on Nov 19, 2010 at 18:16 UTC

    LPC2010:

    You should really learn how to browse around on CPAN for the interesting words in your question. For example, searching for XML and XPATH quickly turns up XML::XPath, which may be more useful to you than XML::Twig (or may not, I don't do much XML processing here). Spend a little time reading the documentation for some of the XML and XPATH modules on CPAN, and you'll be able to get started.

    ...roboticus