John M. Dlugosz has asked for the wisdom of the Perl Monks concerning the following question:

I'm using XML::Twig based on the "Building an XML Filter" example from the docs. It is a good fit, as I only need to modify or remove a small number of nodes, and everything else gets passed through.

However, I need to access some attributes from the root element. If I don't use twig_roots then it has to load the whole file and it's not as trivial to simply not print nodes to be removed. It kind of defeats the main feature of Twig.

Is there a good way to read the attributes from the top-level element, and then configure the twigs (and print outside roots) and callbacks based on what I found? How to get what was already read to print seemlessly with what it will start handling differently?

Another idea is to start processing, then quit after the first start tag. Is there a flag or something that can be set from the callback to "cancel" and return from parse() without reading any more from the file?

—John

2006-05-05 Retitled by GrandFather, as per Monastery guidelines
Original title: 'accessing root tags when XML::Twig and twig_roots'

  • Comment on accessing root tags when using XML::Twig and twig_roots

Replies are listed 'Best First'.
Re: accessing root tags when using XML::Twig and twig_roots
by mirod (Canon) on May 04, 2006 at 16:53 UTC

    I think that what you are looking for is the start_tag_handlers option, which lets you call a handler as soon as the start tag of an element has been parsed.

    In the handler you can die (provided the call to parse is wrapped in an eval) or do whatever you want, including setting regular handlers.

      If I use a start_tag_handler for the root element, then I can't have "roots" that are a subset of the whole document.

      If I set the "roots" within the first callback, what happens to the printout? I have a partial open going on and it hasn't started printing outside of roots yet. Whatever happens, will it be well-behaved?

        Couldn't you do two passes against the file? First create a twig with the start_tag_handler to get whatever it is you need. Then use that information to create a second twig for the actual processing. Would that work or am I missing something? As mirod mentions some sample code and data might help clear things up.

        What did you try? I find it easier to get some code to work than to answer questions like these ;--(