Skeeve has asked for the wisdom of the Perl Monks concerning the following question:

I'd like to have a that could prune xml files I need to analyze. xmllint already gives me a nicely indented overview. But I'd like to ignore all stuff below certain elements. Short example:

if the input is

<A> <SD> <CH> <P> <ST/> <EN/> <MA> <V/> </MA> </P> </CH> </SD> </A>
I'd like to call this tool like this:

prune -at=P

and the output should become:

<A> <SD> <CH> <P/> </CH> </SD> </A>
I think I will have to write that on my own and now I'd like to know which modules you recommend.

Replies are listed 'Best First'.
Re: xml parse and print
by PodMaster (Abbot) on Jul 18, 2003 at 10:51 UTC
    XML::Twig or XML::TokeParser, whichever you feel more comfortable with ;)

    MJD says "you can't just make shit up and expect the computer to know what you mean, retardo!"
    I run a Win32 PPM repository for perl 5.6.x and 5.8.x -- I take requests (README).
    ** The third rule of perl club is a statement of fact: pod is sexy.

      I seem to be unable to install XML::Twig :-(

      I have to install XML::Parser first and can't figure out what I am doing wrong. I keep receiving:

      gcc: language depend not recognized gcc: Expat.c: linker input file unused because linking not done
      I never installed CPAN Modules before. Is there any page that might help me remove the error?
        From the XML::Parser V2.31 README file:

        This is a Perl extension interface to James Clark's XML parser, expat. It requires at least version 5.004 of perl and it requires that you have release 1.95.0 or greater of expat installed. You can download expat from:

           http://sourceforge.net/projects/expat/

        jeffa

        L-LL-L--L-LL-L--L-LL-L--
        -R--R-RR-R--R-RR-R--R-RR
        B--B--B--B--B--B--B--B--
        H---H---H---H---H---H---
        (the triplet paradiddle with high-hat)
        
        As PodMaster already pointed out, Google is a good source of information.

        It was actually a problem of solaris' perl 5.6.1. When I compiled my own perl 5.8.0, everything went flawlessly.

Re: xml parse and print
by chanio (Priest) on Jul 19, 2003 at 02:20 UTC
    I think that you should look for some module that does XSL or XSLT. Those should come with some filters to easy select what you want to exclude from the parsing. Besides, everything now is passing through some sort of XSL filter.

    Yes, I know that it is XML but XSL's main function is precisely filtering XML elements.

    I found this...

    ''Internally, XML::Schematron::LibXSLT uses the Gnome Project's XSLT proccessor via XML::LibXSLT and, while this proccessor is not 100% compliant with the XSLT spec at the time of this writing, it is the best XSLT libraray available to the Perl World at the moment. It is therefore possible that you might use a completely valid XSLT expression within one of your schema's tests that will cause this module to die unexpectedly.

    For those platforms on which libxslt is not available, please see the documentation for XML::Scmeatron::Sablotron and XML::Schematron::XPath (also in this distribution) for alternatives. ''(here)

      Thanks for the info.

      Right now I have absolutely NO idea what XSLT is so it's not usefull to me at this moment in time. I will check it when I've learned more.