Beefy Boxes and Bandwidth Generously Provided by pair Networks
good chemistry is complicated,
and a little bit messy -LW
 
PerlMonks  

Re: Re: xml parsers: do I need one?

by inman (Curate)
on Sep 01, 2003 at 09:15 UTC ( #288101=note: print w/replies, xml ) Need Help??


in reply to Re: xml parsers: do I need one?
in thread xml parsers: do I need one?

XSLT is an ideal tool for pre-digesting large pieces of XML in order to work on them further using any of the perl mods mentioned above. This technique is useful if the original XML contains a large amount of information that is surplus to requirements.

In the original example, the 9000 tags may contain a large number of sub tags, attributes, text etc. which are not required. The XML can be processed into either a simpler form of XML that contains only the required data or even a flat file format that can be parsed line by line using standard techniques.

Pre-processing the original XML using an XSLT engine such as Xalan (either directly or via XML::Xalan) is only going to be worth while if the source XML is large and contains a high proportion of non-essential information.

Inman

Replies are listed 'Best First'.
Re: Re: Re: xml parsers: do I need one?
by idsfa (Vicar) on Sep 02, 2003 at 19:36 UTC
    {My apologies for not having been logged in earlier when I suggested XSLT}

    As you said: "source XML is large" and contains "non-essential information". His example case was "large" (3.3MB) and searching solely for tags of type <message>. That is a very simple XSLT to output as HTML (fragmentary example, please don't carp about the syntax):

    <ul> <xsl:for-each select="message"> <li><xsl:value-of select="current()" /></li> </xsl:for-each> </ul>
    XSLT can convert XML directly into XML, HTML, or even perl:
    @messages = ( <xsl:for-each select="message"> "<xsl:value-of select="current()" />", </xsl:for-each> );

    I wouldn't want to comment further without a better understanding of the actual "processing" to be done, but the simple example presented in the question is practically a textbook case for XSLT.

    Whatever. Use the tools you like.

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://288101]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others contemplating the Monastery: (3)
As of 2022-08-16 06:59 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found

    Notices?