Re: XSLT processing huge XMLs
by ajt (Prior) on Feb 07, 2005 at 11:44 UTC
|
I've found XML::LibXML and XML::LibXSLT to be very powerful modules, that when combined are capable of processing large and complex XML files. They are based on the very good Gnome libXML2 and libXSLT libraries.
HTH
* EDIT: Extra Links added *
| [reply] |
|
Is there any alternative to extent Sablotron ?... Needs to overcome the .Net solution that translates the xml in few secs... That means to work exclusively in win32. For small xmls Sablotron is excellent but considering huge xmls..
Dont know, maybe something to parse the xml in chunks translate those sequentialy (with xsl)and adding them in the final xml version of the xslt. Is there any hinds for that approach?
Thanks for your concer ajt.
| [reply] |
|
XML::LibXSLT will install okay on Win32, you can build all the bits yourself, use the Cygwin version of Perl or install a pre-compiled binary, see below for more details:
While you are here you may wish to join the Monastery, which will improve your experience of the place by enabling extra features.
Goood Luck,
| [reply] |
|
|
|
|
| [reply] |
Re: XSLT processing huge XMLs
by dakkar (Hermit) on Feb 07, 2005 at 18:04 UTC
|
You say:
found many probs with the XML parser, which holds to memory ALL the xml
Well, it's supposed to do that... to be able to use XSLT, you need the entire DOM (Document Object Model) in RAM, since XSLT allows random access to every part of the document.
About memory occupation: at the least, every element occupies tha space for its name, name and value of each attribute, and a couple of pointers (to parent and first child, for example). Meaning that a properly packed DOM can occupy a bit less space than the file it was parsed from. No implementation I know does it this packed, however: in order to be faster, usually. Yau should anyway see an occupation of less than twice the file size.
About the .NET solution: is it using XSLT, or munging the data directly? XSLT is not a really optimizable language, and implementations tend to be rather slow (even in C).
| [reply] |
|
If a given XSLT-program does not require access to the whole tree, e.g. by only using information in the current node, it is possible to do streamline processing where only the relevant part of the XML-file is kept in memory, and information is generated as soon as possible.
I believe that the original XT did this, and that Xalan can do this for simple cases, but I have not researched on this for some time.
| [reply] |
|
...., or munging the data directly?
Ok. I think (cause of its quickness) that mungs data directly How can we do that with perl?
| [reply] |
Re: XSLT processing huge XMLs
by inman (Curate) on Feb 08, 2005 at 09:59 UTC
|
Can you post a sample of the XML that you are working with and a description of the task that you are trying to achieve? It may be the case that you could divide the XML into smaller chunks before transforming with XSLT. | [reply] |