Hallo PerlMonks,
Just found perlmonks.org when googling for a solution to a special problem. I'm not an experienced Perl programmer and english is not my native language, so please have mercy on me...
My Problem: I've got large XML files (about 10MB each) which I'd like to split into smaller chunks and later merge back to big ones.
xml_split fom module XML::Twig does a good job when used with option -c (condition), but due to the structure of some of my XML files this sometimes gives me thousands of small chunks, which is far too much.
Options -s (chunk size) or -g (group certain tags) seem to be the best choice, but unfortunately all text nodes are then lost after xml_split is finished with my data. Just XML tags are left over.
The reason for that seems to be that xml_split does not use XML::Twig when options -s and -g are active. Instead it uses XML::Parser directly. In this case, XML::Parser neither calls text handlers nor default handlers, but why? It seems to just delete or skip all text for some strange reason.
I can hardly believe that noone else has stumbled upon or even solved this problem so far.
So: Does anyone know about a solution for this or perhaps about some different XML splitting tool/module I could use? I've been searching for quite a while but only found that "xml_split".
Thanks for any help,
donp
In reply to Tool "xml_split" from XML::Twig removes all text by donp
| For: | Use: | ||
| & | & | ||
| < | < | ||
| > | > | ||
| [ | [ | ||
| ] | ] |