in reply to XML::LibXML out of memory
#! /usr/bin/perl use warnings; use strict; use XML::LibXML::Reader; print "Importing...\n"; my $file = 'my.xml'; my $reader = 'XML::LibXML::Reader'->new(location => $file) or die; my $entry_pattern = 'XML::LibXML::Pattern'->new('/martif/text/body/ter +mEntry'); while ($reader->nextPatternMatch($entry_pattern)) { my $termEntry = $reader->copyCurrentNode(1); for my $lang_set ($termEntry->findnodes('langSet')) { my $language = $lang_set->getAttribute('xml:lang'); for my $term_grp ($lang_set->findnodes('./tig')){ my $term = $term_grp->findvalue('./term'); print "$language: $term\n"; } } } print "Done!\n";
Tested with the following input:
<martif> <text> <body> <termEntry> <langSet xml:lang="en"> <tig><term>English</term></tig> <tig><term>Saesneg</term></tig> </langSet> <langSet xml:lang="cs"> <tig><term>Czech</term></tig> <tig><term>Tsieceg</term></tig> </langSet> <langSet xml:lang="de"> <tig><term>German</term></tig> <tig><term>Almaeneg</term></tig> </langSet> </termEntry> </body> </text> </martif>
Reader is a pull parser that doesn't need to load the whole file into memory, but while walking it, you can ask it to inflate the current node into the whole DOM object (which is what copyCurrentNode(1) does.)
($q=q:Sq=~/;[c](.)(.)/;chr(-||-|5+lengthSq)`"S|oS2"`map{chr |+ord }map{substrSq`S_+|`|}3E|-|`7**2-3:)=~y+S|`+$1,++print+eval$q,q,a,
|
|---|
| Replies are listed 'Best First'. | |
|---|---|
|
Re^2: XML::LibXML out of memory
by Anonymous Monk on Mar 24, 2022 at 11:52 UTC | |
by choroba (Cardinal) on Mar 24, 2022 at 12:01 UTC | |
by Anonymous Monk on Mar 24, 2022 at 13:49 UTC |