in reply to out of memory! parsing large XML file

What does the xml look like?

Try xml_pp -- it should be as memory efficient as you can get with http://www.xmltwig.org/ for pretty printing a file

Here's my code if it helps, but it's not really doing anything complicated or interesting.

you should $rows->purge in row_processing as well as flush, see http://xmltwig.org/xmltwig/tutorial/yapc_twig_s4.html

  • Comment on Re: out of memory! parsing large XML file ( xml_pp )

Replies are listed 'Best First'.
Re^2: out of memory! parsing large XML file ( xml_pp )
by slugger415 (Monk) on Feb 13, 2014 at 18:12 UTC

    looks like $rows->purge did the trick! thanks for the tip, didn't know you could purge on that level.

    Thanks to all for the tips -- I'm basically just reading the contents and attributes of a bunch of row/cell elements. (Yes I realize Twig is overkill, but I know how to use it.) And since you asked, here's a snippet:

    <?xml version="1.0" encoding="UTF-8"?> <xmlreport title="Enterprise Internet" dates="May 1, 2013 - May 31, 20 +13"> <columns> <column name="Page" type="dimension">Page</column> <column name="Visitors" type="metric">Visitors</column> <column name="ABC Visitor %" type="metric">ABC Visitor %</colu +mn> <column name="New Visitors" type="metric">New Visitors</column +> <column name="ABC Visitors" type="metric">ABC Visitors</column +> <column name="Visits per Visitor" type="metric">Visits per Vis +itor</column> </columns> <rows> <row rownum="1"> <cell columnname="page" csv="&quot;publib.boulder.ABC.com/ +infocenter/zvm/v6r2/topic/com.ABC.zvm.v620/zvminfoc03.htm&quot;" db=" +53008">publib.boulder.ABC.com/infocenter/...ic/com.ABC.zvm.v620/zvmin +foc03.htm</cell> <cell columnname="cm_visitors" db="407">407</cell> <cell columnname="cm_ABCvisitor1" csv="&quot;33.7%&quot;" +db="33.700000">33.7%</cell> <cell columnname="newvisitors" db="80">80</cell> <cell columnname="cm_ABCvisitors" db="137">137</cell> <cell columnname="cm_visitspervisitor" db="1.958231">2.0</ +cell> </row> <row rownum="2"> <cell columnname="page" csv="&quot;publib.boulder.ABC.com/ +infocenter/zvm/v6r2/topic/com.ABC.zvm.v620/whatsin.htm&quot;" db="113 +6334">publib.boulder.ABC.com/infocenter/...topic/com.ABC.zvm.v620/wha +tsin.htm</cell> <cell columnname="cm_visitors" db="2">2</cell> <cell columnname="cm_ABCvisitor1" csv="&quot;0.0%&quot;" d +b="0.000000">0.0%</cell> <cell columnname="newvisitors" db="0">0</cell> <cell columnname="cm_ABCvisitors" db="0">0</cell> <cell columnname="cm_visitspervisitor" db="1.000000">1.0</ +cell> </row> </rows> </xmlreport>