zuma53 has asked for the wisdom of the Perl Monks concerning the following question:

Hi--

I have this 3-level HoH that I would like to print using XML-Twig. The hash stores a fair amount of info (about 20MB over 14K rows) and I am able to print it out by using XML-Twig to build the whole tree then printing that tree.

Something like:
foreach key (keus %hash) some data manipulation build some XML nodes onto tree foreach key2 (keys %{$hash{key}}) some data manipulation build some XML nodes onto tree foreach key2 (keys %{$hash{key}{key2}}) parse some prepared XML snippets graft onto tree end end end tree->print
This works fine and dandy, but as my hash is getting more and more rows, the slower and slower it runs. Plus the memory allocated goes through the roof, and lately, I've been getting segmentation faults (out of memory is a guess).

Is there a way using XML-Twig to flush(?) the output at the end of a loop step to create the output tree as it is looping through, vs. holding the entire tree in memory before commencing the print?

I've looked around for answers, but most of the Twig tricks are parsing-centric.

Thanks.

Replies are listed 'Best First'.
Re: XML-Twig: Tree building and printing
by mirod (Canon) on Jun 25, 2012 at 07:01 UTC

    I have never used it this way, but I think you could use flush_up_to at the end of the inner loop to flush the part of the tree before the current element.

Re: XML-Twig: Tree building and printing
by frozenwithjoy (Priest) on Jun 25, 2012 at 05:15 UTC
    Using IO::Handle, you can control how output to a filehandle is flushed. In your situation, you may need to go in and make a small change to the print sub in Twig.pm.

      Using IO::Handle, you can control how output to a filehandle is flushed. In your situation, you may need to go in and make a small change to the print sub in Twig.pm.

      Nonsense. If you don't want to build a giant tree in memory, you don't have to, print as soon as possible.

        What part is nonsense? When you print to a file, it doesn't automatically write to disk that instant. That would be inefficient. An example: you have a script that prints some log file to keep track of the script's progress. Unless you activate autoflush, you can't be certain that the log file represents the current status of the running script. In fact, sometimes the data destined for the log file might not be written at all until the script is completely finished executing.