lyang has asked for the wisdom of the Perl Monks concerning the following question:

I try to write a huge xml file like this.
<?xml version="1.0"?> <bioml label="yeast ORFs"> <organism label="Saccharomyces cerevisiae (yeast)"> </organism> </bioml>
say I have several dozens chromosome and 40000 gene nodes. In each node, it contains endless features. I tried to use XML:DOM and XML:twig to build a big tree and print them to file, but the memory is not enough to do this. is there a better way to get around this problem? thanks lyang

Replies are listed 'Best First'.
Re: how to write a huge XML file without encountering memory limits
by cyocum (Curate) on Jul 20, 2001 at 01:27 UTC
    Have you tried using a SAX parser (ie expat) instead of one that holds everything in memory?
Re: how to write a huge XML file without encountering memory limits
by princepawn (Parson) on Jul 20, 2001 at 02:10 UTC
    It wouldn't hurt to look at Boulder and AcePerl. These are Lincoln Stein's contributions to Genomics in Perl and I do know that Boulder supports XML output of its data. Whether it does so without memory constraints, I can't say.
Re: how to write a huge XML file without encountering memory limits
by voyager (Friar) on Jul 20, 2001 at 03:08 UTC
    What a few of the other posts are essentially saying is that you need a technique that doesn't build the whole thing into a memory structure prior to printing.

    So if your source is an OS file or a database, you need to write out the info to your XML file as soon as possible, i.e. read a row of organism info from a db, write a few rows to your XML file.

Re: how to write a huge XML file without encountering memory limits
by abstracts (Hermit) on Jul 20, 2001 at 01:14 UTC
    Hello,

    You can use the XML::Writer module to accomplish your task.

    Hope this helps,,,

    Aziz,,,

    NAME
           XML::Writer - Perl extension for writing XML documents.
    
    SYNOPSIS
             use XML::Writer;
             use IO;
    
             my $output = new IO::File(">output.xml");
    
             my $writer = new XML::Writer(OUTPUT => $output);
             $writer->startTag("greeting",
                               "class" => "simple");
             $writer->characters("Hello, world!");
             $writer->endTag("greeting");
             $writer->end();
             $output->close();
    
example
by lyang (Initiate) on Jul 20, 2001 at 01:20 UTC
    <?xml version="1.0"?> <bioml label="yeast ORFs"> <organism label="Saccharomyces cerevisiae (yeast)"> <chromosome label="1"> <gene...>.....</gene> </chromosome> .... <chromosome label="n"> .... </chromosome> </organism> </bioml>