Anonymous Monk has asked for the wisdom of the Perl Monks concerning the following question:

Hi all, I am working on a perl script for the uofm.
It parses XML. I get this error.
Out of memory during request for 4084 bytes, total sbrk() is 536545280 bytes!
This is the code
$db->do("begin"); $del = $db->prepare("delete from master where org = '$we'"); $del->execute; my $we = "hh"; my $response = XMLin("katrina.xml"); for my $id (keys %{$response->{offer}}) { $db2->execute($we,$id,$response->{offer}->{$id}->{city},$response->{of +fer}->{$id }->{state},$response->{offer}->{$id}->{zip},$response->{offer}->{$id}- +>{link},$r esponse->{offer}->{$id}->{slots_available},$response->{offer}->{$id}-> +{descripti on}); } $db->commit();
The XML file is 42Mb, is there anything I can do

Replies are listed 'Best First'.
Re: Katrina Parseing perl script
by Zaxo (Archbishop) on Sep 10, 2005 at 01:19 UTC

    You're running out of memory because the file contains so much data. The data structure your XML module builds is too big for your machine. The solution is to use a XML parser that can handle stream mode and doesn't attempt to build the whole document.

    XML::Twig would be a good one to try. Its pod has an example of working with a huge file.

    After Compline,
    Zaxo

      Can you give me an example of the code above in xml::twig?
Re: Katrina Parseing perl script
by pg (Canon) on Sep 10, 2005 at 01:19 UTC

    XML::Parser is a good candidate here.

    All you want is to go through the XML document once, and process each group of tags one by one. At any given time, you only need a handful of those tags and it content stored in memory.

    Put your code database access code in handlers.

    With XML::Parser, the parser only does the parsing (which is what you need), but the rest is up to you, including whether you want to store the XML file in a data structure, or how big a portion of it should be stored in the memory at a given time. As I said, in your case, all you want at a given time is the content of a handful tags.

Re: Katrina Parseing perl script
by raptnor2 (Beadle) on Sep 10, 2005 at 01:57 UTC
    XML parsers come in two flavors: tree and streaming. Streaming parsers make better use of memory. This link has more information on the difference if you're interested.

    Cheers,

    John

Re: Katrina Parseing perl script
by Tanktalus (Canon) on Sep 10, 2005 at 01:35 UTC

    While the other posts have some insight into some potential problems, I'll add my most recent out-of-memory error solutions:

    • ulimit - make sure your memory settings are set as high as you need. If you have a box unto yourself, the ulimit could be set to unlimited - if it is already set there, then this isn't your problem.
    • Perl version. You don't specify what perl version you're using, but we had to undergo an emergency upgrade from 5.8.6 to 5.8.7 which solved some of our out-of-memory conditions. Also, when I first started using XML, we were using 5.6.0 even though 5.8.0 was out. Under 5.6, we got constant memory problems, while 5.8.1 (which we eventually moved to for that prototype) had none, with no other changes to the code.
    Even if these don't help you in this situation, others may find your question via Super Search, and I just want these suggestions kept here beside the other good suggestions. ;-)

Re: Katrina Parseing perl script
by davidrw (Prior) on Sep 10, 2005 at 13:18 UTC
    looks like good advice above on the xml/memory issues .. just a couple quick code comments:
    • Instead of $db->do("begin"); the 'AutoCommit' DBI attribute is typically used:
      my $db = DBI->connect($data_source, $username, $auth, { AutoCommit => +0 } );
    • might as well placeholder the DELETE statement, too (and don't need to prepare):
      $del = $db->do(delete from master where org = ?", {}, $we);
    • You can use a slice of the hashref to make the bind variables easier to read/maintain:
      $db2->execute( $we, $id, @{ $response->{offer}->{$id} }{ qw/ city state zip link slots_available description / }, );
Re: Katrina Parseing perl script
by holli (Abbot) on Sep 10, 2005 at 14:37 UTC
    Who or what is Katrina? It's not that I'm curious ;-)


    holli, /regexed monk/
Re: Katrina Parseing perl script
by planetscape (Chancellor) on Sep 11, 2005 at 04:28 UTC