Sachin has asked for the wisdom of the Perl Monks concerning the following question:

Show error "Out of Mermory" when i run it!! Anybody Can tell me what's the problem in the program?? And when i change the Value "maximumRecords=1" in $login_url then it run.
use Data::Dumper; use XML::Simple; use WWW::Mechanize; my $login_url = "http://www.jstor.org/action/doXmlSearch?query=dc.titl +e+%3D+%22test%22&version=1.1&operation=searchRetrieve&recordSchema=in +fo%3Asrw%2Fschema%2F1%2Fdc-v1.1&maximumRecords=10&startRecord=1&a.dis +cipline=&recordPacking=xml"; my $mech = new WWW::Mechanize; $mech->get($login_url); my $xml = $mech->content; $xml =~ s/\<\?xml version="1.0" encoding="UTF-8"\?\>//ig; $xml =~ s/\<\?xml-stylesheet type="text\/xsl" href="\/templates\/xsl\/ +_jstor\/search\/searchRetrieveResponse.xsl"\?\>//ig; my $xs = XML::Simple->new(); my $ref = $xs->XMLin($xml); my $records_data = $ref->{'records'}->{'record'}->{'recordData'}->{'sr +w_dc:dc'}; foreach my $rd ($records_data){ my ($title) = $rd->{'dc:title'}; print "\n " . $title; }

Replies are listed 'Best First'.
Re: Show error "Out of Mermory"!!
by biohisham (Priest) on Oct 12, 2009 at 09:44 UTC
    That can be well because of buffering overflow, disable the buffering as in
    use Data::Dumper; use XML::Simple; use WWW::Mechanize; $| ++; #making hot the pipe
    or save the $mech->get($login_url) directly to a file by adding in   ':content_file' => 'somefile'

    Excellence is an Endeavor of Persistence. Chance Favors a Prepared Mind.

      Hi biohisham, Thanks for reply i also done by saving xml in file but the problem("OUT OF MEMORY") didn't solve. So please tell any other way to solve out it. Could u tell me what's "Pseudo-hashes are deprecated" and how to solve it. I am new to perl so that have so many problem.

      use Data::Dumper; use XML::Simple; use WWW::Mechanize; my $login_url = "http://www.jstor.org/action/doXmlSearch?query=dc.titl +e+%3D+%22test%22&version=1.1&operation=searchRetrieve&recordSchema=in +fo%3Asrw%2Fschema%2F1%2Fdc-v1.1&maximumRecords=10&startRecord=1&a.dis +cipline=&recordPacking=xml"; my $mech = new WWW::Mechanize; $mech->get($login_url); my $xml = $mech->content; open (OUT, "> jestor_1.xml") || \ &htmlDie ("Error: Could not open output file!\n"); my @contents = split '\n', $xml; my $line; foreach $line (@contents) { chomp($line); next if ( $line =~ /^\s*$/m ); print OUT "$line\n"; } close (OUT); my $xs = XML::Simple->new(ForceArray => 'srw_dc:dc'); my $ref = $xs->XMLin("jestor_1.xml"); my $records_data = $ref->{'records'}->{'record'}->{'recordData'}->{'sr +w_dc:dc'}; foreach my $rd ($records_data){ my ($title) = $rd->{'dc:title'}; print "\n " . $title; }

        You've told XML::Simple to "ForceArray" (i.e. to always create arrays, even if there is only one element).  Thing is that ForceArray expects a boolean value, so as you have ForceArray => 'srw_dc:dc', you've enabled the feature globally (not only for 'srw_dc:dc' as the idea might have been...)

        That means you have to access the resulting data structure accordingly, i.e. explicitly specify all array indices

        my $records_data = $ref->{'records'}->[0]->{'record'}->... # or shorter my $records_data = $ref->{'records'}[0]{'record'}...

        (note the [0] in between the fields 'records' and 'record')

        Update: the thing with the "Pseudo-hashes are deprecated" is as follows: in short, as pseudo-hash is an array with a hash in its first element (index 0) that is being used to lookup the array indices to be used for the individual pseudo-hash fields (this applies to Perl versions < 5.10 only, btw).  E.g.

        $pseudohash = [ { key => 5 } ];

        When you, for example, write

        $pseudohash->{key} = 42;

        the resulting data structure would look like

        $VAR1 = [ { 'key' => 5 }, undef, undef, undef, undef, 42 ];

        Two things to note: (1) the pseudo-hash field 'key' maps to array index 5, and (2) the array is automatically resized (as usual, with all indices less than 5 being created as undef).

        Now, XML::Simple (with ForceArray enabled) produces a data structure something like this

        my $data = { 'records' => [ { 'record' => [ { 'recordPosition' => [ '1' ], }, ], }, ], };

        so when you write

        my $records_data = $ref->{'records'}->{'record'}->{'recordData'}

        the hash that holds 'record' is being used as pseudo-hash indexing array (as it is at index 0 of the 'records' => [...] array/pseudo-hash). And due to what I would call a bug in Perl, using the arrayref as the pseudo-hash-mapped index, doesn't produce an error message, but rather is interpreted as a (typically huge) number, which in turn leads to Perl auto-resizing the array (in this case because of autovivification) until it runs out of memory in your case...

        Saving to a file might not solve the problem, you haven't said whether you disabled buffering or not, you haven't shown that either. In Perl you got line-buffering and block-buffering, saving to files employs block buffering since print and write operations are involved hence sometimes you need to enforce a buffer flush every time a filehandle is written to in case you are afraid that memory issues would come-up.

        print OUT "$line\n"; autoflush OUT 1;
        This link can be useful for you to learn more about buffering and for pseudo hashes, a quick googling around the place can give you an indication; Pseudo-hashes deprecated, pseudo hashes.. Perl is a lovable language, just be committed to it and do what it takes to get there.

        Excellence is an Endeavor of Persistence. Chance Favors a Prepared Mind.
Re: Show error "Out of Mermory"!!
by Anonymous Monk on Oct 12, 2009 at 08:43 UTC
      Thanks for reply.. I used XML::Twig and the problem was solved.. Thanks, Sachin
Re: Show error "Out of Mermory"!!
by Jenda (Abbot) on Oct 12, 2009 at 20:51 UTC

    There is no reason to employ WWW::Mechanize. LWP::Simple is enough in this case. Though it doesn't seem to work for me ... I get a login page. So I can't have a look at the XML. There is also no need to remove the <?xml...> stuff.

    As you do not seem to need the whole XML, you might you either XML::Twig or XML::Rules to process the XML in snippets. With XML::Rules you would just specify that the default rule is to forget the value of the tag (_default => '',) and the rule for 'srw_dc:dc' to print the 'dc:title' ('srw_dc:dc' => sub {print "$_[1]->{'dc:title'}\n"; return},).

    Jenda
    Enoch was right!
    Enjoy the last years of Rome.