rocketperl has asked for the wisdom of the Perl Monks concerning the following question:

Hi Mokers, Please help me with the 'out of memory' issue. My input file <PROTEOME> is around 30mb (due to which i assume my program is not running as expected), please advice me regarding improvement of the script and avoid populating unwanted memory. Below is my code for your ref. Thanks

#!/usr/bin/perl use 5.010; use strict; use warnings; use diagnostics; open GENES,'genes.txt' or print "cant open genes\t"; open LOG,'log.txt' or print "cant open log\t"; open START,'start.txt' or print "cant open start\t"; open STOP,'stop.txt' or print "cant open stop\t"; open DIST,'diff.txt' or print "cant open diff"; open O,'>logratio_g0_revised_withnp.txt'; my @genes=<GENES>; my @start=<START>; my @stop=<STOP>; my @log=<LOG>; my @dist=<DIST>; chomp @genes; chomp @start; chomp @stop; chomp @log; chomp @dist; my $index=0; my $index1=1; my $ind=0; my $in=0; my $count=1; do { $ind=0; $ind1=1; $hit=0; undef @hold; undef @aminoacid; if($proteome[$index]=~m/^>/) { $val++; $proteome[$index] =~ /\|(.+)\|(.+)$/; $id=$1; $name=$2; $index++; print O "$name\t$id\t"; push @hold,$proteome[$index]; $index++; do { push @hold,$proteome[$index]; $index++; } until($proteome[$index]=~m/^>/); $peptide= join ('', @hold); undef @hold; @aminoacid=split(//,$peptide); } if(@aminoacid>0) { do { if($aminoacid[$ind]=~/^X$/) { $ind++; $ind1++; } else { if($aminoacid[$ind] eq $aminoacid[$ind1]) { do { $hit++; $ind++; $ind1++; } until($aminoacid[$ind] ne $aminoacid[$ind1] or $ind==@ +aminoacid); $hit++; if($hit>9) { print O "$aminoacid[$ind]\t$hit\t"; } else { $hit=0; } $ind++; $ind1++; $hit=0; } else { $ind++; $ind1++; } } } until($ind==@aminoacid); undef @aminoacid; print O "\n"; } } until($index==@proteome); say $val;

Replies are listed 'Best First'.
Re: warning "Out of memory!"
by tobyink (Canon) on Jan 23, 2014 at 10:20 UTC

    You're not showing us your full code. You don't even show where you're reading the input. So don't expect many especially useful answers.

    At a guess, you're slurping the entire file into memory, when perhaps you'd be better off processing it line by line.

    That said, 30 MB is not a large file by modern standards, and should fit in memory quite easily.

    use Moops; class Cow :rw { has name => (default => 'Ermintrude') }; say Cow->new->name
Re: warning "Out of memory!"
by vitoco (Hermit) on Jan 23, 2014 at 10:27 UTC

    Did you use strict; and use warnings;?

    Which is the starting value of $index? From that piece of code, it seems to be undef, which is not a valid index for @proteome array.

    At the other end, it seems that $index is going out of the array. Change the condition of the last until to $index >= @proteome... The same for other until condition.

    Just guessing without the input file specs...

Re: warning "Out of memory!"
by vitoco (Hermit) on Jan 23, 2014 at 12:37 UTC

    Well, OP updated the code, including some definitions, but @proteome array is still undefined!

    Again, review the terminating conditions for each loop... I think you are going far away the end of the array, until no more memory is able to assign to @proteome. Do not use == as the index variable is being increased conditionaly inside inside the loop more than once without checking it's limit.

    Also, check what happens if the $proteome[$index] does not match some of the REs... may be array @hold is full of undef's.

Re: warning "Out of memory!"
by locked_user sundialsvc4 (Abbot) on Jan 23, 2014 at 14:45 UTC

    Nevertheless, it is easy enough to “eyeball” the practices that are being used here and to guess what the problem is:   you are “slurping” probably many megabytes of data into memory and processing it as arrays.   Maybe it’s another one of the files that is causing the problem, or something in combination.   In any case, a 32-bit machine on a very good day will allow only a memory-space of 2gb+ for everything that a process can address, and that fills up fast.   I presume that this machine has all the RAM, physically installed now, that its motherboard is capable of.   (On a 64-bit system, the theoretical limit is much larger, but virtual-memory paging can still bring you to your knees.)

    If it is simply an input file that is too-large, you can consume that file one record at a time.   If several of the files are very large, you might have to conceive of a way to process the data in multiple stages ... multiple runs through the essential algorithm, with a slice of the inputs each time.   Or, other ways ... sorting the data to take advantage of the grouping that this gives; storing data (say) in an SQLite database file so that you can, by means of a query, retrieve from an arbitrarily-large source file only the data which might match.   And so on.   The width of the memory-address, and the installed-RAM capacity, is an absolutely “hard” limitation.

Re: warning "Out of memory!"
by dasgar (Priest) on Jan 23, 2014 at 17:44 UTC

    I agree with others that your posted code definitely looks like its missing quite bit.

    In your code, I see lines like my @genes=<GENES>;, which kind looks like your intent is to store the contents into an array such that each element of the array corresponds to a line in the file.

    Don't know if that is the source of your memory issues or not, but if that really is your intent and you want to reduce the memory being used, I would recommend checking out Tie::File. Tie::File will tie a file to an array and each line of the file becomes an element of the array. Also, Tie::File does not read the entire file into memory. It even has an option that you can set to increase/decrease the amount of memory that it "will consume at any time while managing the file".

Re: warning "Out of memory!"
by mtmcc (Hermit) on Jan 23, 2014 at 21:31 UTC

    At the risk of being repetitive, it's always more difficult to give advice on how to improve the script when you you haven't explained what it's supposed to do.

    What is your input, your expected output, and your actual output? And have a look at this How do I post a question effectively?