out of memory problem

perlbeginner10 has asked for the wisdom of the Perl Monks concerning the following question:

Replies are listed 'Best First'.
Re: out of memory problem by GrandFather (Saint) on Mar 15, 2006 at 20:51 UTC
If it is really O(n^n) and n is anything greater than about 10, then you are stuffed. Describe what you are trying to achieve and show us the guts of the code with a very small sample set. We may be able to advise on a better technique than you are currently using. Consider this thread as an example of how a different approach can help. DWIM is Perl's answer to Gödel	[reply]
Re: out of memory problem by Tanktalus (Canon) on Mar 15, 2006 at 20:38 UTC
In order of easiest/cheapest (on developer costs) to most difficult ... Check your ulimit. If you have a limit on memory from that, then you'll get an out of memory error when you reach it. I had this problem - 64MB wasn't enough, so I just told the sysadmins that I needed unlimited memory, and that problem went away. Buy more RAM/increase swap space. If you still don't have enough memory after the above, then maybe it's because there's no more memory to have. My current home computer has 4GB of RAM and 8GB of swap specifically because of this. Of course, RAM is way faster than swap. Try upgrading to 64-bit. That means a 64-bit processor as well as 64-bit OS and 64-bit perl. With sufficient memory. If approximately 3.5GB isn't enough memory to access, then 64-bit will allow you to access anything you can throw at it. Try using a file-system-backed tied data structure. Since I don't use this, nor do I know what data structures you're using, I can't really tell you which one. But the basic idea is to throw all your intermediate results to disk, and read them back in as you need them. A tied structure, such as DBM or something, can radically simplify this. This allows you to use your hard disk as if it were RAM, without actually hitting the limits of your ulimit or CPU or physical RAM/swap. Hope that helps.	[reply]
Re: out of memory problem by izut (Chaplain) on Mar 15, 2006 at 20:31 UTC
Could you post the code you wrote? Igor 'izut' Sutton your code, your rules.	[reply]
Re: out of memory problem by ikegami (Patriarch) on Mar 15, 2006 at 20:32 UTC
We don't know anything about your algorithm or your input. How can we help?	[reply]
Re^2: out of memory problem by perlbeginner10 (Acolyte) on Mar 15, 2006 at 20:58 UTC
Sorry about that. Here is my code my %fnameof; my %valueof; my @relation; my @second; my $mainfile; my $subfiles; { open ($testdataset, "datasetnew.txt") or die "Cannot open file"; @testdataset = <$testdataset>; close ($testdataset); open (STDOUT, ">>result.txt"); $fcount = 1; $secondcount = 0; @testdataset = grep { $_ ne '' } @testdataset; @testdataset = grep /\S/, @testdataset; foreach $dataline (@testdataset) { ($mainfile, $subfiles) = GetFileName($dataline); for ($mainfile) { $mainfile =~ s/^\s+//; $mainfile =~ s/\s+$//; } addtoHash($mainfile); @subfiles = @keywords = split(/;/, $subfiles); @subfiles = grep { $_ ne '' } @subfiles; @subfiles = grep /\S/, @subfiles; foreach $subfile (@subfiles) { $subfile =~ s/^\s+//; $subfile =~ s/\s+$//; addtoHash($subfile) unless ($_ ne ''); } #defining the relation of mainfile with subfiles. Each mainfil +e has relation weight = 1 with subfile. foreach $subfile (@subfiles) { $relation[$valueof{$mainfile}][$valueof{$subfile}] = 1 +; $second[$secondcount] = "$valueof{$mainfile};$valueof{ +$subfile}"; $secondcount++; } } #creating transitive relationship. ie: if A->B and B->C, then A->C foreach $seconditem (@second) { @test = split(/;/, $seconditem); $b = $test[0]; $c = $test[1]; for ($k = 1; $k<=$secondcount; $k++) { if ($relation[$c][$k] gt 0) { $relation[$b][$k] = $relation[$b][$k]+1; } } } PrintArray(); } #get mainfile and subfiles sub GetFileName{ my $item = $_[0]; @datasplit = split(/\t/, $item); $mainfile = @datasplit[0]; $subfiles = @datasplit[1]; return ($mainfile, $subfiles); } sub addtoHash{ my $file = $_[0]; $exist = 0; for ($i = 0; $i < $fcount; $i++) { if ($fnameof{$i} eq $file) { $exist = $i; } } if ($exist == 0) { $fnameof{$fcount}= $file; $valueof{$file} = $fcount; $fcount++; } } sub PrintArray(){ for($i=1;$i<$fcount; $i++) { for($j=1;$j<$fcount;$j++){ if (defined ($relation[$i][$j])) { print $fnameof{$i}."-".$relation[$i][$j]."->".$fnameof +{$j}."\n"; } } } print "\n"; } [download] And Here is sample dataset: `cancer breast cancer; lung cancer; heart cancer; stomach cancer; breast cancer foot cancer; foot cancer some cancer; lung cancer blood cancer; foot cancer; heart cancer foot cancer; stomach cancer foot cancer; blood cancer some cancer;` [download] But this dataset is actually huge. It's about 48MB. I have 1GB memory in my comp. I ran this program on Windows and Fedora core, but the resut is the same: blank --(with the 48MB dataset). PS: If there are any other points that can improve my code, please let me know.	[reply] [d/l] [select]
Re^3: out of memory problem by GrandFather (Saint) on Mar 15, 2006 at 21:14 UTC
First glance - add `use strict; use warnings` to your code then clean up the errors and warnings. don't use $a or $b as variable names - they are reserved for use by sort use the three parameter open where does $fcount in addtoHash get a value? Make it explicit by passing the value into the sub rather than relying on a global. Don't prototype PrintArray - especially after it's first use! you probably want `chomp @testdataset;` before `@testdataset = grep { $_ ne '' } @testdataset;` `@testdataset = grep { $_ ne '' } @testdataset;` is redundant when followed by `@testdataset = grep /\S/, @testdataset;` what does `for ($mainfile) {` achieve? You test `if ($exist == 0)`, but $i can == 0 and therefore $exist can == 0 (in addtoHash) You could describe the output you expect. Sometimes knowing what is expected of a piece of code helps understand it - sometimes it helps misunderstand it :) Update: more items added DWIM is Perl's answer to Gödel	[reply] [d/l] [select]
Re^4: out of memory problem by perlbeginner10 (Acolyte) on Mar 16, 2006 at 05:33 UTC
Re^5: out of memory problem by GrandFather (Saint) on Mar 16, 2006 at 07:05 UTC
Some notes below your chosen depth have not been shown here
Re^3: out of memory problem by ikegami (Patriarch) on Mar 15, 2006 at 22:47 UTC
I don't have time to look at it personally, at least not now, but the following will help you greatly. Change `open ($testdataset, "datasetnew.txt") or die "Cannot open file"; @testdataset = <$testdataset>; close ($testdataset); @testdataset = grep { $_ ne '' } @testdataset; @testdataset = grep /\S/, @testdataset; foreach $dataline (@testdataset) {` [download] to `open (my $testdataset, '<', "datasetnew.txt") or die "Cannot open input file: $!\n"; while (my $dataline = <$testdataset>) { next if $dataline =~ /^\s*$/;` [download] You'll have (2 or 3) fewer copies of your file in memory.	[reply] [d/l] [select]
Re: out of memory problem by GrandFather (Saint) on Mar 16, 2006 at 01:52 UTC
It's not entierly clear what you are tryng to achieve. But on the guess that it is something to do with finding transitive relationships in the data, the following may be of use: use strict; use warnings; my %mappings; while (<DATA>) { chomp; next if ! /\S/; s/^\s+//; s/\s+$//; my ($mainfile, $subfiles) = split /\s,\s/; my @subfiles = split /\s;\s/, $subfiles; $mappings{$mainfile} = [grep /\S/, @subfiles]; } # Print transitive relationships. ie: if A->B and B->C, then A->C for my $A (sort keys %mappings) { for my $B (@{$mappings{$A}}) { print " $A - $B -> @{$mappings{$B}}\n" if exists $mappings{$B} +; } } __DATA__ cancer,breast cancer; lung cancer; heart cancer; stomach cancer; breast cancer,foot cancer; foot cancer,some cancer; lung cancer,blood cancer; foot cancer; heart cancer,foot cancer; stomach cancer,foot cancer; blood cancer,some cancer; [download] Prints: `breast cancer - foot cancer -> some cancer cancer - breast cancer -> foot cancer cancer - lung cancer -> blood cancer foot cancer cancer - heart cancer -> foot cancer cancer - stomach cancer -> foot cancer heart cancer - foot cancer -> some cancer lung cancer - blood cancer -> some cancer lung cancer - foot cancer -> some cancer stomach cancer - foot cancer -> some cancer` [download] DWIM is Perl's answer to Gödel	[reply] [d/l] [select]