Re^4: Out of memory.

Replies are listed 'Best First'.
Re^5: Out of memory. by BrowserUk (Patriarch) on Oct 22, 2008 at 06:17 UTC
Try this and see how you get on: #! perl -slw use strict; sub multiply { my( $ref1, $ref2 ) = @_; my @temp; for my $r1 ( @{ $ref1 } ) { for my $r2 ( @{ $ref2 } ) { push @temp, $r2 . $r1; } } return \@temp; } $/ = ','; my @stack; open POSTFIX, '<', '718318.dat' or die $!; while( my $op = <POSTFIX> ) { chomp $op; chop $op if $op =~ tr[\n][\n]; ## remove any trailing newline if( $op eq 'x' ) { push @stack, multiply( pop( @stack ), pop( @stack ) ); } elsif( $op eq '+' ) { push @stack, [ @{ pop @stack }, @{ pop @stack } ]; } elsif( $op =~ m[^\d+$] ) { push @stack, [ pack 'v', $op ]; } else { die "Bad '$op' at position;" . tell( POSTFIX ); } } print "stacksize: " . @stack; for my $item ( @stack ) { printf "%s", join ',', unpack 'v', @{ $item }[ 0 ]; for my $group ( @{ $item }[ 1 .. $#$item ] ) { printf "+%s", join ',', unpack 'v', $group; } print "\n"; } [download] This does 4 things to try and save memory without unduly hitting performance: Instead of reading the entire postfix expression into memory as a single string and then splitting it into a large array, it sets `$\ = ','` and reads the expression one operand at a time. Rather than storing the intermediate values as strings which have to be split & joined over and over, it stores them as arrays of substrings. Instead of an intermediate value being stored as `1,2,3+4,5,6+7,8,9`, that same data is stored as `[ '1,2,3', '4,5,6', '7,8,9' ]`. This makes the additive process just a case of coombining the two arrays on the top of the stack into one. And the multiply process combines the substrings using concatenation, and pushes them into a new array. Instead of storing the substrings as comma-delimited ASCII, it packs them together as 2-byte binary unsigned integers. This saves 33% memory for 2-digits numbers; 50% for 3-digits; 60% for 4-digits; and 66% for 5-digits. (Including the now unnecessary commas.) Finally, the numbers don't get ASCIIized or the ,s and +s added until output. And this is done in chunks to avoid building the entire string in memory. This could be taken a stage further by unpacking the numbers from the substrings one at a time, but it hasn't seemed necessary. Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error. "Science is about questioning the status quo. Questioning authority". In the absence of evidence, opinion is indistinguishable from prejudice. "Too many [] have been sedated by an oppressive environment of political correctness and risk aversion."	[reply] [d/l] [select]
Re^6: Out of memory. by dneedles (Sexton) on Oct 22, 2008 at 12:49 UTC
I ran the updated code and got outofmemory again, but it kept running at 1.2-1.4 gigs until I killed it an hour later. I converted the number vars to characters a-zA-Z and 'x' to '*' to compress the data: Read more... (120 kB)	[reply] [d/l]
Re^7: Out of memory. by dneedles (Sexton) on Oct 22, 2008 at 12:56 UTC
It almost got it all. Here is the rest... Read more... (749 Bytes)	[reply] [d/l]