in reply to Re^4: Out of memory.
in thread Out of memory.

Try this and see how you get on:

#! perl -slw use strict; sub multiply { my( $ref1, $ref2 ) = @_; my @temp; for my $r1 ( @{ $ref1 } ) { for my $r2 ( @{ $ref2 } ) { push @temp, $r2 . $r1; } } return \@temp; } $/ = ','; my @stack; open POSTFIX, '<', '718318.dat' or die $!; while( my $op = <POSTFIX> ) { chomp $op; chop $op if $op =~ tr[\n][\n]; ## remove any trailing newline if( $op eq 'x' ) { push @stack, multiply( pop( @stack ), pop( @stack ) ); } elsif( $op eq '+' ) { push @stack, [ @{ pop @stack }, @{ pop @stack } ]; } elsif( $op =~ m[^\d+$] ) { push @stack, [ pack 'v', $op ]; } else { die "Bad '$op' at position;" . tell( POSTFIX ); } } print "stacksize: " . @stack; for my $item ( @stack ) { printf "%s", join ',', unpack 'v*', @{ $item }[ 0 ]; for my $group ( @{ $item }[ 1 .. $#$item ] ) { printf "+%s", join ',', unpack 'v*', $group; } print "\n"; }

This does 4 things to try and save memory without unduly hitting performance:

  1. Instead of reading the entire postfix expression into memory as a single string and then splitting it into a large array, it sets $\ = ',' and reads the expression one operand at a time.
  2. Rather than storing the intermediate values as strings which have to be split & joined over and over, it stores them as arrays of substrings.

    Instead of an intermediate value being stored as 1,2,3+4,5,6+7,8,9, that same data is stored as [ '1,2,3', '4,5,6', '7,8,9' ].

    This makes the additive process just a case of coombining the two arrays on the top of the stack into one.

    And the multiply process combines the substrings using concatenation, and pushes them into a new array.

  3. Instead of storing the substrings as comma-delimited ASCII, it packs them together as 2-byte binary unsigned integers.

    This saves 33% memory for 2-digits numbers; 50% for 3-digits; 60% for 4-digits; and 66% for 5-digits. (Including the now unnecessary commas.)

  4. Finally, the numbers don't get ASCIIized or the ,s and +s added until output.

    And this is done in chunks to avoid building the entire string in memory. This could be taken a stage further by unpacking the numbers from the substrings one at a time, but it hasn't seemed necessary.


Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
"Science is about questioning the status quo. Questioning authority".
In the absence of evidence, opinion is indistinguishable from prejudice.
"Too many [] have been sedated by an oppressive environment of political correctness and risk aversion."

Replies are listed 'Best First'.
Re^6: Out of memory.
by dneedles (Sexton) on Oct 22, 2008 at 12:49 UTC
    I ran the updated code and got outofmemory again, but it kept running at 1.2-1.4 gigs until I killed it an hour later. I converted the number vars to characters a-zA-Z and 'x' to '*' to compress the data:
      It almost got it all. Here is the rest...