in reply to Re: Out of memory.
in thread Out of memory.

Sounds like it did get truncated. The stack at the end was because some operations in the postfix notation got truncated at the end. I ran the program and outputed the array before the join was performed. The output was 500 or so megs. I suspect I might have hit some limit on string size or other limit. The memory usage on the Windows XP box ran close 1.5 gig for the perl process. Is there any cleaner way to do this that will not greatly impact performance? (The join cut the time to 1/60th over string appending.) Or is there a nice package to "page" to file parts of an array before the memory limit is hit?

Replies are listed 'Best First'.
Re^3: Out of memory.
by BrowserUk (Patriarch) on Oct 22, 2008 at 02:01 UTC
      The postfix equation is 113,957 bytes. When it runs the final array before the join that results in the Out of memory error is 513,422,726 bytes. The solution is deterministic so the problem can be recreated from the 113,957 bytes.
      The postfix file that feeds the program is 113,957 bytes. The array right before the join that results in the memory error is 513,422,726 bytes. The program can be run to recreate the problem everytime on Windows XP box with 2gig of RAM running PERL 5.10.0 build 1003 by ActiveState.

        Try this and see how you get on:

        #! perl -slw use strict; sub multiply { my( $ref1, $ref2 ) = @_; my @temp; for my $r1 ( @{ $ref1 } ) { for my $r2 ( @{ $ref2 } ) { push @temp, $r2 . $r1; } } return \@temp; } $/ = ','; my @stack; open POSTFIX, '<', '718318.dat' or die $!; while( my $op = <POSTFIX> ) { chomp $op; chop $op if $op =~ tr[\n][\n]; ## remove any trailing newline if( $op eq 'x' ) { push @stack, multiply( pop( @stack ), pop( @stack ) ); } elsif( $op eq '+' ) { push @stack, [ @{ pop @stack }, @{ pop @stack } ]; } elsif( $op =~ m[^\d+$] ) { push @stack, [ pack 'v', $op ]; } else { die "Bad '$op' at position;" . tell( POSTFIX ); } } print "stacksize: " . @stack; for my $item ( @stack ) { printf "%s", join ',', unpack 'v*', @{ $item }[ 0 ]; for my $group ( @{ $item }[ 1 .. $#$item ] ) { printf "+%s", join ',', unpack 'v*', $group; } print "\n"; }

        This does 4 things to try and save memory without unduly hitting performance:

        1. Instead of reading the entire postfix expression into memory as a single string and then splitting it into a large array, it sets $\ = ',' and reads the expression one operand at a time.
        2. Rather than storing the intermediate values as strings which have to be split & joined over and over, it stores them as arrays of substrings.

          Instead of an intermediate value being stored as 1,2,3+4,5,6+7,8,9, that same data is stored as [ '1,2,3', '4,5,6', '7,8,9' ].

          This makes the additive process just a case of coombining the two arrays on the top of the stack into one.

          And the multiply process combines the substrings using concatenation, and pushes them into a new array.

        3. Instead of storing the substrings as comma-delimited ASCII, it packs them together as 2-byte binary unsigned integers.

          This saves 33% memory for 2-digits numbers; 50% for 3-digits; 60% for 4-digits; and 66% for 5-digits. (Including the now unnecessary commas.)

        4. Finally, the numbers don't get ASCIIized or the ,s and +s added until output.

          And this is done in chunks to avoid building the entire string in memory. This could be taken a stage further by unpacking the numbers from the substrings one at a time, but it hasn't seemed necessary.


        Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
        "Science is about questioning the status quo. Questioning authority".
        In the absence of evidence, opinion is indistinguishable from prejudice.