Dogg has asked for the wisdom of the Perl Monks concerning the following question:

I'm working with a really long array of scalars (~14.4 million entries). I read all the numbers from a file into a single scalar using:
while (<ASA>) { $ASAlist .= $_; } #end while
Then I split that single scalar into a list of numbers:
@ASAList = split(/ /, $ASAlist);
It just seems to fail on the splitting (the first time it segment faulted with a panic:POPSTACK error and the second time it just died). It has worked on a shorter array (102,000 numbers). Is there some length limit on arrays? Is there a better way to split the long scalar into the array? (I'm using perl 5.8.0.)

Thanks.

Replies are listed 'Best First'.
Re: Limit on array size?
by BrowserUk (Patriarch) on Nov 06, 2003 at 17:41 UTC

    Provided your numbers are integers (signed 32-bit), you could reduce your memory requirements for the array from around 440MB to 54MB by using Packed::Array. It will slow the processing down somewhat, but it should easily handle the volume on most modern machines.

    Be sure to heed the advice about pushing to the array as you go, rather than slurping the whole lot and then converting to an array, otherwise your just chewing the memory to hold the huge scalar for no reason.


    Examine what is said, not who speaks.
    "Efficiency is intelligent laziness." -David Dunham
    "Think for yourself!" - Abigail
    Hooray!
    Wanted!

Re: Limit on array size?
by ctilmes (Vicar) on Nov 06, 2003 at 17:13 UTC
    split as you go along:
    use strict; use warnings; use Data::Dumper; my @ASAList; while (<DATA>) { chomp; push(@ASAList, split / /); } print Dumper(\@ASAList); __DATA__ 27 25 52 1 2 3
    Output:
    $VAR1 = [ '27', '25', '52', '1', '2', '3' ];
    I added a chomp() as well.

    You'll still be limited by the memory of your machine, but this will make more efficient use of what you've got.

    Might also try pre-extending your array with something like this:

    $ASAList[14000000] = 1; @ASAList = ();
Re: Limit on array size?
by thelenm (Vicar) on Nov 06, 2003 at 17:18 UTC

    The possible size of an array is only limited by how much memory you have. You may be running out of memory here. Instead of creating a huge scalar and a huge array, you might try just creating the array as you read the file:

    while (<ASA>) { push @ASAList, split; }

    This may help enough to get the entire array into memory, but if not, you'll have to use some scheme for keeping the data on the hard disk and only reading what you need. Sounds like a good job for a tied array, maybe something in the Tie namespace on CPAN will help.

    -- Mike

    --
    XML::Simpler does not require XML::Parser or a SAX parser. It does require File::Slurp.
    -- grantm, perldoc XML::Simpler

Re: Limit on array size?
by Abigail-II (Bishop) on Nov 06, 2003 at 17:18 UTC
    You have to realize that a value in perl takes quite an amount of bytes. For a string, it's something like 24 bytes, plus the size of the string itself. For a value to be an element of an array, there's another 4 bytes pointer overhead. Multiply that by 14M4, and you need over 400 Mb just to store empty strings.

    So, you might have run out of memory. From the message you got, I'd say you ran out of memory for your stack (the value from the split are first put on the stack). I don't know whether perl has to copy all the values (and there goes another 400 Mb), but that's hardly relevant in this case.

    You still shouldn't get this error, it should report running out of memory, but you've hit a limit imposed by the system, not Perl.

    Please try what kind of error you get with 5.8.2. If you still get a panic, be so kind and submit a bug report.

    Abigail

Re: Limit on array size?
by pg (Canon) on Nov 06, 2003 at 17:19 UTC

    On a 32-bit machine, the highest index you can have is 2 ** 31 -1. After that you get an error msg saying: "Modification of non-creatable array value attempted".

    But you may fail long before you reach that point, because out of memory.

    my @a; $a[2 ** 31] = 1;