in reply to How do I pre-allocate an array of hashes?

Something like:
$#myarray = 250000; keys %$_ = 120 for @myarray;

Abigail

Replies are listed 'Best First'.
Re: Re: How do I pre-allocate an array of hashes?
by jaa (Friar) on Feb 21, 2003 at 17:53 UTC
    Thanks, but I'm looking for something a bit more chunky that reduces the number of times that Perl will call malloc

    250,000 calls to

    keys %myhash = 120;
    is not much better than just preallocating the array and allowing the individual hashes to get created one at a time.

      The old C trick I used to see was to calculate the total malloc arena size needed, then malloc one huge chunk that size at startup and immediately free it. The idea was that the smaller mallocs would then use that existing huge malloc arena. Since the actual process memory, gotten (on UNIX) via sbrk is one way only -- you can't give it back -- malloc maintains free space in its own internal arena. This approach supposedly then traded off all those expensive small system calls with one up front.

      In practice, I saw mixed results with this. And I'm sure malloc algorithms have changed so it may mean absolutely nothing today even if you can figure out how to translate that to Perl. Maybe it would be better to use an alternate malloc package when building Perl?

      Well, Perl isn't C. Maybe you should consider using a different datastructure.

      Abigail

        I think you missed my point. I was throwing out the idea that maybe that model would translate to Perl. There are three ways I've seen memory allocation affected in general:

        1. Use a different malloc algorithm. You can do this with Perl by building a different one into the perl binary.
        2. Configure an existing malloc interface if it supports it. Some C libraries had this when I coded C but it wasn't protable by any means. What they did is let you "profile" your base allocations (lots of small blocks, a few large blocks, etc.) by passing in values to a configuration API that primed the malloc algorithm. There is nothing equivalent to this in Perl that I'm aware of.
        3. Try controlling malloc at the application level. I mentioned that one way I used to see some people try this was to figure out the overall allocation size used by the application, then allcoate this as one huge chunk and free it. There really is no direct equivalent in Perl, but I was thinking along these lines for this particular example:
          use strict; my $i = 120; my $n = 250000; my $x = {}; my @a; my $j; $#a = $n; # Up front, big malloc/free ... Does it make a difference? #keys %$x = ($i * $n); #$x = {}; while ($n-- > 0) { $x = {}; keys %$x = $i; for ($j = 0; $j < $i; $j++) { $x->{$j} = $j; } push(@a, $x); }
          You'd run this with and without the big malloc/free and see if there's any difference. The big leap here is that these are not straight (flat) memory allocations in Perl -- you're also (I'd think) building up data structures (hashes, arrays). I did not dive into the code so I may be way off base here. But I did try this test just for kicks on Solaris 2.7. The big malloc/free came out a lot worse, so that theory can pretty much be buried ... at least on that platform. Here's a baseline difference. The runs take forever so I didn't get a good sample set. Kind of pointless I think.

          Times with the big malloc/free:

          Total Elapsed Time = 947.1299 Seconds User+System Time = 383.5699 Seconds
          Times without it:
          Total Elapsed Time = 497.0599 Seconds User+System Time = 376.4799 Seconds