in reply to Could we save the memory occupied by "undef" in an array?

Are you sure about that? I recall that there is only one copy of the undef value in memory, and I'm not sure that perl fills all the buckets from 0 to 9999 when you set 10000. Why don't you try just setting 10000 and see how much memory the program takes? If it's not much, you can just forget all about this.
  • Comment on Re: Could we save the memory occupied by "undef" in an array?

Replies are listed 'Best First'.
Re^2: Could we save the memory occupied by "undef" in an array?
by JavaFan (Canon) on Nov 23, 2008 at 12:23 UTC
    If perl wouldn't fill in the "buckets", then the buckets would contain random garbage. Since perl reuses memory, that garbage may well be old array elements. How would perl know which array elements are "valid" and which ones are "not filled in"?

    Note that (internal, C-level) arrays contain pointers to SVs.

      It could mark a range as empty without actually filling in each bucket.

      I just tried it. Setting element 10000 of an array takes 0.35 MB more RAM than setting element 0 on perl 5.8.8 for OS X on Intel. So, there's a cost, although not a large one.

        It could mark a range as empty without actually filling in each bucket.
        Yes, and then you'd need a data structure to keep track of which ranges are empty. Wait, I got it! We'd use a hash to keep track which elements in the array are 'valid'! But then, you might as well use a hash to begin with.

        Furthermore, it doesn't save any memory. Even if you don't initialize in the first 1000 elements if you do:

        my @array; $array[1000] = 1;
        you still need to allocate space for 1001 pointers. Initialized or not.
        I just tried it. Setting element 10000 of an array takes 0.35 MB more RAM than setting element 0 on perl 5.8.8 for OS X on Intel. So, there's a cost, although not a large one.
        That's about 36 bytes/element. More that I expected. An undefined value takes 20 bytes in Perl. A pointer takes 4 bytes (32-bit platform). So, even if undefined values aren't shared, I'd expect less.

        However, I cannot recreate this:

        $ perl -MDevel::Size=:all -wE '$a[1] = 1; say total_size \@a' 148 $ perl -MDevel::Size=:all -wE '$a[10000] = 1; say total_size \@a' 40136
        which is a difference of about 40000 bytes, aka 10000 pointers. Replacing Devel::total_size with <c>`grep VmRSS /proc/$$/status` confirms the difference of 40000 bytes.