http://qs1969.pair.com?node_id=1175919


in reply to push to array without copying

The down-side of using references is that dereferencing stuff in multiple places can a) get messy; b) actually end up costing more time than simply duplicating the data and avoiding the dereferencing, if the data items are small.

The hash won't be used once the array is created, so I'm not concerned about the hash values being modified when the array values are.

That observation if key here. If you don't need the values in the hash after you've copied them to the array, don't copy them, move them.

That can be achieved by deleteing the key/value pair from the hash and assigning the return from delete -- which is the value of the key being deleted -- to the array.

Like this:

my %hash = ('a'=>'test'); my @arr = delete $hash{'a'}; print $arr[0]."\n";

This results in the key 'a' being removed from the hash, and its former value 'test' being transfered directly to $arr[0] without any copying.

I assume that would be faster?

It will be; provided: a) the size of the values is sufficient to make a noticeable difference -- a few hundred bytes would do it -- and b) it doesn't force you to do too much extra work other places as a result.

delete also works with hash slices, which makes this very convenient and efficient for doing multiple transfers simultaneously:

%h = qw[ the quick black fiend jumps over the lazy god child ];; pp \%h;; { black => "fiend", god => "child", jumps => "over", the => "lazy" } print delete @h{ qw[ the black god ] };; lazy fiend child pp \%h;; { jumps => "over" }

With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday'
Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
"Science is about questioning the status quo. Questioning authority". The enemy of (IT) success is complexity.
In the absence of evidence, opinion is indistinguishable from prejudice.

Replies are listed 'Best First'.
Re^2: push to array without copying
by chris212 (Scribe) on Nov 15, 2016 at 17:39 UTC
    But the array value doesn't seem to have the same reference address as the hash value. Are you sure that doesn't make a copy?
    my %hash = ('a'=>'test'); print \$hash{'a'}."\n"; my @arr = delete $hash{'a'}; print \$arr[0]."\n";
      Are you sure that doesn't make a copy?

      Yes. Quite sure. I have two independent demonstrations for proof of that.

      1. Using Devel::Peek::Dump():

        This is not a Perl-level 'aliasing' effect; it is a C-level pointer swap, so you're looking at the wrong thing. It isn't the Perl reference values you should be considering, but rather the PV component of the two scalars as shown below.

        Note that whilst the the two scalars have different heads and bodies (the two hex values on the first line of each dump); the address of the actual data on the fourth line of each dump is identical in both:

        use Devel::Peek;; $h{ XXX } = 'test';; Dump $h{ XXX };; SV = PV(0x2ab2a0) at 0x3dc99d8 REFCNT = 1 FLAGS = (POK,pPOK) PV = 0x3e28f98 "test"\0 CUR = 4 LEN = 8 $a[0] = delete $h{ XXX };; Dump $a[ 0 ];; SV = PV(0x2ab2b0) at 0x3dc9978 REFCNT = 1 FLAGS = (POK,pPOK) PV = 0x3e28f98 "test"\0 CUR = 4 LEN = 8

        Contrast that with a normal assignment where all three hex values in the scalar Dump()s are different:

        use Devel::Peek;; $h{ XXX } = 'test';; Dump $h{ XXX };; SV = PV(0x11b2a0) at 0x3e199d8 REFCNT = 1 FLAGS = (POK,pPOK) PV = 0x3e78f98 "test"\0 CUR = 4 LEN = 8 $a[0] = $h{ XXX };; Dump $a[ 0 ];; SV = PV(0x11b2b0) at 0x3e19978 REFCNT = 1 FLAGS = (POK,pPOK) PV = 0x3e78f68 "test"\0 CUR = 4 LEN = 8
      2. Using empirical observation.

        In the trace below mem is a function that returns the process memory utilisation in K bytes.

        • When the REPL session has just started, the memory used is just under 10MB.
        • After I create the hash key with a 100e6 byte value, the memory has grown to 107MB.
        • After I transfer that value to the array element, the memory has change by a few kb, but is still basically the same 107MB.
        C:\test>p1 Perl> print mem;; 9,340 K Perl> $h{ XXX } = 'X' x 100e6;; Perl> print mem;; 107,248 K Perl> $a[ 0 ] = delete $h{ XXX };; Perl> print mem;; 107,308 K Perl>

        If the data had been copied, the footprint would have been over 200MB.

        (I cannot Dump() the scalars in this latter case because Dump() would spend a week trying to format a 100e6 byte string nicely, before dumping it to the console, which would take another week (or two:)!)


      With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday'
      Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
      "Science is about questioning the status quo. Questioning authority". The enemy of (IT) success is complexity.
      In the absence of evidence, opinion is indistinguishable from prejudice.
        Very thorough explanation! Thanks, I appreciate it.