Beefy Boxes and Bandwidth Generously Provided by pair Networks
There's more than one way to do things
 
PerlMonks  

Re: Speeding up stalled script

by sundialsvc4 (Abbot)
on Feb 03, 2015 at 17:28 UTC ( [id://1115416]=note: print w/replies, xml ) Need Help??


in reply to Speeding up stalled script

While we’re on this subject, here’s something I’d really like to know:   how memory-hungry is a line like this . . .

foreach my $key (keys %dstrbtn_hash) {

... versus, say, an iterator, e.g. each() applied to the same hash?

Is Perl going to build an in-memory anonymous array of all those keys, in order to subsequently foreach through it?

I’ve no doubt that the root cause of this problem is virtual-memory thrashing.   But, is a statement like this one a “hidden” source of more memory consumption?

Replies are listed 'Best First'.
Re^2: Speeding up stalled script
by Athanasius (Archbishop) on Feb 05, 2015 at 08:02 UTC

    Hello sundialsvc4,

    Update: My conclusion was wrong, keys does consume more memory than each. See choroba’s comment and my reply, below.


    Is Perl going to build an in-memory anonymous array of all those keys, in order to subsequently foreach through it?

    No, Perl uses the same internal iterator in both cases:

    So long as a given hash is unmodified you may rely on keys, values and each to repeatedly return the same order as each other.
    ...
    Each hash or array has its own internal iterator, accessed by each, keys, and values.
    — quoted from each, but see also keys and values.

    Note that this internal iterator (one per hash) is implicitly reset when successive calls to each (or keys, or values) have exhausted the available hash entries. But it can also be reset explicitly, for which the recommended method is to call keys @array in void context (see values).

    Hope that helps,

    Athanasius <°(((><contra mundum Iustus alius egestas vitae, eros Piratica,

      The iterator is used by keys, but not on each iteration of the for loop, but for the construction of the list of keys. Otherwise, the following program wouldn't output
      $VAR1 = { 'x' => 'cabd' };

      The value would be shorter, as there would be less keys after the second iteration.

      #!/usr/bin/perl use warnings; use strict; use Data::Dumper; my %hash = ( a => 11, b => 12, c => 13, d => 14, ); for my $key (keys %hash) { delete $hash{$_} for qw( a b c d ); $hash{'x'} .= $key; } print Dumper \%hash;
      لսႽ† ᥲᥒ⚪⟊Ⴙᘓᖇ Ꮅᘓᖇ⎱ Ⴙᥲ𝇋ƙᘓᖇ

        Hello choroba,

        Yes, you’re right: despite the calls to delete, the for loop list doesn’t change, so it must be constructed before the first loop iteration.

        Whereas using each I get an infinite loop:

        use warnings; use strict; use Data::Dump qw(dd pp); my %hash = ( a => 11, b => 12, c => 13, d => 14, ); while (my ($key, $value) = each %hash) { print "key is $key, value is $value, hash is ", pp(\%hash), "\n"; delete $hash{$_} for qw( a b c d ); $hash{x} .= $key; } dd \%hash;

        Output:

        23:43 >perl 1148_SoPW.pl key is c, value is 13, hash is { a => 11, b => 12, c => 13, d => 14 } key is x, value is c, hash is { x => "c" } key is x, value is cx, hash is { x => "cx" } key is x, value is cxx, hash is { x => "cxx" } key is x, value is cxxx, hash is { x => "cxxx" } key is x, value is cxxxx, hash is { x => "cxxxx" } key is x, value is cxxxxx, hash is { x => "cxxxxx" } key is x, value is cxxxxxx, hash is { x => "cxxxxxx" } key is x, value is cxxxxxxx, hash is { x => "cxxxxxxx" } key is x, value is cxxxxxxxx, hash is { x => "cxxxxxxxx" } ...

        Which isn’t surprising given the warning in the documentation for each:

        If you add or delete a hash's elements while iterating over it, the effect on the iterator is unspecified; for example, entries may be skipped or duplicated--so don't do that. Exception: It is always safe to delete the item most recently returned by each()...

        So it appears that sundialsvc4’s concern over memory is justified: for a large hash, the use of keys in a for loop will consume significantly more memory than the equivalent use of each. It seems strange to me that the documentation doesn’t highlight this — or did I just miss it somewhere?

        Cheers,

        Athanasius <°(((><contra mundum Iustus alius egestas vitae, eros Piratica,

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://1115416]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others about the Monastery: (6)
As of 2024-03-28 11:44 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found