Bulk hash population order

JakeIII has asked for the wisdom of the Perl Monks concerning the following question:

Replies are listed 'Best First'.

Re: Bulk hash population order
by perrin (Chancellor) on Nov 27, 2007 at 14:14 UTC

@bighash{'C', 'D'} = ('some new value', 4);
[download]

[reply]
[d/l]

Re^2: Bulk hash population order

by jbert (Priest) on Nov 27, 2007 at 14:30 UTC

Assuming you have the new values you want in %newbits:

@bighash{keys %newbits} = values %newbits;
[download]

$bighash{$_} = $newbits{$_} for keys %newbits;
[download]

keys

values

[reply]
[d/l]
[select]

Re: Bulk hash population order
by Fletch (Bishop) on Nov 27, 2007 at 13:56 UTC

Yes, it will work. A hash in list context evaluates to a list of key/value pairs (in an arbitrary order), and initializing a hash from a list of key/value pairs will overwrite earlier instances of duplicate keys with the last one seen.

We're looking for a Perl and Database Developer for Corporate Investments Group.

[reply]

Re: Bulk hash population order
by halley (Prior) on Nov 27, 2007 at 13:55 UTC

never

each

for

The internal structure is well-documented for those who are hacking the interpreter, but really none of the concern of a perl script that uses it. Perl's interpreter is free to rearrange at will, to ensure either memory efficiency or lookup efficiency. Some versions of perl will even hash differently each time the script is run, to thwart attempts to "attack" the interpreter by generating very unbalanced hashes.

There is a specialized version of the hash available, that will keep a second structure around internally to remember what order you inserted things, or how you liked things ordered. This is a subclass, a special case, and useful if your script must have that knowledge or power.

Update: It seems like some people are downvoting, perhaps assuming that I didn't read the question because I'm talking about visitation, and the question talks about bulk insertion. I phrased this intentionally; they're the same thing as far as dealing with a data structure goes. Don't rely on a quirk of the parser to populate hashes in a particular order, either. Just because the ( x => y ) syntax on the right-hand-side is implemented as a list, and today's interpreter happens to walk that list in order to populate the hash, and any duplicate values of x will get written multiple times, does not mean that making these assumptions are a good coding practice. You're inserting into a hash. If you expect a lot of overwrites, and this is important to you, express the appropriate order of insertions to manage these redundant overwrites carefully. While it seems unlikely that the interpreter will break this assumption tomorrow, that's more likely due to the fact that the perl hackers know there are a million poorly written scripts that do make this assumption out there. As a benefit, it makes your code more literate, more self-explanatory and clear.

--
[ e d @ h a l l e y . c c ]

[reply]
[d/l]
[select]

Re^2: Bulk hash population order

by jdporter (Paladin) on Nov 27, 2007 at 15:23 UTC

Maybe it's good advice to say "never make assumptions about ordering wrt hashes", but in this case the advice is misleading, because the form

%h = ( %h, foo => 2 );
[download]

always

%h

foo => 2

$h{'foo'}

Fletch

explanation

* That is, unless %h is tied to other behavior, e.g. altering or deleting keys or values according to some function.

A word spoken in Mind will reach its own level, in the objective world, by its own weight

[reply]
[d/l]
[select]

Re^2: Bulk hash population order

by grinder (Bishop) on Nov 27, 2007 at 17:33 UTC

It seems like some people are downvoting, perhaps assuming that I didn't read the question because I'm talking about visitation, and the question talks about bulk insertion. I phrased this intentionally; they're the same thing as far as dealing with a data structure goes.

It could be you're being dinged because the information is incorrect.

The problem has nothing to do with hash traversal. What is happening is hash-to-list flattening (and back again). The keys and values are being flattened into a series of list pairs, and then additional list pairs are being tacked on the end.

In a subsequent step (the assignment to a hash), the pairs are paired up again, and the results assigned to a hash. The values of keys coming later in the list overwrite the values of keys set earlier in the list. There is no voodoo involved, it's the way list iteration works. It's not an assumption, it's the only way it could ever work. There's no sane algorithm that could replace it.

List flattening is deterministic, there's no two ways about it (literally :)

• another intruder with the mooring in the heart of the Perl

[reply]

Re^2: Bulk hash population order

by JakeIII (Novice) on Nov 27, 2007 at 14:19 UTC

my %lilhash = (
     'A' => 1,
     'A' => 2,
     'A' => 3
);
print $lilhash{'A'};
[download]

[reply]
[d/l]

Re^3: Bulk hash population order

by halley (Prior) on Nov 27, 2007 at 14:30 UTC

We know that the perl interpreter will parse this equivalently to:

my @hiddenlist = (
     'A', 1,
     'A', 2,
     'A', 3
);
my %lilhash;
while (@hiddenlist)
{
     my $hiddenkey = shift @hiddenlist;
     $lilhash{$hiddenkey} = shift @hiddenlist;
}
print $lilhash{'A'};
[download]

could

I just think it's a bad idea when the word "order" and the word "hash" come anywhere near each other to start making such assumptions. Regardless of how safe or well-entrenched the idiom may be, my advice is: the hash is unordered and the list is ordered and if you care about order, be explicit.

--
[ e d @ h a l l e y . c c ]

[reply]
[d/l]

Re^4: Bulk hash population order

by Fletch (Bishop) on Nov 27, 2007 at 15:16 UTC

Re: Bulk hash population order
by duelafn (Parson) on Nov 27, 2007 at 17:07 UTC

One can even make it pretty with a simple prototyped function:

sub hpush(\%@) {
  my ($h,$k,$v) = (shift);
  $$h{$k} = $v while ($k, $v) = splice(@_, 0, 2);
}

# ... later
my %bighash = ( A => 1, B => 2, C => 3 );
hpush %bighash, B => 3, D => 4;
[download]

Good Day,
Dean

[reply]
[d/l]

Re: Bulk hash population order
by localfilmmaker (Novice) on Nov 28, 2007 at 07:44 UTC

my %defaults = get_defaults();
my %bighash = get_big_hash();

# Set default values to our bighash
%bighash = (%bighash, %defaults);
[download]

[reply]
[d/l]

Re: Bulk hash population order
by locked_user sundialsvc4 (Abbot) on Nov 28, 2007 at 20:17 UTC

Amen! Make it clear!

If you want to change the values in a hash, use hash notation consistently throughout ... precisely so that Perl in its DWIM-crazed way does not decide that you “meant” something that you didn't even know you were doing.

And then, once you know you have a hash, and that no goofy hash-to-list-to-hash magic is happening behind your back, always assume that the keys will be retrieved “in no particular order.” A hash is intended to be used as a high-speed, but random-access data structure.

Re: Bulk hash population order
by toolic (Bishop) on Nov 27, 2007 at 13:55 UTC

Update

~~It seems simpler just to omit %bighash from the right-hand side of the assignment:~~

%bighash = (
     'C' => 'some new value',
     'D' => 4
);
[download]

[reply]
[d/l]
[select]

Re^2: Bulk hash population order

by Fletch (Bishop) on Nov 27, 2007 at 14:01 UTC

Assigning to a whole hash (%hash = LIST) clobbers the existing contents; what he's trying to do is append and/or overwrite new contents. Your code has now clobbered the rest of the contents of %bighash and it only has those two keys rather than replacing just the old pairs for the keys C and D.

We're looking for a Perl and Database Developer for Corporate Investments Group.

[reply]
[d/l]
[select]