Re: Missing something with regards to references
by blokhead (Monsignor) on Nov 01, 2004 at 22:35 UTC
|
Your code is pretty close, but I'm afraid you need another level of referencing to deal with autovivifying. I'll try to explain why... Consider when you have an empty $DATA{3}{h} and call put_element($DATA{3}, 'h', 'i', "worked"). Then your code eventually does:
- $current = $DATA{3}{h}; (this returns undef)
- $current = $$current{i};
That last line autovififies the undef in $current to a fresh anonymous hashref, fetches the key 'i' from the new anonymous hash (which returns undef), and sets $current to this undef... But overwriting the value of $current cleans up the anonymous hash that was just created. It's lost forever. Whoops!
Instead of this, you need to keep another level of references. Now instead of trying to keep the values of $DATA{3}, $DATA{3}{h}, $DATA{3}{h}{i} (which are copied, so don't do the right thing if the hash key doesn't exist), we have to keep references to those slots of the hashes. This way the new hashes can be autovivified in the appropriate slot, and not into your temporary variable.
You also need to use $_[0] instead of shift if the first argument has a hash dereference that may be autovifified (since shift makes a copy, but $_[0] is an alias to the first arg, not a copy).
sub put_element {
my $current = \ $_[0];
shift;
my $value = pop;
$current = \ $$current->{ +shift } while @_;
$$current = $value;
}
use Data::Dumper;
my %DATA;
put_element( $DATA{3}, 'h', 'i', "Worked" ); ## $DATA{3} autovifified
+ too
put_element( $DATA{3}, 'b', 'y', 'e' ); ## $DATA{3} exists
print Dumper \%DATA;
Yes, the autovivification makes this a very sticky situation. But if you can understand this, you are in very good shape with your understanding of references.
| [reply] [d/l] [select] |
|
|
Thank you for the explination.
It sort of make sense to me, but this is entirely new territory for me. I'd had a hunch at one time that the undef keys beneath the one I wanted would be a problem, but I couldn't quite wrap myself around what could be done about it. Using the reference rather than copies of values sort of makes sense too, but I think it needs time to sink in. I think the part about using $_[0] instead of shift makes the least sense to me... since I am always pathing through a reference, I apparently need to make a reference to the reference?
Anyway, thank you. This is very helpful.
| [reply] |
|
|
my %DATA = (3 => {});
put_element($DATA{3}, ...);
But it is needed if the key doesn't exist yet:
my %DATA = (3 => {});
put_element($DATA{42}, ...);
"But why doesn't my $current = \shift work? Why do I have to use my $current = \$_[0]?" Answer: If the key already exists and its value is a hashref, both possibilities work. But if the key does not exist, we have to make sure, that we really get a reference to the entry of %DATA, not the value of the non-existing entry.
sub test1 {
my $val = shift;
$val = 5;
}
sub test2 {
$_[0] = 5;
}
my ($a, $b) = (3, 3);
test1($a); # $a continues to be 3.
test2($b); # $b is 5 now.
But I could have missed something, too. | [reply] [d/l] [select] |
Re: Missing something with regards to references
by dpuu (Chaplain) on Nov 01, 2004 at 22:42 UTC
|
put_element( $DATA{3}, 'h', 'i', 'worked' );
For clarity, lets do the whole hash. You can create a reference to the top level (root) hash using the backslash operator:
put_element( \%DATA, 3, 'h', 'i', 'worked' );
All those quotes get tiresome after a while, so lets get rid of them:
put_element( \%DATA, qw(3 h i worked) );
Perl flattens arg-lists, so that is identical to the previous version.
So now the implementation (I'll omit the error-checks):
sub put_element {
my ($hashref, @path) = @_;
# The leaf node and its value are special
my $value = pop @path;
my $leaf = pop @path;
# walk the tree, create non-existing nodes
for my $node (@path) {
$hashref = $hashref->{$node} ||= {};
}
# Finally, set the value at the leaf
$hashref->{$leaf} = $value;
}
Hopefully the comments are sufficient to explain my thinking.
--Dave Opinions my own; statements of fact may be in error.
| [reply] [d/l] [select] |
|
|
This is very elegant and makes much more sense to me, so thank you dpuu. But given some of the points that blokhead raised, are there any other gotchas with this approach I should be concerned about?
| [reply] |
|
|
| [reply] |
Re: Missing something with regards to references
by diotalevi (Canon) on Nov 01, 2004 at 23:22 UTC
|
Use List::Util::reduce. You have a serious problem if you use the approach you selected. If you have already added "a", "b" = 1, what happens when you say "a", "b", "c" = 2? Using your existing code you will immediately die on symbolic references. Better, solve the problem by putting the child node references into a separate key and store your values in another.
use List::Util 'reduce';
use Carp 'carp';
# a.tree => b.tree => c.tree => ... z.value
sub put_element {
'HASH' eq ref( my $root = shift() )
or carp "put_element( \ %..., [ KEY ], $val ) must be a hash r
+eference";
'ARRAY' eq ref( my $keys = shift() )
or carp "put_element( \ %..., [ KEY ], $val ) must be given an
+ array reference";
my $value = shift;
my $leaf = reduce { $a->{$b}{'tree'} ||= {} } $root, @{$keys}[ 0 .
+. $#$keys - 1 ];
$leaf->{$keys->[-1]}{'value'} = $value;
}
| [reply] [d/l] [select] |
Re: Missing something with regards to references
by borisz (Canon) on Nov 01, 2004 at 23:16 UTC
|
Instead of building the hash so complicated, build another key.
In the old times perl did that for you.
$data{qw/3 h i/} = 'worked';
$data{@array} = 'worked';
this looks much easier to me.
| [reply] [d/l] |
Re: Missing something with regards to references
by cLive ;-) (Prior) on Nov 01, 2004 at 22:27 UTC
|
Err, if you have a sub you call like this:
put_element($DATA{3}, 'h', 'i', 'worked');
Why not just do:
$DATA{3}{h}{i}='worked';
Why write a sub to do it? Or am I missing something?
cLive ;-)
update: I am missing something, but I don't get exactly what on rereading your question ;-) | [reply] [d/l] [select] |
|
|
What you're missing is that this is an example, this is not the actual application or the proper use of the function. It's been simplified in order to be able to directly address the issue I am having without the confusion of the full complexity of the program.
The whole point of a data structure is to be dynamic... and if it's dynamic then I wont know whether I'm assigning to $DATA{3}{h}{i}, or $DATA{3}{h}{i}{j}, or $DATA{3}{h}{i}{j}{k}. So I need a function that adapts to the number of dimensions of the hash dynamically.
| [reply] |
|
|
Right, but my point is, what are you trying to achieve by calling this:
put_element($DATA{3}, 'h', 'i', 'j', 'k', 'value');
rather than this:
$DATA{3}{h}{i}{j}{k} = 'value'
I don't see how the sub optimizes anything, or provides anything that a direct call does.
Unless you're calling it like this:
my @arr = qw(h i j k);
put_element($DATA{3}, @arr, 'value');
Is that how you're using it?
cLive ;-) | [reply] [d/l] [select] |
Re: Missing something with regards to references
by Anonymous Monk on Nov 01, 2004 at 22:52 UTC
|
Note that if you call put_element($DATA{3}, 'h', 'i', 'worked');, and $DATA{3} isn't a reference, the main datastructure will not be changed. I'd write the function like this:
sub put {
die "Usage: put \\hash, key1, ..., keyn, value\n" unless @_ >= 3;
my $hash = shift;
die "Not a HASH reference\n" unless ref($hash) eq "HASH";
my $value = pop;
my $key = pop;
while (@_) {
my $key = shift;
if (!exists($hash->{$key})) {
$hash = $hash->{$key} = { };
next;
}
elsif (ref($hash->{$key}) eq "HASH") {
$hash = $hash->{$key};
next;
}
else {
die "Not a HASH reference\n";
}
}
$hash->{$key} = $value;
}
Make sure you always pass in a reference to the put function. | [reply] [d/l] |