Missing something with regards to references

scottb has asked for the wisdom of the Perl Monks concerning the following question:

Replies are listed 'Best First'.
Re: Missing something with regards to references by blokhead (Monsignor) on Nov 01, 2004 at 22:35 UTC
Your code is pretty close, but I'm afraid you need another level of referencing to deal with autovivifying. I'll try to explain why... Consider when you have an empty `$DATA{3}{h}` and call `put_element($DATA{3}, 'h', 'i', "worked")`. Then your code eventually does: `$current = $DATA{3}{h};` (this returns undef) `$current = $$current{i};` That last line autovififies the undef in `$current` to a fresh anonymous hashref, fetches the key 'i' from the new anonymous hash (which returns undef), and sets `$current` to this undef... But overwriting the value of `$current` cleans up the anonymous hash that was just created. It's lost forever. Whoops! Instead of this, you need to keep another level of references. Now instead of trying to keep the values of `$DATA{3}`, `$DATA{3}{h}`, `$DATA{3}{h}{i}` (which are copied, so don't do the right thing if the hash key doesn't exist), we have to keep references to those slots of the hashes. This way the new hashes can be autovivified in the appropriate slot, and not into your temporary variable. You also need to use `$_[0]` instead of shift if the first argument has a hash dereference that may be autovifified (since shift makes a copy, but `$_[0]` is an alias to the first arg, not a copy). `sub put_element { my $current = \ $_[0]; shift; my $value = pop; $current = \ $$current->{ +shift } while @_; $$current = $value; } use Data::Dumper; my %DATA; put_element( $DATA{3}, 'h', 'i', "Worked" ); ## $DATA{3} autovifified + too put_element( $DATA{3}, 'b', 'y', 'e' ); ## $DATA{3} exists print Dumper \%DATA;` [download] Yes, the autovivification makes this a very sticky situation. But if you can understand this, you are in very good shape with your understanding of references. blokhead	[reply] [d/l] [select]
Re^2: Missing something with regards to references by scottb (Scribe) on Nov 01, 2004 at 23:27 UTC
Thank you for the explination. It sort of make sense to me, but this is entirely new territory for me. I'd had a hunch at one time that the undef keys beneath the one I wanted would be a problem, but I couldn't quite wrap myself around what could be done about it. Using the reference rather than copies of values sort of makes sense too, but I think it needs time to sink in. I think the part about using $_[0] instead of shift makes the least sense to me... since I am always pathing through a reference, I apparently need to make a reference to the reference? Anyway, thank you. This is very helpful.	[reply]
Re^3: Missing something with regards to references by iblech (Friar) on Nov 03, 2004 at 10:47 UTC
since I am always pathing through a reference, I apparently need to make a reference to the reference? `\$_[0]` is not needed if the given (first) key is already a hashref. E.g.: `my %DATA = (3 => {}); put_element($DATA{3}, ...);` [download] But it is needed if the key doesn't exist yet: `my %DATA = (3 => {}); put_element($DATA{42}, ...);` [download] "But why doesn't `my $current = \shift` work? Why do I have to use `my $current = \$_[0]`?" Answer: If the key already exists and its value is a hashref, both possibilities work. But if the key does not exist, we have to make sure, that we really get a reference to the entry of `%DATA`, not the value of the non-existing entry. `sub test1 { my $val = shift; $val = 5; } sub test2 { $_[0] = 5; } my ($a, $b) = (3, 3); test1($a); # $a continues to be 3. test2($b); # $b is 5 now.` [download] But I could have missed something, too.	[reply] [d/l] [select]
Re: Missing something with regards to references by dpuu (Chaplain) on Nov 01, 2004 at 22:42 UTC
Sometimes with these "what am I missing" type of questions, the best approach is to explain how I'd do it, and then see if that answer somehow hits the one thing you need to understand. Lets start with the function usage model: You suggest `put_element( $DATA{3}, 'h', 'i', 'worked' );` [download] For clarity, lets do the whole hash. You can create a reference to the top level (root) hash using the backslash operator: `put_element( \%DATA, 3, 'h', 'i', 'worked' );` [download] All those quotes get tiresome after a while, so lets get rid of them: `put_element( \%DATA, qw(3 h i worked) );` [download] Perl flattens arg-lists, so that is identical to the previous version. So now the implementation (I'll omit the error-checks): `sub put_element { my ($hashref, @path) = @_; # The leaf node and its value are special my $value = pop @path; my $leaf = pop @path; # walk the tree, create non-existing nodes for my $node (@path) { $hashref = $hashref->{$node} \|\|= {}; } # Finally, set the value at the leaf $hashref->{$leaf} = $value; }` [download] Hopefully the comments are sufficient to explain my thinking. --Dave Opinions my own; statements of fact may be in error.	[reply] [d/l] [select]
Re^2: Missing something with regards to references by scottb (Scribe) on Nov 02, 2004 at 00:03 UTC
This is very elegant and makes much more sense to me, so thank you dpuu. But given some of the points that blokhead raised, are there any other gotchas with this approach I should be concerned about?	[reply]
Re^3: Missing something with regards to references by dpuu (Chaplain) on Nov 02, 2004 at 00:39 UTC
I don't think so: my explicit default-assignment should avoid any autovivication issues. --Dave Opinions my own; statements of fact may be in error.	[reply]
Re: Missing something with regards to references by diotalevi (Canon) on Nov 01, 2004 at 23:22 UTC
Use List::Util::reduce. You have a serious problem if you use the approach you selected. If you have already added `"a", "b" = 1`, what happens when you say `"a", "b", "c" = 2`? Using your existing code you will immediately die on symbolic references. Better, solve the problem by putting the child node references into a separate key and store your values in another. `use List::Util 'reduce'; use Carp 'carp'; # a.tree => b.tree => c.tree => ... z.value sub put_element { 'HASH' eq ref( my $root = shift() ) or carp "put_element( \ %..., [ KEY ], $val ) must be a hash r +eference"; 'ARRAY' eq ref( my $keys = shift() ) or carp "put_element( \ %..., [ KEY ], $val ) must be given an + array reference"; my $value = shift; my $leaf = reduce { $a->{$b}{'tree'} \|\|= {} } $root, @{$keys}[ 0 . +. $#$keys - 1 ]; $leaf->{$keys->[-1]}{'value'} = $value; }` [download]	[reply] [d/l] [select]
Re: Missing something with regards to references by borisz (Canon) on Nov 01, 2004 at 23:16 UTC
Instead of building the hash so complicated, build another key. In the old times perl did that for you. `$data{qw/3 h i/} = 'worked'; $data{@array} = 'worked';` [download] this looks much easier to me. Boris	[reply] [d/l]
Re: Missing something with regards to references by cLive ;-) (Prior) on Nov 01, 2004 at 22:27 UTC
Err, if you have a sub you call like this: `put_element($DATA{3}, 'h', 'i', 'worked');` [download] Why not just do: `$DATA{3}{h}{i}='worked';` [download] Why write a sub to do it? Or am I missing something? cLive ;-) update: I am missing something, but I don't get exactly what on rereading your question ;-)	[reply] [d/l] [select]
Re^2: Missing something with regards to references by scottb (Scribe) on Nov 01, 2004 at 22:39 UTC
What you're missing is that this is an example, this is not the actual application or the proper use of the function. It's been simplified in order to be able to directly address the issue I am having without the confusion of the full complexity of the program. The whole point of a data structure is to be dynamic... and if it's dynamic then I wont know whether I'm assigning to $DATA{3}{h}{i}, or $DATA{3}{h}{i}{j}, or $DATA{3}{h}{i}{j}{k}. So I need a function that adapts to the number of dimensions of the hash dynamically.	[reply]
Re^3: Missing something with regards to references by cLive ;-) (Prior) on Nov 01, 2004 at 22:49 UTC
Right, but my point is, what are you trying to achieve by calling this: `put_element($DATA{3}, 'h', 'i', 'j', 'k', 'value');` [download] rather than this: `$DATA{3}{h}{i}{j}{k} = 'value'` [download] I don't see how the sub optimizes anything, or provides anything that a direct call does. Unless you're calling it like this: `my @arr = qw(h i j k); put_element($DATA{3}, @arr, 'value');` [download] Is that how you're using it? cLive ;-)	[reply] [d/l] [select]
Re: Missing something with regards to references by Anonymous Monk on Nov 01, 2004 at 22:52 UTC
Note that if you call `put_element($DATA{3}, 'h', 'i', 'worked');`, and `$DATA{3}` isn't a reference, the main datastructure will not be changed. I'd write the function like this: `sub put { die "Usage: put \\hash, key1, ..., keyn, value\n" unless @_ >= 3; my $hash = shift; die "Not a HASH reference\n" unless ref($hash) eq "HASH"; my $value = pop; my $key = pop; while (@_) { my $key = shift; if (!exists($hash->{$key})) { $hash = $hash->{$key} = { }; next; } elsif (ref($hash->{$key}) eq "HASH") { $hash = $hash->{$key}; next; } else { die "Not a HASH reference\n"; } } $hash->{$key} = $value; }` [download] Make sure you always pass in a reference to the `put` function.	[reply] [d/l]