Test if a subhash in a referenced hash exists

Henri has asked for the wisdom of the Perl Monks concerning the following question:

Replies are listed 'Best First'.
Re: Test if a subhash in a referenced hash exists by moritz (Cardinal) on May 28, 2010 at 08:34 UTC
`if (not exists $coordinates->{$group}{$id}{$stage}{"coords"}) { ...}` [download] (untested) Perl 6 - links to (nearly) everything that is Perl 6.	[reply] [d/l]
Re^2: Test if a subhash in a referenced hash exists by DrHyde (Prior) on May 28, 2010 at 10:43 UTC
If $group, $id, $stage etc don't exist in $coordinates, then that will auto-vivify them. You need to check the existence of each level in the hash if magically creating them merely by looking at them is a problem for your application. Something like this (ignoring your application's logic for the sake of making my example code clearer): `if(exists($coordinates->{$group})) { if(exists($coordinates->{$group}->{$id})) { if(exists($coordinates->{$group}->{$id}->{$stage})) { ... } } }` [download] or to generalise (untested. haven't even tried to compile it) ... `my $result = do_stuff_in_hash_without_autovivifying( sub { my $hash_to_work_on = shift; # do stuff here }, $coordinates, # initial hash $group, $id, $stage, 'coords' # list of keys to traverse ); sub do_stuff_in_hash_without_autovivifying { my($sub, $hash, @keys) = @_; if(@keys && exists($hash->{$keys[0]}) { return do_stuff_in_hash_without_autovivifying( $sub, $hash->{$keys[0]}, @keys[1 .. $#keys] ); } else { return $sub->($hash); } }` [download] If you're really paranoid about auto-vivification, then you can subvert Tie::Hash::Vivify to turn it into a fatal error instead of silent data corruption: `use Tie::Hash::Vivify; use Data::Dumper; my $hash = Tie::Hash::Vivify->new(sub { die("No auto-vivifying! Bad programmer! No bikkit!\n".Dumper(\@_)) });` [download]	[reply] [d/l] [select]
Re^3: Test if a subhash in a referenced hash exists by moritz (Cardinal) on May 28, 2010 at 11:22 UTC
Another way to prevent autovivification is to use Data::Diver. Perl 6 - links to (nearly) everything that is Perl 6.	[reply]
Re^3: Test if a subhash in a referenced hash exists by Henri (Novice) on May 29, 2010 at 13:21 UTC
Thanks for pointing out that autovivificated hash parts stay around. Reading about autovivification I always thought that those parts would appear on the fly and be gone again after they have been looked at. Since I actually might test hashes several times, I will be on the outlook for this situation.	[reply]
Re^2: Test if a subhash in a referenced hash exists by Henri (Novice) on May 28, 2010 at 10:22 UTC
Moritz, your reply helped me find the problem. After your suggestion did not work either I rechecked the subroutine and found that I had not passed $coordinates into it. Doing so, now your suggestion and my original line both work. Also, I thought about exist, defined and true and think I should use exist, because it is a prerequisite to defined and true. Thanks a lot for your help! Henri	[reply]
Re^3: Test if a subhash in a referenced hash exists by almut (Canon) on May 28, 2010 at 10:37 UTC
...but note that exists does not test what you asked for in the subject line, i.e. if a subhash exists — it just tests whether the respective hash key exists, or more precisely, whether the respective hash element has ever been initialized (with whatever value, including `undef`). `$data->{foo} = ""; if ( exists $data->{foo} ) { print $data->{foo}{bar}; # 'Can't use string ("") as a HASH ref +...' }` [download] or with `undef` as value (in which case the missing hash would be autovivfied) `$data->{foo} = undef; if ( exists $data->{foo} ) { print $data->{foo}{bar}; # 'Use of uninitialized value in print +...' }` [download]	[reply] [d/l] [select]
Re^4: Test if a subhash in a referenced hash exists by Henri (Novice) on May 29, 2010 at 13:17 UTC
Re: Test if a subhash in a referenced hash exists by almut (Canon) on May 28, 2010 at 10:09 UTC
"Can't use string ("") as a HASH ref while "strict refs" in use That error results from trying to evaluate an empty string as a hash reference. In other words, the part you have in between `%{...}` most likely evaluates to the empty string, instead of the hashref that would be needed here: `use strict; use warnings; my $data = {}; $data->{foo} = { a => 1}; $data->{bar} = ""; if ( %{ $data->{foo} } ) { ... } # ok, because $data->{foo} is a has +href if ( %{ $data->{bar} } ) { ... } # not ok, because $data->{bar} is e +mpty` [download] Best is probably to directly test whether the value in question is a hashref, i.e. `if ( ref($data->{bar}) eq "HASH" ) { ... } # ok, even if $data->{bar +} is empty/undef` [download] or, adapted to your example: `... next unless ref($coordinates->{$group}{$id}{$stage}{"coords"}) eq +"HASH"; # print individual and coordinate information` [download] See ref.	[reply] [d/l] [select]
Re^2: Test if a subhash in a referenced hash exists by Henri (Novice) on May 29, 2010 at 13:24 UTC
The empty string crept in since I had a legacy line in my code that set `$coordinates = УФ` <grmpf>. I am trying to clean up a script that has grown over the past years. I had tried ref, but it had not worked, probably due to my notation confusion. Your %{$data->{foo}} notation is another variant to the ones I just posted.	[reply] [d/l]
Re: Test if a subhash in a referenced hash exists by Henri (Novice) on May 29, 2010 at 13:12 UTC
Thanks to all of you, your comments are really helpful to me. The background of my question is, that I am at a point where my scripts are getting more numerous and complex, while I am still just repeating what has been working in the past without necessarily understanding why. To be able to make a step ahead I dearly need to clarify some basic concepts. These seem obvious when I read about them, but get mangled when I try to apply them. I am just not comfortable with (de-) referencing complex data structures and testing for existence. Part of my confusion seems to boil down to the following: All of you write my hash structures like this `if (not exists $coordinates->{$group}{$id}{$stage}{"coords"}) {` [download] and as I understand the two following notations are equivalent `if (not exists ${$coordinates}{$group}{$id}{$stage}{"coords"}) { if (not exists ${${${${$coordinates}{$group}}{$id}}{$stage}}{"coords"} +)` [download] All three work, they are dereferencing the hash reference $coordinates and provide the hash reference of the anonymous subhash that has at its top level the coordinate numbers as keys. The following tests for the `$coordinates->` notation also go through without error: `if (not defined $coordinates->{$group}{$id}{$stage}{"coords"}) { if (!($coordinates->{$group}{$id}{$stage}{"coords"})) {` [download] However, what about `%{$$coordinates{$group}{$id}{$stage}{"coords"}}`that is used in `foreach my $coord_no (keys %{$$coordinates{$group}{$id}{$stage}{"coord +s"}}) {Е}` [download] I always thought this is equivalent and also would give the hash reference of the anonymous subhash? But `if (not exists %{$$coordinates{$group}{$id}{$stage}{"coords"}}) { exists argument is not a HASH or ARRAY element` [download] Thus, what does it return? And why is what it returns defined and true, but does not exist: `if (not defined %{$$coordinates{$group}{$id}{$stage}{"coords"}}) { if (!(%{$$coordinates{$group}{$id}{$stage}{"coords"}})) {` [download] both go through just fine. To make my confusion complete, tests of a similar hash give different results: `if (not exists %{$$original_data{$group}{$id}{$stage}}) { # exists argument is not a HASH or ARRAY element if (not defined %{$$original_data{$group}{$id}{$stage}}) { # works if (!(%{$$original_data{$group}{$id}{$stage}})) { # Can't use an undefined value as a HASH reference` [download] Mmh.	[reply] [d/l] [select]
Re^2: Test if a subhash in a referenced hash exists by wfsp (Abbot) on May 29, 2010 at 15:52 UTC
Have a look at the very useful References quick reference. Also, the keys doc. The argument for `keys` must be a hash and has a `%` sigil. `for my $key (keys %hash){ #... } for my $key (keys %{$hashref}){ # dereferencing a hash #... }` [download] The exists doc tells you its argument is a hash element which will have a `$` sigil. You need to read your error message `exists argument is not a HASH or ARRAY element` [download] as `exists argument is neither a HASH element nor an ARRAY element` [download] `if (exists $hash{some_key}){ #.. } if (exists $hashref->{some_key}){ # dereferencing a hash element using + an arrow #.. }` [download] An observation. ...what about `%{$$coordinates{$group}{$id}{$stage}{"coords"}}`... The `%{...}` does the dereferencing, there's no need for the extra `$` in `$$`. I prefer the `->` for dereferencing (many monks don't) but it is often unnecessary (as in this case). Have a look at the tutorial linked to above. I look at it at least once a week. :-) update: When I'm have a fight with a convoluted data structure I try out the syntax on a simplified version and get that working first. And while you're doing that don't forget the mighty Data::Dumper. Good luck!	[reply] [d/l] [select]
Re^3: Test if a subhash in a referenced hash exists by Henri (Novice) on May 30, 2010 at 18:30 UTC
wfsp, your mentioning that exists and keys need different types of input in combination with Data::Dumper got me on the right track. I checked the results of the different notations with Data::Dumper and voila! With this a major brain knot got untangled and some of the error messages are now actually starting to make sense to me. `print Dumper($coordinates->{"AC"}{"132"}{"0"}{"coords"}); $VAR1 = { '1' => { 'value' => '4411478.623', 'name' => 'Xgeo' }, '2' => { 'value' => '5953375.013', 'name' => 'Ygeo' } };` [download] Here a scalar is returned. $VAR1 I think is the hash reference of the subhash. `print Dumper(%{$$coordinates{"BD"}{"132"}{"0"}{"coords"}}); $VAR1 = '1'; $VAR2 = { 'value' => '4411478.623', 'name' => 'Xgeo' }; $VAR3 = '2'; $VAR4 = { 'value' => '5953375.013', 'name' => 'Ygeo' };` [download] This returns the hash contents (keys and the references of the next level subhashes) as a list. From how `exists` and `defined` behave, I take it that `exists` requires a single element, while `defined` is able to handle lists, too. I have to check that. But I take it I should test for the subhash reference ($hasref->) as almut pointed out and don't test the hash list itself (%{$$hasref...}). The refs quick ref page and the keys page were new to me, thanks for pointing me there. Most recently I have been looking at perlref, perlreftut, perlsub, perlvar, perldata and exits. Generally, most of the examples seem to deal with the topmost layer of a HoH or its final leaves layer ie. when you finally access a scalar. Examples for the middle of a HoH when you have to deal with subhashes and subarrays, however, are much rarer. Maybe it is just obvious, but to me it's not intuitive from the start. At the topmost layer you access the hash (reference) - results are from `print` `$hashref # HASH(x183e2e8) %($hashref} # %{HASH(x183e2e8)} %$hashref # %{HASH(x183e2e8)}` [download] and you access the leaves `$$hashref{$key1}{$key2}{$key3} # = scalar value` [download] In the middle you always need the extra dereference to first get at the subhash reference (a scalar) and then with it access the subhash: `$$hashref{$key1}{$key2}{$key3} # HASH(0x18b9df8) $hashref->{$key1}{$key2}{$key3} # HASH(0x18b9df8) %($$hashref{$key1}{$key2}{$key3}} # 1HASH(0x18b9e58)2HASH(0x18b9e88)` [download] The following notations donТt work: `%($hashref{$key1}{$key2}{$key3}} # Global symbol %hashref requires e +xplicit package name %$hashref{$key1}{$key2}{$key3} # syntax error` [download] Your comments and suggestions really helped me to cut that knot and understand a bit better what I am doing. Thanks to all of you!	[reply] [d/l] [select]