The thought was triggered by IOrdy's searching nested structures.

The module encapsulates the recursive path list logic, and generates a hash, which contains "path=>data reference" pairs. One use of this module is to resolve IOrdy's original problem, which is now no more than some grep() or regexp matching.

Preliminary testing shows no unwanted autovivification.

Module code:

package PathLister; use Data::Dumper; use strict; use warnings; sub new { my $self = {}; bless($self); return $self; } sub list { my ($self, $data) = @_; my $pathes = {}; my $path = ""; list_slave($data, $path, $pathes); return $pathes; } sub list_slave { my ($data, $path, $pathes) = @_; if (ref($data) eq "HASH") { foreach my $key (keys(%$data)) { my $new_path = $path . '->{' . $key . '}'; if (ref($data->{$key})) { $pathes->{$new_path} = $data->{$key}; list_slave($data->{$key}, $new_path, $pathes); } else { $pathes->{$new_path} = \$data->{$key}; } } } elsif (ref($data) eq "ARRAY") { foreach my $index (0 .. $#{@$data}) { my $new_path = $path . '->[' . $index . ']'; if (ref($data->[$index])) { $pathes->{$new_path} = $data->[$index]; list_slave($data->[$index], $new_path, $pathes); } else { $pathes->{$new_path} = \$data->[$index]; } } } } 1;

Testing or demo code:

use Data::Dumper; use PathLister; use strict; use warnings; my $data = { "a" => { "b" => { "c"=> 100, "e"=> 101 }, "k" => {} }, "b" => { "c" => 3, "d" => 4 }, "c" => [1,2,3,4,5] }; my $lister = new PathLister; print Dumper($data); my $pathes = $lister->list($data); foreach my $key (keys(%$pathes)) { print "key = $key\n"; print Dumper($pathes->{$key}); } ${$pathes->{'->{a}->{b}->{c}'}} = 200; ${$pathes->{'->{c}->[2]'}} = 10; my $hash_ref = $pathes->{'->{b}'}; $hash_ref->{"c"} = 3000; print Dumper($data);

Result from the testing code:

$VAR1 = { 'c' => [ 1, 2, 3, 4, 5 ], 'a' => { 'k' => {}, 'b' => { 'e' => 101, 'c' => 100 } }, 'b' => { 'c' => 3, 'd' => 4 } }; key = ->{c}->[0] $VAR1 = \1; key = ->{c} $VAR1 = [ 1, 2, 3, 4, 5 ]; key = ->{a}->{b}->{c} $VAR1 = \100; key = ->{b}->{c} $VAR1 = \3; key = ->{c}->[4] $VAR1 = \5; key = ->{c}->[1] $VAR1 = \2; key = ->{c}->[3] $VAR1 = \4; key = ->{a}->{b} $VAR1 = { 'e' => 101, 'c' => 100 }; key = ->{b} $VAR1 = { 'c' => 3, 'd' => 4 }; key = ->{c}->[2] $VAR1 = \3; key = ->{a} $VAR1 = { 'k' => {}, 'b' => { 'e' => 101, 'c' => 100 } }; key = ->{a}->{k} $VAR1 = {}; key = ->{b}->{d} $VAR1 = \4; key = ->{a}->{b}->{e} $VAR1 = \101; $VAR1 = { 'c' => [ 1, 2, 10, 4, 5 ], 'a' => { 'k' => {}, 'b' => { 'e' => 101, 'c' => 200 } }, 'b' => { 'c' => 3000, 'd' => 4 } };

Replies are listed 'Best First'.
Re: multi-tier collection path lister
by simonm (Vicar) on Jan 06, 2004 at 05:21 UTC
    FWIW, there's a similar function in Data::DRef called leaf_drefs_and_values(), although it differs in that it only returns extended-key-string/value pairs for the outermost, non-reference items of the structure, and it uses "." as the separator rather than "->".
Re: multi-tier collection path lister
by exussum0 (Vicar) on Jan 06, 2004 at 12:37 UTC
    Don't assume your data is NOT acylic. My advice is to keep a set of reference addresses that you've already visited and dont' visit them again. Try this type of structure on your code.
    my $y = {}; my $x = { a => { b => { c => $y } } }; $y->{ d } = $x->{ a }; $y->{ e } = 1;
    Use a set to determine where you visited first and if you should visit a visited node. Grows by the number of nodes you have.

    Play that funky music white boy..