flightdm has asked for the wisdom of the Perl Monks concerning the following question:

Greetings, Monastery denizens: I've been struggling with this seemingly simple problem for a while, and would really appreciate some input from brains that are not mine. I have a data structure of arbitrary depth handed to me that is composed of hash/arrayrefs for which I need to create a path-type lookup. This is proving to be a pain, since I have to figure out what the structure is at every level - so I would really rather flatten it to hashrefs all the way down. The main fact here is that every hashref has a 'name' key in it pointing to a text value - so that's how I'd like to "re-key" this thing. Example:
$foo = { [ { name => 'a', type => 1, rockets => 2, leaves => 3 }, { name => 'b', type => 7, rockets => 4, leaves => 1, samples => [ { name => 'mary', dog => 'fifi' }, { name => 'john', fish => 'oscar' } ] } ] };
And here's what I'd like to produce (I don't care whether the { 'name' => 'value' } entry remains in the result or not):
$foo1 = { a => { name => 'a', type => 1, rockets => 2, leaves => 3 }, b => { name => 'b', type => 7, rockets => 4, leaves => 1, samples => { mary => { name => 'mary', dog => 'fifi' }, john => { name => 'john', fish => 'oscar' } } } };
Thanks in advance for the help!

Replies are listed 'Best First'.
Re: Recursive data structure munging - arrayrefs to hashrefs
by haukex (Archbishop) on Aug 31, 2017 at 19:35 UTC

    First, the data structure you've shown isn't actually valid: $foo = { [ ... ] }; is missing a hash key, so for now I'm just going to assume $foo = { foo => [ ... ] };

    I need to create a path-type lookup

    You may be interested in Data::Diver. I also showed some custom "diver" type code here and here.

    Also, it's unclear to me if your data structure always alternates between array and hash refs? Or can you ever have a hash of hashes? If not, this code will work on the sample data you showed, but it does make some assumptions about the structure of the input data. If you need to differentiate between arrays and hashrefs, you'll have to add things like if (ref $e eq 'ARRAY') { ... } elsif (ref $e eq 'HASH') { ... }.

    sub rekey { my $h = shift; for my $k (keys %$h) { next unless ref $$h{$k}; my %n; for my $e (@{$$h{$k}}) { die "duplicate key '$$e{name}'" if exists $n{$$e{name}}; $n{$$e{name}} = $e; rekey($e); } $$h{$k} = \%n; } } rekey( $foo );

    Update: You might also want to take a look at Data::DPath or Data::Path, I haven't used these myself but they sound like they might be fitting.

    Update 2: Fixed typo in text.

      First, the data structure you've shown isn't actually valid: $foo = { ... }; is missing a hash key, so for now I'm just going to assume $foo = { foo => ... };

      Thanks - that was, obviously, incorrect. The structure does not necessarily alternate between @ and % - but I'm fairly certain that there's no AoA component anywhere (and lots of HoH structures.)

      Your code works for $foo, but unfortunately not for the more complex structure - "Not an ARRAY reference" errors, unsurprisingly. I'm pretty sure it's that assumption that the next level down must be an array... as I've said, it's arbitrary depth. I appreciate your effort, though!

        Your code works for $foo, but unfortunately not for the more complex structure

        Well, as I said, easily fixed with an if. More representative sample input data gets you better answers ;-) The following should work on AoH and HoH, but not yet on AoA, and the root must be a hash. Extending this further is left as an exercise to the reader...

        sub rekey { my $h = shift; KEY: for my $k (keys %$h) { if (ref $$h{$k} eq 'ARRAY') { rekey($_) for @{$$h{$k}}; my %n; for my $e (@{$$h{$k}}) { next KEY unless exists $$e{name}; die "duplicate key '$$e{name}'" if exists $n{$$e{name}}; $n{$$e{name}} = $e; } $$h{$k} = \%n; } elsif (ref $$h{$k} eq 'HASH') { rekey($$h{$k}) } } } rekey( $foo );

        Update: I realized based on the sample data that you posted here that not all of your hashes in the AoHs have "name" keys. So I modified the above to skip transforming such AoHs with next KEY unless exists $$e{name};. Other approaches are of course possible too, if you can identify a useful key. Update 2: I applied that change a little too hastily, it wasn't correct as it wasn't recursing into the full data structure anymore. Fixed.

Re: Recursive data structure munging - arrayrefs to hashrefs
by tybalt89 (Monsignor) on Aug 31, 2017 at 19:41 UTC

    Try this, I had to fix your initial data. this outside braces were wrong.

    #!/usr/bin/perl # http://perlmonks.org/?node_id=1198452 use strict; use warnings; my $foo = [ { name => 'a', type => 1, rockets => 2, leaves => 3 }, { name => 'b', type => 7, rockets => 4, leaves => 1, samples => [ { name => 'mary', dog => 'fifi' }, { name => 'john', fish => 'oscar' } ] } ]; sub fix { my ($obj) = @_; if( 'HASH' eq ref $obj ) { return $obj->{name} => { map fix($_), %$obj }; } elsif( 'ARRAY' eq ref $obj ) { return { map fix($_), @$obj } } else { return $obj; } } use Data::Dump 'pp'; print "before:\n"; pp $foo; my $newfoo = fix($foo); print "after\n"; pp $newfoo;

    Outputs:

    before: [ { leaves => 3, name => "a", rockets => 2, type => 1 }, { leaves => 1, name => "b", rockets => 4, samples => [ { dog => "fifi", name => "mary" }, { fish => "oscar", name => "john" }, ], type => 7, }, ] after: { a => { leaves => 3, name => "a", rockets => 2, type => 1 }, b => { leaves => 1, name => "b", rockets => 4, samples => { john => { fish => "oscar", name => "john" }, mary => { dog => "fifi", name => "mary" }, }, type => 7, }, }
      The output ends up looking like this:
      Use of uninitialized value in anonymous hash ({}) at /Users/ben/Work/c +ert/mine/restruct line 33, <$fh> line 1. Odd number of elements in anonymous hash at /Users/ben/Work/cert/mine/ +restruct line 33, <$fh> line 1. Use of uninitialized value in anonymous hash ({}) at /Users/ben/Work/c +ert/mine/restruct line 33, <$fh> line 1. Odd number of elements in anonymous hash at /Users/ben/Work/cert/mine/ +restruct line 33, <$fh> line 1. Use of uninitialized value in anonymous hash ({}) at /Users/ben/Work/c +ert/mine/restruct line 33, <$fh> line 1. Odd number of elements in anonymous hash at /Users/ben/Work/cert/mine/ +restruct line 33, <$fh> line 1. Use of uninitialized value in anonymous hash ({}) at /Users/ben/Work/c +ert/mine/restruct line 33, <$fh> line 1. Odd number of elements in anonymous hash at /Users/ben/Work/cert/mine/ +restruct line 33, <$fh> line 1. [...] { "" => { items => {} }, "2017-08-30T22:47:28.941Z" => "clusters", "allHostsConfig" => undef, "HASH(0x7f91e29911a8)" => "hosts", "HASH(0x7f91e29c8c18)" => "timestamp", "HASH(0x7f91e29c8f30)" => "hostTemplates", "HASH(0x7f91e29d1720)" => undef, "managementService" => "mgmt", "peers" => {}, [...]
      I see what you're trying to do, though - which gives me lots of ideas. Thank you!

        Show a Data::Dump printout of your data before you run it through fix()
        Or at least a small subset that shows the problem you claim.