Re: How to improve introspection of an array of hashes

Nice :).

First:

my $out .= "var ";

my $out .= "var ";
[download]

What what? And please don't name your hashrefs $array ;-).

Just so we're clear. Your idea will work when the keys identify uniquely the associated values, so something like this would be valid perl, but invalid input:

(Shapes => 
  [ 
    { Type => 'Circle', Diameter => 2, Center => [0,1] }, 
    { Type => 'Square', Side => 3, Pos => [4,8] },
    [ {x => 1, y => 1}, {x => 3, y => 1}, {x => 4, y => 2}, {x => 2, y
+ => 2} ]
  ]
);
[download]

The values in the Shapes array can either be hashes describing the shape, or an array of points, and you need to look inside the hashes to know the type of shape and the associated members.
Edit: actually my solution below accepts mixed ARRAY/HASH. It just melds Circle and Square together in a way that might not make much sense.

As can be seen, my crude approach of merging each element into a %giant_hash, while great for data if everything within the array hashes is a hash, falls down when arrays are encountered.

Actually the hash is the issue, in %giant_hash = (%giant_hash, %$array); if there are keys in %$array that are already present in %giant_hash, they will overwrite them. This means that the kids of Marge Keefe will erase the kids of Tony Jones.

A hash in perl can only hold a single value (scalar) for each key. That value can be a reference that holds other values, but perl won't just do that on its own when you "merge" hashes, so in %giant_hash, you will only have the fname, last_name, occupation and set of kids. So you can't actually merge the hash that way before iterating over it.

The reason you merged the hashes in the first place is that they are of the same type, so contain similar data. That's also true for the kids (basically they have a name, an age, and might be vaccinated), so you should "merge" them, and show that "kids" may contrary may contain an arbitrary number of elements of type "kid". Which would make your output look like:

var is a HASH with 5 keys
        the keys are 'age', 'fname', 'kids', 'last_name', 'occupation'
        key 'kids' is an ARRAY containing HASHREFs:
                the keys are 'age', 'name', 'vaccinated'
                key 'vaccinated' is a SCALAR
                key 'age' is a SCALAR
                key 'name' is a SCALAR
        key 'fname' is a SCALAR
        key 'age' is a SCALAR
        key 'occupation' is a HASH with 2 keys
                the keys are 'title', 'years_on_job'
                key 'title' is a SCALAR
                key 'years_on_job' is a SCALAR
        key 'last_name' is a SCALAR
[download]

Here is my attempt at solving your problem. There are of course many ways to do it, keeping the list of keys down to the current point rather than a reference to the current level might be a better way to work (you don't have to provide the output hash as a parameter), but I just went where my fingers took me :D

use v5.14;

use strict;
use warnings;

use Data::Dump qw( pp );
use YAML;

sub introspect
{
  my ($data, $output) = @_;
  if (ref $data eq 'ARRAY')
  {
    my $sub_out = ($output->{'ARRAY'} //= {});
    introspect($_, $sub_out) for @{ $data };
  }
  elsif (ref $data eq 'HASH')
  {
    my $hash_out = $output->{"HASH"} //= {};
    for my $key (keys %$data)
    {
      my $sub_out = ($hash_out->{"$key"} //= {});
      introspect($_, $sub_out) for $data->{$key};
    }
  }
  elsif (ref $data)
  {
    $output->{ref($data).'REF'}=1;
  }
  else
  {
    $output->{SCALAR}=1;
  }
}

my @array = ({fname => 'bob',  last_name => 'smith', foo => [\*main]},

             {fname => 'tony', last_name => 'jones', age => 23,
               kids =>
                 [
                   {first_name   => 'cheryl',
                    middle_name => 'karen',
                    age         => 24        },

                   {name         => 'jimmy',
                    age          => 17       }

                 ],
                                                },
             {fname => 'janet', last_name => 'marcos', foo => {},
               occupation => {
                 title => 'trucker',
                 years_on_job => 12}                              },


             {fname => 'Marge', last_name => 'Keefe',
                kids =>
                  [
                    {name => 'kate', age => 7, vaccinated => 'yes'},
                    {name => 'kim', age => 5}
                  ]
             });
             
my %out;

introspect(\@array, \%out);
say pp \%out;
say YAML::Dump(\%out);
[download]

{
  ARRAY => {
    HASH => {
      age => { SCALAR => 1 },
      fname => { SCALAR => 1 },
      foo => { ARRAY => { GLOBREF => 1 }, HASH => {} },
      kids => {
        ARRAY => {
          HASH => {
            age => { SCALAR => 1 },
            first_name => { SCALAR => 1 },
            middle_name => { SCALAR => 1 },
            name => { SCALAR => 1 },
            vaccinated => { SCALAR => 1 },
          },
        },
      },
      last_name => { SCALAR => 1 },
      occupation => {
        HASH => { title => { SCALAR => 1 }, years_on_job => { SCALAR =
+> 1 } },
      },
    },
  },
}
---
ARRAY:
  HASH:
    age:
      SCALAR: 1
    fname:
      SCALAR: 1
    foo:
      ARRAY:
        GLOBREF: 1
      HASH: {}
    kids:
      ARRAY:
        HASH:
          age:
            SCALAR: 1
          first_name:
            SCALAR: 1
          middle_name:
            SCALAR: 1
          name:
            SCALAR: 1
          vaccinated:
            SCALAR: 1
    last_name:
      SCALAR: 1
    occupation:
      HASH:
        title:
          SCALAR: 1
        years_on_job:
          SCALAR: 1
[download]

Edit: you can add this case to handle things like \\\\\{};

  elsif (ref $data eq 'REF')
  {
    introspect($$data, ($output->{'REF'} //= {}));
  }
[download]

Comment on Re: How to improve introspection of an array of hashes Select or Download Code

Replies are listed 'Best First'.
Re^2: How to improve introspection of an array of hashes by nysus (Parson) on Sep 13, 2018 at 14:33 UTC
Perfect! Very elegant. I will study this closely. And nice use of Dumper and yaml to do the work of formatting the output. Do you think this might be useful as a cpan module? I searched cpan but didn't find anything that did anything quite like this. $PM = "Perl Monk's"; $MCF = "Most Clueless ~~Friar~~ ~~Abbot~~ ~~Bishop~~ ~~Pontiff~~ ~~Deacon~~ ~~Curate~~ Priest"; $nysus = $PM . ' ' . $MCF; Click here if you love Perl Monks	[reply]
Re^3: How to improve introspection of an array of hashes by Eily (Monsignor) on Sep 13, 2018 at 15:08 UTC
YAML is often my go-to module when I want formatted data but I'm too lazy to do it myself :). I was hoping for compacter data with YAML than Data::Dump though .But the latter has inline `{ SCALAR => 1 }` where YAML puts it in a separate line. Do you think this might be useful as a cpan module? Maybe? It needs some tinkering though (or rewrite). Like a wrapper to hide the %output hash. And proper handling of objects: right now inspecting `bless {}, 'Pony'` would be indicated as 'PonyREF' and `bless {}, 'ARRAY'` would try to dereference the hashref as an arrayref. Oups. I'd still be curious to see what others might have to say about the subject. I wouldn't be surprised if there is already a data traversing module that, rather than do what you want already, let's you do it in two to three lines.	[reply] [d/l] [select]
Re^4: How to improve introspection of an array of hashes by nysus (Parson) on Sep 13, 2018 at 16:33 UTC
I'm sure someone has done something like this as well. Just couldn't find it. I took your code for a spin in the real world using Google Contacts API. Here's the output from a json response converted to a Perl data structure using Mojo::JSON::decode_json: HASH => { encoding => {}, feed => { HASH => { "author" => { ARRAY => { HASH => { email => { HASH => { "\\$t" => {} } }, name => { HASH => { "\\$t" => {} } }, }, }, }, "category" => { ARRAY => { HASH => { scheme => {}, term => {} +} } }, "entry" => { ARRAY => { HASH => { "app\\$edited" => { HASH => { "\\$t" => {}, "xmlns\\$app +" => {} } }, "category" => { ARRAY => { HASH => { scheme => {}, term +=> {} } } }, "content" => { HASH => { "\\$t" => {} } }, "gContact\\$birthday" => { HASH => { when => {} } }, "gContact\\$groupMembershipInfo" => { ARRAY => { HASH => + { deleted => {}, href => {} } } }, "gContact\\$nickname" => { HASH => { "\\$t" => {} } }, "gContact\\$relation" => { ARRAY => { HASH => { "\\$t" = +> {}, "rel" => {} } } }, "gContact\\$userDefinedField" => { ARRAY => { HASH => { +key => {}, value => {} } } }, "gContact\\$website" => { ARRAY => { HASH => { href => {}, label => {}, primary +=> {}, rel => {} } }, }, "gd\\$email" => { ARRAY => { HASH => { address => {}, label => {}, primary => {}, + rel => {} }, }, }, "gd\\$etag" => {}, "gd\\$extendedProperty" => { ARRAY => { HASH => { "\\$t" + => {}, "name" => {} } } }, "gd\\$im" => { ARRAY => { HASH => { address => {}, label => {}, primary => {}, + protocol => {}, rel => {} }, }, }, "gd\\$name" => { HASH => { "gd\\$additionalName" => { HASH => { "\\$t" => {} } +}, "gd\\$familyName" => { HASH => { "\\$t" => {}, " +yomi" => {} } }, "gd\\$fullName" => { HASH => { "\\$t" => {} } +}, "gd\\$givenName" => { HASH => { "\\$t" => {}, " +yomi" => {} } }, "gd\\$namePrefix" => { HASH => { "\\$t" => {} } +}, "gd\\$nameSuffix" => { HASH => { "\\$t" => {} } +}, }, }, "gd\\$organization" => { ARRAY => { HASH => { "gd\\$orgDepartment" => { HASH => { "\\$t" => {} } + }, "gd\\$orgName" => { HASH => { "\\$t" => {} } }, "gd\\$orgTitle" => { HASH => { "\\$t" => {} } }, "primary" => {}, "rel" => {}, }, }, }, "gd\\$phoneNumber" => { ARRAY => { HASH => { "\\$t" => {}, "label" => {}, "primary" => +{}, "rel" => {}, "uri" => {} }, }, }, "gd\\$structuredPostalAddress" => { ARRAY => { HASH => { "gd\\$city" => { HASH => { "\\$t" => { +} } }, "gd\\$country" => { HASH => { "\\$t" => { +}, "code" => {} } }, "gd\\$formattedAddress" => { HASH => { "\\$t" => { +} } }, "gd\\$postcode" => { HASH => { "\\$t" => { +} } }, "gd\\$region" => { HASH => { "\\$t" => { +} } }, "gd\\$street" => { HASH => { "\\$t" => { +} } }, "primary" => {}, "rel" => {}, }, }, }, "id" => { HASH => { "\\$t" => {} } }, "link" => { ARRAY => { HASH => { "gd\\$etag" => {}, "href" => {}, "rel" => +{}, "type" => {} }, }, }, "title" => { HASH => { "\\$t" => {} } }, "updated" => { HASH => { "\\$t" => {} } }, }, }, }, "gd\\$etag" => {}, "generator" => { HASH => { "\\$t" => {}, "uri" => {}, "version +" => {} } }, "id" => { HASH => { "\\$t" => {} } }, "link" => { ARRAY => { HASH => { href => {}, rel => {}, type = +> {} } } }, "openSearch\\$itemsPerPage" => { HASH => { "\\$t" => {} } }, "openSearch\\$startIndex" => { HASH => { "\\$t" => {} } }, "openSearch\\$totalResults" => { HASH => { "\\$t" => {} } }, "title" => { HASH => { "\\$t" => {} } }, "updated" => { HASH => { "\\$t" => {} } }, "xmlns" => {}, "xmlns\\$batch" => {}, "xmlns\\$gContact" => {}, "xmlns\\$gd" => {}, "xmlns\\$openSearch" => {}, }, }, version => {}, }, [download] I modified the code slightly to get rid of the "SCALAR" output. $PM = "Perl Monk's"; $MCF = "Most Clueless ~~Friar~~ ~~Abbot~~ ~~Bishop~~ ~~Pontiff~~ ~~Deacon~~ ~~Curate~~ Priest"; $nysus = $PM . ' ' . $MCF; Click here if you love Perl Monks	[reply] [d/l]
Re^5: How to improve introspection of an array of hashes by Eily (Monsignor) on Sep 13, 2018 at 16:50 UTC
Re^6: How to improve introspection of an array of hashes by nysus (Parson) on Sep 14, 2018 at 00:07 UTC
Re^6: How to improve introspection of an array of hashes by nysus (Parson) on Sep 13, 2018 at 20:24 UTC
Some notes below your chosen depth have not been shown here