rpaskudniak has asked for the wisdom of the Perl Monks concerning the following question:

Hi Y'all. Bless me.. For it has been years since my last post on a forum like this.

I could not think of a way to search this or any other forum/search engine for a question like this.

I have a configuration file that contains fields that are actually the of "fields" in a hash. (Actually a json hash but that's besides the issue at hand.) These will correspond with positional parameters supplied at run-time. For example, here re a couple of lines; each is a set of fields, comma separated:

name,address-virtual-email movie-role,hoig-boig-loig-shmoig
(I do need to protect the private data of my client.)

The first line requires that I plunk a value into:

o $js->{name} # This is the easy part o $js->{address}{virtual}{email} # Here's the trouble
The second line tells to plunk run-time values into
o $js->{movie}{role} # In this example o $js->{hoig}{boig}{loig}{shmoig} # both are trouble!
There are dozens of such lines in the config file and there's no telling what they will add to the list. So I'm racking my brains out for a way to address the substructures (using the term loosely) in a dynamic fashion. I can certainly combine the string $compound = "{hoig}{boig}{loig}{shmoig}" but it won't help me to address js=>{$compound} because that is just a string and there is no such key in the hash.

Can anyone come up with a way to get at the nested fields in the hash when I can't know in advance which keys (and sub-keys) I will need to address? I don't believe the symbol table will help here; the keys of the hash (and all its sub-hashes) are not in the symbol table (I think).

Thanks! Gracias! Spasiba! Szepem Köszönöm!

-- RP

Replies are listed 'Best First'.
Re: Dynamic addressing in a hash
by hv (Prior) on Sep 01, 2022 at 00:16 UTC

    You can do this by a) starting with a reference to the top level hash; b) replacing that with a reference in the next level for each additional level of nesting, something like this (untested):

    sub set_config_value { my($config, $name, $value) = @_; my @nesting = split /-/, $name; while (@nesting > 1) { my $field = shift @nesting; # make sure the next level is initialized to a hashref $config->{$field} //= {}; # now point to it $config = $config->{$field}; } # store at the bottom level $config->{$nesting[0]} = $value; } # use like this: set_config_value(\%config, "address-virtual-email", "rpaskudniak");
      Hi hv.

      I tried something similar but it ended up not referencing the actual element within the original hash. I see you have added an extra "next" operation to walk down the structure. Other responses have had a similar idea.

      One question, hv: The // operator you use here:

      $config->{$field} //= {};
      The usual cheat-sheets list this only as similar to the bitwise || "or" operator. Please explain how you are using this here. The Camel book (4th Ed, p. 27) gives this example, though I've never seen this used:
      $val ||= "2"; # Set $val to 2 if it isn't already "true".
      which makes as much sense to me as *your* use of the // operator.

      Please enlighten..

      Thanks. (I waive consecutive translation. :-)

      -- RP

        One question, hv: The // operator you use here: $config->{$field} //= {}; ...

        It's a fair question, I glossed over some stuff there.

        The difference between || and // lies in what they are equivalent to: $a ||= $b is equivalent to $a = b unless $a, while $a //= $b is equivalent to $a = $b unless defined $a. In both cases the equivalence is exact, except that the expression for $a is evaluated only once.

        So that particular line says "initialize this value to an empty hashref if there is no defined value already there". But in essence, the line was actually a lazy shorthand for "insert your preferred error-handling behaviour here".

        The assumption is a) that there will never be a mismatch - you'll never have a string stored at $config{address} before trying to read or write $config{address}{virtual} - and b) that if for some reason a mismatch occurs, a fatal error is a good enough way to signal it.

        A tighter (and clearer) version might say something like:

        use Scalar::Util qw{ reftype }; $config->{$field} //= {}; die "invalid field '$field' at '$name'" unless reftype($config->{$fiel +d}) eq 'HASH';

        A looser, "just get it done" sort of approach might simply overwrite it:

        $config->{$field} = {} unless reftype($config->{$field}) eq 'HASH';

        Using $config->{$field} ||= {} would instead give you mixed behaviour - it will silently overwrite any value that's false in perl, like (0, 0.00, ""), but leave others to cause an error. For something like configuration values, it seemed unlikely to me that you would want to have it endemically treat those values differently from others.

        As AnomalousMonk says, this "defined-or" or "defaulting" operator // has been available since v5.10; since then, I find I use it far more often than I use ||.

        $config->{$field} //= {};

        // is the Logical Defined-Or operator (see perlop), added with Perl version 5.10. So the example from the Camel book
            $val ||= "2"; # Set $val to 2 if it isn't already "true"
        exactly translates to
            $val //= "2"; # Set $val to "2" if $val isn't defined


        Give a man a fish:  <%-{-{-{-<

      Hi again, HV.

      I just though I owed you this confirmation: You solution worked beautifully! As promised, I mentioned your moniker as a "credit" to the function, though that hardly exposes you to publicity.

      Again THANKS much!

      -- RP

Re: Dynamic addressing in a hash
by GrandFather (Saint) on Sep 01, 2022 at 02:20 UTC

    Data::Diver may fit the bill:

    use strict; use warnings; use Data::Diver qw(Dive); my $inFile = <<STR; name,address-virtual-email movie-role,hoig-boig-loig-shmoig STR my $js = { name => "Poo Bear", address => {virtual => {email => 'the.woods@erewhon.com'}}, movie => {role => "Bear"}, hoig => {boig => {loig => {shmoig => "doig"}}} }; open my $fIn, '<', \$inFile; for my $line (<$fIn>) { chomp $line; my @parts = split ',', $line; for my $part (@parts) { my @keys = split '-', $part; print "@keys: ", Dive($js, @keys), "\n"; } }

    Prints:

    name: Poo Bear address virtual email: the.woods@erewhon.com movie role: Bear hoig boig loig shmoig: doig
    Optimising for fewest key strokes only makes sense transmitting to Pluto or beyond
Re: Dynamic addressing in a hash
by AnomalousMonk (Archbishop) on Sep 01, 2022 at 01:40 UTC

    Another approach is to leave the string of concatenated key fields as it is and use it alone as the hash key:

    Win8 Strawberry 5.8.9.5 (32) Wed 08/31/2022 21:20:10 C:\@Work\Perl\monks >perl use strict; use warnings; use Data::Dump qw(dd); my %config; sub set_config_value { my ($hr_config, # hash ref.: configuration hash to fill $key_seq, # string: multi-key sequence, hyphen-separated $value, # string: value of multi-key sequence ) = @_; # dd '---', '$key_seq', $key_seq, '$value', $value; # for debug push @{ $hr_config->{$key_seq} }, $value; } sub get_config_value { my ($hr_config, # hash ref.: configuration hash to fill @key_seq, # list: multi-key sequence to get value of ) = @_; # dd '===', '$key_seq', \@key_seq; # for debug my $multi_key = join '-', @key_seq; die "key sequence (@key_seq) not in config hash" if not exists $hr_config->{$multi_key}; return $hr_config->{$multi_key}; } for my $ar_add ( [ 'address-virtual-email', 'rpaskudniak', ], [ 'address-virtual-email', 'rasputin', ], [ 'address-virtual-email-foo', 'fred', ], [ '', 'emptystring', ], [ 'bag-end', 'frodo', ], ) { my ($key_sequence, $val) = @$ar_add; set_config_value(\%config, $key_sequence, $val); } dd \%config; dd get_config_value(\%config, qw(address virtual email foo)); dd get_config_value(\%config, ''); dd get_config_value(\%config); ^Z { "" => ["emptystring"], "address-virtual-email" => ["rpaskudniak", "rasputin"], "address-virtual-email-foo" => ["fred"], "bag-end" => ["frodo"], } ["fred"] ["emptystring"] ["emptystring"]
    An elaboration of this would be to make the set_config_value() function accept a variety of separator sequences by adding a step like (untested):
        $key_seq =~ s{ \s* (?: => | [-,=]) \s* }{$;}xmsg;
    (get_config_value() then has to be changed to join with $; - see perlvar.)


    Give a man a fish:  <%-{-{-{-<

Re: Dynamic addressing in a hash
by LanX (Saint) on Sep 01, 2022 at 01:33 UTC
    basically loop.

    little demo in perl -de0

    DB<23> $js = {} DB<24> $js->{hoig}{boig}{loig}{shmoig} = 666 DB<25> $walk = $js DB<26> $walk = $walk->{$_} for split /-/,"hoig-boig-loig-shmoig" DB<27> say $walk 666

    but be careful about unwanted autvivification if keys are unknown.

    Cheers Rolf
    (addicted to the Perl Programming Language :)
    Wikisyntax for the Monastery

      In case you also want to be able to set dynamically

      DB<44> sub js :lvalue { my $last = pop; my $walk = $js; $walk = $wal +k->{$_} for @_; $walk->{$last} } DB<45> say js(split /-/,"hoig-boig-loig-shmoig") 666 DB<46> say js(split /-/,"hoig-boig-loig-shmoig") = 42 42 DB<47> say js(split /-/,"hoig-boig-loig-shmoig") 42 DB<48> x $js 0 HASH(0x337b528) 'hoig' => HASH(0x337b768) 'boig' => HASH(0x337b498) 'loig' => HASH(0x3379e08) 'shmoig' => 42 DB<49>

      see also Data::Diver for more elaborated handling of edge cases

      Cheers Rolf
      (addicted to the Perl Programming Language :)
      Wikisyntax for the Monastery