zerohero has asked for the wisdom of the Perl Monks concerning the following question:

Wise monks,

Recently, I asked a question about functions with analogous properties to perl map and grep, only for hashes. I now realize it is useful to change the spec a bit: a map function that operates on hashes and is composable. This would probably be trivial in a language which supported "functional programming", but my belief is you can probably do this in Perl as well.

The following script uses the version of hmap suggested previously by BrowserUK. I learned a lot by taking this little gem apart (it's humbling to me that some people can do so much in so few lines):

use Data::Dumper; my $h = { 1 => { nick => 'rbush', phone => '5551212', bday => '3/12/1965', }, 3 => { nick => 'ernest', phone => '5553300', bday => '3/12/1971', }, 5 => { nick => 'fred', phone => '1112300', bday => '5/01/1972', }, }; sub hmap (&%) { my $code = shift; my @i = @_; local @_; my @rv; push @rv, $code->(@_=(shift @i,shift @i)) while @i; @rv; } my %new = hmap { $_[1]{nick} =~ /rbush|fred/ ? ($_[0], $_[1]) : () } %$h; my %new2 = hmap { $_[1]{nick} =~ /rbush|fred/ ? ($_[0], hmap { $_[0], $_[1] } %{$_[1]}) : () } %$h; print "new = " . Dumper (\%new) . "\n"; print "new2 = " . Dumper (\%new2) . "\n";

Now we run it:

owncloselady-lm:scratch rbush$ perl hmap_ex1.pl new = $VAR1 = { '1' => { 'nick' => 'rbush', 'bday' => '3/12/1965', 'phone' => '5551212' }, '5' => { 'nick' => 'fred', 'bday' => '5/01/1972', 'phone' => '1112300' } }; new2 = $VAR1 = { 'rbush' => 'bday', '1' => 'nick', '3/12/1965' => 'phone', 'nick' => 'fred', 'phone' => '1112300', 'bday' => '5/01/1972', '5551212' => '5' };

Note that we have a multi-level hash (a hash of hashes). Our first example with "%new" works as expected. We are simply selecting each subhash, based on the nickname. The second test with "%new2" is an attempt to put a "do nothing" hmap "inside the loop" so to speak (composition). This is a simple test to see if composition will work. The expectation is I will get the same result as %new, since I'm just copying inputs to outputs.

However, it doesn't preserve the structure. It ends up removing one level of hash structure, and mixing up the keys. It's possible I'm invoking this wrong, or that this is intentional (several perl programmers mentioned they just throw away the structure and flatten things out).

The next script shows my implementation of hmap which allows composition (and preserves structure). The ultimate task is to select records (i.e. with nick = rbush|fred), and then return only the "nick" and "phone" attributes (the next level of the hashes).

use Data::Dumper; my $h = { 1 => { nick => 'rbush', phone => '5551212', bday => '3/12/1965', }, 3 => { nick => 'ernest', phone => '5553300', bday => '3/12/1971', }, 5 => { nick => 'fred', phone => '1112300', bday => '5/01/1972', }, }; sub hmap { my ($h, $code) = @_; return undef unless defined $h; my $rv = {}; while (my ($k, $v) = each %$h) { my $x = &$code($k, $v); $rv->{$k} = $x if (defined $x); } return $rv; } my $new = hmap $h, sub { $_[1]->{nick} =~ /rbush|fred/ ? $_[1] : undef }; my $new2 = hmap $h, sub { return hmap $_[1]->{nick} =~ /rbush|fred/ ? $_[1] : undef, sub { $_[0] =~ /nick|phone/ ? $_[1] : undef } }; print "new = " . Dumper ($new) . "\n"; print "new2 = " . Dumper ($new2) . "\n";

Now we run this version:

owncloselady-lm:scratch rbush$ perl hmap_ex2.pl new = $VAR1 = { '1' => { 'nick' => 'rbush', 'bday' => '3/12/1965', 'phone' => '5551212' }, '5' => { 'nick' => 'fred', 'bday' => '5/01/1972', 'phone' => '1112300' } }; new2 = $VAR1 = { '1' => { 'nick' => 'rbush', 'phone' => '5551212' }, '5' => { 'nick' => 'fred', 'phone' => '1112300' } };

Note the preserved structure for new2. However, it seems there is some clunkiness. For example, I have to use "sub". Any suggestions on how I can make this better? It's not super important that the implementation of hmap be short and elegant, but that the many _uses_ of hmap be short and elegant. Anything that reduces the number of characters you need to type in an expression, increases robustness or flexibility is helpful.

Replies are listed 'Best First'.
Re: hmap revisited
by ikegami (Patriarch) on Feb 01, 2009 at 00:13 UTC

    Change

    my %new2 = hmap { $_[1]{nick} =~ /rbush|fred/ ? ($_[0], hmap { $_[0], $_[1] } %{$_[1]}) : () } %$h;

    to

    my %new2 = hmap { $_[1]{nick} =~ /rbush|fred/ ? $_[0] => { hmap { $_[0], $_[1] } %{$_[1]} } : () } %$h;

    The relevant change is the curlies around the inner call to hmap. %{$_[1]} creates a list of the hash's content from the hash reference. You have to mirror that on the output side by creating a hash reference from the list.

    You can't have it both ways. hmap can either return a list that can be used to initialize a hash or a hash reference.

    An alternative would be to just pass hash references instead of the contents of hashes.

    sub hmap (&$) { my $code = shift; my @i = %{ shift() }; my @rv; push @rv, $code->(shift @i,shift @i) while @i; return { @rv }; } my $new2 = hmap { $_[1]{nick} =~ /rbush|fred/ ? $_[0] => hmap { $_[0], $_[1] } $_[1] : () } $h;

    (Removed the useless local @_ and @_=. What do you think calling a sub does...)

    Since we have an actual hash to work with instead of a list of its content, we can use each instead of flattening the hash into @i.

      Is there a way to initialize local variables (e.g. $k for 'key', and $v for 'value') and then have these loaded into the scope of the called code? e.g. like sort does with "a" and "b".

      How do I take a hash reference, and then turn it into a list without an intermediate variable to hold it?

        Is there a way to initialize local variables (e.g. $k for 'key', and $v for 'value') and then have these loaded into the scope of the called code? e.g. like sort does with "a" and "b".

        Yes and no. Yes, you can set variables in the caller's package (like sort does with $a and $b).

        sub hmap(&@) { my $cb = shift; my $caller = caller(); my $kgr = do { no strict 'refs'; \*{$caller.'::k'} }; my $vgr = do { no strict 'refs'; \*{$caller.'::v'} }; local *$kgr = \my $k; local *$vgr = \my $v; for (...) { $k = ...; $v = ...; $cb->(); } }

        But while $a and $b are exempt from strict, $k and $v aren't. You'd have to use something like

        ... = hmap { our($k,$v); ... } ...;

        So there isn't really an advantage over using arguments as you are doing now.

        ... = hmap { my($k,$v)=@_; ... } ...;

        One possible workaround is to use $a and $b instead of $k and $v. That's what List::Util does for reduce.

        Another possible workaround is to export $k and $v from the module that provides $hmap. Imported variables are exempt from strict.

        package Hash::Map; use Exporter qw( import ); our @EXPORT = qw( $k $v hmap hgrep ); sub hmap(&@) { ... } sub hgrep(&@) { ... } 1;

      I copy and pasted the code sample from above, but it failed to compile. I tried a few things but couldn't get it to work (the error is in the dense part where I don't fully understand the syntactical rules). Offending line seems to be:

      snippet:

      my %new2 = hmap { $_[1]{nick} =~ /rbush|fred/ ? $_[0] => { hmap { $_[0], $_[1] } %{$_[1]} } : () } %$h;

      Here's the compilation error:

      bash-3.2$ perl hmap_good2.pl syntax error at hmap_good2.pl line 39, near "] =>" syntax error at hmap_good2.pl line 40, near ":" Execution of hmap_good2.pl aborted due to compilation errors.

      Possibly I'm missing something simple? Full script pasted below:

      use Data::Dumper; my $h = { 1 => { nick => 'rbush', phone => '5551212', bday => '3/12/1965', }, 3 => { nick => 'ernest', phone => '5553300', bday => '3/12/1971', }, 5 => { nick => 'fred', phone => '1112300', bday => '5/01/1972', }, }; sub hmap (&$) { my $code = shift; my @i = %{ shift() }; my @rv; push @rv, $code->(shift @i,shift @i) while @i; return { @rv }; } my %new2 = hmap { $_[1]{nick} =~ /rbush|fred/ ? $_[0] => { hmap { $_[0], $_[1] } %{$_[1]} } : () } %$h; print Dumper ($new2), "\n";

      Note that the second version (the hash ref one) has the same compile error, but I could get that to work by surrounding the returned hash in "()", i.e.:

      my $new2 = hmap { $_[1]{nick} =~ /rbush|fred/ ? ($_[0] => hmap { $_[0], $_[1] } $_[1]) : () } $h;
        Just a precedence issue. Add parens.
        my %new2 = hmap { $_[1]{nick} =~ /rbush|fred/ ? ( $_[0] => { hmap { $_[0], $_[1] } %{$_[1]} } ) : () } %$h;
        my $new2 = hmap { $_[1]{nick} =~ /rbush|fred/ ? ( $_[0] => hmap { $_[0], $_[1] } $_[1] ) : () } $h;
Re: hmap revisited
by shmem (Chancellor) on Feb 01, 2009 at 20:09 UTC

    <nit>Er.. that piece of code

    sub hmap (&%) { my $code = shift; my @i = @_; local @_; my @rv; push @rv, $code->(@_=(shift @i,shift @i)) while @i; @rv; }

    wasn't from BrowserUk, but yours truly, well corrected by ikegami along the lines:

    sub hmap (&%) { my $code = shift; my @i = @_; my @rv; push @rv, $code->(shift @i,shift @i) while @i; @rv; }

    </nit>