gnu@perl has asked for the wisdom of the Perl Monks concerning the following question:

I have a little program that iterates through a HoH and builds a new hash from data found in the original. The program is working but it is pretty ugly and I would like input from others on ways to clean it up. One big issue that I don't like is in the 4th line
@holdval = keys %{$$switch_hash{$key}};

There is only one key in that level of the hash I am working on and that will never change. So I feel 'safe' in assuming the the 'keys' will only return one key. but this bothers me. Any suggestions?

Anyway, following is the original hash that is passed into my program, the program itself, the original hash after it has been changed and the hash that was created from data in the original hash.

The original hash'es final entry, if populated, contains a number:text. This is hours:reason. NOTE: I always 'use strict' and -w.

my %reasons; my ($key, $val, @holdval, $holdkey); while (($key, $val) = (each %{$switch_hash})){ @holdval = keys %{$$switch_hash{$key}}; for (values%{$val}){ while ( (my $tmpkey, my $tmpval) = each%{$_}){ my $newkey = $key.":".$tmpkey; my $newval =$tmpval; $tmpval =~ s/^\d+://; $reasons{$newkey} = $tmpval if ($tmpval ne ''); $newval =~ s/:.*$//; $$switch_hash{$key}{$holdval[0]}{$tmpkey} = $newval; } } }
Original Hash
$VAR1 = { 'Washington' => { 'uslecwas5e1' => { '01-AUG-2002' => '' } } +, 'Charleston' => { 'uslecchst5e1' => { '01-AUG-2002' => '' } +}, 'Richmond' => { 'uslecric5e1' => { '01-AUG-2002' => '' } }, 'West Palm Beach' => { 'uslecwpb5e1' => { '01-AUG-2002' => ' +' } }, 'Atlanta' => { 'uslecatl5e1' => { '14-AUG-2002' => '10:reaso +n', '15-AUG-2002' => '11:new reason' } }, 'Fort Myers' => { 'uslecftm5e1' => { '01-AUG-2002' => '' } } +, 'Mobile' => { 'uslecmob5e1' => { '01-AUG-2002' => '' } }, 'Nashville' => { 'uslecnas5e1' => { '01-AUG-2002' => '' } }, 'Orlando' => { 'uslecorl5e1' => { '12-AUG-2002' => '2:differ +ent reason' } }, 'Charlotte' => { 'uslechar5e1' => { '01-AUG-2002' => '' } }, 'Louisville' => { 'usleclou5e1' => { '01-AUG-2002' => '' } } +, 'Memphis' => { 'uslecmem5e1' => { '01-AUG-2002' => '' } }, 'Philadelphia' => { 'uslecphi5e1' => { '01-AUG-2002' => '' } + }, 'Chattanooga' => { 'uslecchat5e1' => { '01-AUG-2002' => '' } + }, 'Birmingham' => { 'uslecbir5e1' => { '01-AUG-2002' => '23:ju +nk' } }, 'Greensboro' => { 'uslcgb5e2sm' => { '01-AUG-2002' => '' } } +, 'New Orleans' => { 'uslecnew5e1' => { '01-AUG-2002' => '22:f +ake' } }, 'Jacksonville' => { 'uslecjac5e1' => { '01-AUG-2002' => '' } + }, 'Norfolk' => { 'uslecnor5e1' => { '01-AUG-2002' => '' } }, 'DEX' => { 'chrdex' => { '13-AUG-2002' => '1:some stuff', '1 +9-AUG-2002' => '1:reason again' } }, 'Atlanta II' => { 'uslecat25e1' => { '01-AUG-2002' => '' } } +, 'Baltimore' => { 'uslecbal5e1' => { '01-AUG-2002' => '' } }, 'Raleigh' => { 'uslecral5e1' => { '01-AUG-2002' => '' } }, 'Pittsburgh' => { 'uslecpit5e1' => { '01-AUG-2002' => '' } } +, 'Miami' => { 'uslecmia5e1' => { '01-AUG-2002' => '' } }, 'Knoxville' => { 'uslecknxv5e' => { '01-AUG-2002' => '' } }, 'Tampa' => { 'uslectam5e1' => { '01-AUG-2002' => '' } } };
New Hash
$VAR1 = { 'DEX:19-AUG-2002' => 'reason again', 'Atlanta:14-AUG-2002' => 'reason', 'Orlando:12-AUG-2002' => 'different reason', 'Birmingham:01-AUG-2002' => 'junk', 'DEX:13-AUG-2002' => 'some stuff', 'Atlanta:15-AUG-2002' => 'new reason', 'New Orleans:01-AUG-2002' => 'fake' };
Modified Original Hash
$VAR1 = { 'Washington' => { 'uslecwas5e1' => { '01-AUG-2002' => '' } } +, 'Charleston' => { 'uslecchst5e1' => { '01-AUG-2002' => '' } +}, 'Richmond' => { 'uslecric5e1' => { '01-AUG-2002' => '' } }, 'West Palm Beach' => { 'uslecwpb5e1' => { '01-AUG-2002' => ' +' } }, 'Atlanta' => { 'uslecatl5e1' => { '14-AUG-2002' => '10', '15 +-AUG-2002' => '11' } }, 'Fort Myers' => { 'uslecftm5e1' => { '01-AUG-2002' => '' } } +, 'Mobile' => { 'uslecmob5e1' => { '01-AUG-2002' => '' } }, 'Nashville' => { 'uslecnas5e1' => { '01-AUG-2002' => '' } }, 'Orlando' => { 'uslecorl5e1' => { '12-AUG-2002' => '2' } }, 'Charlotte' => { 'uslechar5e1' => { '01-AUG-2002' => '' } }, 'Louisville' => { 'usleclou5e1' => { '01-AUG-2002' => '' } } +, 'Memphis' => { 'uslecmem5e1' => { '01-AUG-2002' => '' } }, 'Philadelphia' => { 'uslecphi5e1' => { '01-AUG-2002' => '' } + }, 'Chattanooga' => { 'uslecchat5e1' => { '01-AUG-2002' => '' } + }, 'Birmingham' => { 'uslecbir5e1' => { '01-AUG-2002' => '23' } + }, 'Greensboro' => { 'uslcgb5e2sm' => { '01-AUG-2002' => '' } } +, 'New Orleans' => { 'uslecnew5e1' => { '01-AUG-2002' => '22' +} }, 'Jacksonville' => { 'uslecjac5e1' => { '01-AUG-2002' => '' } + }, 'Norfolk' => { 'uslecnor5e1' => { '01-AUG-2002' => '' } }, 'DEX' => { 'chrdex' => { '13-AUG-2002' => '1', '19-AUG-2002' + => '1' } }, 'Atlanta II' => { 'uslecat25e1' => { '01-AUG-2002' => '' } } +, 'Baltimore' => { 'uslecbal5e1' => { '01-AUG-2002' => '' } }, 'Raleigh' => { 'uslecral5e1' => { '01-AUG-2002' => '' } }, 'Pittsburgh' => { 'uslecpit5e1' => { '01-AUG-2002' => '' } } +, 'Miami' => { 'uslecmia5e1' => { '01-AUG-2002' => '' } }, 'Knoxville' => { 'uslecknxv5e' => { '01-AUG-2002' => '' } }, 'Tampa' => { 'uslectam5e1' => { '01-AUG-2002' => '' } } };

Edited: ~Wed Sep 11 23:05:05 2002 (GMT) by footpad: Added <readmore> tag and removed extraneous \n's to improve readability.

Replies are listed 'Best First'.
Re: newbie question on HoH manipulation
by Aristotle (Chancellor) on Sep 11, 2002 at 18:06 UTC
    You're using each, but throwing away the benefit of getting the value along with the key, and looking it up in the original hash instead, several times. There are several temporary variables you can get rid of, and unless you need them elsewhere (which I'd be wary of), the scope for your essential variables can be kept much narrower. Personal preference of mine: use fewer brackets and more spaces instead.
    my %reasons; while (my($city, $code_info) = each %$switch_hash) { my ($code, $date_info) = %$code_info; while(my ($date, $reason) = each %$date_info) { next unless $reason =~ /^(\d+):(.*)/; $date_info->{$date} = $1; $reasons{"$city:$date"} = $2; } }
    If you're sure the hash will only have one entry, you can store the code along with the name in the toplevel hash key and save yourself a lot of dereferencing:
    my $switch_hash = { 'Washington:uslecwas5e1' => { '01-AUG-2002' => '' }, 'Charleston:uslecchst5e1' => { '01-AUG-2002' => '' }, 'Richmond:uslecric5e1' => { '01-AUG-2002' => '' }, # ...
    You could also store that code in an altogether separate hash elso keyed on the same city names. Or you could store it in a special key of the subhash that doesn't contain a valid date, like
    my $switch_hash = { 'Washington' => { CODE => 'uslecwas5e1', '01-AUG-2002' => '' }, # ...
    and write something like
    my %reasons; while (my($city, $info) = each %$switch_hash) { while(my ($date, $reason) = each %$info) { next unless $reason =~ /^(\d+):(.*)/ and $date ne 'CODE'; $info->{$date} = $1; $reasons{"$city:$date"} = $2; } }

    Makeshifts last the longest.

Re: newbie question on HoH manipulation
by fglock (Vicar) on Sep 11, 2002 at 16:50 UTC

    You could name your temporary variables $city, $code, $hour, $reason. Or $tmp_key_city if you like. I think $key and $val don't say enough when you try to understand the program for the first time.

    About " @holdval = keys %{$$switch_hash{$key}}" this might work:

    @holdval = keys %{$val};
Re: newbie question on HoH manipulation
by Util (Priest) on Sep 11, 2002 at 19:19 UTC

    Issue: You have a almost-regular data structure (HoHoH), with a redundant key in the middle that you want to omit from the derived hash (%reasons).

    Solution: Traverse the HoHoH regularly with ( while(each){while(each){while(each){}}} ), just as if the middle key is not static. Omit the middle key when forming the key for %reasons, but check for duplicates as a help to future maintenance programmers. Alternately, you could make reasons an AoA.

    !/usr/bin/perl -w use strict; my $switch_hash = { 'Washington' =>{uslecwas5e1 =>{'01-AUG-2002' => '' }}, 'Charleston' =>{uslecchst5e1 =>{'01-AUG-2002' => '' }}, 'Richmond' =>{uslecric5e1 =>{'01-AUG-2002' => '' }}, 'West Palm Beach'=>{uslecwpb5e1 =>{'01-AUG-2002' => '' }}, 'Atlanta' =>{uslecatl5e1 =>{'14-AUG-2002' => '10:reason', '15-AUG-2002' => '11:new reason' +}}, 'Fort Myers' =>{uslecftm5e1 =>{'01-AUG-2002' => '' }}, 'Mobile' =>{uslecmob5e1 =>{'01-AUG-2002' => '' }}, 'Nashville' =>{uslecnas5e1 =>{'01-AUG-2002' => '' }}, 'Orlando' =>{uslecorl5e1 =>{'12-AUG-2002' => '2:different rea +son' }}, 'Charlotte' =>{uslechar5e1 =>{'01-AUG-2002' => '' }}, 'Louisville' =>{usleclou5e1 =>{'01-AUG-2002' => '' }}, 'Memphis' =>{uslecmem5e1 =>{'01-AUG-2002' => '' }}, 'Philadelphia' =>{uslecphi5e1 =>{'01-AUG-2002' => '' }}, 'Chattanooga' =>{uslecchat5e1 =>{'01-AUG-2002' => '' }}, 'Birmingham' =>{uslecbir5e1 =>{'01-AUG-2002' => '23:junk' }}, 'Greensboro' =>{uslcgb5e2sm =>{'01-AUG-2002' => '' }}, 'New Orleans' =>{uslecnew5e1 =>{'01-AUG-2002' => '22:fake' }}, 'Jacksonville' =>{uslecjac5e1 =>{'01-AUG-2002' => '' }}, 'Norfolk' =>{uslecnor5e1 =>{'01-AUG-2002' => '' }}, 'DEX' =>{chrdex =>{'13-AUG-2002' => '1:some stuff', '19-AUG-2002' => '1:reason again' + }}, 'Atlanta II' =>{uslecat25e1 =>{'01-AUG-2002' => '' }}, 'Baltimore' =>{uslecbal5e1 =>{'01-AUG-2002' => '' }}, 'Raleigh' =>{uslecral5e1 =>{'01-AUG-2002' => '' }}, 'Pittsburgh' =>{uslecpit5e1 =>{'01-AUG-2002' => '' }}, 'Miami' =>{uslecmia5e1 =>{'01-AUG-2002' => '' }}, 'Knoxville' =>{uslecknxv5e =>{'01-AUG-2002' => '' }}, 'Tampa' =>{uslectam5e1 =>{'01-AUG-2002' => '' }}, }; my %reasons; while ( my ($city, $v1) = each %$switch_hash ){ while ( my ($usl, $v2) = each %$v1 ){ while ( my ($date, $v3) = each %$v2 ){ next unless $v3; my ($hours, $reason) = split ':', $v3, 2; warn if exists $reasons{"$city:$date"}; $reasons{"$city:$date"} = $reason; # push @reasons, [$city, $date, $reason ]; $v2->{$date} = $hours; # Can't use $v3 here; it is a copy! } } } use Data::Dumper; $Data::Dumper::Useqq = 1; print Data::Dumper->Dump( [ \%reasons, $switch_hash ], [qw( *reasons switch_hash )] );