in reply to Print hash keys and lookup the keys for values in another filr
Make sure to follow the wiser advice you already received. Be sure to understand all haukex said and read the docs linked by him.
Pay attention to the final datastructure you want to fill: under the Wood key you are talking about an HashOfHashes like $hoh{Wood}{CLAYCOUNTY} values 119736, 448094,.. I think, as haukex already suggested, an HashOfArrays in the form: $hoh{Wood}{CLAYCOUNTY}[119736, 448094,..] is more appropriate.
Infact when you arrange your data into the most appropriate datastructure, everything runs smoother.
In principle the basis of all programming is a plain description of the solution. This is also in the signature of one of our estimated brother (the name is missing from my memory at the moment).
Your plain solution or pseudocode, looks like:
parse file-1 for every line: remove newline split it using ';' filling an array check if the element 1 (array start from 0) of such array is 'Wood' if yes create a sub key using element zero of the above array and as +sign as value an anonymous, empty array [] close file-1 parse file-2 for every line: remove newline split it using ';' filling an array if the element 2 is an already present key in the hash push the element 0 into the values of such key (we have defined it + as an anonymous array [] above) close file-2 print out the datastructure in the way you like
This is the basic; you can make it complex at your will. Then YOU must translate into Perl code: this heavily depend on the level you are. Go slowly translating into Perl using well known best practices and idioms. Ask only for parts you do not understand how to do the translation or that give you back unexpected results.
For your information (but eminently for my own amusement!) you must be aware that Perl is powerfull enought to get that job done in few keystroke. perlrun describes switches used below, but the below code is tricky and use some advanced tecniques like the BEGIN and END block, the ternary operator ' ? : ' and the eof trick (setting $f=0 where it means $first ) to parse the second file in a different way from first one... If you want to learn more you are welcome!
Given file-1.txt and file-2.txt the two files you describe, the following oneliner do the trick:
# pay attention to windows doublequote around the oneliner, on Linux i +s perl -e '...' not perl -e ".." perl -MData::Dumper -F";" -lane "BEGIN{$f=1};$f?$hoh{$F[1]}{$F[0]}=[] +:push @{$hoh{Wood}{$F[2]}},$F[0];$f=0 if eof; END {print Dumper \%hoh}" file-1.tx +t file-2.txt $VAR1 = { 'Wood' => { 'NASSAUCOUNTY' => [ '920232', '727659', '471817', '983043', '578286' ], 'SUWANNEECOUNTY' => [ '640802', '403866', '828788', '751490', '972562', '367541', '481360' ], 'CLAYCOUNTY' => [ '119736', '448094', '206893', '333743', '172534', '785275', '995932', '223488', '433512' ] } };
In the case of such cryptic oneliner you can enjoy the ability of the core module B::Deparse to help you understanding: as for the synopsis of such module you just need to call the oneliner prepending perl -MO=Deparse to see it a bit more readable:
# spacing and comments added perl -MO=Deparse -MData::Dumper -F";" -lane "BEGIN{$f=1};$f?$hoh{$F[1 +]}{$F[0]}=[]:push @{$hoh{Wood}{$F[2]}},$F[0]; $f=0 if eof; END{print Dumper \%hoh}" + file-1.txt file-2.txt # B::Deparse translate the above into the below: # the following block is added by the -l switch # automatically handling newlines BEGIN { $/ = "\n"; $\ = "\n"; } use Data::Dumper; # this while block is added by th -n switch (see also -p for completne +ss) LINE: while (defined($_ = <ARGV>)) { chomp $_; # the special @F array (see perlvar) is called into play by the -a + (autosplit) switch # (UPDATE) the -F";" switch states that we are going to automatica +lly split using ';' instead of the default (space) our(@F) = split(/;/, $_, 0); # this is our BEGIN block esplicitally put in pur oneliner setting + $f = 1 before doing anything # please note the BEGIN blocks are executed as they are seen by th +e compiler, so even if put inside the while # it is executed only once, as it is seen sub BEGIN { $f = 1; } # our part: the ternary '? :' is ' IF ? THEN : ELSE' # IF $f is true (ie we are processing the first file) $f ? # THEN use @F elements to create a subkey assigning to it an empty + anonymous array [] $hoh{$F[1]}{$F[0]} = [] : # ELSE (we are processing the second file) # push into the subkey (that holds an array) the value of the firs +t element push(@{$hoh{'Wood'}{$F[2]};}, $F[0]); # the trick: eof is true when we reach the end of a file: so at th +e end of file-1 it happens to be true: we trap it and # we change $f to 0 ie we are stating we are processing, from now +on, the second (or third) file $f = 0 if eof; # our END block executed only once at the end of program is used t +o dump the datastructure # it can be conceiled using the 'eskimo operator' trick, I prefere + Data::Dump over Data::Dumper sub END { print Dumper(\%hoh); } ; } # B::Deparse tell us the syntax is OK -e syntax OK
L*
|
|---|
| Replies are listed 'Best First'. | |
|---|---|
|
Re^2: Print hash keys and lookup the keys for values in another file -- oneliner and more explained
by Magnolia25 (Sexton) on Feb 10, 2017 at 16:05 UTC | |
by poj (Abbot) on Feb 10, 2017 at 17:18 UTC | |
by Magnolia25 (Sexton) on Feb 12, 2017 at 15:09 UTC |