in reply to Trouble iterating through a hash
First some sparse suggestion about you code style: use vars qw($ID $sire... here is unuseful: first use vars is deprecated and it really means our (see vars and use vars). you just need my ($ID, Ssire ... to declare lexical scoped variables.
Second you opening open (FILE1, "<wholepedigree_F.txt") or .. is oldish enough: always use the three arg open like: open my $filehandle, '<', $path or ... and use a lexical filehandle istead of the bareword form. Also the low precedence or is there to avoid the necessity of the parens.
Now: you got good solutions and many wise suggestions, but if you say I want the output to match column 4 of damF.txt and wholepedigree_F.txt and print columns 3 and 5 of whole_pedigree and columns 1,2,3 of damF.txt
I'd answer with a oneliner (pay attention to MSWin32 doublequote)(PS if I understand correctly your needs as they are stated..)
perl -F"\s+" -lane "$sec?(push $h{$F[0]},@F[2,4]):($h{$F[3]}=[@F[0..2] +]);$sec++ if eof;END{print map{qq($_ = @{$h{$_}}\n)}sort keys %h}" + mergedhash01.txt mergedhash02.txt 3162 = 501093 0 0 501093 0 3163 = 2958 0 0 2958 0 3164 = 1895 0 0 1895 0 3165 = 1382 0 0 1382 0 3166 = 2869 0 0 2869 0
See perlrun for -lane commandline options anf for -F too. The END is executed after the implicit while loop created by the -n switch.
In brief -a is autosplit mode and populate the special variable @F ('F' for fields, see perlvar for this). I have specified, with the -F option, that i want the current line splitted on \s+ instead of a single space that is the default.
Then -l take care for us of line endings (no need to chomp), -n put an implicit while loop around all the code that will be executed for every line of the files passed as arguments.
The $sec++ if eof is tricky: this part initialize and set to 1 the variable $sec (for second), ie when processing the first file and whe perl meet the end of file (see eof ) it set this switch like scalar to 1 (well the varible is set to 2 at the end of second file, but then we do not need it anymore).
Having this switch let us to know when we are processing the second file: infact the core of the oneliner is a IF ? THEN : ELSE ternary operator based on the value of the $sec variable: if false (we are processing the first file) we populate an hash entry $h{$F[3] with an anonymous array containing field 1,2 and 3 ( @F[0..2] ).
If $sec is true wea re processing the second file so we push in the yet created anonymous array fields 3 and 5 ( @F[2,4] ).
If you add -MO=Deparse before the other options you'll see the oneliner expanded a bit:
perl -MO=Deparse -F"\s+" -lane "$sec?(push $h{$F[0]},@F[2,4]):($h{$F[3 +]}=[@F[0..2]]);$sec++ if eof;END{print map{qq($_ = @{$h{$_}}\n)}sort keys %h}" mergedhash01.txt mergedhash +02.txt BEGIN { $/ = "\n"; $\ = "\n"; } LINE: while (defined($_ = <ARGV>)) { chomp $_; our(@F) = split(/\s+/, $_, 0); $sec ? push($h{$F[0]}, @F[2, 4]) : ($h{$F[3]} = [@F[0..2]]); ++$sec if eof; sub END { print map({"$_ = @{$h{$_};}\n";} sort(keys %h)); } ; } -e syntax OK
HtH and have fun!
L*
|
|---|