Re: Trouble iterating through a hash -- oneliner explained

Hello e9292 and welcome to the monastery!,

First some sparse suggestion about you code style: use vars qw($ID $sire... here is unuseful: first use vars is deprecated and it really means our (see vars and use vars). you just need my ($ID, Ssire ... to declare lexical scoped variables.

Second you opening open (FILE1, "<wholepedigree_F.txt") or .. is oldish enough: always use the three arg open like: open my $filehandle, '<', $path or ... and use a lexical filehandle istead of the bareword form. Also the low precedence or is there to avoid the necessity of the parens.

Now: you got good solutions and many wise suggestions, but if you say I want the output to match column 4 of damF.txt and wholepedigree_F.txt and print columns 3 and 5 of whole_pedigree and columns 1,2,3 of damF.txt

I'd answer with a oneliner (pay attention to MSWin32 doublequote)(PS if I understand correctly your needs as they are stated..)

perl -F"\s+" -lane "$sec?(push $h{$F[0]},@F[2,4]):($h{$F[3]}=[@F[0..2]
+]);$sec++ if eof;END{print map{qq($_ = @{$h{$_}}\n)}sort keys %h}"   
+        mergedhash01.txt mergedhash02.txt


3162 = 501093 0 0 501093 0
3163 = 2958 0 0 2958 0
3164 = 1895 0 0 1895 0
3165 = 1382 0 0 1382 0
3166 = 2869 0 0 2869 0
[download]

See perlrun for -lane commandline options anf for -F too. The END is executed after the implicit while loop created by the -n switch.

In brief -a is autosplit mode and populate the special variable @F ('F' for fields, see perlvar for this). I have specified, with the -F option, that i want the current line splitted on \s+ instead of a single space that is the default.

Then -l take care for us of line endings (no need to chomp), -n put an implicit while loop around all the code that will be executed for every line of the files passed as arguments.

The $sec++ if eof is tricky: this part initialize and set to 1 the variable $sec (for second), ie when processing the first file and whe perl meet the end of file (see eof ) it set this switch like scalar to 1 (well the varible is set to 2 at the end of second file, but then we do not need it anymore).

Having this switch let us to know when we are processing the second file: infact the core of the oneliner is a IF ? THEN : ELSE ternary operator based on the value of the $sec variable: if false (we are processing the first file) we populate an hash entry $h{$F[3] with an anonymous array containing field 1,2 and 3 ( @F[0..2] ).

If $sec is true wea re processing the second file so we push in the yet created anonymous array fields 3 and 5 ( @F[2,4] ).

If you add -MO=Deparse before the other options you'll see the oneliner expanded a bit:

perl -MO=Deparse -F"\s+" -lane "$sec?(push $h{$F[0]},@F[2,4]):($h{$F[3
+]}=[@F[0..2]]);$sec++ if eof;END{print
map{qq($_ = @{$h{$_}}\n)}sort keys %h}"    mergedhash01.txt mergedhash
+02.txt

BEGIN { $/ = "\n"; $\ = "\n"; }
LINE: while (defined($_ = <ARGV>)) {
    chomp $_;
    our(@F) = split(/\s+/, $_, 0);
    $sec ? push($h{$F[0]}, @F[2, 4]) : ($h{$F[3]} = [@F[0..2]]);
    ++$sec if eof;
    sub END {
        print map({"$_ = @{$h{$_};}\n";} sort(keys %h));
    }
    ;
}
-e syntax OK
[download]

HtH and have fun!

There are no rules, there are no thumbs..
Reinvent the wheel, then learn The Wheel; may be one day you reinvent one of THE WHEELS.

Comment on Re: Trouble iterating through a hash -- oneliner explained Select or Download Code