The earth heart of the code is the creation of the needed datastructure with
we create the key of the hash %r as stringyfied join of fields 0..3 of the autosplitted @F array (see -F"\s+" -a in perlrun). This give us the uniqueness of the first four fields, used as a key. The value of that key is treated as an array and in this array is pushed another array, anonymous [@F[4, 5]] containing last two fields. One array is pushed every times the key is found again over files read.push @{$r{join (' 'x8,@F[0..3]) }}, [@F[4,5]];'
Using Data::Dump dd function as first thing in the END block you'll see the datastructure:
( "I 33 C C", [[0.5, 2], [1, 2]], "I 21 B A", [[1, 6], [1, 6]], "I 40 D D", [[1, 2], [1, 5]], "I 56 A E", [[1, 2]], "I 9 A B", [[0.25, 6], ["0.30", 8]], )
When all files are processed the END block comes in play: for each key of the %r hash we use map to process all arrays contained as values of the key: every first value is added to $x (these are coming from all $F[4] values! ) and every second value is added to $y (coming from all $F[5] values) Vars $x and $y are declared with my so they are resetted for every key of the %r hash processed.
Now that all is ready and while we are still processing the key of the %r hash we print the key, a tab, $x divided by how many values we used ( scalar @{$r{$k}} ie: the scalar value of the array contained in the $r{$k} ) or the average you asked for. Then the total value of $y and stop.
L*
In reply to Re^3: Merging partially duplicate lines -- oneliner explained
by Discipulus
in thread Merging partially duplicate lines
by K_Edw
| For: | Use: | ||
| & | & | ||
| < | < | ||
| > | > | ||
| [ | [ | ||
| ] | ] |