Anas has asked for the wisdom of the Perl Monks concerning the following question:

Hello every one, I ma facing a problem with this file

position: 1 neighbors: 5

xxxx A 1 G L -135.185 178.150 179.885 0.000 0.000 0.0 +00 1 0.000 P 1 F 1 xxxx A 2 S L -49.668 -31.158 177.985 0.000 0.000 0.0 +00 1 0.000 P 1 F 1 xxxx A 3 M L -84.632 -45.215 177.518 0.000 0.000 0.0 +00 1 0.000 P 1 F 1 xxxx A 1 G L -115.240 -69.349 -171.360 0.000 0.000 0.0 +00 1 0.000 P 1 F 2 xxxx A 2 S L -84.776 -143.809 173.303 0.000 0.000 0.0 +00 1 0.000 P 1 F 2 xxxx A 3 M L -50.674 158.544 177.747 0.000 0.000 0.0 +00 1 0.000 P 1 F 2 xxxx A 1 G L -106.682 126.232 177.885 0.000 0.000 0.0 +00 1 0.000 P 1 F 3 xxxx A 2 S L -20.124 24.502 -179.585 0.000 0.000 0.0 +00 1 0.000 P 1 F 3 xxxx A 3 M L -60.092 2.018 -178.549 0.000 0.000 0.0 +00 1 0.000 P 1 F 3 xxxx A 1 G L -115.172 -16.017 -179.179 0.000 0.000 0.0 +00 1 0.000 P 1 F 4 xxxx A 2 S L -143.860 148.913 -179.781 0.000 0.000 0.0 +00 1 0.000 P 1 F 4 xxxx A 3 M L -81.755 -23.354 -174.564 0.000 0.000 0.0 +00 1 0.000 P 1 F 4 xxxx A 1 G L -71.907 147.690 -178.617 0.000 0.000 0.0 +00 1 0.000 P 1 F 5 xxxx A 2 S L -81.417 52.986 179.983 0.000 0.000 0.0 +00 1 0.000 P 1 F 5 xxxx A 3 M L -83.126 -56.051 178.109 0.000 0.000 0.0 +00 1 0.000 P 1 F 5

I need to average the numeric values in column 6 and 7 (I did not count the spaces) such that I get three lines. Line 1 is the average of of the first

line of each of the five groups, line 2 is the average of the second line of each of the five groups and so one. please help me, I am new to Perl and I

have been struggling with this file for some time.

many thanks

2018-04-04 Athanasius added code tags

Replies are listed 'Best First'.
Re: averaging a group of lines with numeric value in a text file
by kcott (Archbishop) on Apr 03, 2018 at 09:02 UTC

    G'day Anas,

    Welcome to the Monastery.

    It's difficult to discern what's descriptive text and what's data. Please always put your data within <code>...</code> tags; do the same for code and any output which appears to you in a fixed-width font (e.g. error messages). This allows us to see exactly what you see and to download a verbatim copy (e.g. for fixing, testing, etc.).

    As already pointed out, if you don't show us what you tried, we can't really provide much in the way of improvements, fixes, and so on. Please post your code and explain what part of it you are having difficulties with. Here's a couple of hints, but it's purely guesswork on my part and may, in fact, be no help at all.

    Your file looks like it contains CSV data with space separators. Text::CSV has been specifically written to handle this type of data: I recommend you use it. If you also have Text::CSV_XS installed, it will run faster.

    You can probably use $., the input line number, in your averaging calculations.

    Update: Some minor typos corrected. No substantive change to original information.

    — Ken

Re: averaging a group of lines with numeric value in a text file
by gannett (Novice) on Apr 03, 2018 at 09:35 UTC
    Hi This part of the description of task is unclear "first line of each of the five groups". To what 5 groups do you refer ? Simply after putting the lines in a file called dat.txt get started by pulling the columns using auto split switch -a.
    perl -ane '{print $F[5]," ",$F[6],"\n";}' < dat.txt -135.185 178.150 -49.668 -31.158 ... snip -81.417 52.986 -83.126 -56.051
    Now do a simple ave
    perl -ane '{$t5+=$F[5]; $t6+=$F[6]; $t+=1} END{print $t5/$t," ",$t6/$t,"\n";}' < dat.txt -85.6206666666667 30.2721333333333
    Just guessing what is meant by "5 groups". This example uses the contents of Field 3 ( counting from 0 ) to do the grouping
    perl -ane '{$t5{$F[3]}+=$F[5]; $t6{$F[3]}+=$F[6]; $t{$F[3]}+=1} END{foreach $i (keys %t5) {print "Group $i ",$t5{$i}/$t{$i}," ",$t6{$i +}/$t{$i},"\n";}} ' < dat.txt Group S -75.969 10.2868 Group G -108.8372 73.3412 Group M -72.0558 7.1884
Re: averaging a group of lines with numeric value in a text file
by roboticus (Chancellor) on Apr 03, 2018 at 13:41 UTC

    Anas:

    OK, so you have some columns in your report that you're using to create groups, and some columns to collect data for. You don't specify everything, so I'll guess:

    my @columns_to_group_on = (2, 3); my @columns_to_average = (5, 6);

    Then, as you read the file, you'll want to split the data into columns. Then from those columns, you'll want to figure out what the group key will be, and the data you're capturing:

    my $group = join("/", @columns[@columns_to_group_on]); my @row = @columns[@columns_to_average];

    And then update your data for the group:

    ++$data{$group}{ROWCNT}; $data{$group}{SUMS}[$_] += $row[$_] for 0 .. $#row;

    Once you've collected all the data from the file, you need only build the report, so you compute your averages from the data you've collected, and print it:

    my @avg = [ map { $_/$data{$group}{ROWCNT} } @{$data{$group}{SUMS}} ]; print "GROUP $group: [", join(", ", @avg), "]\n";

    Add in your loops and control logic to get something like this:

    $ perl pm_121222.pl Collecting the data into groups: 1: xxxx A 1 G L -135.185 178.150 179.885 0.000 0.000 0.000 1 0.000 P 1 + F 1 2: xxxx A 2 S L -49.668 -31.158 177.985 0.000 0.000 0.000 1 0.000 P 1 +F 1 3: xxxx A 3 M L -84.632 -45.215 177.518 0.000 0.000 0.000 1 0.000 P 1 +F 1 4: xxxx A 1 G L -115.240 -69.349 -171.360 0.000 0.000 0.000 1 0.000 P +1 F 2 5: xxxx A 2 S L -84.776 -143.809 173.303 0.000 0.000 0.000 1 0.000 P 1 + F 2 6: xxxx A 3 M L -50.674 158.544 177.747 0.000 0.000 0.000 1 0.000 P 1 +F 2 7: xxxx A 1 G L -106.682 126.232 177.885 0.000 0.000 0.000 1 0.000 P 1 + F 3 8: xxxx A 2 S L -20.124 24.502 -179.585 0.000 0.000 0.000 1 0.000 P 1 +F 3 9: xxxx A 3 M L -60.092 2.018 -178.549 0.000 0.000 0.000 1 0.000 P 1 F + 3 10: xxxx A 1 G L -115.172 -16.017 -179.179 0.000 0.000 0.000 1 0.000 P + 1 F 4 11: xxxx A 2 S L -143.860 148.913 -179.781 0.000 0.000 0.000 1 0.000 P + 1 F 4 12: xxxx A 3 M L -81.755 -23.354 -174.564 0.000 0.000 0.000 1 0.000 P +1 F 4 13: xxxx A 1 G L -71.907 147.690 -178.617 0.000 0.000 0.000 1 0.000 P +1 F 5 14: xxxx A 2 S L -81.417 52.986 179.983 0.000 0.000 0.000 1 0.000 P 1 +F 5 15: xxxx A 3 M L -83.126 -56.051 178.109 0.000 0.000 0.000 1 0.000 P 1 + F 5 Generating the report: GROUP 1/G AVGS [-108.8372, 73.3412] GROUP 2/S AVGS [-75.969, 10.2868] GROUP 3/M AVGS [-72.0558, 7.1884]

    ...roboticus

    When your only tool is a hammer, all problems look like your thumb.

Re: averaging a group of lines with numeric value in a text file
by Anonymous Monk on Apr 03, 2018 at 06:55 UTC
    have been struggling with this file for some time

    show what you tried