Beefy Boxes and Bandwidth Generously Provided by pair Networks
Perl-Sensitive Sunglasses
 
PerlMonks  

combining two csv files by using math operations

by ng0177 (Acolyte)
on May 16, 2021 at 11:28 UTC ( [id://11132653]=perlquestion: print w/replies, xml ) Need Help??

ng0177 has asked for the wisdom of the Perl Monks concerning the following question:

Hello Monks, trying to combine two .csv files into one resulting file with 3 average values and 1 difference for plotting, I do not seem to accomplish that. The problem differs to multiplication a column of data in csv file by a factor. Appreciate your help!
#!/usr/bin/perl -w use File::Basename; # filename my ($nameA,$pathA,$suffixA) = fileparse($ARGV[0],'\.[^\.]*'); my ($nameB,$pathB,$suffixB) = fileparse($ARGV[1],'\.[^\.]*'); my $nameC = $nameA.'dif'; # read data open(dataA, $nameA) or die " cannot open/read file:$!\n"; my @headerA; push @headerA, $_ = <dataA> for 1 .. 2; my @multi_arrayA; push @multi_arrayA, [split(' ', $_)] for <dataA>; pop @multi_arrayA; pop @multi_arrayA; # trash header close dataA; open(dataB, $nameB) or die " cannot open/read file:$!\n"; my @headerB; push @headerB, $_ = <dataB> for 1 .. 2; my @multi_arrayB; push @multi_arrayB, [split(' ', $_)] for <dataB>; pop @multi_arrayB; pop @multi_arrayB; # trash header close dataB; # modify data my @multi_arrayC; my ($avgX,$avgY,$avgZ,$difF); #u_cor_T pi_T_ts TFP eta_is_T_ts for my $i (0..$#multi_arrayA) { $avgX = ( $multi_arrayA[$i]->[0] + $multi_arrayB[$i]->[0] ) / 2.; $avgY = ( $multi_arrayA[$i]->[1] + $multi_arrayB[$i]->[1] ) / 2.; $avgZ = ( $multi_arrayA[$i]->[2] + $multi_arrayB[$i]->[2] ) / 2.; $difF = $multi_arrayA[$i]->[3] - $multi_arrayB[$i]->[3]; push( $multi_arrayC[$i]->[0], $avgX ); push( $multi_arrayC[$i]->[1], $avgY ); push( $multi_arrayC[$i]->[2], $avgZ ); push( $multi_arrayC[$i]->[3], $difF ); } # write data open(dataC, ">".$nameC) or die " cannot open/read file:$!\n"; print dataC join(",", map { sprintf "%E", $_ } @{$_}),"\n" for (@multi +_arrayC); close dataC; # input __dataA__ variables units 1.0 1.0 1.0 4.0 9.99999 2.0 2.0 2.0 4.0 9.99999 3.0 3.0 3.0 4.0 9.99999 __dataB__ variables units 3.0 3.0 3.0 5.0 9.99999 2.0 2.0 2.0 5.0 9.99999 1.0 1.0 1.0 5.0 9.99999 # output (expected __dataC__ variables units 2.00000E0 2.00000E0 2.00000E0 1.00000E0 2.00000E0 2.00000E0 2.00000E0 1.00000E0 2.00000E0 2.00000E0 2.00000E0 1.00000E0

Replies are listed 'Best First'.
Re: combining two csv files by using math operations
by tybalt89 (Monsignor) on May 16, 2021 at 14:44 UTC
    #!/usr/bin/perl use strict; # https://perlmonks.org/?node_id=11132653 use warnings; #use File::Basename; my ($nameA, $nameB) = qw( data.A data.B ); # FIXME just for testing ## filename #my ($nameA,$pathA,$suffixA) = fileparse($ARGV[0],'\.[^\.]*'); #my ($nameB,$pathB,$suffixB) = fileparse($ARGV[1],'\.[^\.]*'); my $nameC = $nameA =~ s/\.\K.*/dif/r; # read data open(dataA, $nameA) or die " cannot open/read file $nameA:$!\n"; my @headerA; push @headerA, $_ = <dataA> for 1 .. 2; my @multi_arrayA; push @multi_arrayA, [split(' ', $_)] for <dataA>; #pop @multi_arrayA; pop @multi_arrayA; # trash header close dataA; open(dataB, $nameB) or die " cannot open/read file $nameB:$!\n"; my @headerB; push @headerB, $_ = <dataB> for 1 .. 2; my @multi_arrayB; push @multi_arrayB, [split(' ', $_)] for <dataB>; #pop @multi_arrayB; pop @multi_arrayB; # trash header close dataB; # modify data my @multi_arrayC; my ($avgX,$avgY,$avgZ,$difF); #u_cor_T pi_T_ts TFP eta_is_T_ts for my $i (0..$#multi_arrayA) { $avgX = ( $multi_arrayA[$i]->[0] + $multi_arrayB[$i]->[0] ) / 2.; $avgY = ( $multi_arrayA[$i]->[1] + $multi_arrayB[$i]->[1] ) / 2.; $avgZ = ( $multi_arrayA[$i]->[2] + $multi_arrayB[$i]->[2] ) / 2.; $difF = $multi_arrayA[$i]->[3] - $multi_arrayB[$i]->[3]; # push( $multi_arrayC[$i]->[0], $avgX ); # push( $multi_arrayC[$i]->[1], $avgY ); # push( $multi_arrayC[$i]->[2], $avgZ ); # push( $multi_arrayC[$i]->[3], $difF ); $multi_arrayC[$i]->[0] = $avgX; $multi_arrayC[$i]->[1] = $avgY; $multi_arrayC[$i]->[2] = $avgZ; $multi_arrayC[$i]->[3] = $difF; } use Data::Dump 'dd'; dd \@multi_arrayA, \@multi_arrayB, $nameC, \@mult +i_arrayC; # write data open(dataC, ">". $nameC) or die " cannot open/read file $nameC:$!\n" +; print dataC join(" ", map { sprintf "%E", $_ } @$_),"\n" for @multi_ar +rayC; close dataC; system "echo following is output file $nameC; cat $nameC"; # FIXME tes +ting ## input #__dataA__ #variables #units #1.0 1.0 1.0 4.0 9.99999 #2.0 2.0 2.0 4.0 9.99999 #3.0 3.0 3.0 4.0 9.99999 # #__dataB__ #variables #units #3.0 3.0 3.0 5.0 9.99999 #2.0 2.0 2.0 5.0 9.99999 #1.0 1.0 1.0 5.0 9.99999 # ## output (expected #__dataC__ #variables #units #2.00000E0 2.00000E0 2.00000E0 1.00000E0 #2.00000E0 2.00000E0 2.00000E0 1.00000E0 #2.00000E0 2.00000E0 2.00000E0 1.00000E0

    Outputs:

    ( [ ["1.0", "1.0", "1.0", "4.0", 9.99999], ["2.0", "2.0", "2.0", "4.0", 9.99999], ["3.0", "3.0", "3.0", "4.0", 9.99999], ], [ ["3.0", "3.0", "3.0", "5.0", 9.99999], ["2.0", "2.0", "2.0", "5.0", 9.99999], ["1.0", "1.0", "1.0", "5.0", 9.99999], ], "data.dif", [[2, 2, 2, -1], [2, 2, 2, -1], [2, 2, 2, -1]], ) following is output file data.dif 2.000000E+00 2.000000E+00 2.000000E+00 -1.000000E+00 2.000000E+00 2.000000E+00 2.000000E+00 -1.000000E+00 2.000000E+00 2.000000E+00 2.000000E+00 -1.000000E+00
      I confirm this code works fine. This wisdom will be used in all derivatives of that code. I also see the advantages of using Data::Dump and noted #FIXME comments highlighting. Thank you very much! Any ideas how to trap an empty line at the end of a data file which it may have?

        replace

        my @multi_arrayA; push @multi_arrayA, [split(' ', $_)] for <dataA>;

        with:

        my @multi_arrayA; push @multi_arrayA, [split(' ', $_)] for grep /\S/, +<dataA>;

        in both places, changing A to B in the second one, of course.

Re: combining two csv files by using math operations
by BillKSmith (Monsignor) on May 17, 2021 at 20:09 UTC
    I know that you already have a solution. I believe that a one-pass solution with fewer temporary variables is easier to understand and maintain.
    # copy headers to output my $headerA = <$dataA> . <$dataA>; my $headerB = <$dataB> . <$dataB>; print $dataC "__dataC__\n$headerA"; while (1) { last if (eof($dataA) or eof($dataB)); my @A = (split( /\s+/, <$dataA>))[0..3]; my @B = (split( /\s+/, <$dataB>))[0..3]; my $avgX = ($A[0] + $B[0])/2; my $avgY = ($A[1] + $B[1])/2; my $avgZ = ($A[2] + $B[2])/2; my $difF = ($A[3] - $B[3]); printf $dataC "%E %E %E %E\n", $avgX, $avgY, $avgZ, $difF; }
    Bill
Re: combining two csv files by using math operations
by Anonymous Monk on May 16, 2021 at 13:45 UTC

    I recommend three things:

    1. Fix your compilation errors
    2. Fix your runtime errors
    3. Fix your logic errors

    I could give more specific help if you would give a more specific problem statement than "I do not seem to accomplish that."

    When I try to run your code under Perl 5.32.1 I get

    Experimental push on scalar is now forbidden at fubar.new line 32, nea +r "$avgX )" Experimental push on scalar is now forbidden at fubar.new line 33, nea +r "$avgY )" Experimental push on scalar is now forbidden at fubar.new line 34, nea +r "$avgZ )" Experimental push on scalar is now forbidden at fubar.new line 35, nea +r "$difF )" Execution of fubar.new aborted due to compilation errors.
    Is this what you are seeing? A different version of Perl might actually compile this, in which case it is on to the next problem.

Re: combining two csv files by using math operations
by 1nickt (Canon) on May 16, 2021 at 11:39 UTC

    Hi,

    Please provide some sample data that you are feeding to your program. See also sprintf?


    The way forward always starts with a minimal test.
      Thanks. The sample data is under __dataA__ and __dataB__ I am not sure how to best avoid opening files when the data exists. The input format is floats, the output is supposed to be scientific format.

        Hi again,

        Forgive me but I don't know what "scientific format" is. Please show the expected output from the sample data.

        $ perl -Mstrict -wE 'my $avg = (1.10000E0 + 3.10000E0) / 2.; say sprin +tf("%E", $avg)' 2.100000E+00
        ?


        The way forward always starts with a minimal test.

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://11132653]
Approved by marto
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others musing on the Monastery: (7)
As of 2024-04-23 16:41 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found