combining two csv files by using math operations

ng0177 has asked for the wisdom of the Perl Monks concerning the following question:

Hello Monks, trying to combine two .csv files into one resulting file with 3 average values and 1 difference for plotting, I do not seem to accomplish that. The problem differs to multiplication a column of data in csv file by a factor. Appreciate your help!

#!/usr/bin/perl -w

use File::Basename;

# filename
my ($nameA,$pathA,$suffixA) = fileparse($ARGV[0],'\.[^\.]*');
my ($nameB,$pathB,$suffixB) = fileparse($ARGV[1],'\.[^\.]*');
my $nameC = $nameA.'dif';

# read data
open(dataA, $nameA)   or die " cannot open/read file:$!\n";
my @headerA; push @headerA, $_ = <dataA> for 1 .. 2;
my @multi_arrayA; push @multi_arrayA, [split(' ', $_)] for <dataA>;
pop @multi_arrayA; pop @multi_arrayA; # trash header
close dataA;
open(dataB, $nameB)   or die " cannot open/read file:$!\n";
my @headerB; push @headerB, $_ = <dataB> for 1 .. 2;
my @multi_arrayB; push @multi_arrayB, [split(' ', $_)] for <dataB>;
pop @multi_arrayB; pop @multi_arrayB; # trash header
close dataB;

# modify data
my @multi_arrayC;
my ($avgX,$avgY,$avgZ,$difF); #u_cor_T pi_T_ts TFP eta_is_T_ts
for my $i (0..$#multi_arrayA) {

    $avgX = ( $multi_arrayA[$i]->[0] + $multi_arrayB[$i]->[0] ) / 2.;
    $avgY = ( $multi_arrayA[$i]->[1] + $multi_arrayB[$i]->[1] ) / 2.;
    $avgZ = ( $multi_arrayA[$i]->[2] + $multi_arrayB[$i]->[2] ) / 2.;
    $difF = $multi_arrayA[$i]->[3] - $multi_arrayB[$i]->[3];

    push( $multi_arrayC[$i]->[0], $avgX );
    push( $multi_arrayC[$i]->[1], $avgY );
    push( $multi_arrayC[$i]->[2], $avgZ );
    push( $multi_arrayC[$i]->[3], $difF );
    
}

# write data
open(dataC, ">".$nameC)   or die " cannot open/read file:$!\n";
print dataC join(",", map { sprintf "%E", $_ } @{$_}),"\n" for (@multi
+_arrayC);
close dataC;

# input
__dataA__
variables
units
1.0 1.0 1.0 4.0 9.99999
2.0 2.0 2.0 4.0 9.99999
3.0 3.0 3.0 4.0 9.99999

__dataB__
variables
units
3.0 3.0 3.0 5.0 9.99999
2.0 2.0 2.0 5.0 9.99999
1.0 1.0 1.0 5.0 9.99999

# output (expected
__dataC__
variables
units
2.00000E0 2.00000E0 2.00000E0 1.00000E0
2.00000E0 2.00000E0 2.00000E0 1.00000E0
2.00000E0 2.00000E0 2.00000E0 1.00000E0
[download]

Comment on combining two csv files by using math operations Download Code

Replies are listed 'Best First'.
Re: combining two csv files by using math operations by tybalt89 (Monsignor) on May 16, 2021 at 14:44 UTC
#!/usr/bin/perl use strict; # https://perlmonks.org/?node_id=11132653 use warnings; #use File::Basename; my ($nameA, $nameB) = qw( data.A data.B ); # FIXME just for testing ## filename #my ($nameA,$pathA,$suffixA) = fileparse($ARGV[0],'\.[^\.]'); #my ($nameB,$pathB,$suffixB) = fileparse($ARGV[1],'\.[^\.]'); my $nameC = $nameA =~ s/\.\K.*/dif/r; # read data open(dataA, $nameA) or die " cannot open/read file $nameA:$!\n"; my @headerA; push @headerA, $_ = <dataA> for 1 .. 2; my @multi_arrayA; push @multi_arrayA, [split(' ', $_)] for <dataA>; #pop @multi_arrayA; pop @multi_arrayA; # trash header close dataA; open(dataB, $nameB) or die " cannot open/read file $nameB:$!\n"; my @headerB; push @headerB, $_ = <dataB> for 1 .. 2; my @multi_arrayB; push @multi_arrayB, [split(' ', $_)] for <dataB>; #pop @multi_arrayB; pop @multi_arrayB; # trash header close dataB; # modify data my @multi_arrayC; my ($avgX,$avgY,$avgZ,$difF); #u_cor_T pi_T_ts TFP eta_is_T_ts for my $i (0..$#multi_arrayA) { $avgX = ( $multi_arrayA[$i]->[0] + $multi_arrayB[$i]->[0] ) / 2.; $avgY = ( $multi_arrayA[$i]->[1] + $multi_arrayB[$i]->[1] ) / 2.; $avgZ = ( $multi_arrayA[$i]->[2] + $multi_arrayB[$i]->[2] ) / 2.; $difF = $multi_arrayA[$i]->[3] - $multi_arrayB[$i]->[3]; # push( $multi_arrayC[$i]->[0], $avgX ); # push( $multi_arrayC[$i]->[1], $avgY ); # push( $multi_arrayC[$i]->[2], $avgZ ); # push( $multi_arrayC[$i]->[3], $difF ); $multi_arrayC[$i]->[0] = $avgX; $multi_arrayC[$i]->[1] = $avgY; $multi_arrayC[$i]->[2] = $avgZ; $multi_arrayC[$i]->[3] = $difF; } use Data::Dump 'dd'; dd \@multi_arrayA, \@multi_arrayB, $nameC, \@mult +i_arrayC; # write data open(dataC, ">". $nameC) or die " cannot open/read file $nameC:$!\n" +; print dataC join(" ", map { sprintf "%E", $_ } @$_),"\n" for @multi_ar +rayC; close dataC; system "echo following is output file $nameC; cat $nameC"; # FIXME tes +ting ## input #__dataA__ #variables #units #1.0 1.0 1.0 4.0 9.99999 #2.0 2.0 2.0 4.0 9.99999 #3.0 3.0 3.0 4.0 9.99999 # #__dataB__ #variables #units #3.0 3.0 3.0 5.0 9.99999 #2.0 2.0 2.0 5.0 9.99999 #1.0 1.0 1.0 5.0 9.99999 # ## output (expected #__dataC__ #variables #units #2.00000E0 2.00000E0 2.00000E0 1.00000E0 #2.00000E0 2.00000E0 2.00000E0 1.00000E0 #2.00000E0 2.00000E0 2.00000E0 1.00000E0 [download] Outputs: `( [ ["1.0", "1.0", "1.0", "4.0", 9.99999], ["2.0", "2.0", "2.0", "4.0", 9.99999], ["3.0", "3.0", "3.0", "4.0", 9.99999], ], [ ["3.0", "3.0", "3.0", "5.0", 9.99999], ["2.0", "2.0", "2.0", "5.0", 9.99999], ["1.0", "1.0", "1.0", "5.0", 9.99999], ], "data.dif", [[2, 2, 2, -1], [2, 2, 2, -1], [2, 2, 2, -1]], ) following is output file data.dif 2.000000E+00 2.000000E+00 2.000000E+00 -1.000000E+00 2.000000E+00 2.000000E+00 2.000000E+00 -1.000000E+00 2.000000E+00 2.000000E+00 2.000000E+00 -1.000000E+00` [download]	[reply] [d/l] [select]
Re^2: combining two csv files by using math operations by ng0177 (Acolyte) on May 16, 2021 at 15:29 UTC
I confirm this code works fine. This wisdom will be used in all derivatives of that code. I also see the advantages of using Data::Dump and noted #FIXME comments highlighting. Thank you very much! Any ideas how to trap an empty line at the end of a data file which it may have?	[reply]
Re^3: combining two csv files by using math operations by tybalt89 (Monsignor) on May 16, 2021 at 16:21 UTC
replace `my @multi_arrayA; push @multi_arrayA, [split(' ', $_)] for <dataA>;` [download] with: `my @multi_arrayA; push @multi_arrayA, [split(' ', $_)] for grep /\S/, +<dataA>;` [download] in both places, changing A to B in the second one, of course.	[reply] [d/l] [select]
Re^4: combining two csv files by using math operations by ng0177 (Acolyte) on May 16, 2021 at 16:33 UTC
Re: combining two csv files by using math operations by BillKSmith (Monsignor) on May 17, 2021 at 20:09 UTC
I know that you already have a solution. I believe that a one-pass solution with fewer temporary variables is easier to understand and maintain. `# copy headers to output my $headerA = <$dataA> . <$dataA>; my $headerB = <$dataB> . <$dataB>; print $dataC "__dataC__\n$headerA"; while (1) { last if (eof($dataA) or eof($dataB)); my @A = (split( /\s+/, <$dataA>))[0..3]; my @B = (split( /\s+/, <$dataB>))[0..3]; my $avgX = ($A[0] + $B[0])/2; my $avgY = ($A[1] + $B[1])/2; my $avgZ = ($A[2] + $B[2])/2; my $difF = ($A[3] - $B[3]); printf $dataC "%E %E %E %E\n", $avgX, $avgY, $avgZ, $difF; }` [download] Bill	[reply] [d/l]
Re: combining two csv files by using math operations by Anonymous Monk on May 16, 2021 at 13:45 UTC
I recommend three things: Fix your compilation errors Fix your runtime errors Fix your logic errors I could give more specific help if you would give a more specific problem statement than "I do not seem to accomplish that." When I try to run your code under Perl 5.32.1 I get `Experimental push on scalar is now forbidden at fubar.new line 32, nea +r "$avgX )" Experimental push on scalar is now forbidden at fubar.new line 33, nea +r "$avgY )" Experimental push on scalar is now forbidden at fubar.new line 34, nea +r "$avgZ )" Experimental push on scalar is now forbidden at fubar.new line 35, nea +r "$difF )" Execution of fubar.new aborted due to compilation errors.` [download] Is this what you are seeing? A different version of Perl might actually compile this, in which case it is on to the next problem.	[reply] [d/l]
Re: combining two csv files by using math operations by 1nickt (Canon) on May 16, 2021 at 11:39 UTC
Hi, Please provide some sample data that you are feeding to your program. See also sprintf? The way forward always starts with a minimal test.	[reply]
Re^2: combining two csv files by using math operations by ng0177 (Acolyte) on May 16, 2021 at 11:51 UTC
Thanks. The sample data is under __dataA__ and __dataB__ I am not sure how to best avoid opening files when the data exists. The input format is floats, the output is supposed to be scientific format.	[reply]
Re^3: combining two csv files by using math operations by 1nickt (Canon) on May 16, 2021 at 12:06 UTC
Hi again, Forgive me but I don't know what "scientific format" is. Please show the expected output from the sample data. `$ perl -Mstrict -wE 'my $avg = (1.10000E0 + 3.10000E0) / 2.; say sprin +tf("%E", $avg)' 2.100000E+00` [download] ? The way forward always starts with a minimal test.	[reply] [d/l]
Re^4: combining two csv files by using math operations by ng0177 (Acolyte) on May 16, 2021 at 12:37 UTC
Re^5: combining two csv files by using math operations by 1nickt (Canon) on May 16, 2021 at 13:22 UTC


Perl-Sensitive Sunglasses
	PerlMonks