Adding an array to an existing csv file as a new column

c.con has asked for the wisdom of the Perl Monks concerning the following question:

I am looking to add an array to a csv file as a column. I basically have to csv files containing a column of numbers. There csv files have 100,000's of lines. Example:

Time1 
1  
2  
3  
 
Time 2
4
5
6
[download]

What I want to do is take columns from 2 different csv files, subtract the 2nd from the 1st and then add a new column to the 2nd file with a column called difference and the result. So far, I have got as far as creating an array with the difference in it. I am having trouble to actually put this array back as a column into the 2nd file. Does anyone have any suggestions. I want to keep the code I've written thus far if possible though. I was able to add the array of values to the end of the csv file but not after each line.

use 5.10.0; 

use warnings;




my $output = "results.txt";
my ($fh1, $fh2, $fh3, $fh4, $fh5);
my ($file1, $file2) = @ARGV;
my (@col1, @col2, @col3);
my ($lines, $lines2, $lines3);
my (@array, @array2) = ();
my @diff = ();

my $csv = "csv.csv";

###Open files for read and write###
open ($fh1, '<', $file1) or die $!;
open ($fh2, '+<', $file2) or die $!;
open ($fh3, '>', $output) or die $!;


#open my $out, ">", "out.csv" or die $!;

###Reads lines from file into an array###
      while ($lines = <$fh1>){
      chomp $lines;#Removes the new line
      @col = split "," , $lines; #Array where each index holds the dat
+a of each column
      push @array, $col[5]; #adding element to the array containing sl
+ack values for file
      }

shift @array;#Removes the header line from the array -i.e column name

###Reads lines from file into an array###
      while ($lines2 = <$fh2>){
      print $fh4 $lines2;
      chomp $lines2;
      #push @list, $lines2;
      @col2 = split "," , $lines2;
      push @arr, @col2;
      push @array2, $col2[5];

      
      }

shift @array2;

###Iterates through the array and finds the difference in slack###
      foreach  my $i (@array){
    $diffs  = $array2[$i] - $array[$i];#Variable showing difference
    push @diff, $diffs;#Adding element to an array containing the diff
+erence values
      }  
###Prints out the 2 differenct slacks from each run and shows the diff
+erence between them.

#print $fh3 "Difference :", $array[$_] ,"-", $array2[$_], " = ", $diff
+[$_], "\n" for 1 .. $#array;
unshift @diff, "Difference";
[download]

2018-07-14 Athanasius fixed closing code tag and added code tags around data

Comment on Adding an array to an existing csv file as a new column
Select or Download Code

Replies are listed 'Best First'.

Re: Adding an array to an existing csv file as a new column
by haukex (Archbishop) on Jul 13, 2018 at 14:12 UTC

Please ~~use <code> tags to format your code~~ fix your <code> tags, see How do I post a question effectively? and Markup in the Monastery.

But based on your problem description, here is a template that reads two CSV files of the same length in parallel and writes to a third (which you can then use to replace the second file, if you like) - I think this is what you're after? You can adjust the logic for creating the new rows to fit your needs. It uses Text::CSV, and you should also install Text::CSV_XS for speed.

use warnings;
use strict;
use Data::Dumper; # Debug
$Data::Dumper::Useqq=1;
use Text::CSV;

my $file1 = 'in1.txt';
my $file2 = 'in2.txt';
my $outfile = 'out.txt';

my $csv = Text::CSV->new({binary=>1, auto_diag=>2, eol=>$/});

open my $ifh1, '<', $file1 or die "$file1: $!";
open my $ifh2, '<', $file2 or die "$file2: $!";
open my $ofh, '>', $outfile or die "$outfile: $!";

$csv->getline($ifh1); # read and discard header
$csv->getline($ifh2); # read and discard header
$csv->print($ofh, ['col1','col2']); # new header

while ( my $row1 = $csv->getline($ifh1) ) {
    my $row2 = $csv->getline($ifh2);
    die "file1 has more lines than file2" unless $row2;
    
    # do whatever you like to create the output row here
    my $orow = [@$row1, @$row2]; # example: join the two rows
    
    print Dumper($row1, $row2, $orow); # Debug
    $csv->print($ofh, $orow);
}
die "file2 has more lines than file1" unless eof($ifh2);

close $ifh1;
close $ifh2;
close $ofh;

$csv->eof or $csv->error_diag;
[download]

Minor edits.

[reply]
[d/l]
[select]

Re: Adding an array to an existing csv file as a new column
by Athanasius (Archbishop) on Jul 15, 2018 at 03:40 UTC

Hello c.con, and welcome to the Monastery!

haukex has shown you the correct way to process CSV files using Text::CSV. I just want to comment on your style of variable declarations.

(1) For Perl versions 5.12.0 and above, use VERSION; enables use strict; — but unfortunately your use 5.10.0; does not. The absence of an explicit use strict at the head of your script makes it difficult to catch typos. In particular, the line push @arr, @col2; references the variable @arr which is not declared or used elsewhere in the code. This is likely a typo for push @array, @col2; — an error which use strict; would have caught for you.

(2) It is good practice to declare each variable as near as possible to its point of first use. So, instead of:

my ($fh1, $fh2, $fh3, $fh4, $fh5);
...
open ($fh1, '<', $file1) or die $!;
[download]

just declare $fh1 where it is first used:

open (my $fh1, '<', $file1) or die $!;
[download]

If you edit your code according to this principle you will see that the variables $fh5, @col1, @col3, and $lines3 are never used. More importantly, you will immediately see that between the declaration and first use of $fh4 this filehandle is never, in fact, opened for writing!

Programming is hard enough already; why code in a style that makes it harder than it needs to be? ;-)

Hope that helps,

Athanasius <°(((>< contra mundum Iustus alius egestas vitae, eros Piratica,

[reply]
[d/l]
[select]