c.con has asked for the wisdom of the Perl Monks concerning the following question:

I am looking to add an array to a csv file as a column. I basically have to csv files containing a column of numbers. There csv files have 100,000's of lines. Example:

Time1 1 2 3 Time 2 4 5 6
What I want to do is take columns from 2 different csv files, subtract the 2nd from the 1st and then add a new column to the 2nd file with a column called difference and the result. So far, I have got as far as creating an array with the difference in it. I am having trouble to actually put this array back as a column into the 2nd file. Does anyone have any suggestions. I want to keep the code I've written thus far if possible though. I was able to add the array of values to the end of the csv file but not after each line.

use 5.10.0; use warnings; my $output = "results.txt"; my ($fh1, $fh2, $fh3, $fh4, $fh5); my ($file1, $file2) = @ARGV; my (@col1, @col2, @col3); my ($lines, $lines2, $lines3); my (@array, @array2) = (); my @diff = (); my $csv = "csv.csv"; ###Open files for read and write### open ($fh1, '<', $file1) or die $!; open ($fh2, '+<', $file2) or die $!; open ($fh3, '>', $output) or die $!; #open my $out, ">", "out.csv" or die $!; ###Reads lines from file into an array### while ($lines = <$fh1>){ chomp $lines;#Removes the new line @col = split "," , $lines; #Array where each index holds the dat +a of each column push @array, $col[5]; #adding element to the array containing sl +ack values for file } shift @array;#Removes the header line from the array -i.e column name ###Reads lines from file into an array### while ($lines2 = <$fh2>){ print $fh4 $lines2; chomp $lines2; #push @list, $lines2; @col2 = split "," , $lines2; push @arr, @col2; push @array2, $col2[5]; } shift @array2; ###Iterates through the array and finds the difference in slack### foreach my $i (@array){ $diffs = $array2[$i] - $array[$i];#Variable showing difference push @diff, $diffs;#Adding element to an array containing the diff +erence values } ###Prints out the 2 differenct slacks from each run and shows the diff +erence between them. #print $fh3 "Difference :", $array[$_] ,"-", $array2[$_], " = ", $diff +[$_], "\n" for 1 .. $#array; unshift @diff, "Difference";

2018-07-14 Athanasius fixed closing code tag and added code tags around data

Replies are listed 'Best First'.
Re: Adding an array to an existing csv file as a new column
by haukex (Archbishop) on Jul 13, 2018 at 14:12 UTC

    Please use <code> tags to format your code fix your <code> tags, see How do I post a question effectively? and Markup in the Monastery.

    But based on your problem description, here is a template that reads two CSV files of the same length in parallel and writes to a third (which you can then use to replace the second file, if you like) - I think this is what you're after? You can adjust the logic for creating the new rows to fit your needs. It uses Text::CSV, and you should also install Text::CSV_XS for speed.

    use warnings; use strict; use Data::Dumper; # Debug $Data::Dumper::Useqq=1; use Text::CSV; my $file1 = 'in1.txt'; my $file2 = 'in2.txt'; my $outfile = 'out.txt'; my $csv = Text::CSV->new({binary=>1, auto_diag=>2, eol=>$/}); open my $ifh1, '<', $file1 or die "$file1: $!"; open my $ifh2, '<', $file2 or die "$file2: $!"; open my $ofh, '>', $outfile or die "$outfile: $!"; $csv->getline($ifh1); # read and discard header $csv->getline($ifh2); # read and discard header $csv->print($ofh, ['col1','col2']); # new header while ( my $row1 = $csv->getline($ifh1) ) { my $row2 = $csv->getline($ifh2); die "file1 has more lines than file2" unless $row2; # do whatever you like to create the output row here my $orow = [@$row1, @$row2]; # example: join the two rows print Dumper($row1, $row2, $orow); # Debug $csv->print($ofh, $orow); } die "file2 has more lines than file1" unless eof($ifh2); close $ifh1; close $ifh2; close $ofh; $csv->eof or $csv->error_diag;

    Minor edits.

Re: Adding an array to an existing csv file as a new column
by Athanasius (Archbishop) on Jul 15, 2018 at 03:40 UTC

    Hello c.con, and welcome to the Monastery!

    haukex has shown you the correct way to process CSV files using Text::CSV. I just want to comment on your style of variable declarations.

    (1) For Perl versions 5.12.0 and above, use VERSION; enables use strict; — but unfortunately your use 5.10.0; does not. The absence of an explicit use strict at the head of your script makes it difficult to catch typos. In particular, the line push @arr, @col2; references the variable @arr which is not declared or used elsewhere in the code. This is likely a typo for push @array, @col2; — an error which use strict; would have caught for you.

    (2) It is good practice to declare each variable as near as possible to its point of first use. So, instead of:

    my ($fh1, $fh2, $fh3, $fh4, $fh5); ... open ($fh1, '<', $file1) or die $!;

    just declare $fh1 where it is first used:

    open (my $fh1, '<', $file1) or die $!;

    If you edit your code according to this principle you will see that the variables $fh5, @col1, @col3, and $lines3 are never used. More importantly, you will immediately see that between the declaration and first use of $fh4 this filehandle is never, in fact, opened for writing!

    Programming is hard enough already; why code in a style that makes it harder than it needs to be? ;-)

    Hope that helps,

    Athanasius <°(((><contra mundum Iustus alius egestas vitae, eros Piratica,