Hello ray15, and welcome to the Monastery!
Here’s a solution using Text::CSV_XS:
File “1.csv”
fragment,id,index accb,10,A bbc,11,B ccd,12,C
File “2.csv”
fragment,id,index bbc,14,E ccd,15,D llk,11,B kks,12,C
Script in file “main.pl”
#!perl use strict; use warnings; use List::MoreUtils 'uniq'; use Text::CSV_XS; my %files = (file1 => '1.csv', file2 => '2.csv'); my %hashes; my $csv = Text::CSV_XS->new( { binary => 1 } ); for my $file (keys %files) { open(my $in, '<', $files{$file}) or die "Cannot open file '$files{$file}' for reading: $!"; <$in>; # Discard column headings while (my $row = $csv->getline($in)) { my $key = shift @$row; $hashes{$file}{$key} = [ @$row ]; } close $in or die "Cannot close file '$files{$file}': $!"; } separator_line(); print join("\t", qw(frag id1 file1 id2 file2)), "\n"; separator_line(); my @keys; push @keys, keys %$_ for values %hashes; @keys = uniq @keys; for my $fragment (sort @keys) { my $f1 = exists $hashes{file1}{$fragment} ? 1 : 0; my $f2 = exists $hashes{file2}{$fragment} ? 1 : 0; printf "%s\t%s\t%s\t%s\t%s\n", $fragment, $f1 ? $hashes{file1}{$fragment}->[0] : '', $f1, $f2 ? $hashes{file2}{$fragment}->[0] : '', $f2, } separator_line(); sub separator_line { print '-' x 37, "\n"; }
Output:
13:06 >perl main.pl ------------------------------------- frag id1 file1 id2 file2 ------------------------------------- accb 10 1 0 bbc 11 1 14 1 ccd 12 1 15 1 kks 0 12 1 llk 0 11 1 ------------------------------------- 13:07 >
Note: I do not try to access $hashes{file1}{$fragment}->[0] until I have confirmed that $hashes{file1}{$fragment} already exists in the hash. This is to avoid autovivification, which is a great Perl feature but is not wanted in this case. (See e.g. Uri Guttman’s tutorial for the gory details.)
Hope that helps,
| Athanasius <°(((>< contra mundum | Iustus alius egestas vitae, eros Piratica, |
In reply to Re: comparing csv files in perl
by Athanasius
in thread comparing csv files in perl
by ray15
| For: | Use: | ||
| & | & | ||
| < | < | ||
| > | > | ||
| [ | [ | ||
| ] | ] |