Re: comparing csv files in perl

Your question is barely readable, your code does not compile (you still have file3 all over the code), and you are declaring the use of a Text::CSV module without making use of it anywhere in the code. This has homework written all over it.

Still, for someone with a similar problem, a solution could help. Let's define a problem: we have a few files, organized in columns. The first column is the ID, the second is an item name, the rest are not interesting from our point of view. We are making a few assumptions for the input - the ID's cannot be equal to 0, and there is only one line in a file for a given item name.

The input:

 
### test1.csv ###
1  aaa  ignored_field
3  ccc ignored_field
4  ddd ignored_field

### test2.csv ###
11 aaa ignored_field
22 bbb ignored_field
44 ddd   ignored_field

### test3.csv ###
333 ccc ignored_field
555 eee ignored_field
666  fff   ignored_field
777  ggg   ignored_field
[download]

What we want to accomplish is to produce a report with all items, clearly stating the ID under which an item is stored in a file. If a file does not contain this item, instead of the ID, a 0 will be shown.

#!/usr/bin/perl
use v5.14;

my %data;

sub cnt_fields {
    my @fields = split "\t", $_[0];
    return scalar @fields;
}

sub gather_file {
    state $count = 0;
    return $count unless @_;

    my ($filename, $d) = @_;
    open my $fh, "<", $filename;
    while (<$fh>) {
        chomp;
        my @field = split;
        my $offset = $count - cnt_fields($d->{$field[1]});
        $d->{$field[1]} .= "0\t" x $offset;
        $d->{$field[1]} .= "$field[0]\t";
    }
    close $fh;

    for my $key (keys %$d) {
        if (cnt_fields($d->{$key}) <= $count) {
            $d->{$key} .= "0\t";
        }
    }
    $count++;
}

for (1..3) {
    gather_file("test$_.csv", \%data);
}

my $header = "Key\t";
for (1..gather_file()) {
    $header .= "f$_\t";
}
say $header;

for my $key (sort keys %data) {
    say "$key\t$data{$key}";
}
[download]

The output:

$ ./report.pl 
Key    f1    f2    f3
aaa    1    11    0    
bbb    0    22    0    
ccc    3    0    333    
ddd    4    44    0    
eee    0    0    555    
fff    0    0    666    
ggg    0    0    777
[download]

regards,
Luke Jefferson

Comment on Re: comparing csv files in perl Select or Download Code