in reply to comparing csv files in perl
Your question is barely readable, your code does not compile (you still have file3 all over the code), and you are declaring the use of a Text::CSV module without making use of it anywhere in the code. This has homework written all over it.
Still, for someone with a similar problem, a solution could help. Let's define a problem: we have a few files, organized in columns. The first column is the ID, the second is an item name, the rest are not interesting from our point of view. We are making a few assumptions for the input - the ID's cannot be equal to 0, and there is only one line in a file for a given item name.
The input:
### test1.csv ### 1 aaa ignored_field 3 ccc ignored_field 4 ddd ignored_field ### test2.csv ### 11 aaa ignored_field 22 bbb ignored_field 44 ddd ignored_field ### test3.csv ### 333 ccc ignored_field 555 eee ignored_field 666 fff ignored_field 777 ggg ignored_field
What we want to accomplish is to produce a report with all items, clearly stating the ID under which an item is stored in a file. If a file does not contain this item, instead of the ID, a 0 will be shown.
#!/usr/bin/perl use v5.14; my %data; sub cnt_fields { my @fields = split "\t", $_[0]; return scalar @fields; } sub gather_file { state $count = 0; return $count unless @_; my ($filename, $d) = @_; open my $fh, "<", $filename; while (<$fh>) { chomp; my @field = split; my $offset = $count - cnt_fields($d->{$field[1]}); $d->{$field[1]} .= "0\t" x $offset; $d->{$field[1]} .= "$field[0]\t"; } close $fh; for my $key (keys %$d) { if (cnt_fields($d->{$key}) <= $count) { $d->{$key} .= "0\t"; } } $count++; } for (1..3) { gather_file("test$_.csv", \%data); } my $header = "Key\t"; for (1..gather_file()) { $header .= "f$_\t"; } say $header; for my $key (sort keys %data) { say "$key\t$data{$key}"; }
The output:
$ ./report.pl Key f1 f2 f3 aaa 1 11 0 bbb 0 22 0 ccc 3 0 333 ddd 4 44 0 eee 0 0 555 fff 0 0 666 ggg 0 0 777
regards,
Luke Jefferson
|
|---|