in reply to Matching two files based on one column common in each
In the original problem statement, there was a need to check whether some id exists in file2 that does not in file1. That is why %file1 was created.
If you look at the code you posted, there are 3 main steps: (1) make the hash %file1 (ids in file1), (2) make %file2 (ids in file2), (3) process keys (all unique id's) in %file1. Step(4) process all unique ids in %file2 is not there anymore - so the data structure for it is not needed either.
So, the %file1 hash is not needed. The idea is to combine step1 and step3 together as a new step(3) and get rid of step (1) altogether.
Take out the step 1 code. And then modify step(3): instead of foreach my $id1 (keys %file1){...}, just use the first part of what was step(1):
I didn't test this, but that should give you a repeated line if an id in file1 repeats on a different line.while (my $row = $csv->getline($FILE1)) { # $row is a reference to a row my @fields = @$row; # this explicitly de-references my $id1 = $fields[1]; if (exists $file2{$id1}) { $csv->print ($FILE3, "HL", @fields); #both files } else { $csv->print ($FILE3,"NOT_HK", @fields); #file1 only } }
I do not know why you added "chomp $row;". That's not needed. $row is a reference to an array that the csv module creates when it reads the line from the file. The program won't bomb, but this line doesn't do anything useful.
|
|---|
| Replies are listed 'Best First'. | |
|---|---|
|
Re^2: Matching two files based on one column common in each
by bluray (Sexton) on Sep 28, 2011 at 21:09 UTC |