in reply to Comparing records in file and reporting stats - Scenario 2
First of all: do you really need the structure to be tied to disk? 35MB for a worst case does not seem to be too huge a file for the common amount of RAM nowadays. Why don't you slurp the whole input files in memory, process them, then write your report?
My DS would be almost the same as yours:
$legacy_data { unique field value across both records } = { 'goodField' => 'I am good!', 'firstField' => 1, 'secondField' => 3 }; $new_data { unique field value across both records } = { 'firstField' => 11, 'secondField' => 33, 'goodField' => 'I am good!' }; $differences = [ firstField, secondField];
That is, I don't strictly see the need for all data to be in ONE data structure.
Regarding ALG, just populate both data structures by reading all files in memory at once, then process the records with something like
And then emit your report.foreach my $record_id (keys %legacy_data) { my $legacy_data = $legacy_data{$record_id}; my $new_data = $new_data{$record_id}; .... # here go all the tests you said .... }
|
|---|