First of all: do you really need the structure to be tied to disk? 35MB for a worst case does not seem to be too huge a file for the common amount of RAM nowadays. Why don't you slurp the whole input files in memory, process them, then write your report?

My DS would be almost the same as yours:

$legacy_data { unique field value across both records } = { 'goodField' => 'I am good!', 'firstField' => 1, 'secondField' => 3 }; $new_data { unique field value across both records } = { 'firstField' => 11, 'secondField' => 33, 'goodField' => 'I am good!' }; $differences = [ firstField, secondField];

That is, I don't strictly see the need for all data to be in ONE data structure.

Regarding ALG, just populate both data structures by reading all files in memory at once, then process the records with something like

foreach my $record_id (keys %legacy_data) { my $legacy_data = $legacy_data{$record_id}; my $new_data = $new_data{$record_id}; .... # here go all the tests you said .... }
And then emit your report.

In reply to Re: Comparing records in file and reporting stats - Scenario 2 by jorgegv
in thread Comparing records in file and reporting stats - Scenario 2 by PoorLuzer

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.