Seems to me that this is an inefficient way to do this. Wouldn't it be better to make a hash with a MD5 key for each record in the second set, then for each record in the first set, create a MD5 key and check if the key exists in the second set hash? Given a large number of records to match (not to mention a large number of fields in each record), this should speed thing up signficantly, and since you only have to store the hashes in memory, it will also be much easier on memory usage - if you read each record one at a time as you hash it.

Code to follow...

use strict; use warnings; use Digest::MD5 qw(md5 md5_hex md5_base64); my (@data, %data, $key, @record); # Create nested array of data for testing purposes. for (<DATA>) { chomp; push @data, [split / /]; } # Join each record and create hash key from contents. # Note: You have to include field separators (in this # case tabs), or you could end up with a situation # where non-identical records match. for (@data) { $key = md5 join "\t", @$_; $data{$key} = 1; } # Now you can check any record you want by creating # a key and seeing if it exists in the hash. @record = qw/aa aa aa aa aa aa aa aa aa/; $key = md5 join "\t", @record; print join " ", @record if !$data{$key}; @record = qw/tt ii mm ee tt hh ee rr ee/; $key = md5 join "\t", @record; print join " ", @record if !$data{$key}; # You'll still need to match up the field names, # and you will of course be looping through the # second set of records instead of doing one at a # time, but this should serve as an example of # how to use hashes to drastically cut down on # the number of comparisons. __DATA__ oo nn cc ee uu pp oo nn aa tt ii mm ee tt hh ee rr ee ww aa ss aa gg oo bb ll ii

In reply to Re: Printing the values of unique database records from comparing arrays of records by TedPride
in thread Printing the values of unique database records from comparing arrays of records by Anonymous Monk

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.