I think I get it, you are trying to ensure the one to one equality between the ultimately combined array and the lookup hash of its lines, in other words maintaining the integrity between the two. I was actually thinking of treating the lookup hash like a temporary throwaway.

my @rows = <$file_for_rows>; { my %seen = map {$_ => 1} @rows; while (my $rawData = <$file_for_data>) { push @rows, $rawData if (!$seen{$rawData}); } }

I figured this way the larger array is read in to @rows right away (assumes no need to check for duplicates within itself), and then the @data array is read in and checked a line at a time (it may be faster to read it all in to an array first, but I figured this saves a little temporary memory) before being added to @rows if it doesn't already exist in it. I added the extra unlabeled code block to lower scope everything except @rows assuming that once the block is done @rows will have all the lines from both arrays with no duplicates and the memory taken up by the lookup hash is freed up again (I'm under the impression that's an advantage of the added scoping anyway).

Again, I was assuming there was no need to check for duplicates inside of each individual array, and that the lookup hash is just a temporarily created throwaway that isn't needed after the merge is finished.

Sorry, I like asking little questions and debating about minutia like this because I don't have nearly as much experience with Perl or coding in general as a lot of the Monks on here (definitely no where near as much as you), so asking knit-picky little questions like these helps me learn, and hopefully helps others who are like me learn too when they read it. It's why I've quickly grown to like this place so much.

I love it when things get difficult; after all, difficult pays the mortgage. - Dr. Keith Whites
I hate it when things get difficult, so I'll just sell my house and rent cheap instead. - perldigious

In reply to Re^4: Fastest way to merge (and de-dup) two large arrays? by perldigious
in thread Fastest way to merge (and de-dup) two large arrays? by technojosh

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.