in reply to Re^3: Fastest way to merge (and de-dup) two large arrays?
in thread Fastest way to merge (and de-dup) two large arrays?

I think perldigious is going with the idea that building the hash from the larger array initially and adding the smaller array to the larger one would be faster, as the initial hash build doesn't run any if checks (saving potentially millions of cycles)

It would need testing, but on the face of it this seems to be a reasonable idea. Given that all of the records will eventually end up in the hash, if this doesn't fit in memory, then the entire solution should be reworked to find a completely different method

my %seen = map {$_ => 1} @rows; foreach my $rawData (@data) { push(@rows,$rawData) unless $seen{$rawData}++; }

Replies are listed 'Best First'.
Re^5: Fastest way to merge (and de-dup) two large arrays?
by perldigious (Priest) on Aug 12, 2016 at 13:52 UTC

    Well SimonPratt I wish I had been considering that now that you mention it :-), good catch.

    I was actually just wondering about the difference in efficiency between checking for the existence of a key in a hash of 60k elements 600k times vs. checking for the existence of a key in a hash of 600k elements 60k times. I'm guessing the answer is "it depends," and I'm further guessing that most of that depends on the nature of the keys themselves.

    I asked Corion if he could give me any insight on the efficiency of such things because I'm estimating he has about three orders of magnitude more Perl and coding experience than I do (as do a lot of the other Monks here). Since I have done pretty much this exact same thing before, and do similar things rather frequently, it seemed worth asking these sorts of questions. As it is I usually just try it multiple ways and test to figure out which is best if run time is a major consideration, and maybe that's the only real way to feel certain about anything anyway.

    I love it when things get difficult; after all, difficult pays the mortgage. - Dr. Keith Whites
    I hate it when things get difficult, so I'll just sell my house and rent cheap instead. - perldigious