devan999 has asked for the wisdom of the Perl Monks concerning the following question:

Hi there, I'm relatively new to perl and have a predicament: I have an array1 with 200k recs, array2 with 50K recs. Each array has the same amount of elements (170) and the zero element in each array is a unique identifier. I need to update array1 with certain elements of array2. So... match on the zero element, then update elements 155,158,159,etc. I've used nested for loops and it's taking an eternity. What would be the best approach to speed this process up? Thanks for any help or advice.

Replies are listed 'Best First'.
Re: matching large arrays then updating
by ikegami (Patriarch) on Oct 21, 2011 at 18:38 UTC

    From the sounds of it, you have two AoAs like the following:

    my @array1 = ( [ id1, ... 169 other fields ... ], [ id2, ... 169 other fields ... ], ... ~200k other records ... ); my @array2 = ( [ id2, ... 169 other fields ... ], [ id5, ... 169 other fields ... ], ... ~50k other records ... );

    Start by making it easy to find records in array2.

    my %array2 = map { $_->[0] => $_ } @array2;

    Then the problem becomes trivial.

    for my $array1_rec (@array1) { my $array2_rec = $array2{ $array1_rec->[0] } or next; ... Change @$array1_rec based on values from @$array2_rec ... }

    You don't actually need to create @array1 or @array2.

      -ikegami You illustrated my problem perfectly and the solution I was looking for! Thanks for the help!
      -ikegami I've followed your logic but I'm having the following issue:
      my %array2 = map { $_->[0] => $_ } @array2;
      produces the following problem:
      The %array2 contains a single line: key="" and %array{$key} returns one record from array2

      The initial array from the file contains 200k records but they are not being added to the hash...
      I've tried many ways to map the array but I've been unsuccessful so far...
      Thanks again for you help.

        The initial array from the file contains 200k records but they are not being added to the hash...

        Then your data isn't arranged as you said it is.

        The %array2 contains a single line: key=""

        Hashes have elements, not lines.

        and %array{$key} returns one record from array2

        %array{$key} is a syntax error, there is no variable named "%array" in this discussion, and "array2" could refer to the array @array2 or the hash %array2.

        The initial array from the file contains 200k records but they are not being added to the hash...

        My code doesn't even try to add the 200k records of @array1 to the hash. It adds the 50k records of @array2.

Re: matching large arrays then updating
by davido (Cardinal) on Oct 21, 2011 at 18:26 UTC

    I have an array1 with 200k recs, array2 with 50K recs. Each array has the same amount of elements (170)...

    You surely know what that means, but to me it doesn't seem to add up. Could you think that part of the problem over again and explain it in terms that someone who isn't already familiar with the problem would understand?

    Once you've come up with an explanation that makes sense of that portion of your question, move on to the third sentence and find a way of clarifying what you mean there too.


    Dave

      Judging by the sentence that follows those you quoted,

      $OP =~ s{

      Each array has the same amount of elements (170)

      }{

      Each rec is an array of 170 elements

      }

Re: matching large arrays then updating
by locked_user sundialsvc4 (Abbot) on Oct 22, 2011 at 13:25 UTC

    Also consider whether it would be advantageous to, say, put the data into an SQLite database file.   As long as you are careful to use transactions (which tells SQLite that it is permitted to use lazy-writes), the tool works splendidly.   Plus, it runs on absolutely everything, requires no server, and is not specific to Perl or any other programming language.   Honestly, it’s good enough to make you swear off of “flat files” forever.