in reply to Match speed of R in array procesing

Please show your code. Maybe there's a way to speed it up, but we can't know that if we can't see your code.

  • Comment on Re: Match speed of R in array procesing

Replies are listed 'Best First'.
Re^2: Match speed of R in array procesing
by Anonymous Monk on Mar 28, 2012 at 15:40 UTC

    I haven't tried it yet but I would usually do it like this;

    foreach $i (@Array_1) { $n = grep {$Array_2[$_] == "$i"} 0 .. $#Array_2; push(@m, $n); }

    Once I have the indexes I just delete all the elements with that index in Array 2 and 3. I haven't used perl for a while but as far as I remember I was shocked how fast R does that sort of things. But that must be my not knowing perl enough.

      One problem with your code is that it scales as O(m * n), where  m == scalar @Array_1 and n == scalar @Array_2

      Here's a solution that runs in O(m + n) instead, which should be much faster for large arrays:

      use strict; use warnings; use 5.010; # only needed for say() my @Array_1 = ("a1","a2","a3","a4","a5","a6"); my @Array_2 = ("a1","b2","c3","a4","f5","a6"); my @Array_3 = ("1","2","3","4","5","6"); my %seen; @seen{@Array_2} = undef; my @idx = grep exists $seen{$Array_1[$_]}, 0..$#Array_1; @Array_1 = @Array_1[@idx]; @Array_3 = @Array_3[@idx]; say "@Array_1"; say "@Array_3";

      There are several other ways to write that same algorithm (for example you could use splice to delete array elements one by one in-place, or push onto two new arrays in parallel), but only a benchmark shows which one is fastest.