RE (tilly) 3: Removing certain elements from an array

Your post illustrates the need to benchmark code carefully! The map solutions are better algorithmically, meaning that as you increase the size of the array and the number of elements removed, btrott's solution will slow down much more than they do. But in the real world the worse algorithm may still work out to be better.

In this example, clearly that is the case. If you want to play around with it, here is some sample code to mess around with:

use Benchmark;
my $n = 10000;
my $m = 100;
my @array = 0..$n;
my @idxsToRemove = map {int( rand( 10000 ) ) } 1..$m;

my %tests = (
  grepping => sub {
      my @subarray = @array;
      my %rem;
      @toRemoveIdx{@idxsToRemove} = (1)x@idxsToRemove;
      @subarray = @subarray[ grep !$toRemoveIdx{$_}, 0..$#subarray ];
    },
  mapping => sub {
      my @subarray = @array;
      my %rem;
      @toRemoveIdx{@idxsToRemove} = (1)x@idxsToRemove;
      @subarray = map { $toRemoveIdx{$_} ? () : $subarray[$_] } 0..$#s
+ubarray;
    },
  sorting => sub {
      my @subarray = @array;
      splice @subarray, $_, 1 for sort { $b <=> $a } @idxsToRemove;
      @subarray;
    },
);

timethese( shift @ARGV || -5, \%tests );
[download]

Play with it. As you move $m and $n up you will find the first two solutions scaling better. But unless you have some truly impressive numbers, the sort solution by btrott will be faster.

(Yes, I know the test could be improved a lot...)

NOTE: The sorting solution is both faster and incorrect. If your list of elements to remove has duplicates, you incorrectly remove the same element multiple times! This illustrates my real reason for liking map, I find it easier for me to figure out and validate all possible gotchas using it. Even if it is sometimes a lot slower. :-)

EDIT
A typo change, noticed that I changed code without touching the description of what the code did. Oops.

Comment on RE (tilly) 3: Removing certain elements from an array Download Code