Re: Removing certain elements from an array
by btrott (Parson) on Aug 24, 2000 at 20:53 UTC
|
I think this'll do it for you:
splice @array, $_, 1
for sort { $b <=> $a } @indices;
The reason for the sort is that it starts dropping elements
from the end of the array. If you start from the beginning,
you won't drop the elements that you want to drop. Say
you want to drop the elements at indices 0 and 2; you
drop the element at index 0, but then the other elements
squish towards the front of the array. So element 2 is now
element 1. So you'll drop the wrong elements. That's why
you should start from the end.
Update: tilly pointed out a bug where duplicate
indices could lead to dropping incorrect elements in the
array. Here's a fix:
my %u;
splice @array, $_, 1
for grep !$u{$_}++, sort { $b <=> $a } @indices;
This is just getting silly in terms of the number of times
we're looping over the data; but it does still seem to be quite
fast. Faster than I would have expected, in fact. | [reply] [d/l] [select] |
|
| [reply] |
|
Cool, that was the problem I was having. I knew about the wonders of splice but just didn't think of using sort to prevent the indexes from becoming invalidated after doing the first splice.
Thanks a lot!
| [reply] |
RE (tilly) 1: Removing certain elements from an array
by tilly (Archbishop) on Aug 24, 2000 at 21:30 UTC
|
TMTOWTDI, here is a map solution:
my %remove;
++$remove{$_} foreach @remindexes;
@array = map ((exists $remove{$_} ? () : $array[$_]), 0..$#array);
Basically build up a hash lookup for what you don't want,
and map the indexes of the array into nothing or the
array element as desired.
BTW recent conversations with tye have reminded me that
entering and leaving blocks has a significant overhead.
Therefore I have not written this map with a block. | [reply] [d/l] |
RE: Removing certain elements from an array
by Boogman (Scribe) on Aug 24, 2000 at 20:50 UTC
|
Use the splice function. From perlfunc:
splice ARRAY,OFFSET,LENGTH,LIST
Removes the elements designated by OFFSET and LENGTH from
an array, and replaces them with the elements of LIST, if
any.
If you don't specify LIST, it just removes the specified elements from the array and "scrunches" the rest of the array together.
Update:
Oh and if you want have a list of them you wanted to remove,
you would probably want to do something like
splice( @array, $_, 1 ) foreach ( @elementsToDelete );
And @elementsToDelete would have to be sorted in descending order as btrott mentions. | [reply] [d/l] [select] |
Re: Removing certain elements from an array
by tye (Sage) on Aug 24, 2000 at 21:31 UTC
|
my @array;
getreallyhugelist( \@array );
my @idxsToRemove= getnotashugelist();
my %toRemoveIdx;
@toRemoveIdx{@idxsToRemove}= (1)x@idxsToRemove;
@array= map { $toRemoveIdx{$_} ? () : $array[$_] } 0..$#array;
That last line copies the whole array once (puts it on the stack) and then only has to shift each non-removed element once. So it will probably be faster than repeated calls to splice() (each one having to shift the whole end of the array). I think this speed-up will more than compensate for the cost of building the hash (on the preceding line) if you are removing more than a few elements.
-
tye
(but my friends call me "Tye") | [reply] [d/l] |
|
#!/os/dist/bin/perl -w
use Benchmark;
my @array = 0..10000;
my @idxsToRemove = ();
push @idxsToRemove, int( rand( 10000 ) ) for ( 0 .. 100 );
my %toRemoveIdx;
@toRemoveIdx{@idxsToRemove} = (1)x@idxsToRemove;
sub Tye {
my @subarray = @array;
@subarray = map { $toRemoveIdx{$_} ? () : $subarray[$_] } 0..$#sub
+array;
}
sub btrott {
my @subarray = @array;
splice @subarray, $_, 1 for sort { $b <=> $a } @idxsToRemove;
}
imethese( shift @ARGV || -5,
{
'Tye' => \&Tye,
'btrott' => \&btrott,
}
);
Generated these results:
Benchmark: running Tye, btrott, mikfire, each for at least 5 CPU seco
+nds...
Tye: 7 wallclock secs ( 5.98 usr + 0.00 sys = 5.98 CPU) @ 7
+.19/s (n=43)
btrott: 6 wallclock secs ( 5.38 usr + 0.00 sys = 5.38 CPU) @ 63
+.01/s (n=339)
mikfire | [reply] [d/l] [select] |
|
Note that Perl arrays are most definitely not linked lists. They are neat, however.
What you may be remembering is that pulling things off of the front of an array is nearly as efficient as popping them off of the back of an array. This is because Perl doesn't bother to shift all of the elements down until the gap in front gets big enough. But removing items from the middle of an array requires that one end (or the other?) be shifted.
The drastic difference in benchmark times confuses me. Update: No, "sub btrott" isn't broken (my debugging was). Hmmm....
Update 2: Never underestimate the slowness of Perl's opcode machinery. The "map" version loses because it executes 10_000 statements while the "splice" version only executes 100. While each "splice" statement has to copy a big chunk of the 10_000-element array, Perl's overhead swamps that. With the number of elements to be removed raised to 5_000, the "map" version is faster.
-
tye
(but my friends call me "Tye")
| [reply] |
|
| [reply] |
|
|
| [reply] |
|
use Benchmark;
my $n = 10000;
my $m = 100;
my @array = 0..$n;
my @idxsToRemove = map {int( rand( 10000 ) ) } 1..$m;
my %tests = (
grepping => sub {
my @subarray = @array;
my %rem;
@toRemoveIdx{@idxsToRemove} = (1)x@idxsToRemove;
@subarray = @subarray[ grep !$toRemoveIdx{$_}, 0..$#subarray ];
},
mapping => sub {
my @subarray = @array;
my %rem;
@toRemoveIdx{@idxsToRemove} = (1)x@idxsToRemove;
@subarray = map { $toRemoveIdx{$_} ? () : $subarray[$_] } 0..$#s
+ubarray;
},
sorting => sub {
my @subarray = @array;
splice @subarray, $_, 1 for sort { $b <=> $a } @idxsToRemove;
@subarray;
},
);
timethese( shift @ARGV || -5, \%tests );
Play with it. As you move $m and $n up you
will find the first two solutions scaling better. But
unless you have some truly impressive numbers, the sort
solution by btrott will be faster.
(Yes, I know the test could be improved a lot...)
NOTE: The sorting solution is both faster
and incorrect. If your list of elements to remove has
duplicates, you incorrectly remove the same element
multiple times! This illustrates my real reason for
liking map, I find it easier for me to figure out
and validate all possible gotchas using it. Even if
it is sometimes a lot slower. :-)
EDIT
A typo change, noticed that I changed code without
touching the description of what the code did. Oops. | [reply] [d/l] |
Re: Removing certain elements from an array
by chromatic (Archbishop) on Aug 24, 2000 at 21:53 UTC
|
I like the hash and array slices, but I'm not thrilled with the map. Here's a version that needs a sort, which may not gain you anything, but at least it's a different approach.
my %h;
@h{ 0 .. $#array }++;
delete @h{@remindexes};
@array = @array[ sort { $a <=> $b } keys %h ];
| [reply] [d/l] |
Re: Removing certain elements from an array
by xdb19 (Sexton) on Aug 24, 2000 at 23:39 UTC
|
This is a one liner that works well for me.
splice( @array, $_ - $ind++, 1 ) foreach ( sort @remindexes );
Have Fun - XDB19 | [reply] [d/l] |
|
Attack from the rear:
splice( @array, $_, 1) for (reverse sort @remindexes);
This way, you start with the highest array index and move down, so you don't have to keep count of how many elements you have removed.
| [reply] [d/l] |
Re: Removing certain elements from an array
by isotope (Deacon) on Aug 24, 2000 at 20:52 UTC
|
Try something like this (untested):
foreach(@remindexes) {
splice(@array, $_, 1);
}
Update: This is going to squeeze the array down as it goes, so the indices in @remindexes will be inaccurate. See btrott's post below for the right way to do it.
HTH,
--isotope
| [reply] [d/l] |
Re: Removing certain elements from an array
by le (Friar) on Aug 24, 2000 at 20:58 UTC
|
My solution (clumsy as usual):
my @array = qw( ...);
my @remindexes = qw( ... );
my %remindexes;
my @newarray;
for (@remindexes) {
$remindexes{$_}++;
}
for (my $i = 0; $i <= $#array; $i++) {
push @newarray, $array[$i] if !$remindexes{$i};
}
Of course, if you would setup the indexes to remove into a hash from the beginning, you could omit the first step. | [reply] [d/l] |
RE: Removing certain elements from an array
by BlaisePascal (Monk) on Aug 24, 2000 at 23:00 UTC
|
@hugearray = ( ... );
@toremove = ( ... );
# remove from @hugearray elements indexed in @toremove
@hugearray[@toremove] = ("RemoveMe") x @toremove;
push @hugearray, "SentinelFlag";
for (my $_;;) {
$_ = shift @hugearray;
last if $_ eq "SentinelFlag";
next if $_ eq "RemoveMe";
push @hugearray, $temp;
}
This works by flagging the elements you want to delete, then using @hugearray as a circular shift register, shift-pushing each element in the array in turn, until it gets back to the beginning. As it goes, it doesn't push back in the elements to be deleted.
How does it fare efficiency-wise? This routine touches each data element once, and likely does at most a single implicit array-copy. I think it is probably an O(n+m) operation (n=@hugearray, m=@toremove). The splice solution is O(m log m + m*O(splice(n)). Since perl arrays are not linked lists, I think splice is an O(n) operation, meaning that the splice solutions are O(m log m + m*n). Which is faster? My guess would be the shift-register, but I don't know.
| [reply] [d/l] |
Re: Removing certain elements from an array
by tye (Sage) on Aug 24, 2000 at 22:50 UTC
|
FYI, Benchmark.pm has proved every guess of mine wrong in this thread:
- Using blocks in map didn't make anything slower (I didn't investigate why).
- The "map" solution is much slower than the "splice" solution unless nearly half of the elements are removed (due to overhead in executing so many Perl statements).
- chromatic's solution is slower still (which I thought would be close in speed to the "splice" solution, if not faster). I think this is because the "sort" has to be done on so many more elements.
But then that is why Benchmark was written.
-
tye
(but my friends call me "Tye")
| [reply] [d/l] |
Re: Removing elements from array (use grep!)
by Russ (Deacon) on Aug 24, 2000 at 22:31 UTC
|
Hmmm... Haven't seen a solution using the tailor-made
grep.
my @array = (0,1,2,3,4,5,6,7,8,9); # The source array
my %DropList = map {$_ => 1} (2,4,6,8); # A hash of the indices to rem
+ove
my @newarray = @array[grep {not exists $DropList{$_}} (0..$#array)];
print "@newarray\n";
Notes:
- I am still using 5.005_57 to match a production server, so if you prefer to use exists on arrays in 5.6...
Enjoy.
Russ
Brainbench 'Most Valuable Professional' for Perl | [reply] [d/l] |