Re: Fastest way to "pick without replacement"

haukex the test you are proposing removes only 1 item from @numbers. If the test removed more than one numbers (or all) one after the other (after a shuffle) could be more realistic (and my hash-based approach could get a chance too :)):

use warnings;
use strict;
use Benchmark qw/cmpthese/;

use List::Util qw/shuffle/;

my @numbers = ( 0..20 );
my @indices = shuffle 0..$#numbers;

my @expect = ( 0..2,4..20 );
use constant TEST => 0;

cmpthese(-2, {
    splice => sub {
        my @output = @numbers;
        for my $index (@indices){
            splice @output, 0, 1;
        }
        join("\0", @output) eq join("\0", @expect) or die if TEST;
    },
    grep => sub { # https://www.perlmonks.org/?node_id=11123877
        my @output;
        for my $index (@indices){
            @output = @numbers[ grep $_ != $index, 0 .. $#numbers ];
        }
        join("\0", @output) eq join("\0", @expect) or die if TEST;
    },
    hash => sub {
        my %ha; @ha{0..$#numbers} = @numbers;
        for my $index (@indices){
            delete $ha{$index};
        }
        my @output = map { $ha{$_} } sort keys %ha;
        join("\0", @output) eq join("\0", @expect) or die if TEST;
    },
});

__END__

           Rate   grep   hash splice
grep    30916/s     --   -85%   -96%
hash   205775/s   566%     --   -71%
splice 707918/s  2190%   244%     --
[download]

With the above, grep is disadvantaged and splice it just a guess by me (it always removes the first item).

The bottom line: in a loop of removals, hash-based approach can be a contestant too although splice is by far the best but it messes the indices.

bw, bliako

Comment on Re: Fastest way to "pick without replacement" Select or Download Code

Replies are listed 'Best First'.
Re^2: Fastest way to "pick without replacement" by choroba (Cardinal) on Nov 21, 2020 at 00:31 UTC
Have you tried setting TEST to 1? The implementation of grep is wrong, it overwrites the result in each iteration of the for loop. Also, removing just half of the indices shows most of the code needs to be fixed. #!/usr/bin/perl use warnings; use strict; use Benchmark qw{ cmpthese }; use List::Util qw{ shuffle }; my @numbers = 0 .. 20; my @indices = shuffle(grep $_ % 2, 0 .. $#numbers); my @expect = (0, 2, 4, 6, 8, 10, 12, 14, 16, 18, 20); use constant TEST => 1; cmpthese(-2, { splice => sub { my @output = @numbers; for my $index (sort { $b <=> $a } @indices){ splice @output, $index, 1; } join("\0", @output) eq join("\0", @expect) or die "@output\n@e +xpect" if TEST; }, grep => sub { my @output = @numbers[ grep { my $number = $_; ! grep $number == $_, @indices } 0 .. $#numbers ]; join("\0", @output) eq join("\0", @expect) or die "@output\n@e +xpect" if TEST; }, hash => sub { my %ha; @ha{0..$#numbers} = (); for my $index (@indices){ delete $ha{$index}; } my @output = @numbers[ sort { $a <=> $b } keys %ha ]; join("\0", @output) eq join("\0", @expect) or die "@output\n@e +xpect" if TEST; }, hash_slice => sub { my %indices; @indices{0 .. $#numbers} = (); delete @indices{@indices}; my @output = @numbers[sort { $a <=> $b } keys %indices]; join("\0", @output) eq join("\0", @expect) or die "@output\n@e +xpect" if TEST; } }); [download] The order hasn't changed, but the ratios have (result with TEST set back to 0): `Rate grep hash hash_slice splice grep 65163/s -- -42% -45% -89% hash 112778/s 73% -- -5% -81% hash_slice 118725/s 82% 5% -- -80% splice 580859/s 791% 415% 389% --` [download] (swl's solution included slightly improved.) Update: But using a nested loop (grep in grep) is inefficient and is probably not how I would have changed the original code removing a single element. I'd rather use something like `grep => sub { my %indices; @indices{@indices} = (); my @output = @numbers[ grep ! exists $indices{$_}, 0 .. $#numbers ]; join("\0", @output) eq join("\0", @expect) or die "@output\n@e +xpect" if TEST; },` [download] which leads to (TEST = 0) `Rate hash hash_slice grep splice hash 113323/s -- -6% -41% -80% hash_slice 120876/s 7% -- -37% -79% grep 190934/s 68% 58% -- -66% splice 568237/s 401% 370% 198% --` [download] `map{substr$_->[0],$_->[1]\|\|0,1}[\\|\|{},3],[[]],[ref qr-1,-,-1],[{}],[sub{}^ARGV,3]`	[reply] [d/l] [select]
Re^2: Fastest way to "pick without replacement" by swl (Prior) on Nov 20, 2020 at 21:54 UTC
Using a slice in the delete call will sidestep the for-loop. The map can also be replaced with a slice. `hash_slice => sub { my %ha; @ha{0..$#numbers} = @numbers; delete @ha{@indices}; my @output = @ha{sort keys %ha}; join("\0", @output) eq join("\0", @expect) or die if TEST; },` [download]	[reply] [d/l]
Re^3: Fastest way to "pick without replacement" by swl (Prior) on Nov 21, 2020 at 03:06 UTC
The sort should also be numeric. `hash_slice => sub { my %ha; @ha{0..$#numbers} = @numbers; delete @ha{@indices}; my @output = @ha{sort {$a <=> $b} keys %ha}; join("\0", @output) eq join("\0", @expect) or die if TEST; },` [download] Update: Although I now see Choroba has already done that in 11123952.	[reply] [d/l]
Re^3: Fastest way to "pick without replacement" by bliako (Abbot) on Nov 20, 2020 at 23:53 UTC
coolz!	[reply]