Re^2: find combinations

Unfortunately the benchmark is misleading. You are timing how long it takes tye's solution to return an iterator that can give you the full set, not the amount of time it takes that iterator to return the full set like you are for your solution and my solution. Anyway, heres a modified benchmark that compares apples to apples. (Although I admit I tweaked my code slightly to not pass so much on the stack.)

Also, I normalized and/or simplified some of the other code involved, and changed the way the %mechs has gets passed into your code. Bottom subbing has the disadvantage that you get accidental globals, which can cause trouble when you convert things to a module. Its good practice to top-sub for this reason.

use strict;
use warnings;

use Math::Combinatorics;
use Algorithm::Loops qw( NestedLoops );
use Data::Dumper;
use List::Util qw(max);
use Benchmark qw(:all) ;

sub find_lances_holli
{
    my $mechs    = shift;
    my $players  = shift;
    my $weight   = shift;
    my $treshold = shift;
    
    
    my @all_lances = Math::Combinatorics::combine ($players, @_);
    
    my @valid_lances;
    
    for my $lance ( @all_lances )
    {
        my $sum;
 
        $sum += $mechs->{$_}{weight}
            for @{$lance};
        
        push @valid_lances, { names=>$lance, weight=>$sum }
            if ($weight - $sum <= $treshold) && ($weight - $sum) >= 0 
+;
    }
    
    return @valid_lances;
}


sub find_lances_tye
{
    my( $hvMechs, $members, $maxWeight, $treshold )= @_;
    my @mechs= keys %$hvMechs;
    
    my $iter= NestedLoops
    (
        [ [0..$#mechs], (sub{[$_+1..$#mechs]})x($members-1) ],
        {
            OnlyWhen => sub 
            {
                return   if  @_ != $members;
                my $weight= 0;
                $weight += $_->{weight}  for  @$hvMechs{@mechs[@_]};
                #return $weight == $maxWeight;
                return 1                        
                    if ($maxWeight - $weight <= $treshold) && ($maxWei
+ght - $weight) >= 0 ;
            },
        }
    );
    
    my $getit= sub 
    {
        my @idx= $iter->();
        my $w = 0;
        
        return unless @idx;
        
        $w+=$hvMechs->{$_}->{weight} for @mechs[ @idx ];
        
        return { weight=>$w, names => [@mechs[ @idx ]] }
            if @idx;
    };
    my (@ret,$item);
    push @ret,$item while $item=$getit->();
    return @ret;
}

# make_choose_iter( $choose, @list )
# ------------------------------------------------------
# Returns an anonymous subroutine to act as an iterator
# over the set of combinations, returning each in turn
# every time it is called. If the iter is called with an
# argument or the list has been exhausted on a previous run
# it resets itself and returns undef.
#

sub make_choose_iter {
    my ( $choose, $list )= @_;

    die "Can't do ".(0+@$list)." choose ".($choose||0)."\n"
        if $choose>@$list or !@$list or !$choose;

    $choose--;  # 1 based number passed in, but need it 0 based

    # done is used as a flag, and @idx is the first n positions
    my ( $done, @idx )= ( 0, 0 .. $choose );

    return sub {
        if ( $done || @_ ) {
            # if we finished the list last call, or they are
            # forcing a reset via an argument
            # then we reset everthing and return undef/empty list
            ($done , @idx)= ( 0, 0 .. $choose );
            return;
        }
        # we get the apropriate elements of the list for the
        # current state of @idx. this is what we will return
        my @ret= @{$list}[@idx];

        # assume that this is the last element.
        $done= 1;
        # Loop over the elements of @idx from the right
        # looking for the correct element:
        # We are simulating the behaviour of N
        # nested loops. This means we increment the
        # rightmost element whose increment does not
        # overflow and whose final value is sufficiently
        # low that the elements following it can be filled
        # with its successors without also overflowing
        for ( my $i= $choose ; $i >= 0 ; $i-- ) {
            if ( #$idx[$i] < $#$list
                 #&& 
                 (my $end= $idx[$i] + 1 + $choose - $i) < @$list )
            {
                $idx[$i]++;
                @idx[ $i + 1 .. $choose ]= ( $idx[$i] + 1 .. $end );
                $done= 0;
                last;
            }
        }
        return wantarray ? @ret : \@ret;
    };
}

sub find_lances_demerphq 
{
    my ($mechs,$count,$weight,$treshold)=@_;

    #print Dumper ($mechs);<STDIN>;
    my $iter= make_choose_iter( $count, [values %$mechs] );
    my @ret;
    while (my $items=$iter->()) 
    {
        my $sum=0;
        
        #print Dumper ($items);<STDIN>;
        
        $sum+=$_->{weight} for @$items;
        if (($weight - $sum <= $treshold) && ($weight - $sum) >= 0) {
            my $combo = { weight => $sum, names => [ map { $_->{name} 
+} @{$items} ] }; 
        
            push @ret, $combo
        }
    }

    return @ret;
}

  

my %mechs;
my @holli;
my @tye;
my @demerphq;
my $iter;


while (<DATA>)
{
    chomp;
    my ($mech, $weight, $faction) = split /\s+/;
    $mechs{$mech} = { name => $mech, weight => $weight, faction => $fa
+ction };
}
my $candidates={
        'tye'   => sub { @tye  = find_lances_tye   ( \%mechs, 5, 240, 
+10 ) },
        'holli' => sub { @holli = find_lances_holli ( \%mechs, 5, 240,
+ 10, keys %mechs ) },
        'demq'  => sub { @demerphq = find_lances_demerphq ( \%mechs, 5
+, 240, 10 ) },
    };
    
timethese(-10, $candidates);    
cmpthese(-10, $candidates);


for my $array (\@holli,\@tye,\@demerphq) {
    @$array= sort map { 
                join (",", sort @{$_->{names}}) . "(" . $_->{weight} .
+ " tons)\n" 
             } @$array;
             
}


my $ok = join ("", @holli) eq join ("", @tye) && 
         join ("", @demerphq) eq join ("", @tye) ? "" : " not";

print "output$ok identical\n";         


__DATA__
Flea 20 clan
Commando 25 clan
UrbanMech 30 clan
Hollander 35 clan
Jenner 35 clan
Raven 35 clan
Wolfhound 35 clan
BlackHawk 50 clan
Hunchback 50 clan
Rifleman 60 clan
Catapult 65 clan
Loki 65 clan
Thor 70 clan
MadCat 75 clan
Gargoyle 80 clan
Victor 80 clan
Zeus 80 clan
Longbow 85 clan
Warhawk 85 clan
Mauler 90 clan
Atlas 100 clan
[download]

And it shows my code beating tyes by just a tiny bit. Repeated attempts show the difference is pretty consistant so I dont think this is noise.

Benchmark: running demq, holli, tye for at least 10 CPU seconds...
      demq: 17 wallclock secs (10.31 usr +  0.02 sys = 10.33 CPU) @  2
+.71/s (n=28)
     holli: 19 wallclock secs (11.73 usr +  0.00 sys = 11.73 CPU) @  0
+.34/s (n=4)
       tye: 15 wallclock secs (10.14 usr +  0.03 sys = 10.17 CPU) @  2
+.06/s (n=21)
         Rate holli   tye  demq
holli 0.335/s    --  -85%  -87%
tye    2.18/s  550%    --  -18%
demq   2.66/s  695%   22%    --
output identical
[download]

Anyway, let this be a lesson to you. When the disparity of the benchamrks is so high look for a mistake in the code being benchmarked, especially when you get warnings about insufficient iterations. (And its generally better to use time for the benchmark and not iterations.)

---
$world=~s/war/peace/g

Comment on Re^2: find combinations Select or Download Code

Replies are listed 'Best First'.
Re^3: find combinations by holli (Abbot) on Sep 07, 2005 at 12:25 UTC
I stand corrected. I changed my benchmark code to timethese (-10, { 'tye' => sub { $iter = find_lances_tye ( \%mechs, 5, 240, 10 ); while ( my $lance = $iter->() ) { push @tye, join (",", sort @{$lance->{names}}) . "(" . + $lance->{weight} . " tons)\n" ; } @tye = sort @tye; }, 'holli' => sub { @holli = find_lances_holli (5, 240, 10, keys %mechs ); @holli = map { join (",", sort @{$_->{names}}) . "(" . $_- +>{weight} . " tons)\n" } @holli; @holli = sort @holli; }, 'demerphq' => sub { @demerphq = find_lances_demerphq ( \%mechs, 5, 240, 10 ); @demerphq = map { join (",", sort @{$_->{names}}) . "(" . +$_->{weight} . " tons)\n" } @demerphq; @demerphq = sort @demerphq; }, } ); [download] so it constructs the same data-structure in all cases and then I get `C:\>benchmark.pl Benchmark: running demerphq, holli, tye for at least 10 CPU seconds... demerphq: 10 wallclock secs (10.41 usr + 0.02 sys = 10.42 CPU) @ 1 +.44/s (n=15) holli: 12 wallclock secs (12.23 usr + 0.00 sys = 12.23 CPU) @ 0 +.33/s (n=4) tye: 10 wallclock secs (10.33 usr + 0.02 sys = 10.34 CPU) @ 1 +.84/s (n=19)` [download] Pretty similar to your outcome. Thanks for the clarification. holli, /regexed monk/	[reply] [d/l] [select]
Re^4: find combinations by demerphq (Chancellor) on Sep 07, 2005 at 13:56 UTC
Unfortunately your benchmark is still not what I would consider to be "fair". tyes solution _still_ returns an iterator which you then unpack in the benchmark sub. Wheras your solution and my solution have to return the list on the stack twice. This has an observable cost, especially for large lists like the ones we are dealing with. IMO the three variants of find_lances_XXX() should have identical API's, i.e. they should take the same arguments, and return the same type of list. Anythign else means that you are benchmarking different behaviour and unconsciously weighting some of the solutions different from the others. Another problem with your reformulation is that you have put the list normalization routine inside the benchmark. IMO this is undesirable as it adds the cost of normalizing the list to the runtime of the solutions, when the normalization code is not actually needed for the routine to meet the specification given. Also, sorting is very sensitive to intitial conditions, so this may also subtly favour one of the solutions. Also if you are going to benchmark the code, please use my updated code as the original posting contains some unnecessary stuff, and uses the stack for simplicity when for speed you would normally use pass by reference. --- $world=~s/war/peace/g	[reply]