http://qs1969.pair.com?node_id=191902

ezekiel has asked for the wisdom of the Perl Monks concerning the following question:

I have an array of arbitrary length. I want to get all possible sub-arrays of a certain length. For example,

@my_array = (0, 1, 2, 3); $my_sub_array_length = 2; # possible combinations are (0, 1), (0, 2), (0, 3), (1, 2), (1, 3), (2, 3)

A search of the site lead me to this module which looks like it might do what I want but the links therein to pod and other documentation seem to be no longer active.

Does anyone know of new links to the documentation for the module? or does anyone have any other suggestions for doing these combinatorics?

Thanks.

Replies are listed 'Best First'.
(tye)Re: Combinatorics
by tye (Sage) on Aug 22, 2002 at 03:10 UTC
    sub genFixedSubsets { my( $size, @set )= @_; my @idx= reverse 0..$size-1; return sub { return if $size < @idx; my @ret= @set[@idx]; my $i= 0; $i++ until ++$idx[$i] < @set-$i || $size < $i; $idx[$i]= 1+$idx[1+$i] while 0 <= --$i; return @ret; }; } my $gen= genFixedSubsets( $ARGV[0] || 3, 1..($ARGV[1]||5) ); my @subset; while( @subset= $gen->() ) { print "@subset\n"; }

    For example:

    $ subsets 3 5 3 2 1 4 2 1 5 2 1 4 3 1 5 3 1 5 4 1 4 3 2 5 3 2 5 4 2 5 4 3
            - tye (but my friends call me "Tye")
Re: Combinatorics
by Aristotle (Chancellor) on Aug 22, 2002 at 03:23 UTC
    #!/usr/bin/perl -w use strict; use Data::Dumper; sub unshift_many { my $scalar = shift; unshift @$_, $scalar for @_; @_; } sub combinations { my ( $array, $len, $start ) = @_; $start ||= 0; return unless $len > 0; return $len == 1 ? map [ $_ ], @{ $array }[ $start .. $#$array ] : map unshift_many( $array->[$_], combinations( $array, $len-1 +, $_+1 ) ), $start .. $#$array; } $Data::Dumper::Indent=0; print Dumper( [ combinations [ qw( 1 2 3 4 5 ) ], 3 ] ), "\n";

    Makeshifts last the longest.

Re: Combinatorics
by dpuu (Chaplain) on Aug 22, 2002 at 02:43 UTC
    I don't know about the module, but its a fairly simple (recursive) subroutine:
    use Data::Dumper; my @a = (1..4); my $len = 4; print Dumper(combinations($len, @a)); sub combinations { my ($size, @elements) = @_; return [] if $size < 1; my @result = (); foreach my $elem (@elements) { push @result, map { [$elem, @$_] } combinations($size-1, grep { $_ != $elem } @elements) } return @result; }
    --Dave

    Update: for greater generality, replace foreach loop with

    my @seen = (); while (@elements) { my $elem = shift @elements; push @result, map { [$elem, @$_] } combinations($size-1, @seen, @elements); push @seen, $elem; }

      This looks great! except that it gives permutations rather than combinations. For example, it produces both (1, 2, 3) and (1, 3, 2) whereas, for my purposes, these are the same thing i.e., order is not important. It gives me a starting point though - thanks!

Re: Combinatorics
by jryan (Vicar) on Aug 22, 2002 at 03:05 UTC
    Set "$set_group_size" to the size you want the chunks to be.
    use Data::Dumper; my $set_group_size = 2; my @list = 1..5; print Dumper ([ sumList($set_group_size,@list) ]); sub sumList { my @sumlist; my $size = shift; while ( @_ ) { my @current = splice @_, 0, $size-1; foreach my $item (@_) { push ( @sumlist, [@current,$item] ); } unshift (@_, @current[1..$#current]); } return @sumlist; }
Re: Combinatorics
by DamnDirtyApe (Curate) on Aug 22, 2002 at 05:30 UTC
    #! /usr/bin/perl -w use strict ; use Data::Dumper ; $|++ ; my @my_array = (0, 1, 2, 3) ; my $my_sub_array_length = 2 ; my @subsets = () ; for ( 3 .. 2 ** @my_array ) { my @digits = reverse split // => sprintf "%b", $_ ; if ( ( grep { $_ } @digits ) == $my_sub_array_length ) { my @sub_arr = grep { $digits[$_] } @my_array ; push @subsets, \@sub_arr ; } } print Dumper( \@subsets ) ; exit ; __END__

    _______________
    DamnDirtyApe
    Those who know that they are profound strive for clarity. Those who
    would like to seem profound to the crowd strive for obscurity.
                --Friedrich Nietzsche
Re: Combinatorics
by bart (Canon) on Aug 22, 2002 at 09:12 UTC

    Ah, a twist. The common question is about permutations.

    Nevertheless, I'd still use recursion. Say you need $n elements. Is the first element included? If yes, combine it with all combinations of $n-1 elements from the list of all of the following elements. If no, get all combinations of $n elements of the same sublist. Of course, you need to include every possible solution, which means walking every possible path.

    Code!

    sub combinations { my $n = shift; return [@_] if $n == @_; return () if $n > @_ or $n < 0; my $first = shift; my @r = ((map [ $first, @$_ ], combinations($n-1, @_)), combinations($n, @_)); return @r; } use Data::Dumper; print Dumper [ combinations(2, (0, 1, 2, 3)) ];

    It appears to be working well.

    Update: I'm pretty sure inserting

    return [] if $n == 0;
    at the appropriate place, i.e. among the other return statements near the top, will improve efficiency quite a bit. It avoids doing a lot of useless recursion if all you want is the empty list as a singleton.
      Yes, it will. In fact you can cut down on the amount of useless processing further by replacing all those returns with return map [$_], @_ if $n == 1; and then you have the equivalent to the code I wrote, except passing around full lists rather than just an arrayref.

      Makeshifts last the longest.

Re: Combinatorics
by BrowserUk (Patriarch) on Aug 22, 2002 at 19:09 UTC

    Update:Having heard back from the author of the original C-source of this algorithm, he asked me to change the email address to his new one.

    Not as concise as most of the others, but it seems fairly efficient for both memory and speed.

    It probably could be golfed some more, but it defeated my attempts so far.

    If your application calls for sub-setting different sets, but with the same size of both subset and set, you can generate the subsets of indices to the sets, rather that the subsets themselves and reuse the indices.

    #! perl -w use strict; =pod Bit-twiddling transpositional combination generator in Perl, © 2002,BrowserUK / perlmonks.com Based upon a C implementation by Doug Moore (unkadoug@yahoo.com). Source:http://www.caam.rice.edu/~dougm/twiddle/yargbitcomb.c =cut sub Lshift1 { use integer; my $i = shift; my $ii = $i>>1; return 1 << $ii << ($i - $ii); } sub yargFirstComb # Returns the inverse gray code (yarg) of the first combination of k i +tems (i.e. {0,1,..,k-1}) { use integer; my $kk = Lshift1($_[0])-1; return $kk ^ $kk/3; }; sub leastItem # Returns the least item in a combination (i. e. leastItem({2,4,5}) == + {2} { use integer; return $_[0] & -$_[0]; }; sub yargLastComb # Returns the yarg of the last combination of k items from n (i.e. {n- +k,..,n-1}) { use integer; my ($nn, $kk) = ( Lshift1($_[0])-1, Lshift1($_[1])-1); return ($_[1]) ? $nn ^ ($kk/3) : 0; }; # Returns the yarg of the next combination after yarg input sub yargNextComb { use integer; my $comb = shift; my $grey = ($comb << 1) ^ $comb; my $i = 2; my $candidateBits; do { my $y = ($comb & ~($i - 1)) + $i; my $j = leastItem( $y ) << 1; my $h = !!($y & $j); $candidateBits = (($j - $h) ^ $grey) & ( $j - $i ); $i = $j; } while (!$candidateBits); return $comb + leastItem($candidateBits); } sub factorial { no integer; my ($f,$n) = (1,shift); $f *= $n-- while( +$n ); return $f; } sub subsets { use integer; my @AoAoCombs; my ($k, $n, $combs) = (shift, shift, 0); { no integer; $combs = factorial($n)/(factorial($k)*factorial($n-$k)); print "Generating $combs subsets of $k from a set of $n\n"; $#AoAoCombs = $combs-1; #pre-extend t +he array of array refs to its final size } die "Usage: subsets k, n\nGenerate subsets of k-elements from a se +t of n-elements where k < n.\n" unless $n and $k and $k < $n; my $comb = yargFirstComb($k); my $lastcomb = yargLastComb( $n, $k); while(1) { my $member = 0; #!! my $c = $comb ^ ($comb >> 1); # 'push' anon array ref & pre-extend anon. array space $AoAoCombs[--$combs] = []; $#{$AoAoCombs[$combs]} = $k-1; ($c & Lshift1($_)) and @{$AoAoCombs[$combs]}[$member++] = $_ f +or 0 .. $n-1; # 'unshift' last if $comb == $lastcomb; $comb = yargNextComb($comb); } return \@AoAoCombs; } my $AoAoCombs = subsets 2, 4; # Generate combinations of indices my @data1 = qw( just another perl hacker ); local $,=' '; print "Applying combined indices to @data1\n\n"; print @data1[ @{$AoAoCombs->[$_]} ], $/ for (0 .. $#$AoAoCombs); my @data2= (1,2,3,4); print "\nApplying combined indices to @data2\n\n"; print @data2[ @{$AoAoCombs->[$_]} ], $/ for (0 .. $#$AoAoCombs); # App +ly the indices to as many sets as you like print $/; no integer; my @data3 = (1..31); my @times = (times); my $start = $times[0] + $times[1]; $AoAoCombs = subsets 26, ~~@data3; @times = times; my $end = $times[0]+$times[1]; print "Generating " . @{$AoAoCombs} . " combinations of 26 from 31 too +k ", $end-$start, " seconds of cpu P-II\@233MHz\n", "including generating 169911 x 26 element anonymous arrays to stor +e the results.\n"; #print "\nApplying combined indices to @data1\n\n"; #print @data1[@{$^AoAoCombs[$_]}], $/ for (0..$#{$AoAoCombs}); + # Apply the indices __END__ # Output C:\test>191902 Generating 6 subsets of 2 from a set of 4 Applying combined indices to just another perl hacker just hacker another hacker perl hacker just perl another perl just another Applying combined indices to 1 2 3 4 1 4 2 4 3 4 1 3 2 3 1 2 Generating 169911 subsets of 26 from a set of 31 Generating 169911 combinations of 26 from 31 took 251.51 seconds of +cpu P-II@233MHz including generating 169911 x 26 element anonymous arrays to store th +e results. C:\test>

    What's this about a "crooked mitre"? I'm good at woodwork!
Re: Combinatorics
by blakem (Monsignor) on Aug 22, 2002 at 21:43 UTC
    A little afternoon golf produced this scary looking solution... It assumes the elements in the array are unique (i.e. a set) and none of them contain the comma character.
    #!/usr/bin/perl -wT use strict; my @a = 1..5; my $l = 3; local $" = ','; my @combos = grep!$;{"@$_"}++,map[sortsplit','], grep!/([^,]+).*,\1,/,glob"{@a},"x$l; print "@$_\n" for @combos; __END__ 1,2,3 1,2,4 1,2,5 1,3,4 1,3,5 1,4,5 2,3,4 2,3,5 2,4,5 3,4,5
    Update: There is a subtle bug in the second line of the golfed statement... grep!/([^,]+).*,\1,/ is supposed to filter out all permutations that have doubled elements, but its a bit faulty. It doesn't affect the final value of @combos though. Anyone want to take a guess at it? Whats the bug, and why doesn't it matter in the end?

    -Blake

Re: Combinatorics
by I0 (Priest) on Aug 23, 2002 at 05:03 UTC
    use Data::Dumper; my @my_array = (1..7); my $my_sub_array_length = 4; print Dumper(combinations($my_sub_array_length, @my_array)); sub combinations { my($len,@a)=@_; return map{ my $c=$_<<1; [grep{($c>>=1)&1}@a]} &{sub{ my @ret = (); my $x; for($_=(1<<shift)-1; ($x=$_)<1<<$_[0]; $x&=~$x>>1,$x&=-$x,$_+=$x--,($x&=$_)?($_-=$x,$_+=$x/($ +x&-$x)):0 ){ push @ret,$_ } @ret; }}($len,0+@a); }
A non-recursive solution
by Thelonius (Priest) on Aug 23, 2002 at 15:43 UTC
    #!perl -w use strict; # kenhirsch at myself.com 2002-08-23 my $r = shift or die "usage: combinations r a b c d e ...\n"; my @out = combinations($r, \@ARGV); for (@out) { print join(" ", @{$_}), "\n"; } # From Algorith L in Knuth Vol. 4 Sec 7.2.1.3 (not yet published) sub combinations { my ($t, $arrayref) = @_; my @c = 0 .. $t-1; my @range = reverse @c; my $j; my @result; $c[$t] = scalar(@{$arrayref}); $c[$t + 1] = 0; do { push @result, [@{$arrayref}[@c[@range]]]; for ($j=0; $c[$j] + 1 == $c[$j+1]; $j++) { $c[$j] = $j; } $c[$j]++; } while ($j < $t); return @result; }
Re: Combinatorics
by jackdied (Monk) on Aug 25, 2002 at 07:51 UTC
    Consider this a plug,

    http://probstat.sourceforge.net Is my combination/permutation/cartesian back-of-the-envelope algos written in C with python bindings. I'm currently in the process of upgrading the algos and adding more python functionality (slices).

    I've been meaning to add PerlXS bindings, but haven't had the time to learn anything complicated in XS. If someone can write an XS interface for one of the objects, I can fake it for the rest.

    The license is GPL, oddly enough I'm working on it at this moment (pulling from http://sources.redhat.com/gsl/ the Gnu Scientific library for better C algos where I can). If you want to see what this perl code I wrote Name Me! MixMatch? looks like translated from C to perl, check it out.

      If someone can write an XS interface for one of the objects, I can fake it for the rest.

      Don't. Take a look at Inline::C or Inline::C++. You'll find that it's a lot easier than you think.

      Greetings, Christian