Efficient Unique Nested Combinations

FFRANK has asked for the wisdom of the Perl Monks concerning the following question:

Replies are listed 'Best First'.
Re: Efficient Unique Nested Combinations by Roy Johnson (Monsignor) on Jun 25, 2007 at 20:30 UTC
You might be looking for a technique I discussed in NestedLoops and the Odometer Model: Now imagine that each time one of the dials changes, all the dials to the right of it can be remade. For example, each dial could exclude any numbers that exist on any dial to its left. That is, instead of looping over the static AoA, you can replace the arrayrefs with codrefs that filter out any elements that have already been used by outer loops. New update: In Re: Efficient Unique Nested Combinations, FFRANK points out that my "I think you want" is not the correct algorithm. A corrected solution is posted in reply thereto. Update: I think you just want to ensure that each element is greater-than-or-equal-to the previous one, so: `#!perl use warnings; use strict; use Algorithm::Loops 'NestedLoops'; my @symbol = ( [ 'a', 'b', 'c' ], [ 'a', 'b', 'c' ], [ 'a', 'b', 'c' ], [ 'c', 'd', 'e' ], ); my $combos = NestedLoops([ $symbol[0], map { my $_hold = $_; sub { [grep {$_ ge $_[$#_]} @{$symbol[$_hold]}] } } 1..$#symbol ]); my @result; print "@result\n" while @result = $combos->();` [download] Caution: Contents may have been coded under pressure.	[reply] [d/l]
Re^2: Efficient Unique Nested Combinations by Limbic~Region (Chancellor) on Jun 26, 2007 at 00:05 UTC
Roy Johnson, I think you just want to ensure that each element is greater-than-or-equal-to the previous one, so: That's exactly one the shortcuts I used in Challenge: Number of unique ways to reach target sum. I am going to update Arbitrarily Nested Loops to mention NestedLoops and the Odometer Model fwiw. Everytime I refer to someone to one of them, I have to look up the other. Cheers - L~R	[reply]
Re: Efficient Unique Nested Combinations by clinton (Priest) on Jun 25, 2007 at 19:49 UTC
I don't know anything about Algorithm::Loops, but given your code above, there is no need for the intermediate array `@list`. Instead, you can work out uniqueness as you go by using the keys of a hash: `#!/usr/bin/perl -w use strict; use Algorithm::Loops qw(NestedLoops); my @symbol = ( [ 'a', 'b', 'c' ], [ 'a', 'b', 'c' ], [ 'a', 'b', 'c' ], [ 'c', 'd', 'e' ], ); my $iter = NestedLoops(\@symbol); my %seen; while (my @list = $iter->()) { my $listString = join ('',sort @list); $seen{$listString}++ } my @sortedUniq = sort (keys %seen); foreach my $sortedUniq (@sortedUniq) { print $sortedUniq,"\n"; }` [download] Clint UPDATE - Benchmarks added I compared the array method to the hash method, both for data that was all the same, and for data that was completely different, the results were: `Rate array_diff array_same hash_diff hash_same array_diff 916/s -- -10% -16% -25% array_same 1012/s 11% -- -7% -18% hash_diff 1092/s 19% 8% -- -11% hash_same 1229/s 34% 21% 13% --` [download] Code for benchmarks: Read more... (2 kB)	[reply] [d/l] [select]
Re: Efficient Unique Nested Combinations by FFRANK (Beadle) on Jun 25, 2007 at 22:52 UTC
Thank you all very much for helpful comments; I hope I was explicit enough. I meant to say that 'aab' is the same as 'baa', so to output 'aab' or 'baa' or 'aba', but not all, and to do this the most efficient way. Hereafter, "symbols" are replaced by short strings, and the problem remains the same. I believe so far the suggestion from Roy Johnson is the best: Read more... (2 kB)	[reply] [d/l]
Re: Efficient Unique Nested Combinations by GrandFather (Saint) on Jun 25, 2007 at 21:50 UTC
`use strict; use warnings; use List::Compare; my @symbol = ( [ 'a', 'b', 'c' ], [ 'a', 'b', 'c' ], [ 'a', 'b', 'c' ], [ 'c', 'd', 'e' ], ); my $lCmp = List::Compare->new(@symbol); my @unique = $lCmp->get_symmetric_difference (); my @union =$lCmp->get_union (); print "Unique: @unique\n"; print "Union: @union\n";` [download] Prints: `Unique: d e Union: a b c d e` [download] Update: completely missed the point! DWIM is Perl's answer to Gödel	[reply] [d/l] [select]
Re: Efficient Unique Nested Combinations by Moron (Curate) on Jun 26, 2007 at 11:15 UTC
Algorithm::NestedLoops is a more general effort, whereas Math::Combinatorics is focused specifically on generating combinations and permutations - I would expect it to derive some performance from that specialisation, especially in the case of combinations which are a bit more difficult to iterate with their need to treat e.g. aab and aba as duplicate. __________________________________________________________________________________ ^M Free your mind!	[reply]
Re^2: Efficient Unique Nested Combinations by Roy Johnson (Monsignor) on Jun 26, 2007 at 15:06 UTC
True, except that M::C doesn't support the operation being described: combinations generated by taking one member from each of N (possibly) distinct sets. Caution: Contents may have been coded under pressure.	[reply]
Re^3: Efficient Unique Nested Combinations by Moron (Curate) on Jun 27, 2007 at 09:32 UTC
Au contraire, M::C supports nCk and the described operation can be expressed as a two-deep explicit nested loop of nCk next-combination calls, the lower layer being nC1. Update: because selecting a single member from a set is the semi-degenerate nC1 case of an nCr. __________________________________________________________________________________ ^M Free your mind!	[reply]
Re^4: Efficient Unique Nested Combinations by Roy Johnson (Monsignor) on Jun 27, 2007 at 15:17 UTC
Re^5: Efficient Unique Nested Combinations by Moron (Curate) on Jun 28, 2007 at 11:29 UTC
Re: Efficient Unique Nested Combinations by FFRANK (Beadle) on Jun 26, 2007 at 14:43 UTC
When the "symbols" are not ordered in the initial AoA, it skips some results. `my @symbols=( ['c','b','e'], ['a','f','d'], ); hash(\@symbols); ab ac ae bd bf cd cf de ef combo(\@symbols); c f c d b f b d e f` [download] The "greater than" solution thus assumes an order that is not neccessarily there in the initial AoA. Cheers	[reply] [d/l]
Re^2: Efficient Unique Nested Combinations by Roy Johnson (Monsignor) on Jun 26, 2007 at 15:25 UTC
I resorted to using a hash to catch what had been seen. The model still offers some short-circuiting, and is moderately faster than "hash" in my benchmarking (the more duplicates there are to avoid, the better "combo" does). Note also that you didn't change my code to use the passed-in array ref. I have done so here. `sub combo { my $symbols = shift; my %seen; my $combos = NestedLoops([ $symbols->[0], map { my $_hold = $_; sub { [grep {!$seen{join(':', sort @_, $_)}++} @{$symbols->[$_hold +]}] } } 1..$#$symbols ]); while (my @result = $combos->()) { print "@result\n"; } print "\n"; }` [download] Caution: Contents may have been coded under pressure.	[reply] [d/l]