Given the sets: ABCD ABCE ABD #### ABCD ABE ABC EF Using the "store everything in the powerset approach" (or, at least allocating space for it), we need to store 2^6 elements (A,B,C,D,E,F) or 64 bytes of data. We're going to assume that we know in advance that there are only 6 possible elements. For sake of argument, let's assume a more compact efficient approach along the lines of the original approaches using hash tables. We'd build up a structure like this, in some arbitrary order: #first set ABCD BCD ACD ABD ABC AB AC BC AD BD CD A B C D #second set ABE AE BE E #third set #generates nothing #fourth set EF F This only requires 22 bytes of data, a 65% memory savings. #### #first set generates normally ABCD #ABCD is the first set, store internally BCD ACD ABD ABC AB AC BC AD BD CD A B C D #second set is ABE a subset of ABCD? NO -> print #ABE is the first set, store internally is AB a subset of ABCD? YES -> skip is AE a subset of ABCD? NO -> print is BE a subset of ABCD? NO -> print is A a subset of ABCD? YES -> skip is B a subset of ABCD? YES -> skip is E a subset of ABCD? YES -> skip #third set is ABC a subset of ABCD? YES -> skip #generates nothing, stores nothing #fourth set is EF a subset of ABCD? NO -> is EF a subset of ABE? NO -> print is E a subset of ABCD? NO -> is EF a subset of ABE? YES -> skip is F a subset of ABCD? NO -> is F a subset of ABE? NO -> print internally, we only need to store ABCD, ABE, and EF for 3 bytes of data stored - a 95.3% storage savings. #### So if we added on ABF to the list, it would check as follows: is ABF a subset of EF? NO -> ABE < ABF, therefore ABF cannot be a subset of anything. Stop now and print ABE #### So instead of: my @previous = qw(EF ABE ABCD) I'd have: my @previous = ( [qw(ABE ABCD)], #A slot [qw(ABE ABCD)], #B slot [qw(ABC)], #C slot [qw(ABCD)], #D slot [qw(EF ABE)], #E slot [qw(EF)] #F slot ); #### [qw(ABE ABCD)], #A slot [qw(ABE ABCD)], #B slot [qw(EF)] #F slot #### (total number of sets) * (average length of sets) bytes of data. Let's say we have 10,000,000 sets with an average set length of 24. That's only going to require 240,000,000 bytes to store (240 megabytes), which is slightly less than 1% of the total ram of the "store it all" approach. #### To demonstrate the optimized version: #first set generates normally ABCD #ABCD is the first set, store internally BCD ACD ABD ABC AB AC BC AD BD CD A B C D #second set ABE -> look at slots A, B, E. All are the same, so arbitrarily check against "A" (ABCD) is ABE a subset of ABCD? NO -> print #ABE is the first set, store internally is AB a subset of ABCD? YES -> skip is AE a subset of ABCD? NO -> print is BE a subset of ABCD? NO -> print is A a subset of ABCD? YES -> skip is B a subset of ABCD? YES -> skip is E a subset of ABCD? YES -> skip #third set is ABC a subset of ABCD? YES -> skip #generates nothing, stores nothing #fourth set -> look at slots "E", "F". "F" is the smallest (empty), so we need to do -no- checks. EF E F We store 9 bytes and do 10 checks (determining "no checks" counts as a check). Note - the original powerset version would also require 10 checks, so we're optimal here.