blackmanao has asked for the wisdom of the Perl Monks concerning the following question:

I hope some kindly monk or three can give me some guidance. I'm trying to decompose an arbitrary sum into a set of summands (or addends or terms, whatever you like to call them), such that there are always at least two summands in the set, and every summand is greater than or equal to 2.

Example; the sum 7 should be broken down into the sets (and only the sets) {5,2}, {4,3} and [3,2,2}. I don't need the permutations i.e. {2,2,3} or {2,3,2} since that would stiffle the speed of another calculation where if one works, all will work. And this needs to be very fast since in use it's going to have to handle sums of up to 10^6.

This is as far as I've gotten, but I just can't seem to get the looping to work quite right:

#!/usr/bin/perl -w # decompose sums to sets of summands use strict; use warnings; use vars qw($sum $size @temp @summands $prev $next); $sum = 10; $prev = 2; $next = $sum - $prev; while ($prev <= $next) { break(); } $size += scalar(@summands); for (my $i = 0 ; $i < $size ; $i++) { @temp = split(",",$summands[$i]); $sum = pop(@temp); $prev = 2; $next = $sum - $prev; while ($prev <= $next) { break(); } } for (@summands) { print "$_\n"; } sub break { push(@temp, $prev); push(@temp, $next); push(@summands, join(",",@temp)); @temp = (); $prev++; $next = $sum - $prev; }

Replies are listed 'Best First'.
Re: Decomposing sum to unique sets of summands
by Narveson (Chaplain) on Oct 28, 2008 at 14:10 UTC

    The term for what you are doing is partitioning. To keep your Google search from returning office partitions and disk partitions and wallboard suppliers, say "partitions (number theory)". That's the subject heading in Wikipedia.

Re: Decomposing sum to unique sets of summands
by blokhead (Monsignor) on Oct 28, 2008 at 20:26 UTC
    Here is a slightly simpler iterator than jethro's, but surely it's doing the same thing. It avoids the recursion blowup of JavaFan.
    sub iterator { return unless @_; my @p = ($_[0]); return sub { ## collect all the trailing 2s, and last trailing 3 if present my $temp = 0; $temp += pop @p while @p and $p[-1] == 2; $temp += pop @p if @p and $p[-1] == 3; ## updated return if ! @p; ## reduce the last guy by 1, avoid total of 1 leftover $p[-1]--, $temp++; $p[-1]--, $temp++ if $temp == 1; ## redistribute collected amount, as large as possible ## (largest increments can be $p[-1]) if ($temp % $p[-1] == 0) { push @p, ($p[-1]) x ($temp/$p[-1]); ## special case to avoid 1 leftover } elsif ( $temp % $p[-1] == 1 ) { my $m = int ($temp/$p[-1]) -1; push @p, ($p[-1]) x $m, $p[-1]-1, 2; } else { my $m = int ($temp/$p[-1]) ; push @p, ($p[-1]) x $m, ($temp - $p[-1]*$m); } @p; } } my $iter = iterator(shift || 50); while (my @part = $iter->()) { print "@part\n"; }
    Like jethro's, there could be some efficiencies gained by keeping track of some pointers, but the main problem is that there are just so many partitions. So I think the problem statement should be clarified, especially if 10^6 is involved.

    Update: updated the marked line according to BrowserUk's suggestion. (added "@p and").

    blokhead

Re: Decomposing sum to unique sets of summands
by jethro (Monsignor) on Oct 28, 2008 at 19:25 UTC

    I don't even get what algorithm you try to implement here. But that splitting and joining tells me you are using the wrong data structure if speed is any concern. Also the best (to understand) solution would use recursion but sadly that is not the speediest solution.

    Below I use an array to store the latest set and iteratively calculate the next set from that one

    The interesting part is how to calculate the next step. The script is rather complicated because I put in some optimizations (a further optimization would be to remember where in the array the row of 2s begin)

    #!/usr/bin/perl # decompose sums to sets of summands use strict; use warnings; my $sum=80; my @sets; my @latest=($sum); my $higherthan3=0; #the position of the last number higher than 3 my $sumof3sand2s=0; #all the 3s and 2s summed up; while (1) { # @latest is always sorted from highest to lowest number #calculate the next set #1: special case 3 3 and a string of 2s my $i= $#latest; $i-- while ($i>0 and $latest[$i]==2); if ($i>0 and $latest[$i]==3 and $latest[$i-1]==3) { $latest[$i-1]=2; $latest[$i]=2; push @latest, 2; } else { #2: get the last number higher than 3. If there is none, stop $i= $higherthan3; last if ($i==-1); #3: if that number is also the last number at all, produce a 2 aft +er it if ($i==$#latest) { $latest[$i]-= 2; push (@latest, 2); $sumof3sand2s+=2; if ($latest[$i]<=3) { $higherthan3--; $sumof3sand2s+=$latest[$i]; } } #4: otherwise cut off at this point, decrease it and put the rest # to the right of it with the highest possible numbers else { $#latest=$i; $latest[$i]--; my $x= $latest[$i]; my $leftover= $sumof3sand2s+1; $sumof3sand2s=0; my $foundpoint=0; #is true after the higherthan3 point was fou +nd if ($x<=3) { $sumof3sand2s=$x+$leftover; $foundpoint++; $higherthan3= $i-1; } while($leftover>$x) { $i++; if ($leftover>$x+1) { push(@latest,$x); $leftover-= $x; } else { push @latest, $x-1; $leftover-= ($x-1); if ($x==4) { $sumof3sand2s=3+$leftover; $foundpoint++; $higherthan3= $i-1; } } } if ($leftover>1) { push(@latest, $leftover); if ($leftover>3) { $higherthan3=$#latest; } elsif (not $foundpoint) { $higherthan3= $#latest-1; $sumof3sand2s+= $leftover; } } die "Internal Error" if ($leftover==1); } } #push @sets, join(',',@latest); #print join(',',@latest),"\n"; } print join("\n",@sets),"\n"; print @sets+0,"\n"; #------------------

    The script is around 2.5 times faster than JavaFans version, but above $sum==70 memory becomes a problem. That can be avoided by doing further processing of a set in place instead of collecting all sets into an array. But above $sum==100 time will be a constraint too. The numbers (without storing all sets into an array) on my machine:

    n sets time ------------------------- 50 30700 0.37s 70 500k 4.5s 75 1M 6.5s 80 2M 12s 100 21M 2min
Re: Decomposing sum to unique sets of summands
by swampyankee (Parson) on Oct 28, 2008 at 15:10 UTC

    It's called partitioning. Look here.


    Information about American English usage here and here. Floating point issues? Please read this before posting. — emc

Re: Decomposing sum to unique sets of summands
by JavaFan (Canon) on Oct 28, 2008 at 14:15 UTC
    sub perms { my ($n, $min) = @_; $min //= 2; return [] if $n <= 0; return [$n] if $n <= $min; my @r; foreach my $i ($min .. $n) { my $left = $n - $i; next if $left < $min; my @p = perms $left, $i; foreach my $p (@p) { push @r, [@$p, $i]; } } @r; } say "[@$_"] for perms 7; __END__ [2 3 2] [3 4] [2 5]
    It's not fast, but that's because of the enormous amount of ways to decompose numbers.

      The post says:

      And this needs to be very fast since in use it's going to have to handle sums of up to 10^6.

      I don’t think your program can be used for this; it only works for small integers.

      The so-called partition function p(n) represents the number of possible partitions of a natural number n (distinct and order independent). Unfortunately p(n) grows rapidly, see Partition. Try to run the program on n=1000.

        I don’t think your program can be used for this; it only works for small integers.
        It seems unfair to call this a problem with JavaFan's solution. As you point out, the bottleneck is mathematical—there are so many partitions—rather than programmatic, so any program that does what the poster seems to want will be slow on large numbers. Maybe the poster only needs a few partitions, or knows something about the data-sets, so that some optimisations can be applied?
        Yes, I don't claim it's fast. And I don't think you can do it fast in Perl - you'll have to use some C or other fast language. And even then, you'd still have to convert all the numbers back into Perl numbers to process them.

        For the interested, the number of such partitions is A083751 in the OEIS.

Re: Decomposing sum to unique sets of summands
by swampyankee (Parson) on Oct 30, 2008 at 16:35 UTC

    I found a couple of algorithms (expressed in Fortran, but it should be easy to convert1 them to Perl ;-)) on Netlib, specifically 403 and 448. The algortithms were published in ACM's Collected Algorithms and Transactions on Mathematical Software in the early 1970's; both CALGO and TOMS are refereed publications, so at least one person other than the author has vetted the code.


    1 I'll try to do that tonight.


    Information about American English usage here and here. Floating point issues? Please read this before posting. — emc