in reply to Divide array of integers into most similar value halves

Note: This only works with positive integers!

Try this. It uses a semi-random approach with a specifiable limit to bound the attempts it makes. It seems to do a pretty good job of finding the optimium solution most of the time. It will occasionally miss, but when it does, it still delivers a close to optimium result:

#! perl -slw use strict; use List::Util qw[ sum shuffle ]; sub partition { my( $limit, $aRef ) = @_; my @in = sort{ $a <=> $b } @$aRef; my $target = sum( @in ) >> 1; my( $best, @best ) = 9e99; my $soFar = 0; my @half; for( 1 .. $limit ) { #print "$soFar : [@half] [@in] [@best]"; <>; $soFar += $in[ 0 ], push @half, shift @in while $soFar < $targ +et; return( \@half, \@in ) if $soFar == $target; my $diff = abs( $soFar - $target ); ( $best, @best ) = ( $diff, @half ) if $diff < $best; $soFar -= $half[ 0 ], push @in, shift @half while $soFar > $ta +rget; return( \@half, \@in ) if $soFar == $target; $diff = abs( $soFar - $target ); ( $best, @best ) = ( $diff, @half ) if $diff < $best; @in = shuffle @in; } my %seen; $seen{ $_ }++ for @best; ## return \@best, [ grep !$seen{ $_ }--, @$aRef ]; ## Fix duplicate +s bug return \@best, [ grep{ !exists $seen{ $_ } or !$seen{ $_ }-- } @$a +Ref ]; } for ( [ 62, 83, 121, 281, 486, 734, 771, 854, 885, 1003 ], [ 7,10,12,15,40 ], [ 8, 14, 32, 29 ], [ 41, 37, 37, 43 ], [ 99, (1)x99 ], [ 1 .. 99 ], [ map $_*2+1, 1 .. 50 ], [ map int( rand 1000 ), 1 .. 100 ], ) { my( $a1, $a2 ) = partition( 1e2, $_ ); my( $t1, $t2 ) = ( sum( @$a1), sum( @$a2 ) ); print "\n@$a1 := ", $t1; print "@$a2 := ", $t2; print "Diff: ", abs( $t1 - $t2 ); }

The limit is specified as an input parameter. Above I'm using a hardcoded limit of 100, but that might be better specified in terms of the number of input values. Say, @input * 10 might be a starting point. The longer you're prepared to wait, the better solution it will generally find. It will return quickly if a perfect solution is found.

Testing is limited to what you see above. Anyone know of any particularly hard cases?

c:\test>708290.pl 734 771 854 281 := 2640 121 62 83 885 486 1003 := 2640 Diff: 0 7 10 12 15 := 44 40 := 40 Diff: 4 8 32 := 40 14 29 := 43 Diff: 3 37 41 := 78 37 43 := 80 Diff: 2 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 +1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 + 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 := 99 99 := 99 Diff: 0 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 +30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 +53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 := 2475 94 1 72 99 2 81 96 86 79 74 91 85 92 89 76 71 98 88 93 97 77 84 78 90 +73 75 95 3 4 87 80 83 82 := 2475 Diff: 0 63 65 67 69 71 73 101 79 99 11 75 91 81 87 95 5 37 13 41 77 := 1300 27 29 3 33 21 53 85 61 23 17 47 49 45 51 35 89 7 55 43 57 59 15 25 31 +39 9 93 83 97 19 := 1300 Diff: 0 248 28 269 882 847 389 242 192 484 519 123 259 528 363 410 722 110 34 +501 922 695 622 831 48 464 234 548 815 702 725 403 629 29 579 365 81 +294 47 927 879 639 470 902 318 313 736 228 230 750 38 620 77 702 426 +582 := 25020 923 457 738 7 847 234 696 334 200 989 231 744 528 402 619 485 791 272 +261 533 525 298 724 615 253 71 957 836 904 105 575 929 537 802 791 94 +1 437 538 740 165 229 528 867 609 753 := 25020 Diff: 0

Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
"Science is about questioning the status quo. Questioning authority".
In the absence of evidence, opinion is indistinguishable from prejudice.
"Too many [] have been sedated by an oppressive environment of political correctness and risk aversion."

Replies are listed 'Best First'.
Re^2: Divide array of integers into most similar value halves
by Pepe (Sexton) on Sep 03, 2008 at 15:19 UTC
    Sorry to disappoint you BrowserUk,
    your sub does not work if a number is repeated in the array.
    Try @array = (33,33,37);
    The solution is good in every other case, though. Only need that to be fixed.
    Thanks a lot for your effort, anyways.
    Pepe

      Thanks. That is indeed a bug. The (lightly tested) fix is to change the last line of the sub to:

      return \@best, [ grep{ !exists $seen{ $_ } or !$seen{ $_ }-- } @$a +Ref ];

      Basically, because the sub only stores one best partition, when it runs out of iterations, it needs to filter the input array to generate teh other half. The filter was a sloppy variation on a theme that works for other purposes but not this. I think the above corrects it, but I will need to give more thought once my brain has awoken.


      Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
      "Science is about questioning the status quo. Questioning authority".
      In the absence of evidence, opinion is indistinguishable from prejudice.
        BrowserUk,
        Your algorithm is the best I found so far, but it still has the bug with repeated numbers.
        I came out with a suboptimal solution, but in more than 100.000 arrays your code wins in every case the answer are different and there are no repeated elements.
        Now it does it only with more than 2 repetitions as in @array = (57,57,57,43,32). Would there be any way of solving that bug for any array no matter how many repeated elements it contains?
        I've tried but I don't seem to be able to do it. Your code is really tight and it's difficult for me to modify it.
        Sorry for the bugging. I really appreciate your efforts.
        Pepe
        I'll try it for you and report on it when I'm done.
        Maybe tomorrow.
        Thanks again fro your effort