PWC 244 task 2 in linear time

Disclaimer: it's clickbait. The plot is curved, solution isn't linear, despite lack of nested loops, -- but fast.

Task 2: Group Hero
Submitted by: Mohammad S Anwar

You are given an array of integers representing the strength.

Write a script to return the sum of the powers of all possible 
combinations; power is defined as the square of the largest number 
in a sequence, multiplied by the smallest.

Example 1

Input: @nums = (2, 1, 4)
Output: 141

Group 1: (2) => square(max(2)) * min(2) => 4 * 2 => 8
Group 2: (1) => square(max(1)) * min(1) => 1 * 1 => 1
Group 3: (4) => square(max(4)) * min(4) => 16 * 4 => 64
Group 4: (2,1) => square(max(2,1)) * min(2,1) => 4 * 1 => 4
Group 5: (2,4) => square(max(2,4)) * min(2,4) => 16 * 2 => 32
Group 6: (1,4) => square(max(1,4)) * min(1,4) => 16 * 1 => 16
Group 7: (2,1,4) => square(max(2,1,4)) * min(2,1,4) => 16 * 1 => 16

Sum: 8 + 1 + 64 + 4 + 32 + 16 + 16 => 141

I waited for a recap to be published to write this. Most solutions were built around Algoritm::Combinatorics or similar. The one I have (accidentally) chosen, below, uses different module, I didn't investigate if A::C would be faster. They are all prohibitively slow for array of a couple dozen items or so. Some solutions avoid CPAN modules, but employ, technically, the same combinatorics rolled out manually. They may be ten-fold faster, but scale exponentially all the same.

Later in the week, significantly more optimal, no-combinatorics, solutions were added; I think there were 2-3 of them; one is benchmarked below and is self-described to be "quadratic". It easily can process arrays of hundreds of items -- with help of bigint, of course. With input as such, the only interest is in speed itself, I don't think any practical application would calculate "strength" as big integer (bigint is required for arrays of relatively small integers, starting from approx. 50 items).

Subrotines code to compare to was taken from PWC GH repository; I only changed sub names and localized bigint scope to sub body. Section titles could also be more imaginative and descriptive, sorry about that. The benchmarking sub is a bit rough and ad-hoc.

The result:

*******************************
*** Math::Combinatorics *******
*******************************

+------------+------------+
| Array size | Time, sec. |
+------------+------------+
| 10         | 0.010      |
| 11         | 0.031      |
| 12         | 0.063      |
| 13         | 0.135      |
| 14         | 0.276      |
| 15         | 0.578      |
| 16         | 1.198      |
| 17         | 2.474      |
| 18         | 5.104      |
+------------+------------+
+
       +----------------------------------------------------------+
       |    +     +     +     +      +     +     +     +     +    |
    10 |-+                                                      +-|
       |+                                                        +|
       |+                                                    *   +|
       |+                                                    *   +|
       |                                                     *    |
       |+                                              *     *   +|
       |+                                              *     *   +|
       |                                               *     *    |
       |                                         *     *     *    |
     1 |-+                                       *     *     *  +-|
       |+                                        *     *     *   +|
       |+                                  *     *     *     *   +|
       |+                                  *     *     *     *   +|
       |+                                  *     *     *     *   +|
       |+                            *     *     *     *     *   +|
       |+                            *     *     *     *     *   +|
       |                             *     *     *     *     *    |
       |                      *      *     *     *     *     *    |
   0.1 |-+                    *      *     *     *     *     *  +-|
       |+                     *      *     *     *     *     *   +|
       |+               *     *      *     *     *     *     *   +|
       |+               *     *      *     *     *     *     *   +|
       |+               *     *      *     *     *     *     *   +|
       |+         *     *     *      *     *     *     *     *   +|
       |+         *     *     *      *     *     *     *     *   +|
       |          *     *     *      *     *     *     *     *    |
       |          *     *     *      *     *     *     *     *    |
  0.01 |-+  *     *     *     *      *     *     *     *     *  +-|
       |+   *     *     *     *      *     *     *     *     *   +|
       +----------------------------------------------------------+
            10    11    12    13     14    15    16    17    18



*******************************
*** Combinatorics (manual) ****
*******************************

+------------+------------+
| Array size | Time, sec. |
+------------+------------+
| 13         | 0.016      |
| 14         | 0.021      |
| 15         | 0.052      |
| 16         | 0.104      |
| 17         | 0.224      |
| 18         | 0.464      |
| 19         | 0.990      |
| 20         | 2.031      |
| 21         | 4.219      |
+------------+------------+
+
       +----------------------------------------------------------+
       |    +     +     +     +      +     +     +     +     +    |
    10 |-+                                                      +-|
       |+                                                        +|
       |+                                                        +|
       |+                                                    *   +|
       |                                                     *    |
       |+                                                    *   +|
       |+                                              *     *   +|
       |                                               *     *    |
       |                                               *     *    |
     1 |-+                                       *     *     *  +-|
       |+                                        *     *     *   +|
       |+                                        *     *     *   +|
       |+                                  *     *     *     *   +|
       |+                                  *     *     *     *   +|
       |+                                  *     *     *     *   +|
       |+                            *     *     *     *     *   +|
       |                             *     *     *     *     *    |
       |                             *     *     *     *     *    |
   0.1 |-+                    *      *     *     *     *     *  +-|
       |+                     *      *     *     *     *     *   +|
       |+                     *      *     *     *     *     *   +|
       |+               *     *      *     *     *     *     *   +|
       |+               *     *      *     *     *     *     *   +|
       |+               *     *      *     *     *     *     *   +|
       |+         *     *     *      *     *     *     *     *   +|
       |    *     *     *     *      *     *     *     *     *    |
       |    *     *     *     *      *     *     *     *     *    |
  0.01 |-+  *     *     *     *      *     *     *     *     *  +-|
       |+   *     *     *     *      *     *     *     *     *   +|
       +----------------------------------------------------------+
            13    14    15    16     17    18    19    20    21



*******************************
*** Min-max (quadratic) *******
*******************************

+------------+------------+
| Array size | Time, sec. |
+------------+------------+
| 20         | 0.036      |
| 40         | 0.156      |
| 60         | 0.354      |
| 80         | 0.635      |
| 100        | 1.010      |
| 120        | 1.474      |
| 140        | 2.011      |
| 160        | 2.651      |
| 180        | 3.380      |
+------------+------------+
+
      +-----------------------------------------------------------+
      |    +     +      +     +     +     +      +     +     +    |
  3.5 |-+                                                       +-|
      |                                                      *    |
      |                                                      *    |
      |                                                      *    |
    3 |-+                                                    *  +-|
      |                                                      *    |
      |                                                *     *    |
  2.5 |-+                                              *     *  +-|
      |                                                *     *    |
      |                                                *     *    |
      |                                                *     *    |
    2 |-+                                        *     *     *  +-|
      |                                          *     *     *    |
      |                                          *     *     *    |
      |                                          *     *     *    |
  1.5 |-+                                 *      *     *     *  +-|
      |                                   *      *     *     *    |
      |                                   *      *     *     *    |
      |                                   *      *     *     *    |
    1 |-+                           *     *      *     *     *  +-|
      |                             *     *      *     *     *    |
      |                             *     *      *     *     *    |
      |                       *     *     *      *     *     *    |
  0.5 |-+                     *     *     *      *     *     *  +-|
      |                 *     *     *     *      *     *     *    |
      |          *      *     *     *     *      *     *     *    |
      |    *     *      *     *     *     *      *     *     *    |
    0 |-+  *     *      *     *     *     *      *     *     *  +-|
      |    +     +      +     +     +     +      +     +     +    |
      +-----------------------------------------------------------+
           20    40     60    80   100   120    140   160   180



*******************************
*** Faster (linear?) **********
*******************************

+------------+------------+
| Array size | Time, sec. |
+------------+------------+
| 800        | 0.297      |
| 1200       | 0.484      |
| 1600       | 0.698      |
| 2000       | 0.948      |
| 2400       | 1.245      |
| 2800       | 1.599      |
| 3200       | 2.021      |
| 3600       | 2.516      |
| 4000       | 3.083      |
+------------+------------+
+
      +-----------------------------------------------------------+
      |       +       +       +       +      +       +       +    |
      |                                                           |
    3 |-+                                                    *  +-|
      |                                                      *    |
      |                                                      *    |
      |                                                      *    |
      |                                                      *    |
  2.5 |-+                                              *     *  +-|
      |                                                *     *    |
      |                                                *     *    |
      |                                                *     *    |
      |                                          *     *     *    |
    2 |-+                                        *     *     *  +-|
      |                                          *     *     *    |
      |                                          *     *     *    |
      |                                   *      *     *     *    |
  1.5 |-+                                 *      *     *     *  +-|
      |                                   *      *     *     *    |
      |                                   *      *     *     *    |
      |                             *     *      *     *     *    |
      |                             *     *      *     *     *    |
    1 |-+                     *     *     *      *     *     *  +-|
      |                       *     *     *      *     *     *    |
      |                       *     *     *      *     *     *    |
      |                 *     *     *     *      *     *     *    |
      |                 *     *     *     *      *     *     *    |
  0.5 |-+        *      *     *     *     *      *     *     *  +-|
      |    *     *      *     *     *     *      *     *     *    |
      |    *     *      *     *     *     *      *     *     *    |
      |    *  +  *    + *     *     * +   *  +   *   + *     *    |
      +-----------------------------------------------------------+
     500     1000    1500    2000    2500   3000    3500    4000
[download]

Code:

use strict;
use warnings;
use experimental qw{ say postderef signatures state };

use Benchmark qw/ timeit :hireswallclock /;
use POSIX qw/ floor ceil log10 /;
use File::Spec::Functions 'catfile';
use File::Basename 'dirname';

use List::Util qw{ min max };
use Math::Combinatorics;

use Term::Table;
use Chart::Gnuplot;

my $GP = catfile dirname( $^X ), '../../c/bin/gnuplot.exe';
die unless -x $GP;              # assume Strawberry Perl

say '
*******************************
*** Math::Combinatorics *******
*******************************
';
benchmark( \&group_hero_dave_jacoby, [ 10 .. 18 ], 'logscale' );
say '
*******************************
*** Combinatorics (manual) ****
*******************************
';
benchmark( \&group_hero_e_choroba, [ 13 .. 21 ], 'logscale' );
say '
*******************************
*** Min-max (quadratic) *******
*******************************
';
benchmark( \&group_hero_jo_37, [ map $_ * 20, 1 .. 9 ], '' );
say '
*******************************
*** Faster (linear?) **********
*******************************
';
benchmark( \&group_hero_mine, [ map $_ * 400, 2 .. 10 ], '' );

sub benchmark( $coderef, $x, $logscale ) {
    my @xdata = @$x;
    my @ydata;
    for my $size ( @xdata ) {
        my $t = timeit 3, sub { $coderef-> ( 1 .. $size ) };
        push @ydata, $t-> [ 1 ] / $t-> [ -1 ];
    }

    my $table = Term::Table-> new(
        header => [ 'Array size', 'Time, sec.' ],
        rows   => [ 
            map [ $xdata[ $_ ], sprintf '%.3f', $ydata[ $_ ]], 
                0 .. $#xdata
        ],
    );
    say for $table-> render;

    my $chart = Chart::Gnuplot-> new( 
        gnuplot  => $GP,
        terminal => 'dumb size 70, 35',
    );
    my $dataset = Chart::Gnuplot::DataSet-> new(
        xdata => \@xdata,
        ydata => \@ydata,
        style => 'impulses',
    );
    my $dx = .1 * ( max( @xdata ) - min( @xdata ));
    my $dy = .1 * ( max( @ydata ) - min( @ydata ));
    $chart-> set( 
        xrange => [ -$dx + min( @xdata ), $dx + max( @xdata )],
        yrange => [ -$dy + min( @ydata ), $dy + max( @ydata )],
    );
    if ( $logscale ) {
        $chart-> set( 
            logscale => 'y', 
            yrange   => [ 
                10 ** ( -.2 + floor log10 min( @ydata )),
                10 ** (  .2 + ceil  log10 max( @ydata ))
            ],
        )
    }
    $chart-> plot2d( $dataset )
}

sub group_hero_dave_jacoby (@input) {
    my $output = 0;
    for my $c ( 1 .. scalar @input ) {
        my $comb = Math::Combinatorics->new( count => $c, data => [@in
+put], );
        while ( my @combo = $comb->next_combination ) {
            my $min = min @combo;
            my $max = max @combo;
            my $str = $max**2 * $min;
            $output += $str;
        }
    }
    return $output;
}

sub group_hero_e_choroba(@nums) {
    my @indicator = (0) x @nums;
    $indicator[-1] = 1;
    my $sum = 0;
    while (1) {
        my @group = @nums[grep $indicator[$_], 0 .. $#nums];
        $sum += max(@group) ** 2 * min(@group);

        my $i = $#indicator;
        $indicator[$i--] = 0 while $indicator[$i];
        ++$indicator[$i];
        last if $i < 0;
    }
    return $sum
}

sub group_hero_jo_37 {
    use bigint;
    
    my @s = sort {$a <=> $b} @_;
    my $power;
    while (defined (my $min = $s[0])) {
        while (my ($offs, $max) = each @s) {
            $power += $min * $max**2 * ($offs ? 2**($offs - 1) : 1);
        }
    } continue {
        shift @s;
    }

    $power;
}

sub group_hero_mine {
    use bigint;
    
    my @nums = sort { $a <=> $b } @_;
    
    my $big = 0;
    for my $i ( 0 .. $#nums - 1 ) {
        $big += $nums[ $i ] * 2 ** ( $#nums - $i- 1 )
    }
    my $sum = 0;
    for my $i ( reverse 0 .. $#nums ) {
        $sum += ( $big + $nums[ $i ]) * $nums[ $i ] ** 2;
        $big -= $nums[ $i - 1 ];
        $big /= 2
    }
    return $sum
}
[download]

How? Perhaps it's better to illustrate with a picture, drawn with pen and paper. Number of combinations is 2**n. For 5-items sorted array, we are looking for sum of these terms:

f(4,0)*2**3  f(3,0)*2**2  f(2,0)*2**1  f(1,0)*2**0  f(0,0)*2**0
f(4,1)*2**2  f(3,1)*2**1  f(2,1)*2**0  f(1,1)*2**0
f(4,2)*2**1  f(3,2)*2**0  f(2,2)*2**0
f(4,3)*2**0  f(3,3)*2**0
f(4,4)*2**0
[download]

where f(i,j) is function of items indexed i, j. Direct translation to code:

sub func { $_[0] ** 2 * $_[1] }
sub group_hero_blunt {
    use bigint;
    
    my @nums = sort { $a <=> $b } @_;
    
    my $sum = 0;
    for my $i ( 0 .. $#nums ) {
        for my $j ( 0 .. $i - 1 ) {
            $sum += func( $nums[ $i ], $nums[ $j ]) *
                2 ** ( $i - $j - 1 )
        }
        $sum += func( $nums[ $i ], $nums[ $i ])
    }
    return $sum
}
[download]

With function inlined and square factored out, it's ~2x faster:

sub group_hero_better {
    use bigint;
    
    my @nums = sort { $a <=> $b } @_;
    
    my $sum = 0;
    for my $i ( 0 .. $#nums ) {
        my $s = 0;
        for my $j ( 0 .. $i - 1 ) {
            $s += $nums[ $j ] * 2 ** ( $i - $j - 1 )
        }
        $sum += ( $s + $nums[ $i ]) * $nums[ $i ] ** 2
    }
    return $sum
}
[download]

But looking at picture above, it's not difficult to note that nested loops aren't required. Pre-calculate one big number, and then chip away small (or not so "small") pieces in one loop. Which leads to further ~10x speed improvement, and is what has been benchmarked with 4000 items array.

Comment on PWC 244 task 2 in linear time Select or Download Code

Replies are listed 'Best First'.
Re: PWC 244 task 2 in linear time by LanX (Saint) on Nov 28, 2023 at 15:58 UTC
> powers of all possible combinations; power is defined as the square of the largest number in a sequence My problem with this challenge definition is that a sequence is an ordered set, much like Perl arrays. But it seems to me the solutions presented are based on unordered subsets. For clarification: `[2,3,4,5]` may be a subset of `[0 ..10]` and there is only one subset of length 4 possible with 2 minimal and 5 maximal. But 2 sequences for length 4 � `[2,3,4,5]` `[2,4,3,5]` Am I mistaken and is this a case of TL;DR? Thanks for clarification. Cheers Rolf _{(addicted to the Perl Programming Language :) see Wikisyntax for the Monastery} PS: Mohammed's example doesn't help because you need longer arrays to see the difference. Update �) And generally n! for length n+2	[reply] [d/l] [select]
Re^2: PWC 244 task 2 in linear time by jo37 (Curate) on Nov 28, 2023 at 21:06 UTC
There is a big charm in the ambiguities in the PWC tasks, IMHO. Don't take the task description too serious. Just find an interpretation of your own. It's fun! Greetings, -jo `$gryYup$d0ylprbpriprrYpkJl2xyl~rzg??P~5lp2hyl0p$`	[reply]
Re^3: PWC 244 task 2 in linear time by LanX (Saint) on Nov 28, 2023 at 21:18 UTC
"Interpretation of your own" becomes particular "charming" when np-complete problems are solved in polynomial time. ;) Anyway, my question wasn't answered, which problem was solved here? Seeing terms of 2**n let's me think it's not for sub sequences.(?) Cheers Rolf _{(addicted to the Perl Programming Language :) see Wikisyntax for the Monastery}	[reply]
Re: PWC 244 task 2 in linear time by jo37 (Curate) on Nov 28, 2023 at 20:42 UTC
The linear time solution was not submitted to PWC yet. What a pity! Greetings, -jo `$gryYup$d0ylprbpriprrYpkJl2xyl~rzg??P~5lp2hyl0p$`	[reply]
Re: PWC 244 task 2 in linear time by jo37 (Curate) on Nov 28, 2023 at 09:27 UTC
Cool! Greetings, -jo `$gryYup$d0ylprbpriprrYpkJl2xyl~rzg??P~5lp2hyl0p$`	[reply]

Update