Update: Greatly simplified the else case.

Try this:

#! perl -slw use strict; use Data::Dump qw[ pp ]; use List::Util qw[ sum ]; sub rankSums { my( $aRef, $bRef ) = @_; my( $aSum, $bSum ) = (0) x 2; my( $a, $b ) = (0) x 2; my $rank = 1; while( $a < @$aRef && $b < @$bRef ) { if( $aRef->[ $a ] < $bRef->[ $b ] ) { $aSum += $rank++; ++$a; } elsif( $aRef->[ $a ] > $bRef->[ $b ] ) { $bSum += $rank++; ++$b } else { $aSum += ( $rank * 2 + 1 ) / 2; $bSum += ( $rank * 2 + 1 ) / 2; $rank += 2; ++$a, ++$b; } } $aSum += $rank++ while $a++ < @{ $aRef }; $bSum += $rank++ while $b++ < @{ $bRef }; return $aSum, $bSum; } my @a = split ' ', <DATA>; my @b = split ' ', <DATA>; my( $aSum, $bSum ) = rankSums( \@a, \@b ); print "asum:$aSum bSum:$bSum"; #__DATA__ #1 3 5 7 9 #2 4 6 8 10 #__DATA__ #1 2 3 4 5 6 7 8 9 10 #3.14 4.25 5.36 6.47 7.58 __DATA__ 1 2 3 4 5 3 3.14 4 4

It makes a single pass over the data, and does no copying or sorting or memory allocation, so it should be considerably faster than the current method, but full testing & benchmarking it is your task :)

A further simplified version of the function that runs a tad quicker and has been tested with 1000 runs of 1e6 x 1e6 random integers:

sub rankSums3 { my( $aRef, $bRef ) = @_; my( $aSum, $bSum ) = (0) x 2; my( $a, $b ) = (0) x 2; my $rank = 1; while( $a < @$aRef && $b < @$bRef ) { $aSum += $rank++, ++$a, next if $aRef->[ $a ] < $bRef->[ $b ]; $bSum += $rank++, ++$b, next if $aRef->[ $a ] > $bRef->[ $b ]; $aSum += ( $rank * 2 + 1 ) / 2; $bSum += ( $rank * 2 + 1 ) / 2; $rank += 2; ++$a, ++$b; } $aSum += $rank++ while $a++ < @{ $aRef }; $bSum += $rank++ while $b++ < @{ $bRef }; return $aSum, $bSum; }

With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday'
Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
"Science is about questioning the status quo. Questioning authority". I knew I was on the right track :)
In the absence of evidence, opinion is indistinguishable from prejudice.

In reply to Re: Seeking a fast sum_of_ranks_between function (1e6 x 1e6 in 1/5th sec.) by BrowserUk
in thread Seeking a fast sum_of_ranks_between function by msh210

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.