in reply to Re: Calculated % of data greater than $x in an array
in thread Calculated % of data greater than $x in an array

Sorting is always at least O(n log n). Checking each element is O(n). You don't save anything here by sorting.

Corion: why are you trying to avoid looping? Even if you write some code to solve this problem that doesn't look like it's looping, trust me, it is.

  • Comment on Re^2: Calculated % of data greater than $x in an array

Replies are listed 'Best First'.
Re^3: Calculated % of data greater than $x in an array
by ikegami (Patriarch) on Mar 05, 2010 at 18:29 UTC
    O() only speaks of scalability. It doesn't show that nothing was saved. It wouldn't surprise me to find that Corion's approach is faster, even for long lists. Besides, speed is not the only factor that affects the choice of tools. That said, I don't see the point of sorting here.
      for our $n ( map 10**$_, 1, 3, 6 ) { print "Benchmarking $n values"; our @n = map int( rand $n ), 1 .. $n; cmpthese -log $n, { sort => q[ my $t = $n >> 1; my @m = sort{ $a<=>$b } @n; my $c=0; $c++ while $m[ $c ] <= $t; my $pct= ( @m - $c ) / @m * 100; ], grep => q[ my $t = $n >> 1; my $pct = grep({ $_ > $t } @n ) / @n * 100; ], }; };; Benchmarking 10 values Rate sort grep sort 285546/s -- -62% grep 751484/s 163% -- Benchmarking 1000 values Rate sort grep sort 3525/s -- -71% grep 12328/s 250% -- Benchmarking 1000000 values Rate sort grep sort 1.13/s -- -90% grep 11.4/s 911% --
        Seeing the amount of Perl code involved outside the sort, makes sense. Thanks. Note that sorting is still faster than the original
        use strict; use warnings; use Benchmark qw( cmpthese ); for our $n ( map 10**$_, 1, 3, 6 ) { print "Benchmarking $n values\n"; our @n = map int( rand $n ), 1 .. $n; cmpthese -log $n, { orig => q[ my $c=0; for my $elem (@n) { if ($elem > $t){ $c++; } } my $pct= ( @n - $c ) / @n * 100; ], sort => q[ my $t = $n >> 1; @n = sort{ $a<=>$b } @n; my $c=0; $c++ while $n[ $c ] <= $t; my $pct= ( @n - $c ) / @n * 100; ], grep => q[ my $t = $n >> 1; my $pct = grep({ $_ > $t } @n ) / @n * 100; ], }; }
        Benchmarking 10 values Rate orig sort grep orig 272566/s -- -11% -41% sort 306129/s 12% -- -34% grep 462968/s 70% 51% -- Benchmarking 1000 values Rate orig grep sort orig 4332/s -- -42% -44% grep 7459/s 72% -- -4% sort 7795/s 80% 5% -- Benchmarking 1000000 values ^C

        Lots of variation in the results due to the random input. The first test with orig I ran actually showed sort as being the fastest, although it usually isn't.

Re^3: Calculated % of data greater than $x in an array
by Corion (Patriarch) on Mar 05, 2010 at 18:23 UTC

    I know that you don't save much by sorting instead of looping through the whole array, but if you want other ranks/n-tiles than just the one, operating on a sorted list is easier.