in reply to Re^4: Data range detection?
in thread Data range detection?

The "expected" data set is the linear fit. The following script uses linear regression and the R^2 metrics to calculate a measure of fit for your three datasets:

use strict; use warnings; use Statistics::LineFit; sub fit { my $fit = Statistics::LineFit->new(); $fit->setData( @_ ); return $fit->rSquared(); } my @data = ( [ qw( 5 5 34 44 114 169 177 184 270 339 361 364 442 511 5 +30 554 555 587 709 709 735 778 791 859 871 899 903 926 933 952 ) ], [ 0.5, 1, 3, 7, 15, 31, 63, 127, 255, 511, 1023, 2047, 4095, +8191, 16383, 32767, 65535, 131071, 262143, 524287, 1048575, 2097151, 4194303, 8388607, 16777215, 33554431, 671 +08863, 134217727, 268435455, 536870911, 1073741823 ], [ 1.713125e-005, 1.748086e-006, 2.101463e-006, 1.977405e-006, + 3.597675e-006, 3.725492e-006, 3.924736e-006, 2.902199e-006, 3.988645e-006, 8.210367e-006, 3.360837e-006, 5.202907e-006, + 7.082570e-006, 8.778026e-006, 7.079562e-005, 9.100576e-005, 5.258545e-005, 9.292677e-005, 1.789815e-004, 2.113948e-003, + 7.229146e-004, 1.428995e-003, 2.742045e-003, 5.552746e-003, 1.822390e-002, 2.220999e-002, 4.316067e-002, 8.876963e-002, + 1.751072e-001, 3.494051e-001, 7.155960e-001, 1.347822e+000 ] ); print " linear loglinear loglog\n"; for my $d (@data) { my @x = 1..@$d; my @logx = map log, @x; my @logd = map log, @$d; printf "%10.2f %10.2f %10.2f\n", fit( \@x, $d), fit( \@x, \@logd), f +it( \@logx, \@logd ); }

The result is

linear loglinear loglog 0.99 0.69 0.95 0.26 1.00 0.86 0.26 0.90 0.58

which shows that the first data set describes a linear relationship while the others are more of log type (the largest R^2 wins). If you have a stats package at hand (or even Excel only) you can do the same thing and visualize the results.

Replies are listed 'Best First'.
Re^6: Data range detection?
by BrowserUk (Patriarch) on Apr 13, 2015 at 19:12 UTC

    Sorry, but unless my eye's are deceiving me (quite possible), but you don't appear to be fitting the data at all:

    21 my @x = 1..@$d; ### Takes the values 1..3 +0, 1..31, and 1..32 22 my @logx = map log, @x; ### is the logs of those +sequential ranges 23 my @logd = map log, @$d; ### the loglogs of those +sequential ranges. 24 printf "%10.2f %10.2f %10.2f\n", fit( \@x, $d), fit( \@x, \@logd) +, fit( \@logx, \@logd );

    The actual data is never passed to the fit sub?


    With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday'
    Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
    "Science is about questioning the status quo. Questioning authority". I'm with torvalds on this
    In the absence of evidence, opinion is indistinguishable from prejudice. Agile (and TDD) debunked

      $d is a reference to the array containing your data sets. No comment on your eyesight....

        $d is a reference to the array containing your data sets

        I know that. But...

        @data = map int( rand 100 ), 1 ..30;; ## THE DATA $d = \@data;; ## A REFERENCE TO +IT print 1.. @$d;; ## AN UNRELATED SE +QUENCE OF INTEGERS! 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 2 +7 28 29 30

        Look again!


        With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday'
        Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
        "Science is about questioning the status quo. Questioning authority". I'm with torvalds on this
        In the absence of evidence, opinion is indistinguishable from prejudice. Agile (and TDD) debunked
Re^6: Data range detection?
by BrowserUk (Patriarch) on Apr 14, 2015 at 03:59 UTC

    Sorry hdb, it seems it was more than just my eye's giving me trouble last night. And given it was you, I should have known better.


    With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday'
    Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
    "Science is about questioning the status quo. Questioning authority". I'm with torvalds on this
    In the absence of evidence, opinion is indistinguishable from prejudice. Agile (and TDD) debunked

      No worries, I also realized that what I proposed is based on a lot of assumptions that I did not spell out. Must be a professional blind spot.