in reply to Re: Data range detection?
in thread Data range detection?

think I'd try treating the axes separately

I don't see any other way than to treat each set of points independantly?

find the scale that would most evenly distribute the data points along the axis.

I don't no how to access "most evenly distributed"?

The input values may be clumped or unevenly distributed; and whatever scaling you apply, the output values will, mathematically, be proportionally the same.

I'm just not seeing how to tackle this at all.


With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday'
Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
"Science is about questioning the status quo. Questioning authority". I'm with torvalds on this
In the absence of evidence, opinion is indistinguishable from prejudice. Agile (and TDD) debunked

Replies are listed 'Best First'.
Re^3: Data range detection?
by roboticus (Chancellor) on Apr 13, 2015 at 19:07 UTC

    BrowserUk:

    For evenly distributed, I was meaning choosing the distribution that most evenly spreads out the points. An exponential distribution on a linear axis will bunch everything up to the left, for example. Doing all the work to find out what "evenly distributed" is would be a headache. I hacked something together this morning that worked to select between linear and logarithmic in the number series you provided. To figure out the most "evenly distributed" version, I simply counted the number of points to the left of the midpoint and compared that to the number of points provided, selecting the series where the difference was the smallest.

    From memory, it went something like:

    sub check_list { my $r = shift; my ($min, $max) = minmax(@$r); my $ctr_lin = ($min+$max)/2; my $ctr_log = (log($min)+log($max))/2; my ($cnt_lin, $cnt_log)=(0,0); for (@$r) { ++$cnt_lin if $_ < $ctr_lin; ++$cnt_log if $_ < $ctr_log; } my $error_lin = abs($ctr_lin - @$r/2); my $error_log = abs($ctr_log - @$r/2); return $error_lin < $error_log ? "linear" : "log"; }

    Update: I mentioned treating the axes separately, because some people were mentioning curve fitting (IIRC) which implied (to me) using both axes at the same time.

    ...roboticus

    When your only tool is a hammer, all problems look like your thumb.

      Just for completeness, here's the one I coded up yesterday morning:

      $ cat choose_axes.pl #!/usr/bin/env perl use strict; use warnings; my @series = ( [qw( 5 5 34 44 114 169 177 184 270 339 361 364 442 511 530 554 555 587 709 709 735 778 791 859 871 899 903 926 933 952 )], [ 1, 3, 7, 15, 31, 63, 127, 255, 511, 1023, 2047, 4095, 8191, 16383, 32767, 65535, 131071, 262143, 524287, 1048575, 2097151, 4194303, 8388607, 16777215, 33554431, 67108863, 134217727, 268435455, 536870911, 1073741823 ], [ 1.713125e-005, 1.748086e-006, 2.101463e-006, 1.977405e-006, 3.597675e-006, 3.725492e-006, 3.924736e-006, 2.902199e-006, 3.988645e-006, 8.210367e-006, 3.360837e-006, 5.202907e-006, 7.082570e-006, 8.778026e-006, 7.079562e-005, 9.100576e-005, 5.258545e-005, 9.292677e-005, 1.789815e-004, 2.113948e-003, 7.229146e-004, 1.428995e-003, 2.742045e-003, 5.552746e-003, 1.822390e-002, 2.220999e-002, 4.316067e-002, 8.876963e-002, 1.751072e-001, 3.494051e-001, 7.155960e-001, 1.347822e+000 ], ); for my $ar (@series) { my ($type, $min, $max) = choose_axis_params($ar); print "($min .. $max) $type\n"; } sub minmax { my $min = my $max = shift; while (@_) { my $t = shift; $min = $t<$min ? $t : $min; $max = $t>$max ? $t : $max; } return $min, $max; } sub check_axis { my $name = shift; my @points = @_; my ($min, $max) = minmax(@points); my $midpoint = ($min+$max)/2; my $cnt = 0; for my $t (@points) { ++$cnt if $t > $midpoint; } my $err = abs(@points/2 - $cnt); return $name, $min, $max, $midpoint, $cnt, $err; } sub choose_axis_params { my $r = shift; my ($min,$max) = minmax(@$r); $r = [ sort @$r ]; my @axes; push @axes, [ check_axis('linear',@$r) ]; push @axes, [ check_axis('log', map { log($_) } @$r) ]; @axes = sort { $a->[-1] <=> $b->[-1] } @axes; #for my $r (@axes) { # printf "%-8.8s (%s .. %s) %s, %s, %s\n", @$r; #} return @{$axes[0]}; } $ perl choose_axes.pl (5 .. 952) linear (0 .. 20.794415415867) log (-13.2569890828565 .. 0.298489956293335) log

      There's nothing special about it, as it chooses the distribution that more evenly splits the points over both halves of the interval. So it'll probably choose poorly on the vertical axis of a half-wave rectified sine wave or similar. (I'm guessing that it would choose a log axis instead of linear in that case...)

      ...roboticus

      When your only tool is a hammer, all problems look like your thumb.