In RE: Data Normalization, lonewolf28 asked about normalising some data; and I posted a couple of subroutines that do linear and log scaling.
Thinking about this some more, I wondered if there was any way to programmically decide from a given set of data the most appropriate method of scaling to use.
Often the human being can do this by inspection. eg.This is fairly obviously linear:
5 5 34 44 114 169 177 184 270 339 361 364 442 511 530 554 555 587 709 +709 735 778 791 859 871 899 903 926 933 952
This is log2:
0, 1, 3, 7, 15, 31, 63, 127, 255, 511, 1023, 2047, 4095, 8191, 16383, +32767, 65535, 131071, 262143, 524287, 1048575, 2097151, 4194303, 8388607, 16777215, 33554431, 67108863, 1342 +17727, 268435455, 536870911, 1073741823
And this log10:
1.713125e-005, 1.748086e-006, 2.101463e-006, 1.977405e-006, 3.597675e- +006, 3.725492e-006, 3.924736e-006, 2.902199e-006, 3.988645e-006, 8.210367e-006, 3.360837e-006, 5.202907e-006, 7.082570e- +006, 8.778026e-006, 7.079562e-005, 9.100576e-005, 5.258545e-005, 9.292677e-005, 1.789815e-004, 2.113948e-003, 7.229146e- +004, 1.428995e-003, 2.742045e-003, 5.552746e-003, 1.822390e-002, 2.220999e-002, 4.316067e-002, 8.876963e-002, 1.751072e- +001, 3.494051e-001, 7.155960e-001, 1.347822e+000
But how could a program decide that?
In reply to Data range detection? by BrowserUk
| For: | Use: | ||
| & | & | ||
| < | < | ||
| > | > | ||
| [ | [ | ||
| ] | ] |