comment on

Sorry for the delay in responding. Like I said in my message to you, I wanted some time to write a proper reply. It's been a while since I've dealt with floats. Before I address your question, I want to address your snippet.

I claim that better_average is worse than bad_average, and here follows my supporting example.

You haven't shown that at all. Considering the difference in error of the two functions is 0.000000000000003 of the input, that makes them equally good or equally bad for that data set.

Even if it did show bad_average to be better, your data sample is so horrible that no conclusion can be drawn from the comparison. In a discussion about minimizing the errors of dealing with floating point numbers, you used integers.

Finally, you also haven't shown better_average is bad. Quite the opposite, an error of 0.000000000000003 of the input is rather good.

If this were handed to any of us as a homework problem in refactoring, would we not pull the division back out again?

When it comes to numerical methods, the order in which operations are done can have an impact on the results. For example, a*(b+c) is equal to a*b+a*c in the mathematical world, but it's not necessarily the case in the practical world. You'd be wrong to refactor such code without considering the real-world implications.

Please supply a rationale for better_average.

Because moving the division changes the magnitude of all the numbers being added by the same amount, better_average is no better than bad_average. It was a bad example. A better average might sort the numbers so that the small ones are processed first. (Beware of the signs...)

There's been some speculation as to what I was thinking. What follows explains it.

I said "better_average is no better than bad_average", but that would be more precise if you added "for floating point numbers". For fixed point numbers, bad_average would fail miserably.

I've been working with fixed point numbers because the target machine had a very limited amount of memory. Using fixed point saved memory since the exponent wasn't stored along with the number.

Adding lots of numbers together would result in an overflow when dealing with fixed point numbers. For example, consider a register that can hold 5 decimal digits. (Roughly 15 bits, but easier to visualize.)

    40123 (/1000)
  + 50456 (/1000)
  + 20789 (/1000)
    -------------
    overflow
[download]

You could convert them to a different scaling first, but that would result in a loss of precision.

    40123 (/1000)  =>    04012 (/100)
    50456 (/1000)  =>  + 05046 (/100)
    20789 (/1000)  =>  + 02079 (/100)
                         ------------
                         11137 (/100)     (111.370 instead of 111.368)
[download]

And if you were adding 1000 numbers,

    40123 (/1000)  =>    00040 (/1)
    50456 (/1000)  =>  + 00050 (/1)
    20789 (/1000)  =>  + 00021 (/1)
    ...                  ...
                         ----------
                         00111 (/1)       (111.000 instead of 111.368)
[download]

By dividing first, you avoid the need to scale the numbers and therefore avoid the loss of precision.

So to answer your question, I mistakenly mixed fixed point and floating point techniques.

In reply to Re^3: Floating Point Errors: How and Why to Avoid Adding and Subtracting Varying Numbers by ikegami
in thread Floating Point Errors by Anonymous Monk

Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!

Titles consisting of a single word are discouraged, and in most cases are disallowed outright.

Read Where should I post X? if you're not absolutely sure you're posting in the right place.

Please read these before you post! —

Posts may use any of the Perl Monks Approved HTML tags:

a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr

You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)

	For:		Use:
	&		`&`
	<		`<`
	>		`>`
	[		`[`
	]		`]`

Link using PerlMonks shortcuts! What shortcuts can I use for linking?

See Writeup Formatting Tips and other pages linked from there for more info.