comment on

Wow! Your convergence criteria are a lot more stringent than mine ever have been, if you're trying to get below a 1bit shift for a 53-bit precision number.

They aren't really my criteria; I'm trying to match the results produced by the code I am optimising, prior to my optimisations.

I've got it to the point where over 25% of the runtime is now down to exactly 2 opcodes, both 64-bit double divides in deeply nested loops.

Luckily, the divisor in both cases is invariant in the inner loop, so going the classical route of replacing the division by multiplication by the reciprocal of the divisor reduces the cost of both of those opcodes by just under 95%. (For my test case, from 1.17s to 0.06s.)

The problem is that some 20% (~90,000 of 450,000) of the resultant values differ from the unoptimised code. The maximum individual difference is less than 2e-16; but cumulatively they could have an impact on the numbers derived from the dataset. And my simple testcase is 450,000 datapoints; the real ones that got me started on this quest have upwards of 10 million datapoints. But that is only half of the story.

I'm just discovering that a big part of the reason for the dramatic effect, in this case, of replacing x /= n; with x *= 1/n (where 1/n has been pre-calculated outside of the inner loop) is that it avoids the generation of denormal numbers. And denormal numbers are known to have a huge performance affect on FPUs.

I've only learnt this in the last 10 hrs or so. My OP was seeking to adjust the value produced by 1/n, so as to minimise the accumulated difference in the values produced by the pre- & post-optimised code. But, from a few samples of the 90,000 I've inspected so far, the differences (between x /= n and x *= 1/n) only rise above 0.5 ulp when x is a denormal value. I've never had to deal with denormals before; so this is all new ground for me; and I'm finding it particularly taxing; requiring lots of research in areas of math that go way over my head.

At least the papers do. In many cases -- as so often -- when you cut through the jargon, notation and 'math-paper-ese' that seems to be obligatory, the actual code required -- which is *never* included -- is relatively simple. Eg. Pick a pdf; and then realise that it all comes down to something as simple as:

sub NRreciprocal {
    my( $n, $g ) = @_;

    my $x0 = $g;
    while( 1 ) {
        my $x1 = $x0 + $x0 * ( 1 - $n * $x0 );
        last if  ( $x0 - $x1 ) == 0.0;
        $x0 = $x1;
    }
    return $x0;
}
[download]

Why, oh why, do they insist on making everything so damn complicated?

With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday'

Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.

"Science is about questioning the status quo. Questioning authority". I knew I was on the right track :)

In the absence of evidence, opinion is indistinguishable from prejudice. Not understood.

In reply to Re^4: Determining the minimum representable increment/decrement possible? by BrowserUk
in thread Determining the minimum representable increment/decrement possible? by BrowserUk

Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!

Titles consisting of a single word are discouraged, and in most cases are disallowed outright.

Read Where should I post X? if you're not absolutely sure you're posting in the right place.

Please read these before you post! —

Posts may use any of the Perl Monks Approved HTML tags:

a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr

You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)

	For:		Use:
	&		`&`
	<		`<`
	>		`>`
	[		`[`
	]		`]`

Link using PerlMonks shortcuts! What shortcuts can I use for linking?

See Writeup Formatting Tips and other pages linked from there for more info.