BrowserUk has asked for the wisdom of the Perl Monks concerning the following question:
Given a size, S, that is some multiple of P=4096; and record size R; find the largest common multiple of R & P < S.
Eg. S = 2GB = 2147483648; R = 12; LCM of 12 & 4096 < 2147483648 = 2147475456.
Easy to find iteratively:
$R = 12; $S = 2*1024**3; $c -= 4096 while $c % $r; print $c;; 2147475456
But it can be calculated right? (Long day; brain not cooperating.)
Note: P is always 4096; R can be larger or smaller than 4096; S will always be a multiple of P, and usually a multiple of 1GB.
Eg2. S = 3GB = 3221225472; R = 5007; LCM of 5007 & 4096 < 3221225472 = 3219861504.
|
|---|
| Replies are listed 'Best First'. | |
|---|---|
|
Re: Simple arithmetic?
by oiskuu (Hermit) on Mar 08, 2015 at 11:15 UTC | |
Using bog standard LCM:
| [reply] [d/l] |
by BrowserUk (Patriarch) on Mar 08, 2015 at 12:29 UTC | |
Using bog standard LCM That's a very clean, and totally generic implementation. Thank you. With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday'
Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
"Science is about questioning the status quo. Questioning authority".
I'm with torvalds on this
In the absence of evidence, opinion is indistinguishable from prejudice. Agile (and TDD) debunked
| [reply] |
|
Re: Simple arithmetic?
by Laurent_R (Canon) on Mar 07, 2015 at 22:27 UTC | |
Je suis Charlie.
| [reply] |
by hdb (Monsignor) on Mar 07, 2015 at 23:51 UTC | |
Exactly, and one can also utilize the fact that 4096 is a power of 2:
UPDATE: 7 hours of sleep later, here is what I wanted to say, but Anonymous Monk has posted it already in node 1119206.
| [reply] [d/l] [select] |
by BrowserUk (Patriarch) on Mar 08, 2015 at 12:20 UTC | |
and one can also utilize the fact that 4096 is a power of 2 That's a really useful insight. It makes deriving the lcm substantially more efficient; especially for large values of S combined with R=prime. Thank you. With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday'
Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
"Science is about questioning the status quo. Questioning authority".
I'm with torvalds on this
In the absence of evidence, opinion is indistinguishable from prejudice. Agile (and TDD) debunked
| [reply] |
by BrowserUk (Patriarch) on Mar 08, 2015 at 12:18 UTC | |
I would first determine the least common multiple (LCM) of R and P, which is almost entry level algorithmic. Then divide S by the LCM, and multiply the LCM by the integer part of the result of that division. That's effectively, exactly what I was doing, but in my sleep-deprived state, I was convinced that I could avoid the iteration and calculate the result directly, despite that I couldn't see how. Hence my question. Thank you. With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday'
Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
"Science is about questioning the status quo. Questioning authority".
I'm with torvalds on this
In the absence of evidence, opinion is indistinguishable from prejudice. Agile (and TDD) debunked
| [reply] |
|
Re: Simple arithmetic?
by Anonymous Monk on Mar 08, 2015 at 04:43 UTC | |
output:
| [reply] [d/l] [select] |
by BrowserUk (Patriarch) on Mar 08, 2015 at 12:27 UTC | |
Inspired by previous answers. I like your way of incorporating hdb's powers-of-two observation into the GCD() calculation. the absence of bugs is totally not guaranteed Understood. It's my responsibility to test code I use. It stands up to all the likely scenarios (for my application) that I've tested. Thank you. With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday'
Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
"Science is about questioning the status quo. Questioning authority".
I'm with torvalds on this
In the absence of evidence, opinion is indistinguishable from prejudice. Agile (and TDD) debunked
| [reply] |
by hdb (Monsignor) on Mar 08, 2015 at 12:47 UTC | |
Of course, my post came later, so the observation is this Anonymus Monk's own original observation. If in addition, as you comment above, R is prime (different from 2) or only odd, then the LCM is always 4096*R, so the whole discussion redundant... | [reply] |
by Anonymous Monk on Mar 08, 2015 at 13:03 UTC | |
|
Re: Simple arithmetic? (And the winner is ... )
by BrowserUk (Patriarch) on Mar 08, 2015 at 17:44 UTC | |
By an almost unbelievable margin of being 300 million times faster: oiskuu's implementation from Anonymonk's implementation:
(And yes. I've verified the results. Otherwise I would have posted this hours ago.) It all comes down to the extreme optimisability of the tail-recursive Euclidean gcd algorithm, which converges very, very fast. The benchmark code (uncomment the comments for verification ):
With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday'
Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
"Science is about questioning the status quo. Questioning authority".
I'm with torvalds on this
In the absence of evidence, opinion is indistinguishable from prejudice. Agile (and TDD) debunked
| [reply] [d/l] [select] |
by oiskuu (Hermit) on Mar 09, 2015 at 11:13 UTC | |
I don't know how MSVC handles things, but generally there are other ways to prevent unwanted benchmark optimizations.
Using volatile is almost never necessary, and very often has different implications to what was intended. Just one example, quoting gcc.info:
Usually, a barrier of some sort is indicated. With gcc/icc/clang, the following optimization barrier should work: (Here the volatile may be due to some compiler bugs.) The ways to prevent a compiler of being too smart: Now, getting back to the original topic. Lcm with 4096 means simply to have twelve zeroes at the end. I'd code it like this: (Counting trailing zeroes can also be optimized. Some x86 (Haswell) have an instruction for that. Wikipedia links to Bit Twiddling Hacks; there are other sites and books on the topic.) Update: added the 4th clause to above list. | [reply] [d/l] [select] |
by BrowserUk (Patriarch) on Mar 09, 2015 at 12:06 UTC | |
So, probably it's for the best to use both noinline attribute and an optimization barrier. The problem is that nothing I do inside the function, whether inlined or not, will prevent the compiler optimising the loop away. The decision is, (appears to be), that because the variable to which the result of the function is assigned is never used outside the loop, and the function has no side effects, the loop is redundant. (Though I haven't worked out why the second loop isn't optimised away for similar reasons. The use of volatile on that variable works because the compiler cannot decide that the value is never used. Whilst volatile can have other effects upon the code generated, these mostly relate to multi-threaded code which is not in play here. Also, the MS compiler has extended guarantees with regard to volatile (on non-ARM systems): /volatile:ms Conversely, the read/write/ReadWrite/Barrier intrinsics are now deprecated: The _ReadBarrier, _WriteBarrier, and _ReadWriteBarrier compiler intrinsics and the MemoryBarrier macro are all deprecated and should not be used. For inter-thread communication, use mechanisms such as atomic_thread_fence and std::atomic<T>, which are defined in the C++ Standard Library Reference. For hardware access, use the /volatile:iso compiler option together with the volatile (C++) keyword. The volatile keyword also seems to impose lesser, but enough, constraints. (That's an interpretation, rather than an MS stated fact.) Now, getting back to the original topic. Lcm with 4096 means simply to have twelve zeroes at the end. I'd code it like this: Update:
Ignore the above, I left the lcm = n * 40096; in, where (I assume) you meant to replace:
With:
Which works, but takes twice as long as the original the original version:
/Update: (Counting trailing zeroes can also be optimized.) I thought about that last night and tried using the __BitScanForward64() intrinsic:
Which looked like it should be more efficient, compiling to this: Rather than this: But the reality turns out to be disappointingly about 50% slower:
With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday'
Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
"Science is about questioning the status quo. Questioning authority".
I'm with torvalds on this
In the absence of evidence, opinion is indistinguishable from prejudice. Agile (and TDD) debunked
| [reply] [d/l] [select] |
by BrowserUk (Patriarch) on Mar 09, 2015 at 13:51 UTC | |
FWIW: Of many things I've tried to improve on Anonymous Monk's gcm(), the only thing that has worked is this:
Incrementing seems to be fractionally faster than shifting:
But the difference, 1/4 second on > 1 billions calls is so minimal (0.2 nanoseconds), and the shifting better captures the semantics, I've stuck with the original. With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday'
Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
"Science is about questioning the status quo. Questioning authority".
I'm with torvalds on this
In the absence of evidence, opinion is indistinguishable from prejudice. Agile (and TDD) debunked
| [reply] [d/l] [select] |
by BrowserUk (Patriarch) on Mar 09, 2015 at 14:36 UTC | |
And no sooner have I said that, and I think of a way to use _BitScanForward64() that improves upon the original by 70+%:
Pre-masking the scanned var, avoids the later subtraction being conditional:
With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday'
Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
"Science is about questioning the status quo. Questioning authority".
I'm with torvalds on this
In the absence of evidence, opinion is indistinguishable from prejudice. Agile (and TDD) debunked
| [reply] [d/l] [select] |
by oiskuu (Hermit) on Mar 09, 2015 at 21:52 UTC | |
by Anonymous Monk on Mar 08, 2015 at 20:46 UTC | |
Read more... (2 kB) | [reply] [d/l] [select] |
by BrowserUk (Patriarch) on Mar 08, 2015 at 23:00 UTC | |
Anonymonk's implementation by an order of magnitude over oiskuu's:
Thanks for the sanity check AnonyMonk! And the difference between this benchmark and the previous (unbelievable) one? One keyword:
With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday'
Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
"Science is about questioning the status quo. Questioning authority".
I'm with torvalds on this
In the absence of evidence, opinion is indistinguishable from prejudice. Agile (and TDD) debunked
| [reply] [d/l] [select] |
by Anonymous Monk on Mar 08, 2015 at 23:49 UTC | |
by BrowserUk (Patriarch) on Mar 09, 2015 at 01:44 UTC | |
by BrowserUk (Patriarch) on Mar 08, 2015 at 22:47 UTC | |
Well that's a weird result BUK :) So I actually compiled your code (with some changes to make it suitable for gcc on linux). I was getting strange results too. 1500 nanoseconds for anonM algorithm, no matter what s was (at -O2). I'm curious why that happened. With some checksums to verify results the execution time became more realistic (with 1:10 bias towards bitshift stuff, I understand your reaction. I had the same. I spent 4 hours trying to find something amiss. I failed. If you look at the commented out verification code in what I posted above (with extra comments):
Of course, the additional code slows things down and changes the timing completely, but not a single mismatch is found. I can't think of a more comprehensive way to verify the algorithms than to directly compare their results for a full range of inputs? but perhaps your compiler recognizes Euclid's algorithm? I'll admit that I do not understand quite what the optimiser has done to the code. The Euclidean gcd() function is pretty easy to recognise in the assembler output:
Realisation dawns!Without the verification code, the compiler recognises that the value calculated within the loop is never used outside of it, and simply optimises the entire loop away:
In other words, the times output are the time it takes to make two calls to _rdtsc(). With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday'
Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
"Science is about questioning the status quo. Questioning authority".
I'm with torvalds on this
In the absence of evidence, opinion is indistinguishable from prejudice. Agile (and TDD) debunked
| [reply] [d/l] [select] |
|
Re: Simple arithmetic?
by cheako (Beadle) on Mar 07, 2015 at 22:17 UTC | |
We are talking about integer factorisation?! Can't be calculated without a quantum computer. Plus your code could endless loop. One good solution is to use a lookup table.
Further optimizations can be made with a better search pattern: | [reply] [d/l] [select] |
by marto (Cardinal) on Mar 08, 2015 at 13:23 UTC | |
It would be a good idea to take the time to read and understand PerlMonks for the Absolute Beginner and How do I post a question effectively?. Please mark updates to posts, you've made radical changes/additions to various posts without doing this, perhaps as a way of having the first response before actually having a valid reply. Perhaps this stems back to how you've interpreted some of the responses to PM Leveling Guide.. I'm still not clear on why you feel the need accelerate leveling. The indirect responses to your leveling thread are worth further consideration on this subject. | [reply] |
by LanX (Saint) on Mar 08, 2015 at 14:48 UTC | |
| [reply] | |
by marto (Cardinal) on Mar 08, 2015 at 19:00 UTC | |
by LanX (Saint) on Mar 08, 2015 at 19:07 UTC | |
| |
by BrowserUk (Patriarch) on Mar 08, 2015 at 09:34 UTC | |
I down voted this post last night when you posted the first three lines of the current post: We are talking about integer factorisation? This has (almost) nothing to do with integer factorisation. That is to say, whilst there might be an approach the problem using integer factorisation; it would be like using calculus to tally your bar bill. Can't be calculated without a quantum computer. Whilst generalised integer factorisation is known to be hard; for the size of integers involved here < 2^64, there are simple, efficient methods available. Plus your code could endless loop You're right, it could; but only if the product, $r *4096 > $c; which will never be the case; and the 'cure' (omitted for clarity in the description of the question) is trivial. So now we come to your belatedendum:One good solution is to use a lookup table. (Apart from: what does $__intfactor{%r} mean?, (which I'll assume is a typo); and where did $r and $c come from in that subroutine? (Which I'll assume is just laziness.) Offering a solution that caches to disk, the results of the iterative method I posted, doesn't begin to answer the question I asked. It's like answering the question "How do we solve world hunger", with a proposal for setting up food warehouses and suggesting that when people are hungry, they simply go to the warehouse and collect some food. FinallyThis is the third time in the last couple of days where you've immediately posted something fairly meaningless when a SoPW first appears; and then silently expanded/modified it without attribution later. Apparently taking Ambrus' "advice" to heart. Fair warning: Continue to do so in threads I start, or those I am interested in, and you and I will have a problem. With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday'
Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
"Science is about questioning the status quo. Questioning authority".
I'm with torvalds on this
In the absence of evidence, opinion is indistinguishable from prejudice. Agile (and TDD) debunked
| [reply] [d/l] |
by hdb (Monsignor) on Mar 08, 2015 at 13:07 UTC | |
I find your response far too harsh and not really appropriate, generally and for this forum in particular. Especially in the light of the fact, that your posted snippet is far from exact:
You use upper and lower case $r, $c is not initialized, so the while-loop will terminate immediately. | [reply] [d/l] [select] |
by Laurent_R (Canon) on Mar 08, 2015 at 14:06 UTC | |
by BrowserUk (Patriarch) on Mar 08, 2015 at 14:39 UTC | |
by LanX (Saint) on Mar 08, 2015 at 14:50 UTC | |
by BrowserUk (Patriarch) on Mar 08, 2015 at 21:19 UTC | |