Beefy Boxes and Bandwidth Generously Provided by pair Networks
There's more than one way to do things
 
PerlMonks  

Re^3: Algorithm for cancelling common factors between two lists of multiplicands

by sk (Curate)
on Aug 10, 2005 at 07:56 UTC ( [id://482536]=note: print w/replies, xml ) Need Help??


in reply to Re^2: Algorithm for cancelling common factors between two lists of multiplicands
in thread Algorithm for cancelling common factors between two lists of multiplicands

BrowserUk,

I would be curious to see the results if you ever benchmark the GCD approach with the prime factor approach on the problem size you were mentioning

The GCD Approach is O(m*n)*O(GCDalgo) but I am not sure how the complexity reduces for the prime factorization approach.

You need to calculate the prime factor each of the product value right? i.e. if you have to do 100*101*102*103*104/57*58*59 then you need factor for each of the values correct?

The GCD approach will loop (5 * 3)*O(GCDalgo) times. But prime factor approach needs to calc factors for 8 values and that could potentially loop a lot more per value (max upto the value) and the complexity will be more than the GCD approach.

I ran the GCD approach and took about 9 minutes (not a formal benchmark) just to cancel out terms. Don't have any math package to do big multiplications/divisions.

I even tried to reduce the inner loop by switching @a and @b but did not see a huge impact

Also, i am confused whether the Haskell code actually computed the example (40000+!) because the execution speed was just unbelievable. Did it loose any precision when you tested it? I guess the approach was similar in canceling out terms. It makes me wonder why Perl canceling out takes so much longer than Haskell implementation

will be happy to hear on how your final implementation look.

Thanks,

SK

Replies are listed 'Best First'.
Re^4: Algorithm for cancelling common factors between two lists of multiplicands
by BrowserUk (Patriarch) on Aug 10, 2005 at 10:35 UTC

    These are the results I have currently for the inputs [989 9400 43300 2400]:

    tmoertel's haskell implementation of the cancelling code ( in 2.6 seconds):

    [11:20:43.54] C:\ghc\ghc-6.4\code>FET < ex1.dat +8.070604647867604097576877675243109941729476950993498765160880e-7030 [11:20:46.34] C:\ghc\ghc-6.4\code>

    Using Math::Pari (in 26 ms):

    P:\test>MP-FET.pl 8.070604647867604097576877668E-7030 1 trial of _default ( 24.817ms total), 24.817ms/trial

    Using Math::BigInt (Yes. That 4 hrs 38 minutes!)

    P:\test>MBF-FET.pl 8.070604647867604097576877675243109941729e-7030 1 trial of _default ( 16,699s total), 16,699s/trial

    The whole point of the cancelling code is that you do not have to calculate the huge factorials in infinite precison (slow) math, because you reduce the factorials to products of sets of much smaller, prime factors and then cancel most of those.


    Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
    Lingua non convalesco, consenesco et abolesco. -- Rule 1 has a caveat! -- Who broke the cabal?
    "Science is about questioning the status quo. Questioning authority".
    The "good enough" maybe good enough for the now, and perfection maybe unobtainable, but that should not preclude us from striving for perfection, when time, circumstance or desire allow.
      Thanks for the update

      I cannot read Haskell so I was not sure how the factoring was done. Going through the example it seems like tmoertel was just factoring like terms. If the code does further factoring (school days division) then I am surprised that it can loop through that fast!

      If you take a look at my scratchpad I tried ikegami's logic on gcd. I added a condition to check for values == 1 to save an extra GCD. I also did not have the Math module he was using so picked up another one.

      This one takes about 9 minutes. It would be interesting to see what will happen if we apply BigInt from here onwards!

      I am also curious to see the runtime of the prime factor approach. I shall try hv's code to see how fast it runs. If I remember correctly Euclidean algo for GCD is O(n+log n) and prime factorization is NP-complete. For small values, prime fact is a breeze but the complexity does not change so i shall let you know when i get around testing the runtime for it!

      cheers SK

        I cannot read Haskell so I was not sure how the factoring was done. Going through the example it seems like tmoertel was just factoring like terms.
        My implementation doesn't actually do any factoring. Rather, it efficiently cancels all like terms before doing any multiplications or divisions. The idea is to replace expensive mathematical operations with inexpensive comparisons until I have picked all of the low-hanging fruit. If I pick enough fruit, there isn't enough left on the tree to make factoring worthwhile, and so I can just go ahead a multiply out the rest. (Although it might be interesting to emit the tree after my low-handing-fruit pass and process it with some of the other bits of code in this thread.)

        The following shows how my code might compute 7!/(3!4!):

        7! ----- 3! 4! { expand fac into series of multiplications } P(2,3,4,5,6,7) ----------------- P(2,3) * P(2,3,4) { merge multiplied products } P(2,3,4,5,6,7) ----------------- P(2,2,3,3,4) { cancel like terms across division boundary } P(/,/,/,5,6,7) ----------------- P(/,2,/,3,/) { multiply and divide remaining terms } 5 * 6 * 7 --------- 2 * 3 { result } 35
        In reality, the code does all of these steps in a parallel pipeline, which means I don't have to pay the price for large intermediate terms: bits of big terms are plucked off and used as they are produced lazily. This is an essentially free benefit of using Haskell.

        Cheers,
        Tom

      Can you post your Math::BigInt code and test case? I would like to play around with some other ideas.

      Update: Ooops, I see it now, thanks.

      -QM
      --
      Quantum Mechanics: The dreams stuff is made of

        I already did. It's the second example in 482072.


        Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
        Lingua non convalesco, consenesco et abolesco. -- Rule 1 has a caveat! -- Who broke the cabal?
        "Science is about questioning the status quo. Questioning authority".
        The "good enough" maybe good enough for the now, and perfection maybe unobtainable, but that should not preclude us from striving for perfection, when time, circumstance or desire allow.
      The last two digits of the Math::Pari implementation's result seem to be off:
      +8.070604647867604097576877675243109941729476950993498765160880e-7030 8.070604647867604097576877675243109941729e-7030 8.070604647867604097576877668E-7030 <-- Math::Pari ^^
      Can you increase the implementation's precision? Or is that the limit?

      Cheers,
      Tom

        The precision is setable, but I had left at the default.

        Increasing the precision to the point where it match or suppassed the accuracy of your Haskell code (a setting of 58), doubled the time it took to a gnat's under 50 ms:

        [ 6:46:43.79] P:\test>MP-FET.pl 8.070604647867604097576877675243109941729476950993498765160880911582E- +7030 1 trial of _default ( 49.865ms total), 49.865ms/trial

        BTW: Could you explain something for me? In this:

        cancel (x:xs) (y:ys) | x == y = cancel xs ys | x < y = let (xs', ys') = cancel xs (y:ys) in (x:xs', ys') | otherwise = let (xs', ys') = cancel (x:xs) ys in (xs', y:ys') cancel xs ys = (xs, ys)

        I understand how it cancels between the lists, but I am confused about when it would pattern match against the second definition?


        Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
        Lingua non convalesco, consenesco et abolesco. -- Rule 1 has a caveat! -- Who broke the cabal?
        "Science is about questioning the status quo. Questioning authority".
        The "good enough" maybe good enough for the now, and perfection maybe unobtainable, but that should not preclude us from striving for perfection, when time, circumstance or desire allow.
Re^4: Algorithm for cancelling common factors between two lists of multiplicands
by QM (Parson) on Aug 10, 2005 at 18:36 UTC
    Looking at my copy of The Art of Computer Programming: Seminumerical Methods, 3E, Knuth gives the worst case as:
    O(GCD) ~= 4.785*log(N) + 0.6723
    where N is the smaller argument to Euclid's algorithm. The average is much better:
    Avg(GCD) ~= 0.1775*N
    Update: hv pointed out that the approximation for the mean didn't look right. I went back to the source, and see that I pulled out an intermediate result (I have to read more closely in the future). So it's not the mean.

    The mean is given as

    Avg(GCD) ~= 1.9405*log10(N)
    where N is the smaller argument to Euclid's algorithm, and ignoring a subtracted correction term based on the divisors of N.

    -QM
    --
    Quantum Mechanics: The dreams stuff is made of

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://482536]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others admiring the Monastery: (5)
As of 2024-03-28 16:33 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found