in reply to Re^3: Avoid keeping larger lists in Memory
in thread Avoid keeping larger lists in Memory

Thank you Laurel for your suggestion on further Improving the code . I have tried the steps you have suggested partially (Still working on a sloution using the Square root of the target Number)

Here is my code and the O/p yields right results and execution time is higher compared to ur solution . I will think about adding the Square root portion and update this thread

#!/usr/bin/perl use warnings; use strict; ######## sub prime_factors { my $num = shift; my $large_prime_found = 2; my @primes = (); my $odd = 1; my $i = 0; for ( my $y = $large_prime_found; $y <= $num;) { $i++; #print "Iteration : $i, y => $y, Num => $num LP => $large_prime_f +ound, Arr Primes: @primes, Odd => $odd\n"; if ($num % 2 == 0) { $num /= $large_prime_found; $large_prime_found = 2; push @primes, $large_prime_found; next; } else { $odd += 2; #if ($odd <= sqrt($num)) { if ($num % $odd == 0 ) { $large_prime_found = $odd; $num /= $large_prime_found; push @primes, $large_prime_found; next; } #} } } return @primes; } $ time ./primes.pl Number: 20 , Prime_Factors: => 2 2 5 Number: 48 , Prime_Factors: => 2 2 2 2 3 Number: 96 , Prime_Factors: => 2 2 2 2 2 3 Number: 7 , Prime_Factors: => 7 Number: 69 , Prime_Factors: => 3 23 Number: 33 , Prime_Factors: => 3 11 Number: 13195 , Prime_Factors: => 5 7 13 29 Number: 600851475143 , Prime_Factors: => 71 839 1471 6857 real 0m0.011s user 0m0.005s sys 0m0.004s

Replies are listed 'Best First'.
Re^5: Avoid keeping larger lists in Memory
by Laurent_R (Canon) on Jul 04, 2017 at 19:14 UTC
    Hi pr33,

    the main point in my post was that if you add to your input list a relatively large prime, or the product of a relatively large with some other factors, your performance will fall dramatically. As an example, I took 2**31 - 1 (i.e. 2147483647), a number 280 times smaller than the large number (600851475143) that you use, but much less favorable because it is prime, your code will take not a split second, but several minutes to run (about five minutes on my laptop). Try it for your self to see the problem.

    In such a case, the optimization consisting in running the for loop until the square root of the target will provide you with a huge improvement. This optimization, together with the other two that I described, made my code about 5,000 times faster for the same input data.

    As a starting point you could just change this line:

    for ( my $y = $large_prime_found; $y <= $num;) {
    to this:
    for ( my $y = $large_prime_found; $y <= $num ** .5;) {
    This should already improve considerably the timings. But this calculates the square root of $num many many times.

    It would be better to compute the square root of $num (into, say, $max) outside of the for loop header, i.e. at the beginning of your subroutine and at the place where you modify the value of $num within the for loop, and to use $max within the header of the for loop:

    for ( my $y = $large_prime_found; $y <= $max;) {
    Check that everything still works OK with your new version of the subroutine, I haven't tested these improvements with it, and there may be one or two small things to adjust somewhere else in the code.

      shouldn't you really recalculate the sqrt every time you change x? Because the maximum remaining factor(s) will be less than or equal to the root of the new x, and depending on the size of the y, it could reduce the max significantly enough to overcome the cost of the root. As long as the sqrt is taken after the redo, not in the loop condition, the cost shouldn't be too bad for the payoff. (easily benchmarked)

        Hi pryrt

        in the latest version of pr33's code, the variable name is $num. So yes, I think the sqrt should be recalculated each time $num changes. In the code I had tested and bench marked earlier based on the previous version of the code proposed by pr33, the variable name was $x. And, yes, I recalculated the sqrt of $x each time $x was recalculated, but I avoided to do it within the header of the for loop, to avoid recalculating it at each iteration of the loop. In other words, I was sort of caching the value of the sqrt. This made the code a bit more complicated, but also quite faster

        The latest version of pr33's code does not have a redo and is changing the loop variables within the loop, which might possibly work but is probably a bit dangerous with possible side effects. Restarting the loop with a redo is certainly less error prone.