Counting Primes

It took me some time (on-and-off) getting the OpenMP demonstrations to perform similar to Perl MCE + Inline::C. Counting prime numbers only, primes1.c now performs like algorithm3.pl. Likewise, primes3.c and primes4.codon perform like the primesieve binary or primesieve.pl.

Testing was done on a 32-core machine.

# Algorimth3 $ ./bin/algorithm3.pl 1e12 Primes found: 37607912018 Seconds: 14.711 $ ./demos/primes1.gcc 1e12 Primes found: 37607912018 Seconds: 14.499 $ ./demos/primes1.clang 1e12 Primes found: 37607912018 Seconds: 14.587 $ ./demos/primes1.nvc 1e12 Primes found: 37607912018 Seconds: 14.858 $ ./demos/primes2 1e12 Primes found: 37607912018 Seconds: 20.204 # Primesieve $ /usr/local/bin/primesieve 1e12 Sieve size = 256 KiB Threads = 64 100% Seconds: 5.597 Primes: 37607912018 $ ./bin/primesieve.pl 1e12 Primes found: 37607912018 Seconds: 5.707 $ ./demos/primes3.gcc 1e12 Primes found: 37607912018 Seconds: 5.696 $ ./demos/primes3.clang 1e12 Primes found: 37607912018 Seconds: 5.767 $ ./demos/primes3.nvc 1e12 Primes found: 37607912018 Seconds: 5.841 $ ./demos/primes4 1e12 Primes found: 37607912018 Seconds: 5.719

Printing Primes

Outputting prime numbers is another story. Workers using MCE output to /dev/shm location in parallel, passing the chunk_id to the manager process to output orderly. This is very fast. The C and Codon demonstrations write directly to STDOUT, orderly. Here, threads wait their turn.

The saddest moment was witnessing OpenMP consume unnecessary power consumption for waiting threads. I created an issue ticket for LLVM OpenMP and NVIDIA HPC OpenMP. IMHO, only GCC OpenMP pass in this regard. This is the reason GCC ran faster compared to CLANG and NVIDIA NVC.

Output size for 1e10 is 4.6 GB. Be sure to direct to a command (i.e. cksum) or /dev/null.

# Algorithm3 $ ./bin/algorithm3.pl 1e10 -p >/dev/null Seconds: 0.743 $ ./demos/primes1.gcc 1e10 -p >/dev/null Seconds: 10.249 $ ./demos/primes1.clang 1e10 -p >/dev/null Seconds: 12.696 $ ./demos/primes1.nvc 1e10 -p >/dev/null Seconds: 14.326 $ ./demos/primes2 1e10 -p >/dev/null Seconds: 12.369 # Primesieve # the primesieve binary uses one core when -p is given $ time /usr/local/bin/primesieve 1e10 -p >/dev/null Seconds: 14.379 $ ./bin/primesieve.pl 1e10 -p >/dev/null Seconds: 0.680 $ ./demos/primes3.gcc 1e10 -p >/dev/null Seconds: 7.145 $ ./demos/primes3.clang 1e10 -p >/dev/null Seconds: 8.826 $ ./demos/primes3.nvc 1e10 -p >/dev/null Seconds: 11.249 $ ./demos/primes4 1e10 -p >/dev/null Seconds: 8.597

In reply to Re: MCE Sandbox 2023-08 by marioroy
in thread MCE Sandbox 2023-08 by marioroy

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.