in reply to Computing pi to multiple precision
Back in 2018, I came across the article Parallel GMP-Chudnovsky using OpenMP with factorization and thought how cool it would be to do something similarly involving Perl, MCE, and Inline::C. The examples demonstrate nested parallelization using OpenMP and pthreads.
https://github.com/marioroy/Chudnovsky-Pi
The code supports GMP and MPIR. Moreover, I created a complementary extra folder, my attempt at making GMP/MPIR's mpn_get_str parallel via divide-and-conquer. I used prime_test.cpp by Anthony Hay for testing.
mpn_get_str
$ cd Chudnovsky-Pi/src/extra $ g++ -O3 -DTEST6 -fopenmp prime_test.cpp -l gmp -o prime6_test -Wno-a +ttributes $ time ./prime6_test Calculating 2^n-1 for n=1257787... Calculating 2^n-1 for n=1398269... Calculating 2^n-1 for n=2976221... Calculating 2^n-1 for n=3021377... Calculating 2^n-1 for n=6972593... Calculating 2^n-1 for n=13466917... Calculating 2^n-1 for n=20996011... Calculating 2^n-1 for n=24036583... Calculating 2^n-1 for n=25964951... Calculating 2^n-1 for n=30402457... Calculating 2^n-1 for n=32582657... Calculating 2^n-1 for n=37156667... Calculating 2^n-1 for n=42643801... Calculating 2^n-1 for n=43112609... Calculating 2^n-1 for n=57885161... total failures 0 real 0m11.722s user 0m11.626s sys 0m0.087s
parallel mpn_get_str
Run top in another window and have it refresh every 0.1 seconds. You will see the parallel demonstration consume many threads not exceeding 8 threads max.
$ g++ -O3 -DPARALLEL -DTEST6 -fopenmp prime_test.cpp -l gmp -o prime6_ +test -Wno-attributes $ time ./prime6_test Calculating 2^n-1 for n=1257787... Calculating 2^n-1 for n=1398269... Calculating 2^n-1 for n=2976221... Calculating 2^n-1 for n=3021377... Calculating 2^n-1 for n=6972593... Calculating 2^n-1 for n=13466917... Calculating 2^n-1 for n=20996011... Calculating 2^n-1 for n=24036583... Calculating 2^n-1 for n=25964951... Calculating 2^n-1 for n=30402457... Calculating 2^n-1 for n=32582657... Calculating 2^n-1 for n=37156667... Calculating 2^n-1 for n=42643801... Calculating 2^n-1 for n=43112609... Calculating 2^n-1 for n=57885161... total failures 0 real 0m4.368s user 0m11.966s sys 0m0.134s
Chudnovsky Pi demonstration
$ cd Chudnovsky-Pi/src $ make $ make pi-gmp # requires GMP $ make pi-mpir # requires MPIR $ cd ../bin
pi-hobo.pl
Once the computation is completed (upon seeing end date...), mpf_get_str is called. It will consume 1 thread for a while, then 2, 4, max 8 threads.
$ perl pi-hobo.pl 100000000 1 auto | md5sum # start date = Thu Jul 28 17:50:15 2022 # terms = 7051366, depth = 24, threads = 64, logical cores = 64 sieve cputime = 0.27s wallclock = 0.27s factor = 1. +0 bs cputime = 54.74s wallclock = 1.00s factor = 54. +5 sum cputime = 29.47s wallclock = 3.89s factor = 7. +6 div/sqrt cputime = 8.21s wallclock = 5.02s factor = 1. +6 mul cputime = 2.09s wallclock = 2.09s factor = 1. +0 total cputime = 94.78s wallclock = 12.26s factor = 7. +7 1.58m 0.20m # P size = 158218296 digits (1.582183) # Q size = 158218289 digits (1.582183) # end date = Thu Jul 28 17:50:27 2022 969bfe295b67da45b68086eb05a8b031 -
pi-gmp
$ ./pi-gmp 100000000 1 auto | md5sum # start date = Thu Jul 28 17:54:00 2022 # terms = 7051366, depth = 24, threads = 64, logical cores = 64 sieve cputime = 0.28s wallclock = 0.28s factor = 1. +0 bs cputime = 55.93s wallclock = 0.95s factor = 58. +7 sum cputime = 28.54s wallclock = 3.87s factor = 7. +4 div/sqrt cputime = 8.18s wallclock = 5.03s factor = 1. +6 mul cputime = 2.13s wallclock = 2.13s factor = 1. +0 total cputime = 95.07s wallclock = 12.27s factor = 7. +8 1.58m 0.20m # P size = 158218296 digits (1.582183) # Q size = 158218289 digits (1.582183) # end date = Thu Jul 28 17:54:12 2022 969bfe295b67da45b68086eb05a8b031 -
|
---|