in reply to Re^4: XS module in ithreads Perl much slower in threads::join after adding SvOBJECT_off
in thread XS module in ithreads Perl much slower in threads::join after adding SvOBJECT_off

I searched the web and came across this page where the developer tries SvOBJECT_off. Stas states, "So we get all kind of problems when automatically dereferencing it."

SvOK_off(sv); SvIVX(sv) = 0; SvOBJECT_off(sv);

It is warming to the heart (because of the frustrations at times folks hoping and/or making things threads-safe) to read, "A working solution is needed to make mp2 API perl-ithreads-safe as it's not at the moment, ...".

Stas settled with the following instead.

SV *sv = SvRV(obj); if (sv) { /* detach from the C struct and invalidate */ mg_free(sv); /* remove any magic */ SvFLAGS(sv) = 0; /* invalidate the sv */ }

I tried replacing SvOBJECT_off with mg_free in PDL 2.076.

/* Clear the sv field so that there will be no dangling ptrs */ if (it->sv) { // SvOBJECT_off((SV *)it->sv); /* problematic, issue #385 */ mg_free(it->sv); /* remove any magic instead */ sv_setiv(it->sv,0x4242); it->sv = NULL; }

etj, will that work? The Strassen demonstrations work fine and see no adverse effects during global destruction.

  • Comment on Re^5: XS module in ithreads Perl much slower in threads::join after adding SvOBJECT_off
  • Select or Download Code

Replies are listed 'Best First'.
Re^6: XS module in ithreads Perl much slower in threads::join after adding SvOBJECT_off
by etj (Priest) on Mar 01, 2022 at 21:30 UTC
    I have implemented this on the current git master branch, and intend to release it very soon, after I have made a couple more tweaks to the demos system which has finally got overhauled. Thanks for the amazing research!

      Thank you, for the enlightenment on using PDL::LinearAlgebra::Real. I updated the examples.

      Passing a flag to the script will attempt to load PDL::LinearAlgebra::Real.
      If available, PDL::LinearAlgebra::Real computes faster via LAPACK/OpenBLAS.
      Use PDL 2.077 or later for best results. Check also, OpenMP-enabled i.e.
      pkg-config --variable=openblas_config openblas | grep -c USE_OPENMP
      
      perl matmult_base.pl  4096        # 54.685s built-in matrix multiply
      perl matmult_base.pl  4096 1      #  6.706s LAPACK/OpenBLAS 1 thread
      perl matmult_base.pl  4096 4      #  1.727s LAPACK/OpenBLAS 4 threads
      
      perl matmult_mce_d.pl 4096 4      # 12.468s built-in matrix multiply
      perl matmult_mce_d.pl 4096 4 1    #  1.915s LAPACK/OpenBLAS 4 threads
      
      perl matmult_mce_f.pl 4096 4      # 11.950s built-in matrix multiply
      perl matmult_mce_f.pl 4096 4 1    #  1.836s LAPACK/OpenBLAS 4 threads
      
      perl matmult_mce_t.pl 4096 4      # 12.245s built-in matrix multiply
      perl matmult_mce_t.pl 4096 4 1    #  1.856s LAPACK/OpenBLAS 4 threads
      
      perl matmult_simd.pl  4096 4      # 16.136s built-in matrix multiply
      perl matmult_simd.pl  4096 4 1    #  1.763s LAPACK/OpenBLAS 4 threads
      
      perl strassen_07_f.pl 4096        #  3.516s built-in matrix multiply
      perl strassen_07_f.pl 4096 1      #  1.915s LAPACK/OpenBLAS 7 threads
      
      perl strassen_07_t.pl 4096        #  3.658s built-in matrix multiply
      perl strassen_07_t.pl 4096 1      #  2.072s LAPACK/OpenBLAS 7 threads
      

      Look at matmult_base.pl go :) This is possible with OpenMP-enabled LAPACK/OpenBLAS libs.