in reply to Externally managed threads using embedded Perl

The answer varies slightly depending upon which version of perl you are using as a base, but the basic answer to question 1. is "No", which answers the other two questions.

With perl releases greater than 5.8.0, perl itself uses ithreads ( 1 thread == 1 intrepreter ) internally for implementing it's threading support. There are no internal mechanisms to prevent the inevitable internal corruption that will ensue from calling one interpreter concurrently from more than one thread.

Prior to 5.8.0 back as far as 5.005, perl supported pthreads, where multiple (perl internal threads) use a single copy of the interpreter, but that was never designed to be used with external threads calling into the interpreter concurrently, and it is most unlikely that you would achieve good results by trying.


Examine what is said, not who speaks.
"Efficiency is intelligent laziness." -David Dunham
"Think for yourself!" - Abigail
Hooray!

  • Comment on Re: Externally managed threads using embedded Perl

Replies are listed 'Best First'.
Re: Re: Externally managed threads using embedded Perl
by Anonymous Monk on Jan 08, 2004 at 03:58 UTC

    OK, thanks for that. I am happy to run with Perl 5.8 or later as a requirement.

    So it seems I do this by having a single interpreter that I perl_clone(my_perl, CLONEf_CLONE_HOST) for each thread, throwing away the clone when the thread ends and/or pulling from a pool of pre-existing clones?

    A good analogy for what this is doing is a basic web server (note: this is not a web server - I'm not that stupid). ie: Short-lived requests serviced over a network connection. Is perl_cloning going to be fast or do I need to do some sort of thread pooling (which I may do anyway) to get it to perform? (or should I stop being so damn lazy and just test the peformance myself ;-))

    Or, do I want to learn Perl and try to make it all happen with Perl threads (which will involve learning Perl - I didn't write the script we want to execute)?

    Thanks again,

    Phil

      Reading between the lines of what you've told us, you have a pre-written perl script that you want to be able to run on behalf of networked users on a single machine, with concurrent access, but no sharing of data between the instances? And you aren't a perl programer :)

      It really will depend on how the pre-existing perl script runs, but assuming that the script returns the results via stdout?

      If this is the case, cloning an interpreter for each request, or building a pool of clones would probably work ok. I haven't done enough with it embedding -- nothing beyond the simple examples in perlembed -- to be able to predict the performance. Pre-cloning a pool and returning a "busy...try again" message if the pool is fully utilised, ought to be fast enough, if the loading isn't too extreme.

      Personally, I would probably use a thread-pool design using threads or maybe a pre-forking design written in perl using perl's win32 pseudo-fork support, as I find perl so much more productive that C/C++.


      Examine what is said, not who speaks.
      "Efficiency is intelligent laziness." -David Dunham
      "Think for yourself!" - Abigail
      Hooray!

        Yep, that's pretty much it.

        The existing script is SpamAssassin's spamd. It is designed to run on *nix and uses fork and signals and other stuff that don't work too well on Win32. It reads and writes from/to a socket.

        I find C++ much more productive that Perl - I guess it's just a question of what you know <g>.

        I'll do some work with cloning and see what happens.

        BTW, if I:
        my $spamtest = SpamAssassin::Mail::SpamAssassin->new(...) : shared;
        will $spamtest be available in the clone if I perl_clone after this Perl code is executed?

        You've been most helpful. Thanks.

        I'll second the suggestion to drop C/C++ here unless you have a compelling reason.

        You may also want to drop the entire service wrapper, unless you specifically need this program to be controlled like a Windows service. If you just need it to start when the machine does, and you're using Windows 2000 or later, you can just make it a scheduled task that runs "on system startup". The program you launch at that point can use pthreads, pseudo-fork, or whatever else you'd like. Depending on load, it might be interesting to look at the inetd available in Cygwin also.

        --
        Spring: Forces, Coiled Again!
        I'm back having done a lot of experimental work. I have more questions that would like to be answered. First, some code:
        EXTERN_C void xs_init (pTHX); using namespace std; CPerlEngine::CPerlEngine(char* pScriptFile) : mInterpreter(NULL), mScriptFile(pScriptFile) { this->mInterpreter = ::perl_alloc(); assert (this->mInterpreter != NULL); PERL_SET_CONTEXT(this->mInterpreter); ::perl_construct(this->mInterpreter); char* theArguments[] = {"-x", "-S", "-s", pScriptFile}; ::perl_parse(this->mInterpreter, &xs_init, 4, theArguments, NULL); PL_exit_flags |= PERL_EXIT_DESTRUCT_END; ::perl_run(this->mInterpreter); } CPerlEngine::~CPerlEngine(void) { ::perl_destruct(this->mInterpreter); ::perl_free(this->mInterpreter); } void CPerlEngine::invoke(const char* pFunctionName, vector<string> pParameters) { assert (NULL != this->mInterpreter); PERL_SET_CONTEXT(this->mInterpreter); // Pick up all the stack info in *this* threads local storage dTHX; #ifdef PERL_CLONE_WORKS PerlInterpreter* newInterpreter = ::perl_clone(this->mInterpreter, CLONEf_COPY_STACKS | CLONEf_KEEP_PTR_TABLE | CLONEf_CLONE_HOST); // PerlInterpreter* newInterpreter = ::perl_clone(this->mInterpreter, + CLONEf_CLONE_HOST); // PerlInterpreter* newInterpreter = ::perl_clone(this->mInterpreter, + NULL); #else PerlInterpreter* newInterpreter = this->mInterpreter; #endif assert (NULL != newInterpreter); ::perl_run(newInterpreter); dSP; ENTER; SAVETMPS; PUSHMARK(SP); for (vector<string>::iterator theIterator = pParameters.begin(); theIterator != pParameters.end(); theIterator++) { if (theIterator->length() > 0) { XPUSHs(::newSVpv(theIterator->c_str(), theIterator->length())); } } PUTBACK; ::call_pv(pFunctionName, G_DISCARD); FREETMPS; LEAVE; #ifdef PERL_CLONE_WORKS ::perl_free(newInterpreter); #endif }

        This is my C++ class wrapping the Perl interpreter(s). It's pretty simple. You instantiate it, it loads the specified script, and runs all of the global bits (please excuse any incorrect terminology). So far so good. The global bits are essentially a whole heap of "use blah" type statements. I think there's value in this because it results in a Perl interpreter with script loaded and references loaded - ready to be cloned and executed at will.

        The idea, then, is to call invoke(...) passing the name of the Perl sub to call and some arbitrary number of string elements. This is all pretty cool and seems to work in multiple threads with a basic Perl script.

        If I don't define PERL_CLONE_WORKS, the whole thing works like a bought one (actually better than many) but only in a single thread (obviously).

        Now imagine I define PERL_CLONE_WORKS, instantiate CPerlEngine in one thread, and call invoke(...) on a separate thread. The new thread comes along, clones the existing interpreter, and then calls my sub. This all works perfectly with a basic script BUT with a more complex script (my cut-down spamd) I get a runtime crash out of the Perl Engine in VMem::Free(void* pMem) where it says

        Perl_warn(aTHX_ "Free to wrong pool %p not %p",this,ptr->owner);
        Note that this is not the global cleanup error and it's happening a long time before the end of the script and a reasonable distance into the script.

        I think you (BrowserUk) have seen this before in different circumstances.

        My fallback plan is to create a whole new PerlInterpreter using ::perl_alloc and load absolutely everything from scratch each or have my pool of interpreters (as previously discussed).

        The value in being able to clone on the fly like this is that it won't need to have 50 interpreters lying around consuming vast amounts of memory waiting for something to come in.

        Now to my questions:

        1. Is my class doing everything it should do? If not, what else should it be doing?
        2. I've tried all combination of flags on ::perl_clone and none work. For future reference, what flags should I be using?
        3. What is
          Free to wrong pool
          telling me?

        We're getting closer at least. Another couple of weeks of this sort of questioning and I might get there <g>.

        Phil