I suspect this will be a case of "too many words", and maybe to some degree (as I have been accused of before), the "blind leading the blind"

I am not an 'expert' in threads (nor perl). I am just someone who has:

As such, as with everything I post, read what I have written, but make up your own mind regarding it's utility and accuracy.

The following is an attempt to answer this question:

When you say that I Implicitly shares objects across threads, how is this possible?

To try to explain this, I'll use the snippet of code from your original post.

#!/usr/bin/perl -w use strict; use threads; use IO::String; ## Point A my $var; my $io = IO::String->new($var); ## Point B my $th=threads->new(\&Update); ## Statement C $th->join(); sub Update{};

By the time you reach Point A in this code, perl has already loaded all the code from the modules 'strict' (+ sub-dependances), 'threads' (+sub-dependances), 'IO::String'(+sub-dependances). With use all this code is loaded + any package global variables created by those modules has been allocated during the BEGIN{} phase of running your code.

By the time you reach Point B, $var & $io have also been allocated. In the case of the former, this is a simple scalar. In the case of the latter, the new() method of the IO::String class has also been run, and any variables it creates have been allocated. FInally, the scalar $io has been blessed. What that means is that as well as $io being a reference to some storage that hold the state of this instance of the IO::String class, $io now also carries (behind the scenes), pointer(s) to the byte code that implements the class methods.

When your program does something with $io, it's value tells perl which instance (storage; state) of IO::String you are operating on. The hidden pointer(s) (often called 'magic'), tell it where to find the code for the methods that can operate upon that instance.

When Statement C is executed, what happens (simplistically stated), is that an new copy of the interpreter is created and everything that constitutes your program in memory up to this point--ie. everything above--is duplicated into memory allocated by that new interpreter.

In effect, this is somewhat similar to if you had forked your program at that point or, stated another way, as if you had run a second copy of your program and stopped it at that same point. The difference is that unlike two separate processes which would not be able to access the memory of the other copy, the two copies created by spawning the thread can. That is the major advantage of threads--they can communicate with each other through direct memory access rather than serialisation through pipes etc.

As stated, this is a simplification. Some of the memory allocated by the first thread is not duplicated into the second thread. The non-duplicated elements are "process global". This includes such things as file handles, some of Perl's "Special Vars", and some internal state used by Perl itself.

This duplication is not an effect of threads per se. It is the implementation chosen by the iThreads implementers. The advantage is that your simple variable $var, is now two simple variables--one for each thread. Each thread can now manipulate its copy of $var without needing to concern itself with synchronisation.

Equally, the object $io, has also been duplicated. The problem is that not only has the instance storage been duplicated, so has the method code. When one thread uses $io to invoke a method, the hidden pointer (magic) tells it where to look to find the code, and the reference value itself tells it what state to manipulate, and each copy of $io not only points to a different copy of the state, but its associated magic also points to a different place.

Now, to share the copy of the simple variable $var, which is after all one of the main reasons for using threads, you must designate that it is to be shared using the my $var : shared; nomenclature.

What happens then (and please, don't take my description too literally!), is that the two copies of $var are tied. That is to say, each has hidden pointer (magic) applied to it so that when your code modifies one copy on one thread, behind the covers, that update is also applied to the other copy. The exact mechanism by which this happens is irrelevant in that, as far as your program is concerned, you only have one copy which all threads that can see that copy can manipulate.

The problem comes with trying to share objects. If you applied the shared attribute to $io (which you can't because it won't let you), then not only would the state of $io have to be replicated each time it changed to any other threads copy. Also, the value of the magic would also need to be replicated. And that's impossible.

To understand why you can't share the code (methods) that implement a class between two threads is complex, but as a simplified example. Say you had a class that had a settable separator or terminator. (think $/ for an IO class or ',' for a CSV class). You create an instance on one thread and set the separator to ','. On another thread you set it to '|'. If you could share an instance of this class between the two threads, you have a conflict.

This could be notionally be alleviated by always storing a copy of CLASS DATA in every instance, but then each time you modified the CLASS data, the class would have to search out each if it's instances and update that CLASS value. If you store the class data with the class, then when you tried to use an instance created on one thread with an instance created on another you get the conflict. Is this a comma separated instance or a pipe separated instance? The problems run much deeper than this.

However, that doesn't mean that you can't used threads and objects. It just means that you have not to share objects between threads.

It also means that using require to load modules only into those threads where the module will be used will save memory over useing them as it will avoid them being duplicated into threads that don't need them. I'll try to offer a solution to your actual problem in a separate reply.


Examine what is said, not who speaks.
"Efficiency is intelligent laziness." -David Dunham
"Think for yourself!" - Abigail
"Memory, processor, disk in that order on the hardware side. Algorithm, algoritm, algorithm on the code side." - tachyon

In reply to Re^3: Problem with using threads with modules. by BrowserUk
in thread Problem with using threads with modules. by tele2mag

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.