Problem with using threads with modules.

tele2mag has asked for the wisdom of the Perl Monks concerning the following question:

Replies are listed 'Best First'.
Re: Problem with using threads with modules. by BrowserUk (Patriarch) on Jun 21, 2004 at 22:20 UTC
The problem is caused because you are (implicitly) trying to share objects across threads which isn't possible. However, that doesn't stop you from doing what you are trying to do. It just means that you must adapt your programing style to account for it. As I understand you, you want to be able to write to a 'file', represented by an IO:String object on one thread and then access the 'output', the string, in another thread. That can be done this way. `#! perl -slw use strict; use threads; use threads::shared; my $var : shared; sub Update{ require 'IO/String.pm'; my $io = IO::String->new( $var ); $io->print( 'Hello world' ); $io->print( $_ ) for 1 .. 10; }; my $th=threads->new( \&Update ); $th->join(); print "'$var'"; __END__ P:\test>368516 'Hello world 1 2 3 4 5 6 7 8 9 10 '` [download] There are a couple of things to note here: The IO::String object is not shared between threads. The scalar `$var`, is shared. The IOString object `$io` is created inside the thread that will use it and is only called from within that thread. I have required the module into the thread that will use it rather than useing it. This is an optimisation. It saves all the code from IO::String being duplicated into every thread created by the program. This isn't strictly necessary. You could `use IO::String;` in the normal way provided any IO::String objects you create are only called from the thread in which the were created. That said, if you have a choice, there are better ways of sharing data between threads. If you need a filehandle because you are passing this to some other module for it to write to, then you might be better off using a socket rather than an in-memory file this way. Examine what is said, not who speaks. "Efficiency is intelligent laziness." -David Dunham "Think for yourself!" - Abigail "Memory, processor, disk in that order on the hardware side. Algorithm, algoritm, algorithm on the code side." - tachyon	[reply] [d/l] [select]
Re^2: Problem with using threads with modules. by tele2mag (Initiate) on Jun 22, 2004 at 14:17 UTC
First of all, I want to thank you for you help :) I been struggling with this a while now. When you say that I Implicitly shares objects across threads, how is this possible? If I don't pass an object as an argument or uses an external global object in the thread, how can It share them. Are all objects that are in the same scope (i.e. sub routines, class) automatically shared without me using it. :\ To the program. I have included my original program an a working one that is not what I first had in mind and I don't know why it works, thats frightens me a bit ;). Your code is close to want I wanted but it will not work in my case. The IO::Telnet-socket is the one who writes to `$io` so I cant access it because of the issue of sharing objects you told me. Instead the `Update()`-thread uses `$var` to update the screen and Telnet-socket output is connected to `$io` in the main thread. With the small test program it was impossible to know how the program was composed so I have included my code and stripped all special features for easy reading. This is how it is suppose to work. This class i called from another perl-module like `run("192.168.0.1","23")`. It uses Telnet to extract information from various devices and views it on screen and logs it on disc `FileIO::save_log("traffic_log.txt",$var)` by using `$var` as intermediate storage. IO::Telnet-socket is connected to `$var` through `$sock->input_log($io)`. The program is going to have several instances of `run(...)` working in parallell by using threads later on. Each instances has a thread `Update()` that updates screen with its traffic. For all devices I going to have several `program1()`-subroutines. Because of all the instances of `run(...)`-threads I want as few class-variables as possible and declare them inside `run(...)`-subroutine. This is how the original program looked like. Read more... (3 kB) Now to the working one. The changes are: The IO::String is moved to the `program1()` because with it in `run(...)` (with the thread creation) caused my warnings. As an effect I have to move `$var` out of `run(...)` to be a class variable Update can now read $var directly without passing as argument Here is the working code: Read more... (3 kB) This code works but is not so good if I want to create multiple threads of this class. On the otherhand, maybe I am not doing it the right way... As usual, all inputs are highly appreciated. Thanks again for the help! ;)	[reply] [d/l] [select]
Re^3: Problem with using threads with modules. by BrowserUk (Patriarch) on Jun 23, 2004 at 00:19 UTC
I suspect this will be a case of "too many words", and maybe to some degree (as I have been accused of before), the "blind leading the blind" I am not an 'expert' in threads (nor perl). I am just someone who has: Used threads outside of Perl and understands them at the OS level (to some degree). Has followed the progress of iThreads fairly closely. Hasn't written them off as so many have done. Chooses to use an OS where fork is not suppported natively and where using Perl's pseudo-fork emulation (implemented with ithreads) of that idiom gives few if any of the benefits of forking that OS's that suppport this natively have, and in doing so, discards the main benefits of threads. Is attempting to "do his bit" by exploring what is and is not achievable with iThreads. As such, as with everything I post, read what I have written, but make up your own mind regarding it's utility and accuracy. The following is an attempt to answer this question: When you say that I Implicitly shares objects across threads, how is this possible? To try to explain this, I'll use the snippet of code from your original post. `#!/usr/bin/perl -w use strict; use threads; use IO::String; ## Point A my $var; my $io = IO::String->new($var); ## Point B my $th=threads->new(\&Update); ## Statement C $th->join(); sub Update{};` [download] By the time you reach Point A in this code, perl has already loaded all the code from the modules 'strict' (+ sub-dependances), 'threads' (+sub-dependances), 'IO::String'(+sub-dependances). With use all this code is loaded + any package global variables created by those modules has been allocated during the BEGIN{} phase of running your code. By the time you reach Point B, $var & $io have also been allocated. In the case of the former, this is a simple scalar. In the case of the latter, the `new()` method of the IO::String class has also been run, and any variables it creates have been allocated. FInally, the scalar $io has been blessed. What that means is that as well as $io being a reference to some storage that hold the state of this instance of the IO::String class, $io now also carries (behind the scenes), pointer(s) to the byte code that implements the class methods. When your program does something with $io, it's value tells perl which instance (storage; state) of IO::String you are operating on. The hidden pointer(s) (often called 'magic'), tell it where to find the code for the methods that can operate upon that instance. When Statement C is executed, what happens (simplistically stated), is that an new copy of the interpreter is created and everything that constitutes your program in memory up to this point--ie. everything above--is duplicated into memory allocated by that new interpreter. In effect, this is somewhat similar to if you had forked your program at that point or, stated another way, as if you had run a second copy of your program and stopped it at that same point. The difference is that unlike two separate processes which would not be able to access the memory of the other copy, the two copies created by spawning the thread can. That is the major advantage of threads--they can communicate with each other through direct memory access rather than serialisation through pipes etc. As stated, this is a simplification. Some of the memory allocated by the first thread is not duplicated into the second thread. The non-duplicated elements are "process global". This includes such things as file handles, some of Perl's "Special Vars", and some internal state used by Perl itself. This duplication is not an effect of threads per se. It is the implementation chosen by the iThreads implementers. The advantage is that your simple variable $var, is now two simple variables--one for each thread. Each thread can now manipulate its copy of $var without needing to concern itself with synchronisation. Equally, the object $io, has also been duplicated. The problem is that not only has the instance storage been duplicated, so has the method code. When one thread uses $io to invoke a method, the hidden pointer (magic) tells it where to look to find the code, and the reference value itself tells it what state to manipulate, and each copy of $io not only points to a different copy of the state, but its associated magic also points to a different place. Now, to share the copy of the simple variable $var, which is after all one of the main reasons for using threads, you must designate that it is to be shared using the `my $var : shared;` nomenclature. What happens then (and please, don't take my description too literally!), is that the two copies of $var are tied. That is to say, each has hidden pointer (magic) applied to it so that when your code modifies one copy on one thread, behind the covers, that update is also applied to the other copy. The exact mechanism by which this happens is irrelevant in that, as far as your program is concerned, you only have one copy which all threads that can see that copy can manipulate. The problem comes with trying to share objects. If you applied the shared attribute to $io (which you can't because it won't let you), then not only would the state of $io have to be replicated each time it changed to any other threads copy. Also, the value of the magic would also need to be replicated. And that's impossible. To understand why you can't share the code (methods) that implement a class between two threads is complex, but as a simplified example. Say you had a class that had a settable separator or terminator. (think $/ for an IO class or ',' for a CSV class). You create an instance on one thread and set the separator to ','. On another thread you set it to '\|'. If you could share an instance of this class between the two threads, you have a conflict. This could be notionally be alleviated by always storing a copy of CLASS DATA in every instance, but then each time you modified the CLASS data, the class would have to search out each if it's instances and update that CLASS value. If you store the class data with the class, then when you tried to use an instance created on one thread with an instance created on another you get the conflict. Is this a comma separated instance or a pipe separated instance? The problems run much deeper than this. However, that doesn't mean that you can't used threads and objects. It just means that you have not to share objects between threads. It also means that using require to load modules only into those threads where the module will be used will save memory over useing them as it will avoid them being duplicated into threads that don't need them. I'll try to offer a solution to your actual problem in a separate reply. Examine what is said, not who speaks. "Efficiency is intelligent laziness." -David Dunham "Think for yourself!" - Abigail "Memory, processor, disk in that order on the hardware side. Algorithm, algoritm, algorithm on the code side." - tachyon	[reply] [d/l] [select]
Re^4: Problem with using threads with modules. by tele2mag (Initiate) on Jun 23, 2004 at 19:02 UTC
Re^5: Problem with using threads with modules. by BrowserUk (Patriarch) on Jun 23, 2004 at 20:45 UTC
Re^3: Problem with using threads with modules. by BrowserUk (Patriarch) on Jun 23, 2004 at 03:34 UTC
There are a couple of problems I see with what you are doing. Your `Update()` routine splits the contents of $var into lines and then prints them to STDOUT but never clears that variable. This means that each time `Update()` loops, it will be re-processing all the data it has previously processed along with any new data that has been added. Your starting a new thread running `Update()` for each `run()` thread you are creating. Each of these threads is accessing the same copy of $var and outputting to the same screen. Each copy of `Update()` will therefore be processing (and re-processing) the same data--and repetitiously outputting it to the same screen. If, as is suggested by your code, you simply want all the logging output from all your Telnet sessions to be logged to STDOUT, why not just do `$sock_>input_log( \STDOUT );`? That way, the module will log directly to STDOUT and you don't need all the extra threads or code. If you need to do some pre-processing of the logged data prior to output, then your use of IO::String may make some sense, but you still don't need one `Update()` thread per `run` thread. Just a single `Update()` thread is sufficient. You would still need to correct problem 1 above. That said, if you are using 5.8.x for this, and you should be as threads were less than complete prior to 5.8.3, then it may well be better to use the "in-memory files" ability of Perl's open to avoid the need for this module at all. (Note: I haven't tried this in conjunction with threads yet!) I would (probably) use the original (main) thread for this purpose though as your code implies that this is all embedded in a module, it is unclear to me how the module would be used, so I reserve judgment in that. Finally, you are still loading Net::Telnet with use. As previously explained, this means that every* thread in your application will load a copy of it, including those that have no need for it. If you reduce the number of `Update()` threads to one, and your application doesn't create any other threads besides the run threads and the one Update thread, this probably wouldn't make a big difference to the size of your app. But as coded, half of all your threads will carry a copy of Net::Telnet but never make use if it. As always, with a clearer understanding of what the overall aim of your app is, it would be easier to make suggestions. Examine what is said, not who speaks. "Efficiency is intelligent laziness." -David Dunham "Think for yourself!" - Abigail "Memory, processor, disk in that order on the hardware side. Algorithm, algoritm, algorithm on the code side." - tachyon	[reply] [d/l] [select]
Re^4: Problem with using threads with modules. by tele2mag (Initiate) on Jun 23, 2004 at 21:41 UTC
Re^5: Problem with using threads with modules. by BrowserUk (Patriarch) on Jun 24, 2004 at 08:08 UTC
Some notes below your chosen depth have not been shown here