Re: Problem with using threads with modules.

Replies are listed 'Best First'.
Re^2: Problem with using threads with modules. by tele2mag (Initiate) on Jun 22, 2004 at 14:17 UTC
First of all, I want to thank you for you help :) I been struggling with this a while now. When you say that I Implicitly shares objects across threads, how is this possible? If I don't pass an object as an argument or uses an external global object in the thread, how can It share them. Are all objects that are in the same scope (i.e. sub routines, class) automatically shared without me using it. :\ To the program. I have included my original program an a working one that is not what I first had in mind and I don't know why it works, thats frightens me a bit ;). Your code is close to want I wanted but it will not work in my case. The IO::Telnet-socket is the one who writes to `$io` so I cant access it because of the issue of sharing objects you told me. Instead the `Update()`-thread uses `$var` to update the screen and Telnet-socket output is connected to `$io` in the main thread. With the small test program it was impossible to know how the program was composed so I have included my code and stripped all special features for easy reading. This is how it is suppose to work. This class i called from another perl-module like `run("192.168.0.1","23")`. It uses Telnet to extract information from various devices and views it on screen and logs it on disc `FileIO::save_log("traffic_log.txt",$var)` by using `$var` as intermediate storage. IO::Telnet-socket is connected to `$var` through `$sock->input_log($io)`. The program is going to have several instances of `run(...)` working in parallell by using threads later on. Each instances has a thread `Update()` that updates screen with its traffic. For all devices I going to have several `program1()`-subroutines. Because of all the instances of `run(...)`-threads I want as few class-variables as possible and declare them inside `run(...)`-subroutine. This is how the original program looked like. Read more... (3 kB) Now to the working one. The changes are: The IO::String is moved to the `program1()` because with it in `run(...)` (with the thread creation) caused my warnings. As an effect I have to move `$var` out of `run(...)` to be a class variable Update can now read $var directly without passing as argument Here is the working code: Read more... (3 kB) This code works but is not so good if I want to create multiple threads of this class. On the otherhand, maybe I am not doing it the right way... As usual, all inputs are highly appreciated. Thanks again for the help! ;)	[reply] [d/l] [select]
Re^3: Problem with using threads with modules. by BrowserUk (Patriarch) on Jun 23, 2004 at 00:19 UTC
I suspect this will be a case of "too many words", and maybe to some degree (as I have been accused of before), the "blind leading the blind" I am not an 'expert' in threads (nor perl). I am just someone who has: Used threads outside of Perl and understands them at the OS level (to some degree). Has followed the progress of iThreads fairly closely. Hasn't written them off as so many have done. Chooses to use an OS where fork is not suppported natively and where using Perl's pseudo-fork emulation (implemented with ithreads) of that idiom gives few if any of the benefits of forking that OS's that suppport this natively have, and in doing so, discards the main benefits of threads. Is attempting to "do his bit" by exploring what is and is not achievable with iThreads. As such, as with everything I post, read what I have written, but make up your own mind regarding it's utility and accuracy. The following is an attempt to answer this question: When you say that I Implicitly shares objects across threads, how is this possible? To try to explain this, I'll use the snippet of code from your original post. `#!/usr/bin/perl -w use strict; use threads; use IO::String; ## Point A my $var; my $io = IO::String->new($var); ## Point B my $th=threads->new(\&Update); ## Statement C $th->join(); sub Update{};` [download] By the time you reach Point A in this code, perl has already loaded all the code from the modules 'strict' (+ sub-dependances), 'threads' (+sub-dependances), 'IO::String'(+sub-dependances). With use all this code is loaded + any package global variables created by those modules has been allocated during the BEGIN{} phase of running your code. By the time you reach Point B, $var & $io have also been allocated. In the case of the former, this is a simple scalar. In the case of the latter, the `new()` method of the IO::String class has also been run, and any variables it creates have been allocated. FInally, the scalar $io has been blessed. What that means is that as well as $io being a reference to some storage that hold the state of this instance of the IO::String class, $io now also carries (behind the scenes), pointer(s) to the byte code that implements the class methods. When your program does something with $io, it's value tells perl which instance (storage; state) of IO::String you are operating on. The hidden pointer(s) (often called 'magic'), tell it where to find the code for the methods that can operate upon that instance. When Statement C is executed, what happens (simplistically stated), is that an new copy of the interpreter is created and everything that constitutes your program in memory up to this point--ie. everything above--is duplicated into memory allocated by that new interpreter. In effect, this is somewhat similar to if you had forked your program at that point or, stated another way, as if you had run a second copy of your program and stopped it at that same point. The difference is that unlike two separate processes which would not be able to access the memory of the other copy, the two copies created by spawning the thread can. That is the major advantage of threads--they can communicate with each other through direct memory access rather than serialisation through pipes etc. As stated, this is a simplification. Some of the memory allocated by the first thread is not duplicated into the second thread. The non-duplicated elements are "process global". This includes such things as file handles, some of Perl's "Special Vars", and some internal state used by Perl itself. This duplication is not an effect of threads per se. It is the implementation chosen by the iThreads implementers. The advantage is that your simple variable $var, is now two simple variables--one for each thread. Each thread can now manipulate its copy of $var without needing to concern itself with synchronisation. Equally, the object $io, has also been duplicated. The problem is that not only has the instance storage been duplicated, so has the method code. When one thread uses $io to invoke a method, the hidden pointer (magic) tells it where to look to find the code, and the reference value itself tells it what state to manipulate, and each copy of $io not only points to a different copy of the state, but its associated magic also points to a different place. Now, to share the copy of the simple variable $var, which is after all one of the main reasons for using threads, you must designate that it is to be shared using the `my $var : shared;` nomenclature. What happens then (and please, don't take my description too literally!), is that the two copies of $var are tied. That is to say, each has hidden pointer (magic) applied to it so that when your code modifies one copy on one thread, behind the covers, that update is also applied to the other copy. The exact mechanism by which this happens is irrelevant in that, as far as your program is concerned, you only have one copy which all threads that can see that copy can manipulate. The problem comes with trying to share objects. If you applied the shared attribute to $io (which you can't because it won't let you), then not only would the state of $io have to be replicated each time it changed to any other threads copy. Also, the value of the magic would also need to be replicated. And that's impossible. To understand why you can't share the code (methods) that implement a class between two threads is complex, but as a simplified example. Say you had a class that had a settable separator or terminator. (think $/ for an IO class or ',' for a CSV class). You create an instance on one thread and set the separator to ','. On another thread you set it to '\|'. If you could share an instance of this class between the two threads, you have a conflict. This could be notionally be alleviated by always storing a copy of CLASS DATA in every instance, but then each time you modified the CLASS data, the class would have to search out each if it's instances and update that CLASS value. If you store the class data with the class, then when you tried to use an instance created on one thread with an instance created on another you get the conflict. Is this a comma separated instance or a pipe separated instance? The problems run much deeper than this. However, that doesn't mean that you can't used threads and objects. It just means that you have not to share objects between threads. It also means that using require to load modules only into those threads where the module will be used will save memory over useing them as it will avoid them being duplicated into threads that don't need them. I'll try to offer a solution to your actual problem in a separate reply. Examine what is said, not who speaks. "Efficiency is intelligent laziness." -David Dunham "Think for yourself!" - Abigail "Memory, processor, disk in that order on the hardware side. Algorithm, algoritm, algorithm on the code side." - tachyon	[reply] [d/l] [select]
Re^4: Problem with using threads with modules. by tele2mag (Initiate) on Jun 23, 2004 at 19:02 UTC
To the statement of being "two many words", I just want to say that I prefer that instead of a brief cryptic answer one might get. You gave me a chance to understand the mechanism around the issue not only the coding part of it. I thanks you for that. For me, English is not my native language (instead my 2nd) so I apologise for any grammar or typing errors in my posts. I think I understand your post completely, but something is still bothering me. For example, to start a new thread, you said that almost all memory are copied to the new thread. Just like starting a new instance of the program at that momentary interpreter position. So unless I explicitly share my variables (is ok) or objects (not ok), these both threads would not know of each other. So why does Perl debugger bother me with that message I've presented in my first post? If I run the program with the thread starting the empty shell method `Update()` nothing is then shared and It should not be a problem. If you think I'm a lost cause in this issue, you can supply me with some good URL:s so I can expand my thread-wisdom further, as the monks would put it. I have some knowledge in assembly programming with context-switchin with processes but not so much about thread implementations. We can leave it with that, you have already surpassed my greatest expectations while getting help ;) Don't know how to repay you but I can at least spread my new gained knowledge to other.	[reply] [d/l]
Re^5: Problem with using threads with modules. by BrowserUk (Patriarch) on Jun 23, 2004 at 20:45 UTC
Re^3: Problem with using threads with modules. by BrowserUk (Patriarch) on Jun 23, 2004 at 03:34 UTC
There are a couple of problems I see with what you are doing. Your `Update()` routine splits the contents of $var into lines and then prints them to STDOUT but never clears that variable. This means that each time `Update()` loops, it will be re-processing all the data it has previously processed along with any new data that has been added. Your starting a new thread running `Update()` for each `run()` thread you are creating. Each of these threads is accessing the same copy of $var and outputting to the same screen. Each copy of `Update()` will therefore be processing (and re-processing) the same data--and repetitiously outputting it to the same screen. If, as is suggested by your code, you simply want all the logging output from all your Telnet sessions to be logged to STDOUT, why not just do `$sock_>input_log( \STDOUT );`? That way, the module will log directly to STDOUT and you don't need all the extra threads or code. If you need to do some pre-processing of the logged data prior to output, then your use of IO::String may make some sense, but you still don't need one `Update()` thread per `run` thread. Just a single `Update()` thread is sufficient. You would still need to correct problem 1 above. That said, if you are using 5.8.x for this, and you should be as threads were less than complete prior to 5.8.3, then it may well be better to use the "in-memory files" ability of Perl's open to avoid the need for this module at all. (Note: I haven't tried this in conjunction with threads yet!) I would (probably) use the original (main) thread for this purpose though as your code implies that this is all embedded in a module, it is unclear to me how the module would be used, so I reserve judgment in that. Finally, you are still loading Net::Telnet with use. As previously explained, this means that every* thread in your application will load a copy of it, including those that have no need for it. If you reduce the number of `Update()` threads to one, and your application doesn't create any other threads besides the run threads and the one Update thread, this probably wouldn't make a big difference to the size of your app. But as coded, half of all your threads will carry a copy of Net::Telnet but never make use if it. As always, with a clearer understanding of what the overall aim of your app is, it would be easier to make suggestions. Examine what is said, not who speaks. "Efficiency is intelligent laziness." -David Dunham "Think for yourself!" - Abigail "Memory, processor, disk in that order on the hardware side. Algorithm, algoritm, algorithm on the code side." - tachyon	[reply] [d/l] [select]
Re^4: Problem with using threads with modules. by tele2mag (Initiate) on Jun 23, 2004 at 21:41 UTC
Background: To make you understand why I did as I did, a small explanation of the surrounding application is required. As you correctly pointed out, this is a module for a bigger application that uses Telnet to extract information and execute commands on some hardware devices (in this case Routers). This application is not timecritical due to it's usage, but some optimization I am planning. Main app: The main application has a GUI using Tk module. The main thread handles all user input, while all the other threads handles Telnet sessions that is connected to hardware devices. Every thread has its presentation in its own window in main GUI (not always visible though). So the application helps the user to command several devices at the same time. You might think, why not using fork? Well I thought threads would work fine and had more potentials. Anyway, all logging from the devices is presented in a textwidget. The Telnet-threads should also notify the user if timeouts occur and by their progress. Thats isn't implemented yet. Tk and threads does not go well together so I am going to supply the threads with callback-references to update methods in my GUI app. Now to answering a few questions you had. The `Update()` method is not my final version, just a quick one I wrote to test my app. It had to deliver the Telnet-communication log (stored in `$var`)in a line-by-line style so it looks like the output you'd get on a terminal if you manually did this. As I mentioned above, it delivers those lines to a callback method in my GUI. Then the callback function inserts the lines into my textwidget. I wrote to STDOUT to be able to test it without the GUI present, so that is not my intension at all. You said that I should clear `$var`, but I can't because I want to save the whole comm-log to file when execution finishes. One solution is to extract substrings of `$var` and use that instead. So you know, it didn't feel right to me either with the re-processing lines. It will have no or little effect on the general performance of the program but I would never have it there anyway. About using a single `Update()` thread sounds good, think I will use that advice. Did not know that open could be used for "in memory files" so I'll go for that too. Using the main thread to use `open()` was also my intension so that I'm going to do. Instead of use as you pointed out, a require in `get_connection()` is better. I suppose it stays in memory (as you explained earlier) so that I don't need to repeat it in other methods. Again, thanks for all the help.	[reply] [d/l] [select]
Re^5: Problem with using threads with modules. by BrowserUk (Patriarch) on Jun 24, 2004 at 08:08 UTC
Re^6: Problem with using threads with modules. by tele2mag (Initiate) on Jun 28, 2004 at 14:27 UTC
Some notes below your chosen depth have not been shown here