in reply to Passing globs between threads
This works fine for me.
#! perl -slw use strict; use IO::File; use threads; sub thread{ my $fh = shift; print for <$fh>; return; } my $io = new IO::File( 'junk', 'r' ) or die $!; my $thread = threads->create( \&thread, $io ); $thread->join; $io->close; __END__ P:\test>395373 This is junk and some more junk and yet more junk this is line 4 of junk
|
|---|
| Replies are listed 'Best First'. | |
|---|---|
|
Re^2: Passing globs between threads
by conrad (Beadle) on Sep 30, 2004 at 21:28 UTC | |
| [reply] [d/l] [select] |
by BrowserUk (Patriarch) on Oct 01, 2004 at 00:21 UTC | |
The first thing to realise is that raw filehandles and sockets are process global. So you don't need, and indeed cannot, share them between threads in the threads::shared sense, as they are effectively already shared by all threads in the process. Note: I'm talking about the OS & C-runtime concept of filehandles and sockets, not anything in the IO::* group of modules. These are peculiar beasts in that they are at some level like ordinary perl objects, which makes them difficult, if not impossible to use safely across threads, but they do not behave entirely like ordinary objects. So, the problem is how to transfer an open handle between threads (which at a higher level seems like a bad design to me, but I'll get back to that), given that threads::shared won't let you share a GLOB nor even a 5.8.x lexical scalar that is currently being used as a GLOB-like entity. After a little kicking around, I found a way. I'm not yet sure that it is a good thing to do, but I'll show you how anyway and hope that if someone else out there knows why it should not be done, they'll speak up. The trick is to pass the fileno of the open GLOB to the thread and then use the special form of open open FH, "&=$fileno" or die $! to re-open the handle within the thread before using it.
However, this is not a complete solution. Currently, probably because of a bug I think, but possibly by design, if you actually read from the handle in the thread where you opened it, then pass the fileno to another thread, reopen it and attempt to read some more from it, it fails. No errors, no warnings. Just no input. I've read and re-read perlfunc:open and perlopentut:Re-Opening-Files-(dups) , and I believe that the "&=$fileno" syntax should allow this...but it doesn't. If you need to be able to do this, you'll need to take it up with p5p guys and get their wisdom. Now, getting back to design. Whilst moving objects around between threads by freezing and thawing them is possible, it feels awefully hooky to me. Basically, if your design is done correctly, there should be no need to create duplicates of objects in different threads. Doing so is opening yourself up to a world of greif. These duplicate objects will be uncoordinated. Once you freeze an object, it is exactly that; frozen. Any subsequent changes you make will not be reflected in the copy that you reconstruct elsewhere. If it was trivial, or even if it was possible with a reasonable degree of difficulty to coordinate and synchronise objects across threads, Perl would do that for you. The fact that the clever guys that got iThreads this far did not step up to the plate and do this already, almost certainly means that you should not be trying to do this either. The fact is that I appear to have done as much coding of(i)threads as anyone I am aware of, and I have yet to see a usefully iThreadable problem I couldn't (almost trivially) solve. And so far, I have never needed to share either objects (or IO handles) between threads. But don't take my word for it. My knowledge only extends as far as I have tried to go. It would be good to see someone else pushing the boundaries of what's possible. If you could describe the problem that you are trying to solve at the application level, I'd be most interested to take a look. | [reply] [d/l] [select] |
by conrad (Beadle) on Oct 01, 2004 at 10:36 UTC | |
What I'm trying to do.. A service which will do "stuff" (it's a generic architecture to be customised to particular applications, so I really mean stuff) to bundles of streams - essentially, a client will tell it "here's a bunch of input streams, and a corresponding bunch of output streams, process 'em using some-mechanism-or-other as you pump data from the inputs to the outputs". Each group of inputs and outputs are collated in some way for purposes of error correction, and the service will process multiple bundles simultaneously. Little example diagram here, with two bundles: the first (A->B) is unidirectional with three inputs and only two outputs, while the second (C<->D) is bidirectional with one stream at one end and three at the other (the actual stuff that multiplexes/demultiplexes doesn't really matter for this question).
I'm using Perl 'cos it facilitates rapid prototyping (this is research work), it has good network handling, and is easy to integrate with CGI, web services, and third-party applications (all of which are desirable in this case). The architecture I've gone with, due to the relatively heavy weight of Perl's threads, is to set up a single worker thread which processes all of the bundles in a select() loop, and then have the main thread obtain, validate and submit control directives (such as adding new bundles or modifying existing ones). In an ideal world, I'd just turn each control directive into a closure and hand it off to the worker thread to deal with in its own good time. In another language that might be easy while all the other stuff I've mentioned might be hard; in Perl, the other stuff is easy and this, while not hard to hack (I'm already doing it by passing filenos across) seems hard to do nicely. The reason for separating the control invocation from the worker thread is that I envisage a variety of different control styles, e.g. web page, web service interface, and straight program API. I'd rather not get that all munged up with the I/O and processing work... So I escape your wise warning re: freezing objects because the control thread destroys its copies of them as soon as they're frozen and passed off to the worker. It's clearly not efficient, but my purpose for the time being is clarity and flexibility, which whole-object transmission gives me. Passing instructions through as unblessed shared string/list/hash structures was more error-prone. Now on to the file descriptor messaging: at the moment I do one of two things: either I pass across coordinates (such as "hostname:port") and the worker initiates a socket connection itself, or I pass across the file descriptor of a socket/pipe/filehandle and the worker tries to reconstruct the original object using IO::xxx->new_from_fd(). What would be nice would be to be able to just pass across IO::xxx objects; since that's not possible without extending Storable (I am tempted, would make everything tidy again), the necessary thing seems to be to deconstruct the objects into class (necessary since I can usefully half-close socket connections but not files for example) and file descriptor. Your open("&=") trick is analogous to the new_from_fd one, and difficult because it too needs to know the opening mode of the original object (you have to specify "<&=", ">>&=", etc.). So I guess my ultimate questions are:
| [reply] [d/l] [select] |
by BrowserUk (Patriarch) on Oct 02, 2004 at 06:17 UTC | |
by Anonymous Monk on Oct 01, 2004 at 20:09 UTC | |
Yes it's easy to synchronize objects (at least Storable ones) across threads. RFC677 (yes the one from IETF) tells you how. | [reply] |
by BrowserUk (Patriarch) on Oct 01, 2004 at 22:22 UTC | |
by Anonymous Monk on Oct 02, 2004 at 00:27 UTC | |
| |