in reply to Re: Thoughts on how to devise a queryable win32 service
in thread Thoughts on how to devise a queryable win32 service

Your Monkness,

I apparently don't understand a scoping or access issue with trying to access my main program's variable from the thread subroutine. I have:

use strict; use Thread; my $time : shared; my $t = Thread->new(\&listener,\$time); while (1) { $time = localtime; print $time,"\n"; sleep 1; } sub listener { my $time_ref = shift; while (1) { { lock($$time_ref); print "\tThread time: [$time]\n"; print "\tThread time: [$main::time]\n"; print "\tThread time: [$$time_ref]\n"; } sleep 3; } }
Which prints out:

Wed Feb 9 08:06:02 2005 Thread time: [] Thread time: [] Thread time: [] Wed Feb 9 08:06:03 2005 Wed Feb 9 08:06:04 2005 Wed Feb 9 08:06:05 2005 Thread time: [] Thread time: [] Thread time: [] Wed Feb 9 08:06:06 2005
Indicating to me that the thread only sees the initial value like it has it's own copy. Is that right? How do I access the main threads $time var? Hmmmmm.

Replies are listed 'Best First'.
Re^3: Thoughts on how to devise a queryable win32 service
by BrowserUk (Patriarch) on Feb 09, 2005 at 16:18 UTC

    You had me going there for a few seconds--everything in your program seemed legit at first glance;

    Problems:

    1. You are useing "Thread": Don't!

      The Thread module relates to an early attempt at threading in Perl (called perl5005threads). These were deemed a failure and are "going away" in the next release.

      It has been superceded by Ithreads via the threads module.

      It requires at least 5.7.?, but you should not try to use them with any version < 5.8.4 -- get 5.8.6 if you can.

    2. You also need to be using threads::shared if you wish to share variables between threads.
    3. You don't need to pass a reference to $time into your thread sub, once it is shared, it can be seen from any thread (provided it was declared before the thread sub is instantiated.
    4. For correctness, you should be locking your shared variables in both (all) threads.

      In (my) reality, if you are running in a single CPU machine, read references do not appear to need locks be applied as only one thread can be running at a time, but if you move the code without locks to a multiprocessor machine you would likely get bitten.

      In theory, even on a single processor machine, it is possible that one thread could be in the process of writing to a shared variable and have not completed the update when it gets suspended. If it did not apply locks, or another thread reading it does not apply them (and thereby get suspended until teh write is completed), then the other thread could read a partially updated variable that is not in an internally coherent state and get bad data or even segfault.

      To date, try as hard as I might, even running long running, backtracking regexes on huge shared strings, I have never been able to make this happen.

      ( For the pedantic: The above statement is only true: on my single processor, win32 machine; in my house; whilst I've been watching; etc. etc. etc.....Yet! ;)

    use strict; use threads; use threads::shared; my $time : shared; sub listener { my $array_ref = shift; while (1) { { lock $time; print "\tThread time: [$time]\n"; } { lock($array_ref); print "\tThread time: [@$array_ref]\n"; } sleep 3; } } my @array : shared; my $t = threads->new( \&listener, \@array ); while (1) { { lock $time; $time = localtime; } print $time,"\n"; { lock @array; push @array, $time; } sleep 1; } __END__ [16:09:17.53] P:\test>429405 Wed Feb 9 16:09:31 2005 Thread time: [Wed Feb 9 16:09:31 2005] Thread time: [] Wed Feb 9 16:09:32 2005 Wed Feb 9 16:09:33 2005 Thread time: [Wed Feb 9 16:09:33 2005] Thread time: [Wed Feb 9 16:09:31 2005 Wed Feb 9 16:09:32 200 +5 Wed Feb 9 16:09:33 2005] Wed Feb 9 16:09:34 2005 Wed Feb 9 16:09:35 2005 Wed Feb 9 16:09:36 2005 Terminating on signal SIGINT(2)

    I switched your code around a bit not because there was anything wrong with the ordering, but to allow me to demonstrate a couple of points.

    I've stopped passing the scalar ref to the sub and am passing a reference to an array that is also updated in the main thread.

    Because $time is visible to your listener() sub, it is accessible, via closure in the normal way, without being explicitly passed it to the thread.

    The array however, was not existing when the sub was declared, so it must be passed explicitly.

    Basically, all the normal perl scoping rules apply--once you are using the correct modules :) I should have mentioned that before. Sorry!


    Examine what is said, not who speaks.
    Silence betokens consent.
    Love the truth but pardon error.
      I grovel. Must you be so simply clear? LOL Indeed, I had already figured out that I must not use 'Thread' but I didn't know why. Thanks for pointing it out.

      I had shifted to 'threads' and my code looked almost exactly like yours as I bagged the references and employed locking in both threads.

      Thanks also for pointing out that the array would not be known yet and must be 'blessed' so to speak. I have it working now, on to the next lesson.

      Seems the more I learn, the less I know.....grrrr!

      I thought I read somewhere that Perl internally made sure any shared data accessed by threads would be in a consistent state, no matter if you lock the shared variable or not. In fact, I seem to recall it said it was quite safe to have one thread push stuff onto a shared array (ie, a queue), and another shift stuff off the other end of it, safely without any locking. Are you saying that on an SMP box the en-queue side might end up pushing garbage data onto the shared array, or that the de-queue side might read garbage data? Obviously it's okay if the reader side doesn't see anything yet, so long it just thinks that queue is undefined (empty) and just sleeps a while and checks again later on...

        My first question is, where did you read that, because I would like to read that too. Indeed, this is an open invitation to anyone to /msg me links to any Perl + threads articles and discussions anywhere they see them. Really. Please send me any links you have. I will even commit to producing and maintaining a list of such links, including adding some assessment rating of the information they contain.


        Are you saying that ...

        Whoa! I'm not saying anything that contradicts any better information you can find, or knowledge you have.

        Anything I say relates only to that experience I have personally acquired--plus some speculations based upon them; other experience; and logic.

        Now, getting back to what I did say:), how I arrived at that statement, and it's implications:

        • In a multi-processor machine, 2 or more threads can be running simultaneously.
        • With multi-tasking, Perl cannot control the point at which it is interupted by the scheduler.
        • Perl's datastructures are "fat". Each one contains various attributes and flags that need to be kept in an internally consistant state.

          For example. When you append stuff to a perl scalar:

          1. the length of the scalar must be updated to reflect the addition.
          2. if the length of the C-time, memory previously allocated to that scalar is insufficient to allow the addition.
          3. then a new extended allocation must be made,
          4. and the original contents copied into it,
          5. and the addition must be copied across,
          6. and the original memory must be returned to the freespace chains.
          7. internal pointers, flags etc. all must be updated.

          This must all appear as an atomic operation from the user program's perspective.

        • The scheduler can interupt any of those operations at any time.
        • In a multi-cpu machine, another thread could be attempting to read, or even write that same datastructure at the same time.
        • If user locking is employed purvasively, then the chances of one thread gaining access to a datastructure that is in an internally inconsistant state are nil.

        Perl may use internal locking when modifying it's own internal use datastructures, such that it will prevent multiple, concurrent accesses, even if the user does not employ user locks correctly. This appears to be the case based on my experience, but I have yet to read a good description of what internal mechanisms of ithreads and sharing do, so I can only base my understanding upon my experiences, and what little I can glean from reading the sources--and that is very little! Perl's sources are scantly documented and very complicated.

        The upshot of all of that is, that in my experiments with iThreads (since circa 5.8.3, on my single cpu machine), I have found it impossible to cause Perl to segfault through performing multi-threaded accesses to shared data elements even without user locks being employed. Nor have I been able to detect any inconsistancy of state.

        This implies that Perl does employ internal locking to prevent internal inconsistancies, which makes sense, but I do not recall seeing this written down anywhere.

        But I have not used Perl on an SMP box, so there is a chance that for all my emperical testing, I've only been "getting away with my abuse of not employing user locks consistantly", because I'm only really able to run one thread at a time.

        It makes sense (to me), that Perl would employ internal locking to ensure the correct state of it's own interal structures. Which would mean that users only need to employ user locks if they are maintaining state across multiple Perl variables.

        Example:. If I am using a Perl array to represent a stack, and rather than having perl remove elements "popped" from the stack, I am maintaining a separate "stack pointer" in a scalar, then when a new value is pushed to the stack, there are two operations required.

      • One to add the value at stacktop.
      • One to increment the stackpointer.

        Perl has no knowledge of the association between these two variables and so cannot perform locking to ensure their consistancy, so this would require a user lock. One lock variable used to serialise access to both the array and the scalar.


        Speculation warning!!

        Based upon my experiments, my supposition is that the user need never apply locks to individual shared variables to ensure their internal consistancy. Perl appears to do this internally.

        They only need to employ user locks if they wish to ensure consistancy of state between 2 or more perl variables.


        This makes perfect sense, and fits with my experience of using iThreads--but nowhere I have seen, is this written down!

        Indeed, all the documentation relating to sharing and locking shows individual locks being applied to individual variables. All of it! But this doesn't make sense if Perl is already employing internal locking to protect it's internal consistancy! It achieves nothing apart from to slow the program down.

        My assumption above, fits the data from my experiments,, but the complete lack of engagement from anyone who really understands what is going on internally means that my experiments, limited as the are to my single-cpu machine, are the only information I have upon which to draw conclusions.

        This is why I have resolutely resisted writing any tutorials, or making any meditations on the subject of iThreads. The total lack of engagement or writings by those (few) with real knowledge of what goes on internally to iThreads, (as opposed to a few others like me who played with them and drew conclusion based upon their experiments), means that anything I (or they) write is basically guesswork.

        Contrary to popular opinion that threads a unreliable, memory hungry or otherwise dangerous, the current, woeful state of that little documentation that does exist is the strongest argument for avoiding using threads in a production environment.

        Whilst I am happy to share snippets of code, opinion and advice with individuals on individual problems related to iThreads, I am not going to make the same mistake as others, and commit to writing something that gets transmitted around the world, and then referenced, quoted and re-quoted--by people who do not understand and have only second-hand knowledge and no direct experience--as authoratative.

        So, if you have an application for iThreads, and would like a little help in developing it--and especially if you are developing for a non-win32 platform (pref.linux), and especially, especially if you are using a multi-cpu machine--then I am more than happy to try and help you get it going. If you will provide me with feedback and maybe try a few things out along the way, that would be even better.

        But if you want me to commit to saying anything authorative about iThreads, beyond that which I have tried and tested within my own limited environment, you are out of luck :)

        Now, what was that you were saying about queues?


        Examine what is said, not who speaks.
        Silence betokens consent.
        Love the truth but pardon error.
Re^3: Thoughts on how to devise a queryable win32 service
by noslenj123 (Scribe) on Feb 09, 2005 at 15:49 UTC
    In my humility I guess I don't know the difference between 'Thread' and 'threads'. After shifting the code to use 'threads' and 'threads::shared', I am able to access my variable. What the heck is the 'Thread' module for then?