BrowserUk has asked for the wisdom of the Perl Monks concerning the following question:

The following program run as test somefile , will (on my system) reliably copy the contents of the named file to standard out and then terminate.

But if you run as test -TRACE=1 somefile it will almost always hang, deadlocked.

#! perl -slw use strict; use IO::File; use threads; use threads::shared; BEGIN { $| = 1; our $TRACE ||= 0; print "TRACE=$TRACE"; *CORE::GLOBAL::warn = sub {} unless $TRACE; } sub processData { printf @_; } sub getDataT { my ( $handle, $sharedDataRef, $doneRef ) = @_; while( !$handle->eof ) { warn "t-Locking" . $/; lock $$sharedDataRef; warn 't-Waiting' . $/; cond_wait( $$sharedDataRef ) while $$sharedDataRef; warn 't-Setting' . $/; $$sharedDataRef = $handle->getline; warn 't-Signalling' . $/; cond_signal( $$sharedDataRef ); } $$doneRef = 1; return; } my $handle = IO::File->new( $ARGV[ 0 ], 'r' ); my $sharedData :shared; my $done :shared = 0; threads->create( \&getDataT, $handle, \$sharedData, \$done ); while( !$done ) { warn 'm-locking' . $/; lock $sharedData; warn 'm-waiting' . $/; cond_wait $sharedData until $sharedData; warn 'm-Copying' . $/; my $localCopy = $sharedData; warn 'm-undefing' . $/; undef( $sharedData ); warn 'm-signalling' . $/; cond_signal $sharedData; warn 'm-processing' . $/; processData( $localCopy ); }

All the -TRACE=1 option does is cause it to print some debug lines to STDERR.

3 questions:

  1. Does it behave the same on your system?

    I'd be particularly interested in hearing about it's behaviour on multi-cpu systems.

  2. Can you modify my use of the lock() & cond_*() calls to prevent the deadlock?
  3. Can anyone exlain why it deadlocks when the tracing is enabled?

Many thanks.


Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
Lingua non convalesco, consenesco et abolesco.
Rule 1 has a caveat! -- Who broke the cabal?

Replies are listed 'Best First'.
Re: Printing to STDERR causes deadlocks.
by Thelonius (Priest) on Apr 26, 2005 at 19:20 UTC
    You have a race condition. Anything affecting the timing, like printing to STDERR, can cause a problem. It's not a deadlock, though. Solution, change:
    cond_wait $sharedData until $sharedData;
    to
    cond_wait $sharedData until $sharedData || $done;

      Thanks, but on my system that modification appears to make the odds of the hang more rather than less frequent?


      Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
      Lingua non convalesco, consenesco et abolesco.
      Rule 1 has a caveat! -- Who broke the cabal?
        I'm familiar with C's pthreads so YMMV, but it looks like you are setting $$doneRef = 1 outside of the lock, after the signal is sent, so perhaps the signal is being received before the thread has a chance to set $$doneRef. Generally you'll want to set/check shared variables only when you have a lock held.
Re: Printing to STDERR causes deadlocks.
by ghenry (Vicar) on Apr 26, 2005 at 18:43 UTC

    Hi,

    It deadlocks on my system either way (just prints out the file and then sits and waits). With the TRACE option, the last lines I get are:

    m-locking m-waiting

    Walking the road to enlightenment... I found a penguin and a camel on the way.....
    Fancy a yourname@perl.me.uk? Just ask!!!
Re: Printing to STDERR causes deadlocks.
by bmann (Priest) on Apr 26, 2005 at 20:52 UTC
    It hangs almost every time for me, with or without TRACE (1.8G Pentium IV, Win2K, Perl 5.8.4).

    It looks like a race condition. If the the GetData thread sets $$doneRef in the space after while ( !$done ) is evaluated, the lock on $sharedData will block and wait for a signal indefinitely.

    Since these threads are in lock-step, it seems to me a simple threads->yield near the end of the main thread's while loop (or even a short sleep before the loop starts) would force the GetData thread to start first, thereby preventing the race condition from happening. Should the code get more complex and longer running, all bets would be off - it would be very difficult to prove there wouldn't be a race condition.

    Have you considered using threads::shared::semaphore instead?

    Update: I added sleep 1; before the loop in GetDataT, results are the loop hangs every time. Added threads->yield; before the closing brace of the main thread's while loop, succeeded every time - even with the sleep. Even though it appears to work, yield is only a suggestion to the OS, not a guarantee.

      Yes. I've been through a gamut of variations using yield() and sleeps. The problem, besides that using either renders the communications so slow as to totally invalidate any benefits from the threading, is that as soon as you move the code to a multi-processor system, you are into a completely different set of dynamics and need to start again.

      This should not be necessary. It certainly isn't needed if you use the native threading facilities provided by my OS.

      Have you considered using threads::shared::semaphore instead?

      Have you looked at what Thread::Semaphore actually does?


      Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
      Lingua non convalesco, consenesco et abolesco.
      Rule 1 has a caveat! -- Who broke the cabal?
        I agree that yield and sleep won't solve it reliably - actually, I indicated that in my previous node.

        The problem is the space between signaling the main thread and setting eof and the space between testing whether eof is true and waiting on the shared variable. One of these actions needs to be atomic - you don't want to wait for more data if eof is true.

        How about this - replace getDataT with this, it'll set $done then signal $sharedData is ready, removing the race condition:

        sub getDataT { my ( $handle, $sharedDataRef, $doneRef ) = @_; my $temp; while( !$$doneRef ) { warn "t-Locking" . $/; lock $$sharedDataRef; warn 't-Waiting' . $/; cond_wait( $$sharedDataRef ) while $$sharedDataRef; warn 't-Setting' . $/; $$sharedDataRef = $handle->getline; # set $done before handing the data over to the main thread $$doneRef = 1 if $handle->eof; warn 't-Signalling' . $/; cond_signal( $$sharedDataRef ); } return; }
        I would expect that to scale gracefully.

        Have you looked at what Thread::Semaphore actually does?
        Just the docs. Now I have read the source... point taken ;)
Re: Printing to STDERR causes deadlocks.
by graff (Chancellor) on Apr 27, 2005 at 02:30 UTC
    FWIW -- "This is perl, v5.8.1-RC3 built for darwin-thread-multi-2level":

    After fixing the shebang line as needed, the results are mysterious.

    It never seems to hang (after a dozen or so trials with and without TRACE=1). It seems to randomly alternate between two behaviors, regardless of TRACE value:

    • Sometimes it exits without printing the contents of the given file; in these cases, the output is simply "TRACE=0" or "TRACE=1" followed by:
      Name "CORE::GLOBAL::warn" used only once: possible typo at ./451692.pl + line 11. t-Locking t-Waiting t-Setting t-Signalling
    • Other times (these seem to be in the minority), it prints the file contents before exiting; when TRACE=1, the output includes:
      Name "CORE::GLOBAL::warn" used only once: possible typo at ./451692.pl + line 11. m-locking m-waiting t-Locking t-Waiting t-Setting t-Signalling m-Copying m-undefing m-signalling m-processing

    Since I'm not getting any deadlocks, I'm not sure I can answer your other questions.

Cross-platform testers please.
by BrowserUk (Patriarch) on Apr 27, 2005 at 10:05 UTC

    Below is a version of the code in the OP that (on my system) works reliably and apparently quite efficiently regardless of tracing, filesize, system load, whatever.

    I would be most grateful if people running a multithreaded perl on non-Win platforms, and especially anyone with a multi-cpu machine, could try this code and report back their findings.

    #!/usr/bin/perl -slw use strict; use IO::File; use threads; use threads::shared; BEGIN { $| = 1; our $TRACE ||= 0; print "TRACE=$TRACE"; *CORE::GLOBAL::warn = sub {} unless $TRACE; } sub processData { printf @_; } sub getDataT { my ( $handle, $sharedDataRef, $doneRef ) = @_; while( !$handle->eof ) { warn 't-Waiting' . $/; select undef,undef, undef, 0.0001 while $$sharedDataRef; warn "t-Locking" . $/; lock $$sharedDataRef; warn 't-Setting' . $/; $$sharedDataRef = $handle->getline; } $$doneRef = 1; return; } my $handle = IO::File->new( $ARGV[ 0 ], 'r' ); my $sharedData :shared = undef; my $done :shared = 0; threads->create( \&getDataT, $handle, \$sharedData, \$done ); while( !$done ) { warn 'm-waiting' . $/; select undef, undef, undef, 0.0001 until $sharedData; warn 'm-locking' . $/; lock $sharedData; warn 'm-Copying' . $/; my $localCopy = $sharedData; warn 'm-undefing' . $/; undef( $sharedData ); warn 'm-processing' . $/; processData( $localCopy ); }

    Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
    Lingua non convalesco, consenesco et abolesco.
    Rule 1 has a caveat! -- Who broke the cabal?

      Tried it on a Dual 2.8Ghz Xeon 1GB Dell 1800 PE with perl 5.8.3

      Fine with/without tracing on SUSE 9.1

      HTH.

      Walking the road to enlightenment... I found a penguin and a camel on the way.....
      Fancy a yourname@perl.me.uk? Just ask!!!

        Many thanks. Can you tell if the two threads were running on separate cpus?


        Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
        Lingua non convalesco, consenesco et abolesco.
        Rule 1 has a caveat! -- Who broke the cabal?
Re: Printing to STDERR causes deadlocks.
by Anonymous Monk on Apr 26, 2005 at 20:06 UTC
      I'm still getting used to this whole anonymous monk thing

      Why remain anonymous?

      Anyway, you're kinda making my point for me. Again.

      The point being that it is extremely difficult to do even very, very simple things using the pthreads API--at least, as exposed to Perl through threads. It's too late for Perl 5, but Perl 6 needs something much better.

      It's a little like the difference between the raw sockets calls that perl exposes, and using IO::Socket::INET, except that with for a threading api to be useful within Perl, it need to be architected into the language, declaratively, not bolted through as an after thought.

      I also will contend that native threads alone are not enough, but the arguments supporting that are complex and are taking me some time and a lot of research to put together. Unfortunately, the inadaquacies of threads and the restrictions the implementations impose upon it's use, mean that it is hard to come up with convicing demonstrations of what is possible, and as we've seen, arguments alone are not enough.


      Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
      Lingua non convalesco, consenesco et abolesco.
      Rule 1 has a caveat! -- Who broke the cabal?