mhearse has asked for the wisdom of the Perl Monks concerning the following question:

This is a follow up to this node. I have a program which uses threads and Thread::Queue to process output from the inotifywait program. The program does work as intended. There is a bug in the following method:
################################################################# sub inotifywait { ################################################################# my $self = shift; ### Forces a flush after every write to this pipe. $|++; ### Make a list of exclude patterns using contents of skip file. my $exclude = join ' --exclude=', '', @{$self->{patterns}}; ### Put inotifywait command together. my $cmd = '/usr/local/bin/inotifywait -mrqc ' . $exclude . ' --fromfile=' . $self->{bdir} . '/list -e close_write,delete |'; $self->debug(qq{$cmd\n}); $pid = open INOTIFY, $cmd or die $!; while (my $output = <INOTIFY>) { chomp $output; $q->enqueue($output); $self->debug(qq{IN inotifywait thread.\n}); $self->debug("Items in queue: " . scalar $q->pending() . "\n") +; } }
I have another method which processes items in the queue and then sleeps 3 seconds. The problem (I'm assuming) is that sometines the inotifywait method won't release the lock on the queue. Debugging output states "IN inotifywait thread." This occasionally leads to a large backup of items in the queue. Is there anything I can do to make this work better?

Replies are listed 'Best First'.
Re: Thread::Queue locking question (autoflush)
by ikegami (Patriarch) on Apr 21, 2008 at 17:16 UTC

    This isn't in response to your question, but your incorrect use of $|++.

    $|++; turns on autoflushing for the currently selected file handle, not for your pipe. Both of the following snippets turn on autoflushing:

    ### Forces a flush after every write to this pipe. my $old_fh = select(PIPE); $|=1; select($old_fh);
    use IO::Handle qw( ); ### Forces a flush after every write to this pipe. PIPE->autoflush(1);

    But the thing is, INOTIFY is a reader pipe. You never write to it, so it's useless to turn on autoflushing for it.

      Noted. I guess I misunderstood $|.
Re: Thread::Queue locking question
by BrowserUk (Patriarch) on Apr 21, 2008 at 17:44 UTC
    I have another method which processes items in the queue and then sleeps 3 seconds. The problem (I'm assuming) is that sometines the inotifywait method won't release the lock on the queue.

    What is the purpose of the 3 second sleep?

    It's much more likely that the problem is that your 3 second sleep is preventing the read thread from processing its end of the queue in a timely manner. You can write a huge amount of data to a queue in 3 seconds.

    There should be no reason to have sleeps in the reader end of the queue. If you code it so that it just blocks on the dequeue(), it won't consume any cpu until there is something for it to do, but will be able to respond immediately (subject to getting its next timeslice) as soon as there is something to do.

    If you are using dequeue_nb() in order to allow that thread to also check for other conditions, for example a "terminate now" command, and you want to prevent that thread running away with the cpu, then you should use a much shorter sleep. A sleep of <timeslice> milliseconds is theoretically most efficient, but any reasonable value will work well. Generally, start with say 50 milliseconds (usleep 50;) and vary it it up or down to strike a balance between idle thread cpu consumption and responsiveness.

    As a safe guard to prevent the writing thread running away with memory, it can relinquish its timeslice if the size of the queue gets bigger than some limit: sleep 0 if $Q->pending > 100;. Again, if that thread is not blocking on pipe read, then using a timeslice-sized sleep instead of zero will prevent the thread running away with the cpu.

    It would be easier to make suggestions if you posted the reader thread code also.


    Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
    "Science is about questioning the status quo. Questioning authority".
    In the absence of evidence, opinion is indistinguishable from prejudice.
Re: Thread::Queue locking question
by dave_the_m (Monsignor) on Apr 21, 2008 at 17:42 UTC
    This has nothing to do with locking. The thing about a queue is that the consumer will automatically block trying to dequeue if there is nothing in the queue, while there is nothing to stop the producer pushing more stuff. Thus, depending on the relative speeds of the producer and consumer, the queue can grow without limit.

    Your sleep(3) in the consumer is (probably) pointless; what you want is for the producer to pause up once the queue reaches a certain size, eg

    while ($q->pending > 100) { sleep 1; } $q->enqueue(...);

    Dave.

Re: Thread::Queue locking question
by mscharrer (Hermit) on Apr 21, 2008 at 17:27 UTC
    Not sure if this is your bug but you should set the patterns in single quotes for the case they include spaces.
    You could change your line:
    my $exclude = join ' --exclude=', '', @{$self->{patterns}};
    to:
    my $exclude; $exclude .= sprintf q{ --exclude='%s'}, $_ foreach @{$self->{patterns} +};
    or:
    my $exclude = join '', (map " --exclude='$_'", @{$self->{patterns}});
Re: Thread::Queue locking question
by mhearse (Chaplain) on Apr 21, 2008 at 18:59 UTC
    I applied all suggested changes at once. The problem seems to be resolved. I've been beating the system up with changes and the program keeps right up. I believe it was adding the usleep 50, but I'm not sure. I want to thank you all for the input. I learned a lot from this post.
Re: Thread::Queue locking question
by ikegami (Patriarch) on Apr 21, 2008 at 17:30 UTC

    The problem (I'm assuming) is that sometines the inotifywait method won't release the lock on the queue.

    How did you arrive to that conclusion? You're claiming that what should be a reliable module has a bug in it, yet you haven't shown any evidence of that. You haven't even shown us how you use the queue! As far as we know, you never call dequeue.

    By the way, could you please fix the link in your node to avoid using a specific PerlMonks domain? (For example, by using <a href="http://perlmonks.org/?node_id=672857">node</a> or [id://672857|node].)

      I do not claim to have found a bug with any of the perl threading modules. I'm sure the bug is in my usage of it (which is limited at best). Here is the method which processes output:

        I do not claim to have found a bug with any of the perl threading modules.

        I suppose you could have meant that you thought were misusing the module. But that doesn't change anything. The question I asked ("How did you arrive to that conclusion?") still applied. You specified a very specific problem (the lock on the queue isn't being released) without showing any evidence of that. You didn't even mention where the code blocks! It's hard to debug a problem without knowing anything about it.

        In fact, from looking at the code you have since posted, I suspect the opposite problem. It's not one of your threads blocking itself to oblivion, it's one of your threads not waiting at all.

        Unless there's already something in the queue, that function will exit immediately after being called since $q->dequeue_nb will return undef. The function will also exit as soon as the queue is empties, even if more items will later be added to it.

        That's not necessarily a problem. I don't know how process_output is used. But if I were to venture a guess at what the rest of the code looks like, $q->dequeue_nb should be $q->dequeue.

        There's a second issue. If you put an empty string or the number zero in the queue, the loop will exit prematurely.
        while ( my $line = $q->dequeue_nb() ) or
        while ( my $line = $q->dequeue() )
        should be
        while ( defined( my $line = $q->dequeue_nb() ) ) or
        while ( defined( my $line = $q->dequeue() ) )
        since those methods could return false value when they aren't

        Finally, it's really weird to see use Text::CSV; inside a function. Do you understand that the statement will execute when process_output is compiled (even if it's never called), and not when process_output is called? It would be better if you put it at the top of your program.

        Update: Readability enhancements.