comment on

... take a peek and let me know if it looks like I'm making any _obvious_ mistakes?

Without having read the rest of the code, the problem is identified right here:

sub worker
{
   my $work_queue = shift;
   my $tid = threads->tid;

   local $/;

   WORKER: while ( !$thread_term )
   {
      # signal to the thread poolq that we are ready to work
      $pool_queue->enqueue( $tid );

      # wait for some filename to be put into my work queue
      my $file = $work_queue->dequeue;
...
[download]

Imagine a post office where there is bunch of counters and a locked door with a guard.

When a counter clerk is free, they shout to the guard at the door, he unlocks the door, let's one person in, tells them which counter to go to and locks the door behind them. Of course, quite often, two or more of the counters become free simultaneously, and the respective clerks both yell at the guard for another customer; but he can only service one at a time, so the others have to wait. And the guard tends to service the person that shouts loudest, which means old Miss Primenproper at the far end with small voice and proprietary nature, often finds herself with nothing to do for long periods....

The point is, you've created both a bottleneck -- the guard/door (poolQ/workQ) -- and designed in a four-way, inter-thread, handshake (synchronisation; lock-step). That is, when a thread finishes processing one file -- which perhaps took 1/10th of its designated timeslice -- in order to proceed it has to:

post a "ready to work" message on the poolQ and then relinquish the rest of this timeslice by going into a wait state until the guard sees that message and replies.
At least two thread switches are required. 1) for the guard to see the message and respond; another for the waiting thread to receive the work item.
And if the guard (poolQ) is busy, it may take many thread-swaps before he sees this workers request; and many more before the clerk (worker thread) gets a timeslice and gets the message back.

And in that design, you've reduced your queues to single element at a time buffers; which defeats their purpose.

The solution is to desynchronize your threads:

The guard (feeder thread) only concerns itself with ensuring that the 'internal queue' (workQ) doesn't get overly full. It has some threshold -- say N where N is the number of worker threads -- and when the workQ falls below that number it allows another N people in to join the internal queue (workQ).
The clerks (work threads) all get new customers (filenames) from that same single internal queue (workQ), which means that if they are capable of processing 2 (or N dozen files) in a single timeslice, they do not have to enter wait-states to do so.

Hopefully that analogy will allow you to see where/what to change in your code. If not, ask questions.

Update: Note: I'm quite uneasy by this assertion:

You'd be right to ask why I'd want to make this multi-threaded when it has to do with IO. The answer is quite simply that after you've eliminated obvious non-dupes, you have to start comparing the files with a real means of differentiation, namely calculating file digests--a cpu-intensive task that comes after you're done with the IO.

This implies that your are reading the entire file and digesting it for every file -- regardless of whether there is another file already seen that has the same date/time/first/last/middle bytes.

Again without having digested your entire program in detail, it seem likely that your architecture is letting you down here.

My initial thoughts for an architecture are:

One thread scans the filesystem -- there is nothing to be gained from multiple threads thrashing the directories unless you a) have multiple physical spindles; and can b) confine each scanner to a single spindle. (This is feasible on NTFS systems but not so easy on unified filesystems I think).

Only if two files with the same size/date/checkbytes are found, do those two files get digested by worker threads. If one of them has already been digested; only the one need be done.

Anyway, that's an ill-thought through notion, but it might trigger your synapses.

With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday'

Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.

"Science is about questioning the status quo. Questioning authority".

In the absence of evidence, opinion is indistinguishable from prejudice.

In reply to Re: Threaded Code Not Faster Than Non-Threaded -- Why? by BrowserUk
in thread Threaded Code Not Faster Than Non-Threaded -- Why? by Tommy

Are you posting in the right place? Check out Where do I post X? to know for sure.
Posts may use any of the Perl Monks Approved HTML tags. Currently these include the following:
<code> <a> <b> <big> <blockquote> <br /> <dd> <dl> <dt> <em> <font> <h1> <h2> <h3> <h4> <h5> <h6> <hr /> <i> <li> <nbsp> <ol> <p> <small> <strike> <strong> <sub> <sup> <table> <td> <th> <tr> <tt> <u> <ul>
Snippets of code should be wrapped in <code> tags not <pre> tags. In fact, <pre> tags should generally be avoided. If they must be used, extreme care should be taken to ensure that their contents do not have long lines (<70 chars), in order to prevent horizontal scrolling (and possible janitor intervention).
Want more info? How to link or How to display code and escape characters are good places to start.


Don't ask to ask, just ask
	PerlMonks