http://qs1969.pair.com?node_id=1069353


in reply to Re: Threaded Code Not Faster Than Non-Threaded -- Why?
in thread Threaded Code Not Faster Than Non-Threaded -- Why?

BrowserUk I am so glad you answered and took the time to look over my code. Without sounding sycophantic, I've long admired your posts, specifically about threads. They've helped me understand the concepts and avoid common pitfals. So, Hmmmm...

The solution is to desynchronize your threads:

  • The guard (feeder thread) only concerns itself with ensuring that the 'internal queue' (workQ) doesn't get overly full. It has some threshold -- say N where N is the number of worker threads -- and when the workQ falls below that number it allows another N people in to join the internal queue (workQ).
  • The clerks (work threads) all get new customers (filenames) from that same single internal queue (workQ), which means that if they are capable of processing 2 (or N dozen files) in a single timeslice, they do not have to enter wait-states to do so.

...It's gonna take me a few minutes to wrap my head around how this would be implemented in code. I'm not sure how to do it and avoid the memory issues mentioned in the documentation for threads. Incidentally, the basis of my code was straight from the documentation for threads and threads::shared. Who knew I would be so off base?

Are you saying that the "guard" should start polling a single queue and stuffing things into it on demand? How long and how often would I have to usleep to avoid an underrun on one hand and excessive interrupts on the other? That in and of itself seems like it could vary wildly from one environment/server/workstation to another. I'm not sure I understand how to go about it correctly.

I actually did consider the fact that the thread management/locking/queueing 1 at a time was killing performance. This is why I started having the "guard" start stuffing $opts>{qsize} number of items into the workers' queues at a time (default: 30). I saw a noticeable improvement.

What to do... Could you steer me in the way of an implementation like the one you suggest? Google seems more interested in "Coro vs Threads" wars and other silliness that doesn't help me.

UPDATE Re digesting:

This implies that your are reading the entire file and digesting it for every file -- regardless of whether there is another file already seen that has the same date/time/first/last/middle bytes.

The code:

  1. Traverses the filesystem
  2. Groups same-size files, tossing out the rest. This is not threaded
  3. Takes each group and reads the first few bytes each file, creating sub-groups based on the bytes read. Then it removes sub-groups with a single element, thereby "throwing out" the non-similar files from the parent group
  4. Makes a second pass at the above, but at the end of the file (the efficiency of this second pass is debatable but shows good results)
  5. Adds up the final N number of files to be processed in a :shared variable
  6. Creates thread pool with worker threads and shoves 30 files at a time into their queues and waits until the threads have incremented the number of files they've processed to equal N
    • The threads digest the files in their queues in their entirety (this is bad?)
  7. Main thread signals to the threads that they are done by ending their queues and finally joins them
Tommy
A mistake can be valuable or costly, depending on how faithfully you pursue correction