in reply to IO::Uncompress::Gunzip thread safe?

Is IO::Uncompress::Gunzip thread safe?

I'm afraid the simple answer is no. And it would take some effort to make it so.

That said, the way your code is constructed, it does not benefit from being threaded, because you are forcing the threads to run serially, by following the thread creation immediately with a call to join, which blocks until the thread ends.

And as your posted code doesn't show what you are hoping to achieve by using threads -- just printing lines to the screen will never benefit from threading -- it is impossible to recommend a better approach.


With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday'
Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
"Science is about questioning the status quo. Questioning authority". The enemy of (IT) success is complexity.
In the absence of evidence, opinion is indistinguishable from prejudice.
  • Comment on Re: IO::Uncompress::Gunzip thread safe?

Replies are listed 'Best First'.
Re^2: IO::Uncompress::Gunzip thread safe?
by chris212 (Scribe) on Nov 21, 2016 at 20:23 UTC

    It was just the simplest way I can reproduce the problem. The actual script will start a thread for each chunk, enqueue the thread handle, then continue reading, start another thread, queue the thread handle, etc. The number of threads are limited with a semaphore, down'ed with each thread creation. Each thread performs some work on each record in the chunk it was sent. A separate output thread will dequeue the thread handle, join it receiving the processed output as a returned array reference, up the semaphore, and write the output to a file in the same order it was read. It works quite well with uncompressed input.

    As a workaround for the compression, I had started a thread before loading the compression libraries, then used a queue to send the data to that thread from the input thread, but it was MUCH SLOWER.

    Unfortunately I had to disable support for compressed input. Compressed output still works since that thread doesn't start any other threads.

      If you can make use of multiple CPUs, it might be easier to handle the decompression through an external process, at the cost of more inter-proces IO:

      open my $fh, "gzip -cd $file |" or die "Couldn't read from '$file': $! / $?"; binmode $fh; while (<$fh>) { # or whatever loop mechanism is appropriate ... }

      That way you lose some finer grained control over the error states - for zero-byte files, gzip might just exit and not output anything and your program might think everything is OK, for example.

        That works pretty well on Linux. It won't work on Windows, but better than nothing. Thanks!