Multi threading a

dnamonk has asked for the wisdom of the Perl Monks concerning the following question:

Replies are listed 'Best First'.
Re: Multi threading a by Corion (Patriarch) on Sep 02, 2021 at 20:20 UTC
First, you should compare the processing time of decompressing and recompressing the file without Perl in between: `time (gunzip -cd "$file" \| gzip /tmp/newfile)` [download] If that is already slow, you might gain some time by using the `pigz` tool, which uses multiple cores for decompressing and compressing. If that is still fast, the processing you do in Perl is the bottleneck and you will need to find ways to make your Perl code faster.	[reply] [d/l] [select]
Re^2: Multi threading a by dnamonk (Acolyte) on Sep 02, 2021 at 22:33 UTC
Yeah, that's true. The issue is with zipping module. Otherwise the program is 20x faster	[reply]
Re: Multi threading a by choroba (Cardinal) on Sep 02, 2021 at 19:52 UTC
What exactly do you want to do in the threads? There's no exhaustive computation to parallelise. Reading from a single file in multiple threads tends to be slower than reading from it in a single thread (not sure about SSD). `map{substr$_->[0],$_->[1]\|\|0,1}[\\|\|{},3],[[]],[ref qr-1,-,-1],[{}],[sub{}^ARGV,3]`	[reply] [d/l]
Re^2: Multi threading a.. huge file -- MCE::Grep by Discipulus (Canon) on Sep 03, 2021 at 07:51 UTC
Hello wiser choroba, > Reading from a single file in multiple threads tends to be slower Are you sure? Is not what MCE::Grep is for? `## File path, glob ref, IO::All::{ File, Pipe, STDIO } obj, or scalar +ref ## Workers read directly and not involve the manager process my @e = mce_grep_f { /pattern/ } "/path/to/file"; # efficient` [download] Also: MCE::Grep#PARSING-HUGE-FILES L* There are no rules, there are no thumbs.. Reinvent the wheel, then learn The Wheel; may be one day you reinvent one of THE WHEELS.	[reply] [d/l]
Re^3: Multi threading a.. huge file -- MCE::Grep by choroba (Cardinal) on Sep 03, 2021 at 07:55 UTC
Oh, so in each thread, you read a large chunk from the file and then process it line by line in memory! This might work if you're sure your pattern can't be split between two chunks. `map{substr$_->[0],$_->[1]\|\|0,1}[\\|\|{},3],[[]],[ref qr-1,-,-1],[{}],[sub{}^ARGV,3]`	[reply] [d/l]
Re^3: Multi threading a.. huge file -- MCE::Grep by karlgoethebier (Abbot) on Sep 03, 2021 at 17:33 UTC
He isn’t sure. He forgot that BUK failed on SSD performance years ago in an ugly conversation with Mario. And remember that my name is mentioned in the docs 🤪 See also Threads From Hell #2: How To Search A Very Huge File [SOLVED] «The Crux of the Biscuit is the Apostrophe»	[reply]
Re: Multi threading a by perlfan (Vicar) on Sep 09, 2021 at 00:49 UTC
pigz might be just what you want.	[reply]


good chemistry is complicated, and a little bit messy -LW
	PerlMonks