Oh Monks of wisdom, I call upon you to help me on my quest.
I'm doing a project which involves parsing a large file that can be simplified as follows:
A1
B2
C1
C2
A2
B1
As I'm parsing the file I'm looking for pairs ie A1:A2, B1:B2 etc. Each member of the pair can exist anywhere in the file. When I find a pair of data, I need to test it by running it through a subroutine. This subroutine can run completely independently, however (this is the kicker) it needs to add results back to a hash that I then process after the file has been parsed.
Currently the parsing of the file has to wait until the subroutine finishes before continuing and as the sub takes a while to complete, this is slowing down my script.
What I'd like to do is for the user to specify a number of threads or cpus available and once a pair is found, split off the sub to a separate process (ensuring that the max number is not exceeded) to allow the parsing of the file to continue.
In terms of providing code, it's a complicated sub but I can simplify the whole thing as follows:
#!/usr/bin/perl use strict; # this is the hash I want to keep results in my %count = (); # there are way more lines than this but anyway... for my $i (1 .. 100){ # this just represents that I've found a pair and want to run the sub if ($i % 50 == 0){ # run the sub &inc(\%count); } } # I need access to the hash that's updated in the sub print $count{'count'}."\n"; sub inc { my $c = shift; # make a change to the hash $c->{'count'}++; # pausing for affect! # this simulates that the sub takes a while to run sleep(10); }
This code will simply run the sub twice, sleeping for 10 seconds each time. If I have multiple processors available is there a way of making this code use these resources and
1) make the code still output "2" but
2) run for 10 seconds
In reply to Is this a job for a fork? by richardwfrancis
| For: | Use: | ||
| & | & | ||
| < | < | ||
| > | > | ||
| [ | [ | ||
| ] | ] |