Try this:
#! perl -slw use strict; use threads; use threads::Q; use threads::shared; use LWP::Simple; sub outputter { my( $fname, $href, $n ) = @_; open my $O, '>:utf8', $fname or die $!; for my $id ( 1 .. $n ) { sleep 1 until exists $href->{ $id }; lock %$href; print $O "$id\t::", delete $href->{ $id }; } close $O; } sub getter { my $tid = threads->tid; my( $Q, $href ) = @_; while( $_ = $Q->dq ) { my( $id, $mac ) = split $;, $_; my $content = get( "http://$mac/" ); lock %$href; $href->{ $id } = $content // "Nothing from $id:$mac\n"; } } our $T //= 8; my $iFile = $ARGV[0] or die "No input filename"; my $machines = (split ' ', `wc -l $iFile` )[0]; my %res :shared; my $Q = threads::Q->new( 128 ); my $outputter = threads->create( \&outputter, '1021943.log', \%res, $machines ) or die $!; threads->create( \&getter, $Q, \%res )->detach for 1 .. $T; open I, '<', $iFile or die $!; my $n = 0; chomp(), $Q->nq( join $;, ++$n, $_ ) while <I>; close I; $Q->nq( undef x $T ); $outputter->join;
The command to run it is:1011943 -T=16 url.fil. The output will be in a file called:1021943.log in the current directory. (For simplicity, I've assumed utf8 for the content, you'll need to check headers and stuff.)
The basic mechanism is to use a single outputter thread and shared hash to coordinate the output.
The multiple getter threads read urls prefix with an id (input file sequence number) from a size-limiting queue (you can download it from Re^5: dynamic number of threads based on CPU utilization) and get the content. When they have it, they lock the shared hash and add the content (or an error messgae) as the value, keyed by the id.
The outputter thread monitors this hash waiting for the appearance of the next id in sequence, and when it appears, they lock the hash; write it to the file and then delete it.
Once the main thread has started the outputter and getter threads, it reads the input file and feeds the urls to the queue. The self limit queue prevent memory runaway. Once the entire list has been fed to the, it queues one undef per thread to terminate the getter threads and then waits for (joins) the outputter thread before terminating.
I've also printed a crude header before each lot of content to verify the ordering.
In reply to Re: Program Design Around Threads
by BrowserUk
in thread Program Design Around Threads
by aeaton1843
| For: | Use: | ||
| & | & | ||
| < | < | ||
| > | > | ||
| [ | [ | ||
| ] | ] |