There are, IMO, 3 core models of parallel processing in perl. Each have upsides and downsides, so I'll outline them with examples;

Threads

Threads in perl are not lightweight, so put aside any understanding of threads you have from other languages. This misconception is why threads are often considered 'bad'. It's not appropriate to spawn lots and lots of threads - that's very inefficient - but for a queue and workers model, it works really well.

It's quite well suited for IPC, meaning that passing things around to parallelise is easy. So for example, we can use Thread::Queue to go with threads:

#!/usr/bin/env perl use strict; use warnings; use threads; use Thread::Queue; use IO::Socket; my $nthreads = 20; my $in_file2 = 'rang.txt'; my $work_q = Thread::Queue->new; my $result_q = Thread::Queue->new; sub ip_checker { while ( my $ip = $work_q->dequeue ) { chomp($ip); $host = IO::Socket::INET->new( PeerAddr => $ip, PeerPort => 80, proto => 'tcp', Timeout => 1 ); if ( defined $host ) { $result_q->enqueue($ip); } } } sub file_writer { open( my $output_fh, ">>", "port.txt" ) or die $!; while ( my $ip = $result_q->dequeue ) { print {$output_fh} "$ip\n"; } close($output_fh); } for ( 1 .. $nthreads ) { push( @workers, threads->create( \&ip_checker ) ); } my $writer = threads->create( \&file_writer ); open( my $dat, "<", $in_file2 ) or die $!; $work_q->enqueue(<$dat>); close($dat); $work_q->end; foreach my $thr (@workers) { $thr->join(); } $result_q->end; $writer->join();

This uses a queue to feed a set of (20) worker threads with an IP list, and work their way through them, collating and printing results through the `writer` thread.

Forking

The second approach to parallelism in perl is via forking. Forks are more efficient if you're planning on spinning up more of them in 'disposable' subprocesses, but they're harder to IPC than threads

You can just use fork() on it's own, but I think it's far better to use the excellent Parallel::ForkManager library.

Doing approximately the same as the above:

#!/usr/bin/env perl use strict; use warnings; use Fcntl qw ( :flock ); use IO::Socket; my $in_file2 = 'rang.txt'; open( my $input, "<", $in_file2 ) or die $!; open( my $output, ">", "port.txt" ) or die $!; my $manager = Parallel::ForkManager->new(20); foreach my $ip (<$input>) { $manager->start and next; chomp($ip); my $host = IO::Socket::INET->new( PeerAddr => $ip, PeerPort => 80, proto => 'tcp', Timeout => 1 ); if ( defined $host ) { flock( $output, LOCK_EX ); #exclusive or write lock print {$output} $ip, "\n"; flock( $output, LOCK_UN ); #unlock } $manager->finish; } $manager->wait_all_children; close($output); close($input);

You need to be particularly careful of file IO when multiprocessing, because the whole point is your execution sequence is no longer well defined. So it's insanely easy to end up with different threads clobbering files that another thread has open, but hasn't flushed to disk.

But in both multiprocessing paradigms I outlined above (there are others, these are the most common) you still have to deal with the file IO serialisation. Note that your 'results' will be in a random order in both, because it'll very much depend on when the task completes. If that's important to you, then you'll need to collate and sort after your threads or forks complete.

Non Blocking IP

More a subset of parallel - but you can use something like IO::Select and/or IPC::Run - this allows you to open subprocesses, and read/write to them asynchronously. That's sometimes 'parallel enough' for your use case. IMO it's often the IO element of parallel that you want to run more efficiently, and that generally doesn't benefit from parallelism particularly anyway - you don't get more IO from a socket just by hammering it 5x in parallel, if anything you slow it down. If that's of particular interest, I'll try and mash up some example code to do that for you, but the thing I used from the above isn't particularly 'useful' for that.


In reply to Re: Perl Multi Processing. by Preceptor
in thread Perl Multi Processing. by pritesh_ugrankar

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.