Anonymous Monk has asked for the wisdom of the Perl Monks concerning the following question:

I've never used fork but I read about it a few times in the past because I always knew I'd run into something where I'd like to use it.

My PC is running on windows but my web host is unix, I'm hoping theirs a method I can kill both birds with a single stone.

Taking this very cutdown script, can someone show me a very simple fix to add fork() to this? Everything inside the loop is what I'd aim for, to make the page crawling a multi-process.

I do know this isn't good for the server, but in the end I'm not going to use this script for anything except learning how to fork.

#!/usr/bin/perl use warnings; use strict; my $site = "http://ftw.generation.no/?n="; my $cnt = 0; use LWP::Simple; for (1 .. 4000) { $cnt++; my $page = get("$site$cnt"); $page =~ m#src="img/(.+)"\s+alt#; getstore("http://ftw.generation.no/img/$1", $1); print "$1 saved-- $cnt/2584\n"; }

Replies are listed 'Best First'.
Re: How to use FORK
by blazar (Canon) on Jun 01, 2007 at 16:03 UTC
    My PC is running on windows but my web host is unix, I'm hoping theirs a method I can kill both birds with a single stone.

    Yep, fork, which incidentally is fork() and not FORK(), is now emulated under Windows.

    C:\temp>perl -le "print fork" -5720 0
    Taking this very cutdown script, can someone show me a very simple fix to add fork() to this? Everything inside the loop is what I'd aim for, to make the page crawling a multi-process.

    I would either use piped opens as a cheap solution in that case, or rather than reinventing the wheel while risking do so the wrong way, I would use a dedicated module like Parallel::ForkManager or the even more specific LWP::Parallel (and related modules).

Re: How to use FORK
by ikegami (Patriarch) on Jun 01, 2007 at 15:57 UTC
Re: How to use fork
by betterworld (Curate) on Jun 02, 2007 at 15:11 UTC

    You should think carefully where the bottleneck is. You seem to be retrieving several images from the same server. In this case, chances are good that it won't be any more effective if you have several downloads in parallel (unless, for instance, the server has many CPUs and creating an image costs much CPU load). One pipe-lined connection (using HTTP's Keep-Alive) might be more efficient and less aggressive.

    I agree with blazar that fork is spelled in lowercase letters.