in reply to My 2004 Perlish Wish

At some point, I'll be adding multithreading to my site map builder/link checker (currently, it's just the former). I think all decent link checkers have that ability, and I want to be competitive. I know nothing about doing this, except for something I vaguely recall reading about it dealing with sockets, which I have no idea how to use. I'm not really researching this right now, but if your "powerful conbination of multi-threading and OO" wish would help me achieve this, then I wish for it too.

Replies are listed 'Best First'.
2Re: My 2004 Perlish Wish
by jeffa (Bishop) on Jan 02, 2004 at 05:33 UTC
    If you happen to be using an operating system that implements fork, then you don't need threads to achieve parallelism (in fact, some programmers spells threads F-O-R-K). I would look into LWP::Parallel (well, the ParallelUserAgent distro, to be exact). In the meantime, here is an extremely trivial, bare bones script with issues that uses HTML::LinkExtractor (think HTML::LinkExtor::Simple) and fork to check links in parallel (Quiz: what limitation is allowing us to achieve parallelism even though only one processor may be all that's available to us? ;))
    # be careful with this ... a fork is executed for every link found use LWP::Simple; use HTML::LinkExtractor; my $link = HTML::LinkExtractor->new; $link->parse(\*DATA); my @href = map $_->{href}, grep { $_->{tag} eq 'a' } @{$link->links}; for (@href) { next if fork; my $valid = head($_) ? 'good' : 'bad'; warn "$_ is $valid\n"; exit; } __DATA__ <ul> <li><a href="http://www.perlmonks.org">Perlmonks</a></li> <li><a href="http://www.yahoo.com">Yahoo</a></li> <li><a href="http://bad.link.number.one">Bad #1</a></li> <li><a href="file://not.there"></a>Bad #2</li> </ul>
    Mr. Peabody Explains Fork

    jeffa

    L-LL-L--L-LL-L--L-LL-L--
    -R--R-RR-R--R-RR-R--R-RR
    B--B--B--B--B--B--B--B--
    H---H---H---H---H---H---
    (the triplet paradiddle with high-hat)