Thanks Perrin.
With multiple processes, I have tracked down two basic approaches, apart from LWP::Parallel which you didn't like.
- ThreadQueue -- Re: What is the fastest way to download a bunch of web pages?-- (thanks BrowserUK)
- Parallel::ForkManager (Suggested by jasonk above, and also mentioned on the "fastest way to download" thread)
Do you think one way has any advantages over the other? Or are these ways essentially the same under the hood?
FWIW I'm on linux now (new job -- yay! now I get perl in its native habitat :) ), since this seems to be relevant when forking comes into play. (Forking works better on linux.)
Also, to give a bit more contetx, I'll be downloading potentially 10s of thousands of websites, but no more than 100 from any one particular domain.
Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
Read Where should I post X? if you're not absolutely sure you're posting in the right place.
Please read these before you post! —
Posts may use any of the Perl Monks Approved HTML tags:
- a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
| |
For: |
|
Use: |
| & | | & |
| < | | < |
| > | | > |
| [ | | [ |
| ] | | ] |
Link using PerlMonks shortcuts! What shortcuts can I use for linking?
See Writeup Formatting Tips and other pages linked from there for more info.