While I see many style/design issues with the code (shared globals, looping when you are only going to do things once, the indentation issue I pointed out, etc) the only issues that I see which could cause things to fail horribly are the fact that your call to wait on the children only takes place within the child code so it is never being called, and you are running on NT. (I should point out that the second issue is not simple OS bigotry. Windows NT does not support a native fork, and the emulation has issues.) An alternative method for starting parallel processes on NT which has worked for me is IPC::Open3. See Run commands in parallel for a demonstration of how to do that. This is less efficient than forking on Unix, but it is portable. (NT is, by design, much less friendly than Unix to having multiple active processes trying to do work at the same time. NT would prefer one process with multiple threads, which Perl does not support very well.)

An incidental conceptual misunderstanding that I see is that you are assuming that DOCUMENT_RETRIEVER will have a useful return in the parent. It won't, but since you don't use that it shouldn't be causing problems that you see (yet). However what this means is that children and parents will need to figure out how to communicate, and the odds are pretty good that it will be through external files.

And an incidental note. Most people who like to be called things like "Perl guru" aren't. In general I have found that people who think of themselves as being really good do so because they have never been in the larger pond of good people. But without that experience they have had to invent things themselves, which means that they may be better than their friends, but they are not going to be very good next to a random person who has absorbed "standard good advice".

And a final note. Parallel processing like this with many processes works best when you are doing things where the bottleneck is I/O. If you are doing computationally intensive work, then it is preferable to run only as many processes as you have CPUs. Because of this I would suggest that you rethink your design. It is probably going to make sense to have one loop where you download your files in parallel, and then have another loop where you do the complex processing serially.


In reply to Re (tilly) 7: Parallel Downloads using Parallel::ForkManager or whatever works!!! by tilly
in thread Parallel Downloads using Parallel::ForkManager or whatever works!!! by jamesluc

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.