There are (at least) three problems with your program:

  1. You are using fork and pipe for no good reason.

    This: my $content = $response->content(); fetches the entire web page as a single lump and returns it to you in $content.

    Which you then write to your pipe as a single lump: print WRITER $content;.

    In your fork, you then read that back, line by line from the pipe and write it out to your file. That's nonsensical.

    Why not skip the fork and write directly to the file? This works:

    use strict; use warnings; use LWP::UserAgent; use HTTP::Request; use HTTP::Response; use HTTP::Message; my $ua = LWP::UserAgent->new(); my $iteration = 1; my $some_directory = 'C:/test/junk'; #open(URLS, $some_directory.'urls.txt'); while (<DATA>) { my $url = $_; chomp $url; &get_url($url, $iteration); print "($iteration) $url\n"; $iteration++; } close URLS; sub get_url { my ($url, $iteration) = @_; open(FH, '>'.$some_directory.$iteration); my $request = HTTP::Request->new("GET", $url); my $response = $ua->request($request); my $content = $response->content(); print FH $content; close FH; return; } __DATA__ ...

    And it allowed me to remove that dumbly arbitrary 15 seconds delay that wastes 13 seconds when the pages only take 2 seconds to download.

  2. You're working far too hard for what you are doing.

    (Or doing a disservice to those of whom you are asking questions, by concealing the real requirements of your code. Simplifying the problem too much gets answers to the wrong questions).

    This also works (without leaks), and is far simpler:

    use strict; use warnings; use LWP::Simple; my $iteration = 1; my $some_directory = 'C:/test/junk'; #open(URLS, $some_directory.'urls.txt'); while (<DATA>) { my $url = $_; chomp $url; getstore( $url, $some_directory . $iteration ); print "($iteration) $url\n"; $iteration++; } __DATA__ ...
  3. You are using fork on windows.

    This is rarely used and barely tested. It is quite likely that it leaks over time; that those leaks are a internal and would require the bug report and the release of a new version to fix.

    If there is a need to do the downloads asynchronously (and on the basis of what you've presented here, you don't), then threads are a far simpler and better tested option.


Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
"Science is about questioning the status quo. Questioning authority".
In the absence of evidence, opinion is indistinguishable from prejudice.
RIP an inspiration; A true Folk's Guy

In reply to Re: Memory Leak Caused by Forking? by BrowserUk
in thread Memory Leak Caused by Forking? by nwboy74

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.