anancontigger has asked for the wisdom of the Perl Monks concerning the following question:

Hi, I need to download tar files that are > 300M from a secure website. The following code stops after getting 78M. Would you please show me what needs to be changed in order to get all 300M? Thanks in advance!!
$browser->agent("lwp-download " . $browser->agent); my $req = HTTP::Request->new(GET => $ueTar); $req->authorization_basic("$p4idpw", "$p4idpw"); my $ret = $browser->request($req); my $ueFH = new FileHandle; open $ueFH, "> my.tgz"; print $ueFH $ret->content; close $ueFH;
============12/3/08===============
Hi,
Thank you everyone for your suggestions. After some investigation, my original code actually works fine. The file download file was still in gzip format, so it was much smaller.

Replies are listed 'Best First'.
Re: need to download large .tgz file from secure site
by ikegami (Patriarch) on Dec 03, 2008 at 03:40 UTC

    my $ueFH = new FileHandle; open $ueFH, "> my.tgz";
    should be
    open my $ueFH, "> my.tgz";
    Or better yet,
    open my $ueFH, ">", "my.tgz";
    But that's not the problem.

    On a Windows system, you want binmode($ueFH) in there. It doesn't hurt on other systems either. But that's not the problem if your Perl uses PerlIO. Perl ≥5.8 are compiled to use PerlIO by default.

    The fact that 78MB got printed means the print was reached, which means LWP thought the request was completed. So either

    1. LWP is ignoring an error,
    2. LWP was told the download was complete when it wasn't,
    3. LWP is reporting an error but you're not checking for it, or
    4. The download was successful and there was an error saving the data to the file.

    Check the status code of LWP's response. Check if print $ueFH or close $ueFH are returning an error. If that doesn't help, you'll probably have to look at the network traffic.

Re: need to download large .tgz file from secure site
by almut (Canon) on Dec 03, 2008 at 03:22 UTC

    Maybe you could try the lwp-download script instead... because

    "The lwp-download program is implemented using the libwww-perl library. It is better suited to download big files than the lwp-request program because it does not store the file in memory. Another benefit is that it will keep you updated about its progress and that you don't have much options to worry about."

    If that doesn't work either, then there's also other (non-Perl) tools, such as curl.

    (In cases like these, just trying different tools often helps to narrow down on the problem. For example, if curl would also stop at 78 MB, then it's likely the problem lies outside of either of those two tools.)

Re: need to download large .tgz file from secure site
by trwww (Priest) on Dec 03, 2008 at 12:09 UTC

    Hello,

    I've used something like the following in my code to download big files:

    use LWP::Simple qw(getstore);
    getstore( $url, '/tmp/local.tgz' );
    

    See the LWP::Simple docs for more info.

    Hope this helps,

    trwww

Re: need to download large .tgz file from secure site
by zentara (Cardinal) on Dec 03, 2008 at 13:54 UTC
    I always use Wget for downloading big files, mostly because if a connection fails or times out, it will automatically try to reconnect and continue where it left off. It also can pick up and resume on partially downloaded files. Its very reliable, and I've yet to see it fail to get a big file. You can easily use it from a perl script with system or exec. It also will automatically store the file in a directory named after the server.....saving you the hassle of specifying a file location.

    One caveat, wget must be built with openssl support to get from a secure server, but such builds are standard nowadays.


    I'm not really a human, but I play one on earth Remember How Lucky You Are