kaweh has asked for the wisdom of the Perl Monks concerning the following question:

Hi, If I know the url of pdf file(PDF MIME is application/pdf, say http://www.db.ucsd.edu/publications/xkeyword.pdf, How can I write a perl to download it? Since I want to download many pdf files, I have to do them automatically.

Replies are listed 'Best First'.
Re: Question about downloading a pdf file
by Kanji (Parson) on Sep 27, 2003 at 04:49 UTC

    Take a look at LWP::Simple's (alt.) getstore or mirror functions.

    my $url = 'http://www.db.ucsd.edu/publications/xkeyword.pdf'; my $file = 'xkeyword.pdf'; unless ( is_success( mirror($url, $file) ) ) { warn "Couldn't download $url\n"; }

        --k.


Re: Question about downloading a pdf file
by ronzomckelvey (Acolyte) on Sep 27, 2003 at 05:19 UTC
    A non Perl answer, and if your running Linux would be the wget command.

    wget -r   www.db.ucsd.edu/publications/

Re: Question about downloading a pdf file
by kaweh (Initiate) on Sep 27, 2003 at 05:13 UTC
    Thanks a lot. I also found mirror function in Perl Cookbook. Here I have a concern. Since I want to download huge number of files, will the mirror function be a little bit slow? Is there a faster way to do it? Xuehua

      The bottleneck is the bandwidth between your site and the target, NOT the code that does the download which is essentially just setting up a socket, sending a GET /publications/some.pdf HTTP/1.1 line and then grabbing the results off the socket. It will make no real speed difference what you use. But as noted I would tend just to use wget unless I had specific reasons to do it in Perl.

      cheers

      tachyon

      s&&rsenoyhcatreve&&&s&n.+t&"$'$`$\"$\&"&ee&&y&srve&&d&&print