bshade has asked for the wisdom of the Perl Monks concerning the following question:

I am writing a perl script that converts various documents and even websites to text and emails it back to the sender. I have been succesful with converting websites that have Office documents in the url (e.g. http://www.somesite.com/worddocument.doc), however I am not successful with converting urls with pdf files in them. I know I can convert them to text once I have them on my hard drive, but how can I download a PDF file from a website (e.g. http://www.somesite.com/MYfile.pdf)? Any help would be greatly appreciated. Thanks in advance. Brian

Replies are listed 'Best First'.
Re: Download a file from the web
by chip (Curate) on Dec 08, 2001 at 01:31 UTC
    I should think LWP::Simple would suit your needs perfectly:

    use LWP::Simple; $pdf_content = get("http://some.site/foo.pdf")

        -- Chip Salzenberg, Free-Floating Agent of Chaos

Re: Download a file from the web
by blakem (Monsignor) on Dec 08, 2001 at 01:25 UTC
    CPAN has quite a few PDF modules. I bet one (if not several) will do exactly what you want.

    Nevermind... LWP::Simple is what you want. Though its unclear how a method for downloading .doc would fail to work on .pdf urls....

    -Blake

      I don't think PDF-specific modules are particularly helpful in this case; the original poster just wants to fetch the PDFs, and the rest of the processing he can handle already.

          -- Chip Salzenberg, Free-Floating Agent of Chaos

Re: Download a file from the web
by ehdonhon (Curate) on Dec 08, 2001 at 04:46 UTC

    As other have said, LWP::Simple seems to be the best way to go here. But in the interest of TIMTOWTDI, you could do this if you were on a unix system:

    my $pdf_file = `lynx -source $pdf_url`

      Yes, you could do that. However, this would require heavier data validation to ensure that you're not opening up a whopping security hole by passing things to the shell. I wouldn't do it if there is a reasonable alternative.

      Cheers,
      Ovid

      Join the Perlmonks Setiathome Group or just click on the the link and check out our stats.