in reply to Identifying PDF from URLs

tinita is steering you rightly. You can see from this snippet though that the page is not a PDF. Even the page it redirects to in a browser is not a PDF but an HTML page with a PDF viewer embedded. Getting the PDF from that scheme might not end up being trivial. :(

perl -MLWP::Simple=head -le 'print [ head(+shift) ]->[0]' "http://ccdl +.libraries.claremont.edu/u?/stc,87" text/html

Replies are listed 'Best First'.
Re^2: Identifying PDF from URLs
by tinita (Parson) on May 25, 2010 at 00:19 UTC
    Getting the PDF from that scheme might not end up being trivial.
    Indeed. In this case it seems HEAD requests are blocked. I tried to fetch the direct link to the pdf with the HEAD script and it returned text/html and "Content-Disposition: filename=404.txt". So it's necessary here probably to use a GET request with LWP::UserAgent and from there read the http headers :-/
      Hi all,

      Thanks for your replies.

      I will try these suggestions later tonight and get back to you.

      Andy