smilingsagar has asked for the wisdom of the Perl Monks concerning the following question:

I am a new bee in Perl. I am trying to download pdf file from UN website, using mechanize. http://documents-dds-ny.un.org/doc/UNDOC/GEN/N08/588/39/pdf/N0858839.pdf?OpenElement Please let me know more about html headers using mechanize.
use strict; use warnings; use LWP::Simple; use WWW::Mechanize; use Compress::Zlib; # Create a new instance of WWW::Mechanize my $mechanize = WWW::Mechanize->new(autocheck => 1); $mechanize->cookie_jar(HTTP::Cookies->new); my $url = 'http://documents-dds-ny.un.org/doc/UNDOC/GEN/N08/588/39/pdf +/N0858839.pdf?OpenElement'; my $file = 'test.pdf'; #getstore($url,$file); $mechanize->cookie_jar(HTTP::Cookies->new); $mechanize->get($url); my $response = $mechanize->response(); for my $key ( $response->header_field_names() ) { print $key, " : ", $response->header( $key ), "\n"; } my $dest = $mechanize->response->content; if($mechanize->response->header("Content-Encoding") eq "gzip") { $dest = Compress::Zlib::memGunzip($dest); $mechanize->update_html($dest); } $mechanize->get($url,":content_file" => $file); print $mechanize->content();

Replies are listed 'Best First'.
Re: Perl Mechanize - Header Help required.
by Anonymous Monk on Jan 30, 2010 at 22:44 UTC

    sin 1) you did not ask a question

    sin 2) you pretend to be a new bee , can you tell us about older bees ? also , if you're a bee how come you can write ? are you bees that smart ?

    sin 3) it's 100% that you're working on this project , why exactly did you disclose where you're taking the data from ? and how do you respect us monks by asking us to do your work for you ? you bee !

      ++ Anonymous Monk Pity you weren't logged in to get your reward for such first class detective work.

Re: Perl Mechanize - Header Help required.
by Anonymous Monk on Jan 30, 2010 at 18:29 UTC
    Ask a question :)
    #!/usr/bin/perl -- use strict; use warnings; use WWW::Mechanize 1.60; my $ua = WWW::Mechanize->new(autocheck => 1); my $url = 'http://documents-dds-ny.un.org/doc/UNDOC/GEN/N08/588/39/pdf +/N0858839.pdf?OpenElement'; my $file = 'test.pdf'; $ua->get($url,":content_file" => $file); $ua->dump_headers; if( $ua->is_html() ){ print qq!\nBUMMER, NOT PDF (is_html), DELETING "$file"\n\n!; unlink $file or die "DELETING FAILED $!"; } __END__ Cache-Control: no-cache Date: Sat, 30 Jan 2010 18:27:37 GMT Server: Lotus-Domino Content-Encoding: gzip Content-Length: 1483 Content-Type: text/html; charset=UTF-8 Expires: Tue, 01 Jan 1980 06:00:00 GMT Client-Date: Sat, 30 Jan 2010 18:27:30 GMT Client-Peer: 157.150.195.130:80 Client-Response-Num: 1 NtCoent-Length: 3406 BUMMER, NOT PDF (is_html), DELETING "test.pdf"
      Even i got these headers. But how to handle these headers to get the pdf out.
        What pdf? The page returned is an error page, there is no pdf there.