chimni has asked for the wisdom of the Perl Monks concerning the following question:


Hi monks,
I am using WWW.Mechanize to fetch a webpage.
I have so far done the following:
#! /usr/bin/perl use strict; use warnings; use WWW::Mechanize; my $mech = WWW::Mechanize->new(); my $url = "https://xyz.com/svi/app?login=true"; $mech->get($url); $mech->success or die "Can't fetch the Requested page"; print $mech->ct(); print $mech->content();

The output i get is not html but a compressed gzip format
All nice and squigly like !²D¶f¯U±x&ä9þu4ÂøÀìQeÅJªbœñme
i had the get write to a file
`file filename` told me it is gzip format
changed the file to file.gz and did gzip -d file.gz,resulting in html.
My problem is i want to enter a user password and submit this application login page
How do i have the site return me just html so i can enter data and submit the form?
Is there any special header i can send.
does the get() provide anything to decompress.
if i take it in a file like above how do i load that page back after decompressing?
Thanks ,
chimni.

Replies are listed 'Best First'.
Re: WWW::Mechanize get problem
by castaway (Parson) on Mar 03, 2004 at 12:42 UTC
    I already answered this in the CB earlier.. If you can get at the HTTP headers that are being sent, you need to set 'Accept-encoding:', with no values. If this header is not present, the server is allowed to assume that you will accept gzipped and compressed pages.

    Update: It would appear from the WWW::Mechanize docs that you need to do $mech->add_header('Accept-encoding', '');

    C.

Re: WWW::Mechanize get problem
by Corion (Patriarch) on Mar 03, 2004 at 12:20 UTC

    The problem is with the website, as WWW::Mechanize does not indicate of itself that it is willing to accept gzipped content.

    You will either have to notify the webmaster that he is always sending deflated content, or use Compress::Zlib to decompress the content manually.

    Possibly, you can also send an Accept-Encoding: text/plain header or something similar, but as WWW::Mechanize sends Encoding headers of its own, I believe that the website is misbehaved.