GZip transfer encoding depends on the Client sending an "Accept-Encoding" header in the request which has to contain the string "gzip". (Other compression schemes like bzip2 are also possible).

If the server supports gzip and the client has requested it, the server *may* decide to send the BODY of the response compressed as a gzip stream (depending on things like if the file is compressible and if the server wants to spend CPU resources to reduce network load at this point in time). To do this, it adds a "Content-Encoding" header in the response with the value set to "gzip".

From what i remember, ye olde WWW::Mechanize doesn't send any Accept-Encoding header which is was gets it into trouble sometimes. Let me quote from RFC7231, page 41, Chapter "5.3.4 Accept-Encoding", sub-paragraph 1:

If no Accept-Encoding field is in the request, any content-coding is considered acceptable by the user agent.

Here is the link: https://tools.ietf.org/html/rfc7231#page-41

This is what can get WWW::Mechanize in trouble, because the server MAY decide to use gzip, bzip2 or whatever in the reply. If you use WWW::Mechanize::GZip, which *does* send the correct header, the server is only allowed to either send uncompressed or gzip compressed, and WWW::Mechanize::GZip understands both as far as i remember. It's just the more reliable option.

BTW, when we are talking about Transfer-Encoding, this isn't the same as "file format". So you wont download a .gz file and unzip it. Instead, the content just gets gzipped on the server side for sending over the network, then it gets automatically decompressed by the client library before it gets handed (uncompressed) to the client. This is just to speed up transfer, in practise, your script should not even realize (or bother) that this compression magic is going on in the background to save network bandwith and speed up data transfer.

perl -e 'use Crypt::Digest::SHA256 qw[sha256_hex]; print substr(sha256_hex("the Answer To Life, The Universe And Everything"), 6, 2), "\n";'

In reply to Re: Problem while using WWW::Mechanize module for getting html by cavac
in thread Problem while using WWW::Mechanize module for getting html by yujong_lee

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.