Beefy Boxes and Bandwidth Generously Provided by pair Networks
Perl: the Markov chain saw

Re^2: Why this file fetch fails with WWW::Mechanize?

by ZJ.Mike.2009 (Scribe)
on Aug 04, 2010 at 15:11 UTC ( #852892=note: print w/replies, xml ) Need Help??

in reply to Re: Why this file fetch fails with WWW::Mechanize?
in thread Why this file fetch fails with WWW::Mechanize?

@Corion, thanks for the suggestion. I've now recorded the headers sent to the server by Firefox using Live HTTP headers. They are something like:
Host: User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.1; zh-CN; rv: +1) Gecko/20100701 Firefox/3.5.11 Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0. +8 Accept-Language: zh-cn,zh;q=0.5 Accept-Encoding: gzip,deflate Accept-Charset: GB2312,utf-8;q=0.7,*;q=0.7 Keep-Alive: 300 Connection: keep-alive HTTP/1.0 200 OK Content-Type: video/x-flv Content-Length: 21293994 Connection: close
As you suggested, I've also tried to send the same headers in Mechanize:
use WWW::Mechanize; use strict; use warnings; my $browser = WWW::Mechanize->new(); $browser->cookie_jar(HTTP::Cookies->new()); $browser->add_header('User-Agent' => 'Mozilla/5.0 (Windows; U; Windows + NT 5.1; zh-CN; rv: Gecko/20100701 Firefox/3.5.11'); $browser->add_header('Accept' => 'text/xml,application/xml,application +/xhtml+xml;q=0.9,*/*;q=0.8'); $browser->add_header('Accept-Language' => 'zh-cn,zh;q=0.5'); $browser->add_header('Accept-Encoding' => 'gzip,deflate'); $browser->add_header('Accept-Charset' => 'GB2312,utf-8;q=0.7,*;q=0.7') +; $browser->add_header('Keep-Alive' => 300); $browser->add_header('Connection' => 'keep-alive'); my $url = ' +167.217.206/%d3%e9%c0%d6%b0%d9%b7%d6%b0%d9-100803-%d0%a1%d6%ed%bd%dc% +c2%d7%b7%d6%d7%e9%d0%e3%c7%f2%bc%bc.mp4/segno=0%26&rid=A8F1F5DFEB1B11 +F1D90B40AD1BB75D69&filelength=21293994&blocksize=2097152&blocknum=11& +blockmd5=E210862B3F92935D0883E00AA2A38F08@D793599727C6DA4ACDB1CBF2235 +004AC@D5E9C9245C9A1BB63BC5EDA862A32604@51B5FDF91356B2B4E943EF72648EB0 +AD@6F2400488B04EBF66A60336B795EA142@8E51B8DCF87A7A02B84A2CAA5FFCA3CF@ +89080D683268481694DBA6D1E22A2EFF@8F56225C76854A434385A09C319BF9C3@9AB +0A3F199183F479F8887D1C3341B1B@845FE0D711086CC2D086546CD26B35C1@9D93A9 +BE1D2EDE216AA9EBF26BF414BE'; $browser->get($url);
But I'm receiving the same error as follows:
Error GETing +9.167.217. 206/%d3%e9%c0%d6%b0%d9%b7%d6%b0%d9-100803-%d0%a1%d6%ed%bd%dc%c2%d7%b7% +d6%d7%e9%d 0%e3%c7%f2%bc%bc.mp4/segno=0%26&rid=A8F1F5DFEB1B11F1D90B40AD1BB75D69&f +ilelength= 21293994&blocksize=2097152&blocknum=11&blockmd5=E210862B3F92935D0883E0 +0AA2A38F08 @D793599727C6DA4ACDB1CBF2235004AC@D5E9C9245C9A1BB63BC5EDA862A32604@51B +5FDF91356B 2B4E943EF72648EB0AD@6F2400488B04EBF66A60336B795EA142@8E51B8DCF87A7A02B +84A2CAA5FF CA3CF@89080D683268481694DBA6D1E22A2EFF@8F56225C76854A434385A09C319BF9C +3@9AB0A3F1 99183F479F8887D1C3341B1B@845FE0D711086CC2D086546CD26B35C1@9D93A9BE1D2E +DE216AA9EB F26BF414BE: Internal Server Error at E:\ line 17
Is there anything else I can try? Thanks :)

Replies are listed 'Best First'.
Re^3: Why this file fetch fails with WWW::Mechanize?
by Corion (Patriarch) on Aug 05, 2010 at 08:41 UTC
    Internal Server Error

    This means something goes wrong on the server side.

    Really use a network sniffer, to not only think you're sending the same data but to make sure you actually do send the same data.

    The server cannot decide that you are not using a browser unless you do something different from how a browser behaves. You just need to find out where the differences lies.

      @Corion, That makes a lot of sense. I'll try to dig a little deeper and see what's causing the problem. Thanks for the pointer :)

Log In?

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://852892]
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others about the Monastery: (4)
As of 2023-02-05 20:07 GMT
Find Nodes?
    Voting Booth?
    I prefer not to run the latest version of Perl because:

    Results (32 votes). Check out past polls.