Beefy Boxes and Bandwidth Generously Provided by pair Networks
Perl-Sensitive Sunglasses
 
PerlMonks  

Re: Why this file fetch fails with WWW::Mechanize?

by Corion (Patriarch)
on Aug 04, 2010 at 13:15 UTC ( [id://852873]=note: print w/replies, xml ) Need Help??


in reply to Why this file fetch fails with WWW::Mechanize?

Maybe the server sends something different when it does (not) detect Firefox or Internet Explorer. You don't tell us how it fails and also don't show us how you inspect the retrieved content. I recommend looking at the headers that go over the wire (using wireshark for example) and eliminating the differences one by one.

  • Comment on Re: Why this file fetch fails with WWW::Mechanize?

Replies are listed 'Best First'.
Re^2: Why this file fetch fails with WWW::Mechanize?
by ZJ.Mike.2009 (Scribe) on Aug 04, 2010 at 15:11 UTC
    @Corion, thanks for the suggestion. I've now recorded the headers sent to the server by Firefox using Live HTTP headers. They are something like:
    Host: 119.167.217.206:19765 User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.1; zh-CN; rv:1.9.1.1 +1) Gecko/20100701 Firefox/3.5.11 Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0. +8 Accept-Language: zh-cn,zh;q=0.5 Accept-Encoding: gzip,deflate Accept-Charset: GB2312,utf-8;q=0.7,*;q=0.7 Keep-Alive: 300 Connection: keep-alive HTTP/1.0 200 OK Content-Type: video/x-flv Content-Length: 21293994 Connection: close
    As you suggested, I've also tried to send the same headers in Mechanize:
    use WWW::Mechanize; use strict; use warnings; my $browser = WWW::Mechanize->new(); $browser->cookie_jar(HTTP::Cookies->new()); $browser->add_header('User-Agent' => 'Mozilla/5.0 (Windows; U; Windows + NT 5.1; zh-CN; rv:1.9.1.11) Gecko/20100701 Firefox/3.5.11'); $browser->add_header('Accept' => 'text/xml,application/xml,application +/xhtml+xml;q=0.9,*/*;q=0.8'); $browser->add_header('Accept-Language' => 'zh-cn,zh;q=0.5'); $browser->add_header('Accept-Encoding' => 'gzip,deflate'); $browser->add_header('Accept-Charset' => 'GB2312,utf-8;q=0.7,*;q=0.7') +; $browser->add_header('Keep-Alive' => 300); $browser->add_header('Connection' => 'keep-alive'); my $url = 'http://119.167.217.206:19765/ppvaplaybyopen?url=http://119. +167.217.206/%d3%e9%c0%d6%b0%d9%b7%d6%b0%d9-100803-%d0%a1%d6%ed%bd%dc% +c2%d7%b7%d6%d7%e9%d0%e3%c7%f2%bc%bc.mp4/segno=0%26&rid=A8F1F5DFEB1B11 +F1D90B40AD1BB75D69&filelength=21293994&blocksize=2097152&blocknum=11& +blockmd5=E210862B3F92935D0883E00AA2A38F08@D793599727C6DA4ACDB1CBF2235 +004AC@D5E9C9245C9A1BB63BC5EDA862A32604@51B5FDF91356B2B4E943EF72648EB0 +AD@6F2400488B04EBF66A60336B795EA142@8E51B8DCF87A7A02B84A2CAA5FFCA3CF@ +89080D683268481694DBA6D1E22A2EFF@8F56225C76854A434385A09C319BF9C3@9AB +0A3F199183F479F8887D1C3341B1B@845FE0D711086CC2D086546CD26B35C1@9D93A9 +BE1D2EDE216AA9EBF26BF414BE'; $browser->get($url);
    But I'm receiving the same error as follows:
    Error GETing http://119.167.217.206:19765/ppvaplaybyopen?url=http://11 +9.167.217. 206/%d3%e9%c0%d6%b0%d9%b7%d6%b0%d9-100803-%d0%a1%d6%ed%bd%dc%c2%d7%b7% +d6%d7%e9%d 0%e3%c7%f2%bc%bc.mp4/segno=0%26&rid=A8F1F5DFEB1B11F1D90B40AD1BB75D69&f +ilelength= 21293994&blocksize=2097152&blocknum=11&blockmd5=E210862B3F92935D0883E0 +0AA2A38F08 @D793599727C6DA4ACDB1CBF2235004AC@D5E9C9245C9A1BB63BC5EDA862A32604@51B +5FDF91356B 2B4E943EF72648EB0AD@6F2400488B04EBF66A60336B795EA142@8E51B8DCF87A7A02B +84A2CAA5FF CA3CF@89080D683268481694DBA6D1E22A2EFF@8F56225C76854A434385A09C319BF9C +3@9AB0A3F1 99183F479F8887D1C3341B1B@845FE0D711086CC2D086546CD26B35C1@9D93A9BE1D2E +DE216AA9EB F26BF414BE: Internal Server Error at E:\pp2.pl line 17
    Is there anything else I can try? Thanks :)
      Internal Server Error

      This means something goes wrong on the server side.

      Really use a network sniffer, to not only think you're sending the same data but to make sure you actually do send the same data.

      The server cannot decide that you are not using a browser unless you do something different from how a browser behaves. You just need to find out where the differences lies.

        @Corion, That makes a lot of sense. I'll try to dig a little deeper and see what's causing the problem. Thanks for the pointer :)

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://852873]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others making s'mores by the fire in the courtyard of the Monastery: (6)
As of 2024-04-25 09:59 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found