in reply to Re: •Re: Chaining proxies with LWP::UserAgent
in thread Chaining proxies with LWP::UserAgent

As the documentation for LWP says, the best place to discuss exact details for complex stuff is libwww@perl.org. However browsing the documentation and code it looks like there is an assumption built into the code that all requests will be mediated through only one level of proxy, and which proxy to use will be a function of the communication scheme used. (In other words it assumes a configuration which matches any browser you have ever seen.)

But if use the full interface, your LWP::UserAgent object has methods named send_request(), simple_request() and just request() which send off requests with various levels of preparation and munging first. But the important thing is that they can be anything that you want.

So I would suggest playing with using no proxy, and then start sending your own CONNECT requests, and see whether you can get it to send the correct sequence to chain levels of proxying.

PS I have only seen multiple levels of proxying used by people who were attempting to use various open proxies to anonymize themselves. Having looked at the traffic that passed through one such proxy, the users paid attention to how much the server reported to others, and didn't seem to realize that the server they are proxying off of can keep logs including the information not passed on, and do things like hand it over to law enforcement... (Yes, I do know of at least one case where law enforcement took full advantage of this.)

  • Comment on Re: Re: •Re: Chaining proxies with LWP::UserAgent

Replies are listed 'Best First'.
Re: Re: Re: •Re: Chaining proxies with LWP::UserAgent
by Anonymous Monk on Jun 28, 2003 at 21:25 UTC
    But if use the full interface, your LWP::UserAgent object has methods named send_request(), simple_request() and just request() which send off requests with various levels of preparation and munging first. But the important thing is that they can be anything that you want.
    If only that were the case.

    The syntax of CONNECT is: CONNECT host:port HTTP/1.0.

    LWP overloads the URI field to contain both a) the host and port to connect to (to send the request to) and b) the URI to send in the request. I do HTTP::Request->new("CONNECT", "http://proxy1_host:proxy1_port/proxy2_host:proxy2_port", ...) and LWP sends CONNECT /proxy2_host:proxy2_port to proxy1_host:proxy1_port. So I hack LWP::Protocol::http to not send the slash.

    Now comes the tunnelling. I'm supposed to be able to communicate through the tunnel once the connection is established and the tunnel sends HTTP/1.0 200 Connection established. I thought this would be as simple as sending an HTTP request in the content field:

    $req = HTTP::Request->new("CONNECT", "http://proxy1_host:proxy1_port/p +roxy2_host:proxy2_port", HTTP::Headers->new(), "GET http://final_destination.example.com/ HTTP/1.0\cJ\cM\cJ\cM");

    Not so. The content comes before the response (which is expected, I guess):

    CONNECT proxy2_host:proxy2_port HTTP/1.1 TE: deflate,gzip;q=0.3 Connection: TE, close Host: proxy2_host:proxy2_port User-Agent: libwww-perl/5.68 Content-Length: 39 GET http://final_destination.example.com/ HTTP/1.0 HTTP/1.0 200 Connection established
    So I can't get here from there, through LWP, as far as I can see.

    HTTP::Lite suffers from the same disease.

    request ( $url, $data_callback, $cbargs )
    Initiates a request to the specified URL.

    The host to connect to and the request to send are stuffed inside the $url parameter. I don't want to use URIs more than necessary, I just want to speak HTTP.

    HTTP::MHTTP also crams the host to connect to and the request into a URI, passed to http_call.

    Net::HTTPTunnel can do HTTP tunnelling through arbitrary TCP services. The only drawback is that I'll have to write my own HTTP, unless somehow one of these HTTP modules can be instructed to communicate on a given socket created by Net::HTTPTunnel. That's a reasonable trade-off, I suppose.

    PS I have only seen multiple levels of proxying used by people who were attempting to use various open proxies to anonymize themselves.
    RFC 2817 seems to disagree:
    It may be the case that the proxy itself can only reach the requested origin server through another proxy. In this case, the first proxy SHOULD make a CONNECT request of that next proxy, requesting a tunnel to the authority.
    I try to code for all relevant cases, and since I'm writing a program mainly for proxies this is relevant.

    Its definitely possible that one will use an open proxy, but my program will also accept internal proxies; no distinction is made. Perhaps I could do a DNSRBL lookup, although such a check would slow down execution.

    Having looked at the traffic that passed through one such proxy, the users paid attention to how much the server reported to others, and didn't seem to realize that the server they are proxying off of can keep logs including the information not passed on, and do things like hand it over to law enforcement... (Yes, I do know of at least one case where law enforcement took full advantage of this.)
    Thanks for heads up -- I plan to take full advantage of this, also; but that's another thread.
      Having looked at the LWP::UserAgent and LWP::Protocol::HTTP code, I think it should be doable to edit them to add support for the proxy either being an array ref, or a string of proxies chained with some convenient dividers (eg |). The idea is that whever you see a reference to a proxy, you just loop through the proxies and do what you do for all of them. Having scanned it, adding chained proxy support to HTTP looks doable (just a handful of lines in a couple of modules - grep for "proxy" and do "perldoc -l LWP::UserAgent" to find where it is on your system) and probably a lot easier than writing significant new code. It is distributed under the same terms as Perl, those are pretty generous so legal issues are likely not a problem for you.

      If don't think your Perl is up to it, ask on the appropriate list and you likely will find someone else who can. (The speed with which they add features for you might be affected by financial encouragement...)

      If you then contribute that back, and a lot of people will suddenly be able to easily use chained proxies in Perl if they need it. :-)

      PS My comment about chained proxies was not a statement that they aren't useful, just a comment about where I happened to have seen them before.

        Having looked at the LWP::UserAgent and LWP::Protocol::HTTP code, I think it should be doable to edit them to add support for the proxy either being an array ref, or a string of proxies chained with some convenient dividers (eg |).

        If you then contribute that back, and a lot of people will suddenly be able to easily use chained proxies in Perl if they need it. :-)

        Okay, I did it.

        http.pm.diff and UserAgent.pm.diff

        Those patches allow the second argument of the proxy() method to be either an arrayref, or a whitespace-separated list of proxies.

        I sent off the patches to Gisle, hopefully he will accept them in the next version. If so, then proxy chaining via LWP will be available to all! :)