cmac has asked for the wisdom of the Perl Monks concerning the following question:

I've gotten involved with a couple of Perl modules that deal with Proxy Auto Configuration (PAC). The basic JavaScript function FindProxyForURL returns a proxy URL that includes a port (e.g., 800) but not a scheme (e.g., http). Yet LWP::UserAgent seems to want a scheme in the second operand of its $ua->proxy() call. My module(s) typically stand between these two software entities.

I've read some Web pages that recommend that the port used by proxy servers should not be any standard ones. Which means that my module would have a hard time deriving a scheme from the port number returned by FindProxyForURL.

The choices for my module are:
1. Always return http in the scheme part of the URI object it returns,
2. Return the scheme from the URL with which it was called, or
3. Return a URI object without a scheme.

Advice will be much appreciated. I should go find a network-oriented forum to ask this on, but Perl Monks know everything!

Thanks,
cmac

Replies are listed 'Best First'.
Re: scheme for proxy?
by Anonymous Monk on Mar 11, 2010 at 07:38 UTC

      Those are in fact the two modules for which I am now working on new versions. (Both original authors have agreed to make me co-maintainer.)

      Your comment sounds like you're recommending my choice 2 (copy the scheme of the target URL). That's what ProxyPAC does now. ProxyAutoConfig simply returns what FindProxyForURL returns, namely a URL without a scheme.

      I'm trying to resolve an HTTP::ProxyPAC bug report against returning the target scheme, in which the reporter states that one can't have more than one scheme/protocol on a given port. That's true for an endpoint server, but I'm not sure it's true for a proxy.

      But consider what the next-downstream software entity (like LWP::UserAgent) needs to do. It needs to send its request/original URL to the given port on the proxy, which is identified by either its name or its IP address. One purpose of a scheme is to identify a default port (like http->80), but we already have an explicit port. The other purpose of a scheme is to identify the protocol that should be spoken/exchanged.

      I think I will try and see how LWP::UserAgent reacts if I give it a proxy without a scheme, and let its response dictate whether or not to go with choice 3. Then I will interact with the proxy at work to further refine the proper choice. Unless anyone else has further words of wisdom.

      Thanks for replying,
      cmac