motobói has asked for the wisdom of the Perl Monks concerning the following question:

Fellow monks,

I'm asking for your advice.

My company proxy likes to break HTTP protocol and insists repeatedly to deploy full content instead of content-ranged, as requested.

Up to now, everything OK, the problem comes when its time to yum install anything.

Yum asks for package headers using content-range to a server that support that and proxy returns full. Expect a header and receive a full rpm. Result is wrong checksum, no deal.

Having some experience with HTTP::Proxy, I am thinking of writing a BodyFilter to throw away content out of the range. Do you, dear monks, think is wise to use HTTP::Proxy to do this? Is there any other tool capable of doing this?

Sample, untested, non functional code follows

The new Proxy Class
{package HTTP::Proxy::ContentRange; use base HTTP::Proxy; use HTTP::Proxy::HeaderFilter::contentrange; use HTTP::Proxy::BodyFilter::complete; use HTTP::Proxy::BodyFilter::contentRange; use strict; sub new { my $class = shift; my $self = SUPER::new(@_); # Whe depend on HPB::complete. $self->push_filter( response => HTTP::Proxy::HeaderFilter::contentrange->new, response => HTTP::Proxy::BodyFilter::complete->new, response => HTTP::Proxy::BodyFilter::contentRange->new); return $self; }
The Header Filter:
{package HTTP::Proxy::HeaderFilter::contentrange; use HTTP::Proxy; use base HTTP::Proxy::HeaderFilter; use strict; sub filter(){ my ( $self, $headers, $message) = @_; if ( $message->isa('HTTP::Response') and $message->code == 200 and $range = $message->request->header('Content-Range') ){ #Let's fix that nasty behaviour! ;-) $message->code(206); $message->header('Content-Range' => $range); #XXX: Find a way to calc content-lenght $message->remove_header('Content-Length'); #Mark this for body processing! $self->SUPER::proxy->stash($message->uri => 1); } } 1; }
The Body Filter:
{package HTTP::Proxy::BodyFilter::contentRange; use HTTP::Proxy; use base HTTP::Proxy::BodyFilter; use strict; use base HTTP::Proxy::BodyFilter; } sub filter{ my ($self, $dataref, $message, $protocol, $buffer) = @_; #Was this response marked for processing? if ( SUPER::proxy->stash($message->uri) == 1 ){ ###### # TO IMPLEMENT # Parse Content-Range and select data to send. ###### #XXX: Wouldn't be wise to save content to a temp file and avo +id # filling up memory. Maybe use HPB::save? #Delete entry from process table delete SUPER->proxy->stash{$message->uri}; } } 1; }
Thank you in advance for any commentary.

Replies are listed 'Best First'.
Re: Is wise to use HTTP::Proxy to enforce correct Content-Range responses?
by flipper (Beadle) on Apr 15, 2008 at 21:41 UTC
    I think it would be better to persuade yum to interpret a 200 response to mean that the server could not complete the range request (in this case because it did not receive it), and has returned the complete content. It is common for content filtering proxies to strip range headers from requests, otherwise they could be effectively bypassed by a series of sufficiently small range requests.

    Section 14.16 of RFC 2616 does not allow for proxies behaving in this way - if they are to function (and I think they are here to stay), the behaviour you are reporting is arguably the best available to them.

    I'm sure there are enough people behind similar (possibly transparent) proxies to make accommodating this in yum worthwhile.
      Yeah, it would be nice if one could persuade yum to do that. But, as wrote at http://wiki.linux.duke.edu/YumFaq , question 5:
      Q. 5: I get an "Errno -1 Header is not complete." error from yum - what the heck is going on?
      A. It's probably a proxy somewhere between you and the repository. You may not think that a proxy is in the way even though it really is.
      
      (...)
      
      The solutions to this problem are:
      
      - Get your proxy software/firmware updated so that it properly implements HTTP 1.1
      - Use an FTP repository, where byte ranges are more commonly supported by the proxy
      - Create a local mirror with rsync and then point your yum.conf to that local mirror
      - Don't use yum
      
        I think this is similar to the fuss about ECN in the Linux kernel - in an ideal world, everyone would follow the spec; in the real world, the yum developers are making life difficult for people behind these proxies, over which they may well have no control. See also Postel's prescription, TCP window scaling and standards mode - in this case I think that these proxies are not going to go away, and it's churlish not to work with them.
Re: Is wise to use HTTP::Proxy to enforce correct Content-Range responses?
by motobói (Beadle) on Apr 23, 2008 at 22:29 UTC
    Fellow monks.

    Following this story.

    As Yum people don't want to modify it and my network admin the gateway either, I tried writing this little piece of crap:

    Please pay attention to HTTP::Proxy::ContentRange::Body::filter.

    Actually it works pretty well when the file size isn't big enough to cause a timeout while Yum waits for the end of transmission, as I keep receiving and discarding data from upstream server.

    What I need is stop the communication with the client, i.e. yum, when the end of requested range if found.

    Something like issuing some sort of EOF on the client socket, close it and close also the LWP::UserAgent upstream connection.

    Anyway, I begun felling that subclass a HTTP::Proxy::BodyFilter isn't the way it will work, at least not in a beautiful way. Maybe subclass HTTP::Proxy?

    I keep asking: is this the better approach to the original problem?

    I'm willing to receive critics about my code style, and idiomatic or semantic tips too.

    Thank you very much.