Eyck has asked for the wisdom of the Perl Monks concerning the following question:

Hello noble monks, I'm trying to edit http requests on the fly, for that I've got proxy, that intercepts requests, edits them, and then sends them along... This works fine, until you start POSTing things using some binary encoding... then it apears that my messing with those bytes removes '^M' characters and replaces them with unix newlines ie, incoming request looks like this:
POST /something/fileupload.html^M
HTTP/1.1^M
Accept: image/pjpeg, */*^M
Accept-Language: pl^M
Content-Type: multipart/form-data;
boundary=---------------------------7d42228760176^M
and outgoing looks like this:
POST /something/fileupload.html
HTTP/1.1
Accept: image/pjpeg, */*
Accept-Language: pl
Content-Type: multipart/form-data;
boundary=---------------------------7d42228760176
After receiving such request webservers hang, waiting for client to finish... entering '^M' anywhere makes them continue. Question is - how to go about this? My code looks roughly like this:
@indata=split(/^M/,$data); foreach (@indata) { $dataout.=$_; };
... I also tried this:
@indata=split(/^M/,$data); foreach (@indata) { push @outdata,$_; }; $outdata=join(/^M/,@outdata);
But it seems like the only way to put "^M" character in outgoing string is to
$outdata.="^MHelloWorld^M";
Is there some way to make perl use "^M" in newlines? I already tried setting $/="\r\n";$\="\r\n", but that doesen't change this behaviour.

Replies are listed 'Best First'.
Re: Web and newlines, aka perl vs ^M
by matija (Priest) on Mar 12, 2004 at 11:42 UTC
    I think that you will have more luck if you indicate ^M as \r, and thus
    split(/\r/,...)
    and join('\r',...).

    Or you could use \xnn form and give it's hexa code.

    Update: thanks to Corion:Change the '\r' in the join up there to "\r" - I had a braindead moment - of course the single quotes don't interpolate the backslash notation.

Re: Web and newlines, aka perl vs ^M
by bart (Canon) on Mar 12, 2004 at 11:59 UTC
    It smells to me like you're on Windows, and that you need to use binmode on both in- and output of that proxy, so that it becomes more transparent — meaning it will no longer change anything. That way you'll get the wretched "\cM" characters everywhere, even in your headers, so you will might have to change the code a little, to accomodate for them.
Re: Web and newlines, aka perl vs ^M
by Eyck (Priest) on Mar 12, 2004 at 11:38 UTC
    join("^M",@outdata); instead of join(/^M/,@outdata); seems to be part of solution...
Re: Web and newlines, aka perl vs ^M
by iburrell (Chaplain) on Mar 12, 2004 at 22:00 UTC
    I will quote the HTTP spec: "HTTP/1.1 defines the sequence CR LF as the end-of-line marker for all protocol elements except the entity-body".

    This means the request line and headers should use CRLF. Clients are sloppy about this and servers are forgiving. But you should be careful about the end-of-line characters. And don't use "\r\n" which is interpreted differently on different platforms but "\015\012".

    Also, the body should not have the line-endings changed. It should be treated like a binary block. One advantage is that you don't need to examine the media type and worry about images being corrupted.