Beefy Boxes and Bandwidth Generously Provided by pair Networks
We don't bite newbies here... much
 
PerlMonks  

CGI: newlines, write exactly "\r\n" to end the headers, then turn off binmode

by 7stud (Deacon)
on Mar 09, 2018 at 09:26 UTC ( [id://1210553]=perlquestion: print w/replies, xml ) Need Help??

7stud has asked for the wisdom of the Perl Monks concerning the following question:

Dear Monks

In a cgi script, I want to end the headers. All the scripts I see use print "\n\n";, but as far as I can tell that is incorrect on Unix because "\n" gets written untranslated as "\n", yet the HTTP 1.1. spec requires that the two character sequence CRLF be used to end the headers. So I want to write exactly "\r\n" to STDOUT and avoid any newline translations. I can do that with binmode(STDOUT), but then how do I turn off binmode() for STDOUT, so that I can then print regular text in the response body?

Is turning off binmode() documented anywhere? Should I be using syswrite() instead?

Replies are listed 'Best First'.
Re: CGI: write exactly "\r\n" to end the headers, then turn off binmode
by choroba (Cardinal) on Mar 09, 2018 at 09:33 UTC
    As documented, binmode can take a LAYER argument, which can be even :crlf or whatever you like.

    In a CGI script, you usually don't need to care about ending the headers. Just use the $cgi->header(...) method correctly.

    ($q=q:Sq=~/;[c](.)(.)/;chr(-||-|5+lengthSq)`"S|oS2"`map{chr |+ord }map{substrSq`S_+|`|}3E|-|`7**2-3:)=~y+S|`+$1,++print+eval$q,q,a,

      Thanks for the responses.

      In a CGI script, you usually don't need to care about ending the headers. Just use the $cgi->header(...) method correctly.

      In my case, I can't do that. I'm trying to read some simple json data in the body of a post request, but $cgi->{POSTDATA} gives me inconsistent results depending on my server. With an apache server, POSTDATA successfully contains the json. However, I tried my perl script on another server, and POSTDATA returns undef. To solve that issue, I'm reading from STDIN directly. I looked at the source code for CGI.pm, and I don't understand why CGI.pm fails to read the json while executing on my non-apache server, because I can get the json when I read from STDIN directly doing this:

      (The cgi spec requires that a script not try to read more than Content-Length from STDIN.)

      #!/usr/bin/env perl use strict; use warnings; use 5.020; use autodie; use Data::Dumper; use JSON; if (my $content_len = $ENV{CONTENT_LENGTH}) { read(STDIN, my $json, $content_len); #<===HERE ************ my $href = decode_json($json); my $a = $href->{a}; my $b = $href->{b}; print 'Content-type: text/html'; print "\r\n\r\n"; print "<div>a=$a</div>"; print "<div>b=$b</div>"; } else { my $error = "Could not read json: No Content-Length header in requ +est."; print 'Content-type: text/html'; print "\r\n\r\n"; print "<div>$error</div>"; }
      And, once I read from STDIN directly, then none of perl's CGI functionality works.
        what server? What version cgi.pm? You can create cgi objects without it reading fom stdin.
Re: CGI: newlines, write exactly "\r\n" to end the headers, then turn off binmode
by Corion (Patriarch) on Mar 09, 2018 at 09:53 UTC

    You can still print regular (ASCII) text to STDOUT after you've used binmode(). Actually, that is what I would recommend, or alternatively, telling Perl what you actually intend to write (like binmode STDOUT, ':encoding(UTF-8)' if you're sending that.

    Personally, I'm more a fan of explicitly using encode from Encode though and leaving STDOUT binmoded without any argument (binmode STDOUT, ':raw';).

Re: CGI: newlines, write exactly "\r\n" to end the headers, then turn off binmode
by haukex (Archbishop) on Mar 10, 2018 at 11:40 UTC

    First, I agree with the others that the best way to go is CGI.pm's header function.

    I can do that with binmode(STDOUT), but then how do I turn off binmode() for STDOUT, so that I can then print regular text in the response body?

    I don't think this part of the question has been answered, so yes, you can call binmode on a handle as often as you like. See the following example, in which I've shown how I would do it if I wanted to be sure Perl was sending CR+LF to STDOUT (or in this case, the currently selected handle, normally STDOUT).

    $ cat test.pl binmode select, ':crlf'; print "\x{FC}\n"; binmode select, ':raw'; print "\x0D\x0A"; binmode select, ':encoding(UTF-8)'; print "\x{FC}\n"; $ perl -wMstrict test.pl | hexdump -C 00000000 fc 0d 0a 0d 0a c3 bc 0a |........| 00000008
Re: CGI: newlines, write exactly "\r\n" to end the headers, then turn off binmode
by ikegami (Patriarch) on Mar 09, 2018 at 17:39 UTC

    You're not sending an HTTP response, though. You're sending a CGI response. What server are you using? Apache accepts LF. (I think it also accepts CRLF.)

      You're not sending an HTTP response, though. You're sending a CGI response. What server are you using? Apache accepts LF.

      Interesting. I sent a curl request to my Apache server, and I sniffed the response using Wireshark, and Apache performs conversions on the characters that a cgi script uses to terminate the headers. Here are the conversions that I observed:

      1. "\n\n" => "\r\n\r\n"
      2. "\r\n\r\n" => "\r\n\r\n"
      3. "\r\r" => Internal sever error
      4. "\n\r\n\r" => "\r\n\r\n" plus a \r to start the body of the request

      I think Apache is following the dictum: "Be permissive in what you allow, but be strict in what you do." Likewise, I am going to be strict in what I do, and I'm not going to rely on a server to convert newlines to the HTTP 1.1 spec.

      (I think it also accepts CRLF.)

      It seems obvious to me that any http server would accept the header termination characters in the HTTP 1.1 spec? Why the hesitation? Were you speculating that "\r\n\r\n" might get converted to "\r\r\n\r\r\n"?

        Likewise, I am going to be strict in what I do, and I'm not going to rely on a server to convert newlines to the HTTP 1.1 spec.
        Why do you feel the need to output a response compliant to the HTTP spec when you are not returning an HTTP response? You're returning a CGI response, thus you should be concerned with following the CGI spec, not the HTTP spec. So what does the CGI spec say?
        6.2.  Response Types
        
           The response comprises a message-header and a message-body, separated
           by a blank line.  The message-header contains one or more header
           fields.  The body may be NULL.
        
              generic-response = 1*header-field NL [ response-body ]
        
        It says headers are ended by a blank line, as delimited with "NL". Hmm, NL. Obviously means "new line", but what ASCII character sequence might that be?
        6.3.4.  Protocol-Specific Header Fields
        
           The script MAY return any other header fields that relate to the
           response message defined by the specification for the SERVER_PROTOCOL
           (HTTP/1.0 [1] or HTTP/1.1 [4]).  The server MUST translate the header
           data from the CGI header syntax to the HTTP header syntax if these
           differ.  For example, the character sequence for newline (such as
           UNIX's US-ASCII LF) used by CGI scripts may not be the same as that
           used by HTTP (US-ASCII CR followed by LF).
        
        Apache isn't capriciously translating \n to \r\n just for the sake of being permissive, the CGI spec says that CGI-supporting web servers "MUST" do this and explicitly allows CGI scripts to use different line endings than those prescribed by the HTTP spec, even giving unix-style \n-only line ends as an example of a possible alternative.

        You can rely on servers to do this conversion because any server which doesn't is not compliant with the CGI spec.

        Please use the [reply] link to the right of the post you are replying to.

        I'm not going to rely on a server to convert newlines to the HTTP 1.1 spec.

        wut. Why would not rely on Apache sending a valid reponse?

        t seems obvious to me that any http server would accept the header termination characters in the HTTP 1.1 spec?

        Again, we're talking about a CGI response, not an HTTP response.

Re: CGI: newlines, write exactly "\r\n" to end the headers, then turn off binmode (use CGI.pm header)
by Anonymous Monk on Mar 09, 2018 at 11:00 UTC

    All the scripts I see use print "\n\n";,

    that manualness has been obsolete some three decades, for headers use CGIs header

Re: CGI: newlines, write exactly "\r\n" to end the headers, then turn off binmode
by rizzo (Curate) on Mar 09, 2018 at 12:19 UTC
    "\n\n" is the correct way to end the header.
    If your script fails due to an incorrect header declaration, you get an error like:"premature end of script headers".

      "\n\n" is the correct way to end the header.

      How do you reconcile that statement with:

      1. The HTTP 1.1 spec which requires the last header be followed by two CRLF sequences.
      2. The fact that $cgi->header; returns the string: "Content-Type: text/html; charset=ISO-8859-1" terminated by "\x0D\x0A\x0D\x0A"?
        The fact that $cgi->header; returns the string: "Content-Type: text/html; charset=ISO-8859-1" terminated by "\x0D\x0A\x0D\x0A"?

        UPDATE: In addition:
        What is the correct form of response from a CGI script? ... The CGI specification allows any of the usual newline representations +in the CGI response (it's the server's job to create an accurate HTTP + response based on it). So "\n" written in text mode is technically c +orrect, and recommended. NPH scripts are more tricky: they must put o +ut a complete and accurate set of HTTP transaction response headers; +the HTTP specification calls for records to be terminated with carria +ge-return and line-feed, i.e ASCII \015\012 written in binary mode. Using CGI.pm gives excellent platform independence, including EBCDIC s +ystems. CGI.pm selects an appropriate newline representation ($CGI::C +RLF) and sets binmode as appropriate.
        perlfaq9 - Networking

        /Update


        This article sheds some light on the topic:
        The End-of-Line Story
        Few people today are aware of the EOL issue, because systems generally (but not always!) make it transparent. For example, the RFC Editor stores the official RFC archive on a Unix system whose native EOL is a single LF. When you click on a link for an RFC from the RFC Editor We +b page, your browser uses an FTP client to retrieve the ASCII text. The RFC's FTP server atranslates the LF in each text line into CR LF for transmission across the Internet, and your FTP client in turn translates each CR LF into whatever the EOL convention of your system.
        The HTTP 1.1 spec which requires the last header be followed by two CRLF sequences.

        Jep. From RFC 2616
        Request (section 5) and Response (section 6) messages use the generic + message format of RFC 822 [9] for transferring entities (the payload + of the message). Both types of message consist of a start-line, zero + or more header fields (also known as "headers"), an empty line (i.e. +, a line with nothing preceding the CRLF) indicating the end of the h +eader fields, and possibly a message-body. generic-message = start-line *(message-header CRLF) CRLF [ message-body ] start-line = Request-Line | Status-Line
        In short, the header is terminated by an empty line. Maybe I'm missing the point, but that's exactly what I get with:

        print "$lastlineofheader\n\n;"

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://1210553]
Approved by Corion
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others imbibing at the Monastery: (7)
As of 2024-04-18 16:02 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found