PiEquals3 has asked for the wisdom of the Perl Monks concerning the following question:

I'm trying to use the CGI module to write scripts that produce web pages. The header() function is beginning to give me an aneurism.

What I do:

print header();

All I want:

Document-type: text/html
..This is strictly according to the CGI.pm manpage. I expect it to do that.

What I also get:

<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML//EN">
...which is fine, I suppose (I trust the module to do the best and newest and most proper thing), except that the !DOCTYPE form doesn't allow my script to properly read input using sysread(). (It hangs until timeout).

Deep breath..

So: What's going on? What is !DOCTYPE for, and why does it appear to break sysread()?

This isn't an urgent problem so much as a very frustrating and confusing one and I'd bet I'm not the only casualty.

Oh yeah: ActiveState Perl build 6xx, NT/IIS 4.0, IE5

..and thanks

Replies are listed 'Best First'.
(arturo) Re: CGI::header() & What's !DOCTYPE..?
by arturo (Vicar) on Mar 15, 2001 at 21:31 UTC

    <!DOCTYPE ... is a bit of SGML markup that identifies the language and DTD to which the document belongs. It's required to generate w3c standards-compliant HTML, for example.

    If you don't want it, then just issue a manual print "Content-type: text/html\n\n";

    But I don't get this sysread problem ...if you're trying to get user input, CGI.pm handles that for you transparently. Why else would you be calling sysread on the server side?

    HTH

    Philosophy can be made out of anything. Or less -- Jerry A. Fodor

Re: CGI::header() & What's !DOCTYPE..?
by chromatic (Archbishop) on Mar 15, 2001 at 21:36 UTC
    The Doctype is an SGML-type declaration that tells interested user agents which Document Type Definition to use to validate and to interpret the following markup.

    This is important in XML-land so that browsers and other programs don't have to do as much guesswork to interpret a document.

    That said, I find it hard to believe that printing a header will break sysread. In fact, if you're already using CGI.pm, why not go the extra step and use the param() function to get at your CGI input? Unless I'm way off the mark, you're attempting to read POSTed data from STDIN with sysread (as one would do without CGI.pm).

    Depending on how IIS caches information, CGI.pm may have already read the data in, so you'll block until the user agent sends more, which is, effectively, forever until timeout.

Re: CGI::header() & What's !DOCTYPE..?
by davorg (Chancellor) on Mar 15, 2001 at 21:33 UTC

    DOCTYPE is the line that defines which version of the HTML spec your page is supposed to conform to. No HTML document can be valid without it. Which means that the vast majority of the HTML pages out there are invalid as most browsers don't insist on seeing it.

    As for why you're having problems, I'm sure it's nothing to do with the DOCTYPE declaration. Can you post a cut-down version of the code that demonstrates the the problem.

    --
    <http://www.dave.org.uk>

    "Perl makes the fun jobs fun
    and the boring jobs bearable" - me

Re: CGI::header() & What's !DOCTYPE..?
by TGI (Parson) on Mar 15, 2001 at 23:19 UTC

    Interestingly enough, if you save your CGI generated code as a file and have W3.org's HTML validator take a look at it, it doesn't like the doctype declaration. To get usable code validation I've always had to hand edit that line of the file to specify a newer version of HTML, I guess the generated one specifies HTML 1.0 or 2.0. BTW, I really recommend validating your generated code. It's good to comply with standards. I've also found that it helps solve rendering problems, especially when using stylesheets.

    Looking at the w3c site, I found this in the HTML 4.01 spec:

    HTML 4.01 specifies three DTDs, so authors must include one of the following document type declarations in their documents. The DTDs vary in the elements they support.

    • The HTML 4.01 Strict DTD includes all elements and attributes that have not been deprecated or do not appear in frameset documents. For documents that use this DTD, use this document type declaration: <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN" "http://www.w3.org/TR/html4/strict.dtd">
    • The HTML 4.01 Transitional DTD includes everything in the strict DTD plus deprecated elements and attributes (most of which concern visual presentation). For documents that use this DTD, use this document type declaration: <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">
    • The HTML 4.01 Frameset DTD includes everything in the transitional DTD plus frames as well. For documents that use this DTD, use this document type declaration: <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Frameset//EN" "http://www.w3.org/TR/html4/frameset.dtd">

    TGI says moo
Re: CGI::header() & What's !DOCTYPE..?
by jeroenes (Priest) on Mar 15, 2001 at 21:32 UTC
    !DOCTYPE is a w3 spec. It tells the browser which slang to expect. However, I see no reason for that to block sysread. Show us the code!

    Jeroen
    "We are not alone"(FZ)

The code, a solution, and many many thanks
by PiEquals3 (Acolyte) on Mar 16, 2001 at 02:46 UTC
    Thanks for all the help on DTDs.. enlightening.

    Here's a fairly bare-bones version of my script that I have confirmed still exhibits the same behavior:

    #This script recieves valid data POSTed to it from an HTML form. use CGI qw/:standard/; use CGI::Pretty; ### HERE IS THE PROBLEM: ############################################# +######### ##Un-comment one or the other of these next two lines to reproduce the + problem# # + # #print "Content-type: text/html\n\n"; #Using this line works beautifu +lly. # # + # #print header(); #Using this line stalls the browser. + # # + # ###################################################################### +######### print "<HTML><HEAD><TITLE>Confirmation</TITLE></HEAD><BODY BGCOLOR=\"3 +355ff\">"; print h1('Header'); print h4('Trying sysread..'); $data = try_sysread(); print $data; print h4('Sysread done..'); print end_html; exit; ########## sub try_sysread(){ sysread(STDIN,$d,$ENV{CONTENT_LENGTH}); return $d; };
    params() was a good idea, but then I tried Vars() and it works beautifully with any decent header, once I noticed that you have to import the *:cgi-lib* functions from CGI.
    Assuming I hear of nothing dangerous or evil about Vars(), I'll use it.

    I still don't know why this sysread thing happened, though.. any ideas would still be appreciated, if only for the joy of abstract knowledge (especially for the joy of abstract knowledge.. practical knowledge hurts when you don't have it.)

    ==========

    Can an atheist be insured against acts of God?

      Looking through the code of CGI.pm, when you call header(), it calls self_or_default(). That particular function is a rather ugly way of allowing a procedural and object-oriented interface.

      Anyhow, self_or_default() winds up calling CGI::new() if you're using the procedural interface. This is important, because new() calls init(). As I predicted earlier, that's what calls sysread. It slurps up $ENV{'CONTENT_LENGTH'} worth of bytes from STDIN.

      When you attempt to read from STDIN, there's nothing there. So Perl helpfully waits for something -- anything -- which never arrives.

      As for Vars() versus param(), Vars() is an old, deprecated, backwards-compatible function that bridges the gap between Perl 4 functionality and Perl 5. param() is the wave of the future! It's worth learning; it's flexible and powerful.