maestro_ba has asked for the wisdom of the Perl Monks concerning the following question:

Hi. I have a ".cgi" web page (a huge perl script) that, amongst other things, prints LOTS of html code to the browser. The whole process is rather slow, and the html page shows up as the script runs. What I want to do is, AT THE END OF THE SCRIPT, include a line that saves into a variable ALL THE HTML CODE that the script printed on the screen. In fact, I want the same code I would get by doing "View -> Source" in Internet Explorer. How can I do that?? Thanks!

Replies are listed 'Best First'.
Re: Get HTML source code
by tirwhan (Abbot) on Nov 04, 2005 at 16:55 UTC

    Hmm, care to explain a little more why you'd like to do this? I have a feeling you may be trying to cache your page for further requests, in which case there are better solutions for this problem.

    Update:For example, take a look at CGI::Cache which internally pretty much does what you describe.


    Debugging is twice as hard as writing the code in the first place. Therefore, if you write the code as cleverly as possible, you are, by definition, not smart enough to debug it. -- Brian W. Kernighan
      First of all thanks for the quick answers. I haven't tried any of them, but when I solve my problem I'll tell.

      I'll explain what I need:
      This HTML page the script produces is a very large page, with lots of images and forms.
      I often need to send the information this script produces by email, either by doing:
      - "Edit -> Select All", "Edit -> Copy", and "Edit -> Paste" in Outlook)
      - "File -> Send -> Page by Email"

      So, I want to strip the html code of all the forms and informations I don't need. If I can get the HTML code in a variable I can delete everything between "<form>...</form>" and therefore make a similar page, without the forms.

      If anyone can give me a better sugestion, I'll be most grateful.
      Thanks

        Oh! I completely read your original question wrong. At the expense of adding more programming logic to your script, couldn't you simply pass in a CGI parameter (as_email) and exclude certain portions of the HTML page if present? Here's is where tools such as HTML::Template and Template Toolkit are a big win:

        use strict; use warnings; use CGI; use HTML::Template; my $cgi = CGI->new; my $tmpl = HTML::Template->new( filehandle => \*DATA, associate => $cgi, ); print $cgi->header, $tmpl->output; __DATA__ <html> <head> <title>insert generic title</title> </head> <body> <h1>Hello World.</h1> <tmpl_unless as_email> <p>This only appears when as_email is not present.</p> </tmpl_unless> </body> </html>
        Try calling this CGI script with the parameter as_email set to a true value.

        jeffa

        L-LL-L--L-LL-L--L-LL-L--
        -R--R-RR-R--R-RR-R--R-RR
        B--B--B--B--B--B--B--B--
        H---H---H---H---H---H---
        (the triplet paradiddle with high-hat)
        
Re: Get HTML source code
by jeffa (Bishop) on Nov 04, 2005 at 16:58 UTC

    Seems kind of silly to me to recreate the functionality that the browser provides for free, but here is a quick hack:

    use strict; use warnings; use CGI qw(:standard); my $html = do {local $/;<DATA>}; if (param('view_source')) { $html = pre(escapeHTML($html)); } print header, $html; __DATA__ <html> <head> <title>view source example</title> </head> <body> <h1>view source example</h1> <a href="?view_source=1">view source</a> </body> </html>

    jeffa

    L-LL-L--L-LL-L--L-LL-L--
    -R--R-RR-R--R-RR-R--R-RR
    B--B--B--B--B--B--B--B--
    H---H---H---H---H---H---
    (the triplet paradiddle with high-hat)
    

      Actually ... i think i like this one better:

      use strict; use warnings; use CGI qw(:standard); my $html = do {local $/;<DATA>}; my $mime = param('view_source') ? 'plain' : 'html'; print header("text/$mime"), $html; __DATA__ <html> <head> <title>view source example</title> </head> <body> <h1>view source example</h1> <a href="?view_source=1">view source</a> </body> </html>
      Let the browser handle the data as a plain text file, instead of having to escape HTML tags and wrap the results in pre tags.

      jeffa

      L-LL-L--L-LL-L--L-LL-L--
      -R--R-RR-R--R-RR-R--R-RR
      B--B--B--B--B--B--B--B--
      H---H---H---H---H---H---
      (the triplet paradiddle with high-hat)
      
Re: Get HTML source code
by ickyb0d (Monk) on Nov 04, 2005 at 16:58 UTC

    for each print statement you could just create another print statement that prints to a file appends the string to a variable. That should give you all the HTML data that is printed by your cgi script.

    you could also visit your own webpage using something like WWW::Mechanize and just print out the content from that. But that would probably have to be within another script, and not your cgi page.

Re: Get HTML source code
by Spidy (Chaplain) on Nov 04, 2005 at 17:01 UTC
    One way that I would do it is instead of printing out the HTML while the script is running, store it all into a variable. Then, at the very end of the script, print out the variable, and do whatever else you wanted to with it. That way, the variable will have all your HTML source code in it.
      The solutions by ickyb0d and Spidy require that I add a line EVERYTIME I have a print!
      They were the first solutions that came to me, but I wanted to find out a way to avoid doing them...
        Well, you'd just replace your print lines with
        $codeVariable .= qq ~ ~; #whatever was going to be printed goes inside + the quotes
        And then at the very end of the script, you'd have:
        print $codeVariable;
        It's just one extra line added.
Re: Get HTML source code
by radiantmatrix (Parson) on Nov 07, 2005 at 16:10 UTC

    In the realm of CGI, printing to the browser is really just printing to the handle STDOUT. This means you can simply redirect STDOUT to write to both a file of your choice and the usual STDOUT.

    # near the top, before you print open FILE, '>', 'output.txt' or die ("Can't write output.txt: $!"); open STDOUT, '>&FILE>&STDOUT' or die ("Can't redirect STDOUT properly: + $!");

    Remember to close FILE before your script ends. (even though the script ending should cause it to go out of scope and close, better to do so explicitly).

    <-radiant.matrix->
    A collection of thoughts and links from the minds of geeks
    The Code that can be seen is not the true Code
    "In any sufficiently large group of people, most are idiots" - Kaa's Law