Get HTML source code

maestro_ba has asked for the wisdom of the Perl Monks concerning the following question:

Replies are listed 'Best First'.
Re: Get HTML source code by tirwhan (Abbot) on Nov 04, 2005 at 16:55 UTC
Hmm, care to explain a little more why you'd like to do this? I have a feeling you may be trying to cache your page for further requests, in which case there are better solutions for this problem. Update:For example, take a look at CGI::Cache which internally pretty much does what you describe. Debugging is twice as hard as writing the code in the first place. Therefore, if you write the code as cleverly as possible, you are, by definition, not smart enough to debug it. -- Brian W. Kernighan	[reply]
Re^2: Get HTML source code by maestro_ba (Initiate) on Nov 04, 2005 at 17:20 UTC
First of all thanks for the quick answers. I haven't tried any of them, but when I solve my problem I'll tell. I'll explain what I need: This HTML page the script produces is a very large page, with lots of images and forms. I often need to send the information this script produces by email, either by doing: - "Edit -> Select All", "Edit -> Copy", and "Edit -> Paste" in Outlook) - "File -> Send -> Page by Email" So, I want to strip the html code of all the forms and informations I don't need. If I can get the HTML code in a variable I can delete everything between "<form>...</form>" and therefore make a similar page, without the forms. If anyone can give me a better sugestion, I'll be most grateful. Thanks	[reply]
Re^3: Get HTML source code by jeffa (Bishop) on Nov 04, 2005 at 17:45 UTC
Oh! I completely read your original question wrong. At the expense of adding more programming logic to your script, couldn't you simply pass in a CGI parameter (as_email) and exclude certain portions of the HTML page if present? Here's is where tools such as HTML::Template and Template Toolkit are a big win: `use strict; use warnings; use CGI; use HTML::Template; my $cgi = CGI->new; my $tmpl = HTML::Template->new( filehandle => \*DATA, associate => $cgi, ); print $cgi->header, $tmpl->output; __DATA__ <html> <head> <title>insert generic title</title> </head> <body> <h1>Hello World.</h1> <tmpl_unless as_email> <p>This only appears when as_email is not present.</p> </tmpl_unless> </body> </html>` [download] Try calling this CGI script with the parameter as_email set to a true value. jeffa L-LL-L--L-LL-L--L-LL-L-- -R--R-RR-R--R-RR-R--R-RR B--B--B--B--B--B--B--B-- H---H---H---H---H---H--- (the triplet paradiddle with high-hat)	[reply] [d/l]
Re: Get HTML source code by jeffa (Bishop) on Nov 04, 2005 at 16:58 UTC
Seems kind of silly to me to recreate the functionality that the browser provides for free, but here is a quick hack: `use strict; use warnings; use CGI qw(:standard); my $html = do {local $/;<DATA>}; if (param('view_source')) { $html = pre(escapeHTML($html)); } print header, $html; __DATA__ <html> <head> <title>view source example</title> </head> <body> <h1>view source example</h1> <a href="?view_source=1">view source</a> </body> </html>` [download] jeffa L-LL-L--L-LL-L--L-LL-L-- -R--R-RR-R--R-RR-R--R-RR B--B--B--B--B--B--B--B-- H---H---H---H---H---H--- (the triplet paradiddle with high-hat)	[reply] [d/l]
Re^2: Get HTML source code by jeffa (Bishop) on Nov 04, 2005 at 17:06 UTC
Actually ... i think i like this one better: `use strict; use warnings; use CGI qw(:standard); my $html = do {local $/;<DATA>}; my $mime = param('view_source') ? 'plain' : 'html'; print header("text/$mime"), $html; __DATA__ <html> <head> <title>view source example</title> </head> <body> <h1>view source example</h1> <a href="?view_source=1">view source</a> </body> </html>` [download] Let the browser handle the data as a plain text file, instead of having to escape HTML tags and wrap the results in pre tags. jeffa L-LL-L--L-LL-L--L-LL-L-- -R--R-RR-R--R-RR-R--R-RR B--B--B--B--B--B--B--B-- H---H---H---H---H---H--- (the triplet paradiddle with high-hat)	[reply] [d/l]
Re: Get HTML source code by ickyb0d (Monk) on Nov 04, 2005 at 16:58 UTC
for each print statement you could just create another print statement that prints to a file appends the string to a variable. That should give you all the HTML data that is printed by your cgi script. you could also visit your own webpage using something like WWW::Mechanize and just print out the content from that. But that would probably have to be within another script, and not your cgi page.	[reply]
Re: Get HTML source code by Spidy (Chaplain) on Nov 04, 2005 at 17:01 UTC
One way that I would do it is instead of printing out the HTML while the script is running, store it all into a variable. Then, at the very end of the script, print out the variable, and do whatever else you wanted to with it. That way, the variable will have all your HTML source code in it. My Website	[reply]
Re^2: Get HTML source code by maestro_ba (Initiate) on Nov 04, 2005 at 17:34 UTC
The solutions by ickyb0d and Spidy require that I add a line EVERYTIME I have a print! They were the first solutions that came to me, but I wanted to find out a way to avoid doing them...	[reply]
Re^3: Get HTML source code by Spidy (Chaplain) on Nov 04, 2005 at 17:43 UTC
Well, you'd just replace your print lines with `$codeVariable .= qq ~ ~; #whatever was going to be printed goes inside + the quotes` [download] And then at the very end of the script, you'd have: `print $codeVariable;` [download] It's just one extra line added. My Website	[reply] [d/l] [select]
Re^4: Get HTML source code by maestro_ba (Initiate) on Nov 04, 2005 at 17:50 UTC
Re: Get HTML source code by radiantmatrix (Parson) on Nov 07, 2005 at 16:10 UTC
In the realm of CGI, printing to the browser is really just printing to the handle `STDOUT`. This means you can simply redirect STDOUT to write to both a file of your choice and the usual STDOUT. `# near the top, before you print open FILE, '>', 'output.txt' or die ("Can't write output.txt: $!"); open STDOUT, '>&FILE>&STDOUT' or die ("Can't redirect STDOUT properly: + $!");` [download] Remember to close `FILE` before your script ends. (even though the script ending should cause it to go out of scope and close, better to do so explicitly). <-radiant.matrix-> A collection of thoughts and links from the minds of geeks The Code that can be seen is not the true Code "In any sufficiently large group of people, most are idiots" - Kaa's Law	[reply] [d/l] [select]