rostiguy has asked for the wisdom of the Perl Monks concerning the following question:

Hello,

Hello, I come from a Computer Science background have have done a small amount of programming in my time (Mostly C, Pascal, Fortran, ADA, etc..) A couple of years ago I took a Perl class from Sun to help with some utilities I was writing and have loved it ever since!

I recently read Randal L. Schwartz's article in Linux Magazine regarding a "repackaging of data" challenge that was put forth by the folks here at the Perl Monestary. I took Randal's code and modified it to throw the information into a Java "Ticker tape" applet.

I have realized over the last couple years that there are quite a few different ways to do things in perl (to say the least). I was wondering if there were any kind souls in attendence that would like to look over my code and offer some positive feedback / criticizm. I would like to pick up on good technique and possibly learn some new tricks in the process. Here's the Code
#!/usr/bin/perl -w use strict; $|++; use LWP::Simple qw(mirror); use CGI qw(:all -no_debug col thead tbody); use HTML_TAGS; use Getopt::Long; my $CNN_URL = "http://headlinenews.cnn.com/QUICKNEWS/virtual/swf.he +adline.txt"; my $CNN_CACHE = "contest.cnn-cache"; # flat file my $DB_MEMORY = "contest.memory"; # dbmopen { my $s = mirror($CNN_URL, $CNN_CACHE); last if $s == 200; # we got new data last if $s == 304; # no new data, but we have to expire things die "status is $s, aborting\n"; } GetOptions( "refresh=i" => \ (my $REFRESH = 10), # meta refresh ti +me in minutes "output=s" => \ (my $OUTPUT = "/usr/local/apache/htdocs/ +headlines.html"), # output file "expire=i"=> \(my$EXPIRE= 15), # expire time in +minutes (Originally 1440) "clear!" => \ (my $CLEAR = 0), # clear the cache "<>" => sub { $Getopt::Long::error++; warn "Unknown arg: + $_[0]\n" }, ) or die "see code for usage\n"; dbmopen(my %DB, $DB_MEMORY, 0644) or die "Cannot dbmopen $DB_MEMORY: + $!"; open STDIN, $CNN_CACHE or die "Cannot open $CNN_CACHE: $!"; open STDOUT, ">$OUTPUT" or die "Cannot create $OUTPUT~: $!"; %DB=()if $CLEAR; $CGI::Q = CGI->new(\*STDIN) or die "Cannot parse $CNN_CACHE\n"; for (my $i = 1; my $headline = param("headline$i"); $i++) { my $state = param("state$i"); my $key = "$state\n$headline"; if (defined $DB{$key}) { # just update modtime $DB{$key} =~ s/\s\d+/" " . time/e; } else { # add the entry $DB{$key} = time . " " . time; } } for my $key (keys %DB) { delete $DB{$key} if $DB{$key} =~/\s(\d+)/ and $1 < time - $EXPIRE +* 60; } print start_html(-title => "Thunderheart -- Brought to you in Ronovis +ion!", head => meta({-http_equiv => 'refresh', -content => $REFRESH*60})); print <<'HTML'; <BODY BGCOLOR=#FFFFFF> <TABLE CELLPADDING=3 width="100%" BGCOLOR=4040A0> <TR><TH class="tit"> <FONT SIZE="+2" class="tit" COLOR=D0F0FF>Thunderheart</FONT> </TH></TR></TABLE> <CENTER> <CENTER><FONT SIZE=+1 COLOR=000000>Current Headlines from <A HREF=http://www.cnn.com>CNN.com</A></FONT></CENTER> <TD> <applet codebase="./" code="aTicker.class" archive="aTicker.jar" width +=880 height=20 MAYSCRIPT> <param name="_file" value="./mess.txt"> <param name="file" value="s"> <param name="cSep" value=";"> <param name="speed" value="2"> <param name="delay" value="30"> <param name="local" value="true"> <param name="bgcolor" value="13693183"> <param name="Font1" value="Verdana, 14, 0, 0"> <param name="Font2" value="Verdana, 14, 1, 1220"> HTML my $count = 0; while ((my $key, my $value) = each(%DB)) { (my $title, my $headline) = split(/\n/,$key); print "<param name=\"s$count\" value=\"...$headline...\ ; http://w +ww.cnn.com ; _load\">\n"; $count++; } print <<'APPLET'; </applet> APPLET print end_html; close STDOUT; exit 0; sub escapeHTMLnobreak { local $_ = escapeHTML("@_"); s/ / /g; $_; }
Any "positive" criticism would be greatly appreciated.

Thanks,
Rostiguy@ostiguynet.com

Replies are listed 'Best First'.
Re: Code Critique?
by Masem (Monsignor) on Jun 15, 2001 at 17:44 UTC
    One thing that I would consider doing is avoid mixing here-docs with the CGI.pm HTML output functions; you should try to use one or the other, and IMO, the CGI.pm HTML is a better approach, since you avoid known problems with formatting the code correctly for here-docs. While my first thought is that you'd have to still code the APPLET portion of the HTML code in a here-doc, I think that if you use some of the special features of CGI.pm that can generate HTML tags that it doesn't specifically support, you can have every bit of HTML be done via CGI.pm rather than here-doc.

    If anything, if you have a static bit of text, which your applet code is, I've found that using q() and friends are better than here-doc-ing those.


    Dr. Michael K. Neylon - mneylon-pm@masemware.com || "You've left the lens cap of your mind on again, Pinky" - The Brain
      Why do you say that one should avoid mixing CGI.pm's HTML functions with here documents?

      I often do something similar to the following:

      use strict; use CGI; #tested on 5.005/UNIX my $q = new CGI; print $q->header(), $q->start_html(-title=>'CDB Form'); print <<END_of_Multiline_Text; HTML text here END_of_Multiline_Text print $q->end_html();
      This enables me to reuse existing HTML code, and I do not have to remember the correct http headers to post. Moreover, I can even interpolate Perl variables in my here document if I want to.
        Well, it's not that this doesn't work. As with most of perl, TIMTOWTDI. But I was more advocating mixing here-docs with CGI's functions, (as opposed to mixing CGI.pm with here-docs)

        I've lately becoming less enthralled with here-docs due to how they ususally mess up the formating (both of what you put in and what your editor puts in) of a perl script. Plus, this appears to be one of the few points in perl where extra whitespace can screw you up. I've had several bad experiences with here-docs as to avoid using them myself and to try to get others to do so as well.

        It's probably better to use either CGI.pm functions completely, or at least get the header and other page information out with CGI and then jump to a template solution for the rest of the HTML body. Both methods will produce much nicer and easier-to-read code than using here-docs.


        Dr. Michael K. Neylon - mneylon-pm@masemware.com || "You've left the lens cap of your mind on again, Pinky" - The Brain
Re: Code Critique?
by Aighearach (Initiate) on Jun 15, 2001 at 17:47 UTC

    Lets see, you're using strict, you have warnings on, you're using Getopt::Long... so far so good :)

    CGI.pm is good for handling the query (the actual code it uses is only a couple lines, much less bloat to just cut and paste them into a local query fetching sub :) but using it to generate the HTML is nearly imposible to maintain. I always use HTML::Template. It has worked pretty well. Other than a manager saving the browser HTML output and copying it over the template file, I've never seen any difficulties with updating the templates, but when I've had people trying to change my scripts everything breaks.

      I am so happy that I just discovered CGI::Pretty for all my HTML generating needs. I LIKE CGI::PRETTY. HAPPY TILLY?

      So I guess it is <root-node poster's name here> 's decision on what (s)he wants to use. :)

      Tiptoeing up to a Perl hacker.
      Dave AKA damian

        Hmmm...

        CGI::Pretty inherits from CGI so it loads CGI.pm. No improvement there. It goes through extra work to maintain nicely formatted HTML, so no improvement there. It sends lots of extra whitespace down the pipe, which can be a huge slowdown as pages get big and complex. (As a slight reprieve the excess compresses well for users on modems.)

        There is one reason and one reason only for using CGI::Pretty. And that is that you want to be able to read and understand the output HTML. (Which is not a bad reason, all things considered.) But performance is made worse, not better.

         CGI.pm is good for handling the query ...
         but using it to generate the HTML is nearly 
         impossible to maintain.
      
      This may be true for you, but it's just your opinion, not a hard fact. I personally find HTML generated by the CGI module to be easier to maintain. After all, my editor can catch mismatched parens and braces a lot easier than mismatched HTML tags.

      It's all just a matter of personal preference, and probably depends somewhat on what sort of HTML code you're generating.

      buckaduck

        It is a fact in any project with more than one user. Either it is open source, and there will be lots of maintainers, or it is a commercial project and will also have other maintainers. So, rather than being an opinion, it is a general known fact in programming that mixing languages in a single file is less maintainable than having seperate files for seperate technologies.

        The part that is opinion is, if it is important enough to make it so that it can last. Maybe it is a page that won't be around long anyway. Or maybe it is a personal project. Or a project for a perl-centric website where there will always be people who can maintain it. But it is still good to acknowledge the real life difficulties of each choice.

Re: Code Critique?
by rostiguy (Novice) on Jun 20, 2001 at 18:27 UTC
    Folks,

    I just wanted to take a moment to thank all of your for your input! I code things the way I was taught (wich is ok) and sometimes I see things coded very differently at perl monks. I just want to make sure I cultivate good habits.

    Thanks again for all your help.
    Ron Ostiguy