draxil42 has asked for the wisdom of the Perl Monks concerning the following question:

Hi. I have some code here that I am taking to an interview as an example piece of code, so I thought I would post it on here and let the monks give it a look over first.

If you want to actually run it I recommend getting it from here because it comes with example configs and things.

The function of this thing is to get comics off the internet (using http::lite) and display them in a Tk window.

Basically. What could I have done better? How would you critisise my code? What do you think?

#!/usr/bin/perl -w # getcomic - gets my comics # Joe Higton <joe@draxil.uklinux.net> # (Release) V0.4.1 ######################################################### # Config file -- (UNIX)alikes ~/.getcomic # (WIN) whatever $ENV{APPDATA} is on your machine (shou +ld be under you documents and settings dir under appdata as long as +your on NT # Config file takes format # comicname comicpageurl somethingtomatch # # somethingtomatch is something identifiable in your comics IMG tag. N +ormally the alt will do but if there is no ALT then something else, l +ike a non-changing dir name or the image name it's self if this doen' +t change.. # An example config should come with this file, if not goto: # http://www.draxil.uklinux.net/hip/index.pl?page=get_comics and get a + proper one. # if your using crazy/old screen resolutions you may wish to alter %de +faultsize # remember, if this script kills your system it's not my fault. After +all you can read the code: use Tk; use Tk::Image; use Tk::Photo; use Tk::Pane; use Tk::Wm; use MIME::Base64; use strict; use HTTP::Lite; use English; #global mys my($http) = new HTTP::Lite; my($window) = MainWindow->new(); my($comicpane) = $window->Scrolled('Pane',-scrollbars=>"oe"); my(%agoodsize)=( '-width'=>0, '-height'=>0 ); #what we use to decide h +ow big our inital pane will be my(%defaultsize)=( '-width'=>800, '-height'=>600 ); #don't let users resize horizontally, we set the width on the basis of + the comic. $window->resizable(0,1); ### # open our config file based on OS.. # NOTE: Currently we assume (windows || unix). # If I can test this on other platforms I will add other exceptions. if ($OSNAME =~/win/i){ # Windows and therefore, no home dir open(CONF,"$ENV{APPDATA}\\getcomic") or die "can't open config fil +e$ENV{APPDATA}\\getcomic" ; $RS="\n"; # I am told this is needed for activestate } else{ # Yay! Tastes like UNIX, so try using a unix dotfile for the confi +g. open(CONF,"$ENV{HOME}/.getcomic") or die "can't open config $ENV{H +OME}/.getcomic"; } # for each entry in the config file.. while(<CONF>){ chomp; if(/^(.+)\s(http:\/\/\S+)\s*(.+)$/){ my($NAME)=$1; # The name of the comic. my($URL)=$2; # The URL with the page that the comic image is in my($ALT)=$3; # The ALT tag text OR image fragment ## Retrieve the document. $http->reset; my($req)=$http->request($URL); if ($req==200){ ## Check for match text within an img tag. # This is probably in the alt text or a fragment of the SRC ur +l.. if( $http->body()=~/<\s*img[^>]*src\s*=\s*[\'\"]([^\'\"]+)[\'\" +][^>]*$ALT/i || $http->body()=~/<\s*img[^>]*src\s*=\s*[\'\"]([^\'\"]*$ALT[^ +\'\"]*)[\'\"]/i || $http->body()=~/<\s*img[^>]*$ALT[^>]*src\s*=\s*[\'\"]([^\'\ +"]+)[\'\"]/i ) { my($IMGURL)=$1; if ($IMGURL!~/^http:\/\//){ #if we got a relative address try and attach the domain we + got from the page URL $URL=~/(http:\/\/\w+\.\w+\.\w+)/; $IMGURL=$1.$IMGURL; } ##retreive image $http->reset; if(my($req)=$http->request($IMGURL)){ my($raw_im, $lb_im); ##Ok.. We have got an image so add everything to the GUI. $comicpane->Label( -text=>$NAME, -font=>'*-24')->pack(-anc +hor=>'w'); $raw_im=encode_base64($http->body()); $comicpane->Photo($NAME,-data=>$raw_im,-format=>'gif'); $lb_im=$comicpane->Label(-image=>$NAME); $lb_im->pack(-anchor=>'w'); #keep track of our biggest comic $agoodsize{'-width'}=$lb_im->width if($lb_im->width > $ago +odsize{'-width'}); #... and height of our comic stack $agoodsize{'-height'}=$lb_im->height+$lb_im; } } } else{ print STDERR "Could not retrieve $2 ($1)\n"; } } else{ print STDERR "We seem to have an invalid config entry:\n $_" if $_ +; } } exit(0) if $agoodsize{'-height'}==0; # we got nothing, we go no furthe +r.. $comicpane->pack(-fill=>'both',-anchor=>'w',-expand=>1); ##figure out what size our comic pane should be. my ($dimention); for $dimention ('-width','-height'){ if($agoodsize{$dimention}>$defaultsize{$dimention}) { $comicpane->configure($dimention=>$defaultsize{$dimention}); } else { $comicpane->configure($dimention=>$agoodsize{$dimention}); } } ##leave the user in Tk's capable hands! $window->MainLoop;

update (broquaint): added <readmore> tag

(update) Clarification! By the way, I am not just getting you guys to perfect my work, which would obviously be quite dishonest, but I am instead trying to prepare myself for the kind of questions and critisisms I may recieve..

Replies are listed 'Best First'.
Re: Script Review...
by Old_Gray_Bear (Bishop) on Oct 25, 2003 at 22:34 UTC
    The obligatory line -- "use strict; use warnings;". You are presenting this as an example of your work. You want them to know that you are aware of the kind of errors that the pragmas address, and you have taken steps to remove them.

    Stylistic commentary:

    * Re-format everything with an 80 character line. It makes everyone's life miserable if they have to keep scrolling back and forth to read the end the text. In particular, don't be so stingy with the '#' characters. Multiline comments are simply ducky.

    * DO you expect the script to "kill your system"? If so, fix the bug. If not, think about *not* including that line. It makes me think more than twice about using your code.

    * You might try commenting the regexes you use to identify the comic to be pulled. Either with the 'x' option inline with the regex, or as comment block before each one. This is provides the next person who updates the code a leg up on what to expect from the regex, and it gives you a cross check on whether the regex actually is doing what you just said it does.

    * You might want to consider adding a POD stanza to provide external documentation. Give the potential User a paragraph or so description of the Code and What It Does, and they don't have to wade through the code. It's the same principal as a car-seller handing out printed specifications; it keeps the potential buyer from getting down on their knees to look at the under-carriage.

    Good luck on the interview.

    OGB

Re: Script Review...
by graff (Chancellor) on Oct 25, 2003 at 23:08 UTC
    Just a few stylistic points:

    I'll assume that the identing looks goofy because you pasted the code in with a combination of space and tab characters (with indent level set to 4), instead of just spaces -- no big deal here, just make sure the copy you present to the interviewer shows proper indentation.

    The nesting of if..else... blocks, and the distance between a short else block from the condition that triggers it, are not severely bad here, but they are unnecessary impediments to readability and maintainability. Your handling of the config file could go like this:

    while (<CONF>) { next unless ( /^(.+)\s(http:\/\/\S+)\s+(.+)$/ ); # just skip unusa +ble lines my ( $NAME, $URL, $ALT ) = ( $1, $2, $3 ); $http->reset; if ( $http->request( $URL ) != 200 ) { warn "cannot read from $URL ($NAME)\n"; next; } if ( http->body() =~ /.../ ... ) { my $IMGURL = $1; ... $http->reset; if ( $http->request( $IMGURL ) != 200 ) { warn "drat! tried to get $IMGURL (for $NAME) but couldn't\ +n"; next; } # OK, got the image... and so on... } }
    If you really want to report on the status of reading the config file (e.g. report unusable lines), just read the file in a short loop, report the results after closing it, then do a loop over the usable lines -- eg:
    my %name; my %alt; my %bad; while (<CONF>) { if ( /^(.+)\s(http:\/\/\S+)\s*(.+)$/ ) { $name{$2} = $1; $alt{$2} = $3; } else { $bad{$.} = $_; } } close CONF; if ( keys %bad ) { warn "unusable config data at line $_:\n $bad{$_}" for ( sort { $a <=> $b } keys %bad; } warn "Loading ",scalar( keys %alt )," comic sites\n"; for ( keys %alt ) { # ... and so on }
    Note that a "my ... " statement only needs parens when you're doing a list of variables. (update: or when you're declaring a single scalar and assigning a value from something that returns a list, e.g.   my ($first) = sort @array;)

    You weren't doing a consistent amount of error-checking and problem-reporting with your different $http->request calls.

    You should add pod that describes the app and gives the necessary detail for creating a config file from scratch (or provides the content of a sample config file).

    Okay, this is for comics, so I guess the spelling of "$dimention" is supposed to be a joke, right?

      Okay, this is for comics, so I guess the spelling of "$dimention" is supposed to be a joke, right?

      Maybe not. :) Look at this:

      (WIN) whatever $ENV{APPDATA} is on your machine (should be under you documents and settings dir under appdata as long as your on NT (emphasis added)

      Often attention to detail gets you the job. I've dealt with a lot of applicants and have discounted minor errors in interviews, but spelling and grammar errors in any part of the written application look really bad (excellent idea to get it checked out at the Monastery, BTW). One way of avoiding this particular problem is to never write a contraction: it is (not it's), they are (not they're), or worse---confusing "their/there/they're". It's not code, but it counts.

      --
      Allolex

      Perl and Linguistics
      http://world.std.com/~swmcd/steven/perl/linguistics.html
      http://www.linuxjournal.com/article.php?sid=3394
      http://www.wall.org/~larry/keynote/keynote.html

Re: Script Review...
by runrig (Abbot) on Oct 25, 2003 at 22:35 UTC
    Mostly just style choices but in lines like this:
    if(/^(.+)\s(http:\/\/\S+)\s*(.+)$/){
    I would use an alternate regex delimiter, maybe "#":
    if ( m#^(.+)\s(http://\S+)\s*(.+)$# ) {
    and in this:
    $http->body()=~/<\s*img[^>]*src\s*=\s*[\'\"]([^\'\"]+)[\'\" +][^>]*$ALT/i
    You don't need to backwhack the quotes inside the character classes. In fact, I think the only character you need to backslash inside the '[]' brackets is a backslash.
    my ($dimention); for $dimention ('-width','-height'){
    the above is more perlish (and dimention is more correctly spelled) as:
    for my $dimension ( qw(-width -height) ) {
    You have my ($var)= ... all over when you really mean my $var = .... The difference is whether the expression on right side of the '=' is in list or scalar context. It often doesn't matter, but someday you may be bitten by this.

    Last, there doesn't seem to be any reason not to Use strict and warnings.

Re: Script Review...
by sauoq (Abbot) on Oct 26, 2003 at 02:13 UTC

    Perltidy is your friend. It'll help with the inconsistent indentation and long lines. It'll also help with your inconsistent and sometimes cramped spacing between elements. For instance,

    my($IMGURL)=$1;
    will become
    my ($IMGURL) = $1;
    and
    if(my($req)=$http->request($IMGURL)){
    will become
    if ( my ($req) = $http->request($IMGURL) ) {
    Such things go a long way toward helping readability.

    -sauoq
    "My two cents aren't worth a dime.";
    
Re: Script Review...
by pg (Canon) on Oct 25, 2003 at 22:19 UTC
    • use strict, use warnings all the time.
    • define constant for things like window size etc., don't spread literals every where.
Re: Script Review...
by Anonymous Monk on Oct 26, 2003 at 01:01 UTC
    I am not sure how I would feel about this whole scenario if I were the one interviewing you. On the one hand, peer review is a good thing. On the other, you are submitting this as an example of your code.

    If you were honest up front that you had had some or all of your code portfolio reviewed, it might be acceptable, but then I would still want to see some of your own work. Were I to find out later you would not working for me.

      Anonymous Monk,

      I am not sure that your remark is totally fair.

      Nobody learns in a vacumn. If this monk submits his work to us asking for a review then we have to admire his courage! If you look through the monastery you know why I say that - some monks can be pretty scathing in their comments.

      For something like 30 years as an engineer and scientist I figured that every piece of work I did was subjected to peer review. As a young engineer I even had to review work of my seniors, sometimes by boss or his boss. I caught the ocassional error and was very proud when I did. Most of all I learned a huge amount from the process. I hope this monk learns form teh process as well, and should he ever have the ocassion to present himself and his code at an interview it may be a selling point to say that he learns a lot by having others review and comment on his code.

      If he faced me for an interview he would get good marks for saying that! He might get even better marks for showing the code as originally presented and the code as cleaned up - showing just what he learned.

      jdtoronto

        But what is the motivation for peer review in this case? Sure they want to present the best possible code to give the best possible impression and hopefully get the job. But the fact is, the code is being cleaned up in a last minute review process and may not be an accurate reflection of the capabilities of the interviewee. If I were the interviewer and this fact wasn't made clear up front, and I later logged in to perlmonks to see this thread, that person would be out of the running.

      Peer reviews can often be filled with dread for the author. Especially if their peers are particularly critical of the material. It doesn't help if you've never had your work reviewed by a group of peers before, either.

      I think it's a grand idea to have people review your code -- one to find errors (although that's not the objective here, as noted by the Update). But I think it's more important to be able to be comfortable in a peer review. If I make a mistake in my code, I want to be objective enough to be able to see it. That means not taking offense when someone points something out. For some people that comes naturally, for others it takes practice.

      It's just the same as going to mock interviews. It's just the idea of getting used to the process.

Re: Script Review...
by William G. Davis (Friar) on Oct 26, 2003 at 09:42 UTC

    I wholeheartedly echo the strict and warnings sentiments of the other Monks.

    As for your style in general, well, everyone has their own, and asking for comment on this subject is bound to yield many contradictory answers. However, there are at least two things your style must be: consistent and logical.

    For example, why is it that you put parentheses around all my declarations even if you only declare one variable? The parentheses are needles unless you're declaring multiple variables with the same declaration. Granted, it's consistent, but not very logical. Notice, you do the exact opposite with print(); all calls to print() are completely devoid of parentheses, even if you supply multiple arguments to it. Why is that?

    Try to come up some well-thought-out rules and use them constantly. For example, one I subscribe to is, when using built-in functions (which are thus prototyped), only use parentheses with named list operators and and only when your supplying more than one argument. E.g.:

    # see: open(NAME, "< file") || die $!; print("one", "two", "three"); # and see: chop $string; carp "Some error."; # but see: print "a string"; my @stuff = split /\s/;

    You get the idea. The thing is, you can only adopt and apply such a rule if you're reasonably familiar with Perl's built-in functions and what arguments they're prototyped to expect. If you're not, see perlfunc. Misunderstanding of built-ins and their precedence and prototypes is often the cause of inconsistent/needless usage of parentheses in beginner to intermediate Perl programmers' scripts.

    One other thing I noticed was the comments. Throughout the script you switch between using single #, double ##, and even triple ###, and even then you mix things up by having some comments with a leading space between the last # and the text, and others with the text right after the last #.

    Again, pick a style and stick with it.

    The consensus in this area is, try to use only one # and have a leading space between the # and comment text. For example:

    # a comment. # a multi- # line comment.
    is typically easier to read than:
    #a comment. #a multi- #line comment.

    There are other things too, like how some assignment statements have spaces on both sides of = and others have no space between = and the lvalue and rvalue. Just try editing the script with logic and consistency in mind. And it wouldn't hurt to at least compare any new style rules you devise with the "official" ones in perlstyle, and then try to prove to yourself that your rules are better than perlstyle's; and if you can't, abandon your's and stick with perlstyle's.