georgecarlin:

Before I go into one of my long, meandering list of things, I'd like to say:

On a more serious note: By putting your code up here for review, I think you already have the right attitude. Always wanting to improve how you do things is a great skill for a programmer to have.

OK, then, on to the bitching! I've put it in <readmore> tags so people used to my occasional long and meandering nodes can skip it. A brief summary for them: Naming, whitespace, short subroutines! Possibly a bit more, I don't remember what I wrote, and I hate this keyboard.

Random notes as I read your code.

Chunk 1

There's too much "decoration". A common beginner habit is to spend a lot of time on formatting large comment blocks and such. It's not a terrible thing, it's just something I see far too often. Some decoration can be nice and help organization. But it's easy to go overboard.

End blocks are pointless. When you have a large heading on each section, it seems pointless to have an equally large end marker--especially since the next item will either be another large heading or the end of the file. Both will be good visual cues to the end of the previous section.

Names should be clear and meaningful. printversion, printtmpl: Neither of them actually print anything. Each simply returns a text string. A name should be suggestive: that's the most efficient method of documenting your code. You don't need to write many comments when your code clearly describes itself.

Digression 1: Many people will tell you to spell out everything in full. I think they're nuts. A proper set of common abbreviations go a long way to making the code simpler to read. Examples: FName means file name, FH means file handle, rpt means report, db means database, etc. If you're consistent--and you should be--the code can be easier to read.

Neither of your printversion, printtmpl functions actually print anything. Each simply returns a string. Rather than having a function for each string you need to retrieve, I'd suggest putting your text strings into a hash at the beginning, and provide a function to retrieve the one you want:

my %Templates = ( header=>q{ #### HEADER #### # yadda yadda yadda ################ }, config=>q{ #### CONFIG #### # Sets the frobnitz #DEFINE FROBNITZ=Blevin ################ }, . . . ); sub get_template { my $tpl_name = shift; die "Template '$tpl_name' doesn't exist!" unless exists $Templates{$tpl_name}; return $Templates{$tpl_name}; } # Now you can use it like: print get_template('header');
Chunk 2

In your help function, breaking your text up into multiple print statements complicates things more than necessary. Try using a single "here document":

sub help { my $NAME = $_[0]; . . . $NAME mangling . . . print <<HELP_TEXT; SECTION-USAGE: $0 blah blah blah ... as much text as you'd like. When you're done, just make sure you end it with your here-document terminator: HELP_TEXT }

Speaking of your help function, it doesn't look like your help function returns a useful value, so I simply removed it.

Your sani^Hatize function brings up a pet peeve: It's spelled incorrectly. When you name something that's used outside of your function, you *really* should spell it correctly. When a system grows enough, changing the names can be difficult to coordinate, so frequently they don't get corrected.

Then peepul wil haff to contsanly tyep teh saem mispelings evere tim yoo yoose it. And they will curse your name. And make fun of you.

The other problem with the sani^Hatize function is that it doesn't suggest what it *actually* does. Sure the word made sense when you wrote it. But a few months from now when you're tracking down a weird bug and you call sani^Hatize, you'll have to stop and look at the code to figure out what sani^Hatize means in this context.

Looking at the function, it simply performs a set of replacements with a funky hardcoded bit. I can't think of a good name for the function, which makes me think that you may be approaching the problem sideways.

Except for the hardcoded bit, I'd suggest translate, replace or similar for the function. But you can't use those as the basis for a name with that hardcoded bit unless you start adding adjectives to your function name: translate_with_funky_bit(). I'd suggest splitting out the general-purpose multiple_string_replace function from the do_funky_escape_code() bit. Then you could reuse the replacer if/when necessary. You could write a function that takes your list of things to replace and transform them into the list of things used for the replacer function.

Imagine my surprise when I started looking at the taint function: It's basically the multiple_string_replace function.

Next, gettime. For some reason, you don't mind a simple operation to convert the year to the correct value, but you use an array to fix up the month. Also, you're creating new variables instead of using the ones you have. If I were going to write it (instead of using a module or strftime), I'd do it more like:

sub gettime { my ($sec,$min,$hr,$day,$mon,$yr,$dayOfWeek) = localtime(); # Adjust the values to the values humans expect: $yr+=1900; $mon++; return "$weekDays[$dayOfWeek] $mon/$day/$yr $hr:$min:$sec"; }

I abbreviated the variable names because they're used in a very small area, and the abbreviations are obvious in context. I ignored the unused return values from localtime, since they aren't used. I also reused $yr and $mon to hold the "adjusted" values.

Your parsetempdb function does too much in one place, reducing any potential reuse. (I already mentioned naming, so I won't beat that horse any further.) The function has several interesting operations: Parsing the DB file may be useful by itself, so I'd split that out into a subroutine. Generating a report from the data structure could also be useful, so I'd split that out, too.

So you could wind up with something like:

sub print_tempDB_report_from_file { my ($DB_FName, $RPT_FName) = @_; my $tempDB = parse_db_file($DB_FName); my $report = generate_db_report($tempDB); open my $OFH, '>', $RPT_FName or die "meaningless death!"; print $OFH $report; } sub parse_db_file { my $DB_FName = shift; . . . code to parse DB File into data structure . . . return $tmpDB; } sub generate_db_report { my $tmpDB = shift; my $text; . . . code to turn data structure into a string . . . return $text; }

The same can be said for sendmail: It does more than the name implies. Splitting it into smaller components would be useful. One of those components should be a bit of code to send mail--you could call it sendmail! (Sorry, I couldn't resist.)

The Main Code

I'm running out of steam, so I'll just note a couple of things:

You don't use enough whitespace. Your indentation style is fine (it's clean and consistent), you just don't use it often enough. Line breaks are good too. It can be difficult to read a "wall of code", so use whitespace to break things up into single "thoughts". As an example, the 'perform various sanity checks' block is too tightly packed, so it's harder to read than it must be. Put in a few line breaks and indents.

Your main code block is all coded straight through. You can break it into subroutines (even if they're used only once!) to "outline the code", making it easier for the reader to understand. It also helps limit the amount of code you need to read to fix an error.

# Startup process_command_line(); perform_various_sanity_checks(); my ($user,$passwd) = get_login_data($CONFIGFILE); set_static_configuration(); # Do the work

Ah ... you *did* do this a bit later in the code. Good.

OK, I've run out of gas. So I'll leave with one last bit: Your functions in this chunk are just *too big*. I expect that it's more of the same: You're trying to do too much in a single subroutine.

You'll find things easier to maintain and test if you keep your subroutines short and focused. Each individual component can be tested easily. But if you make a large do-all function, it's a *lot* harder to test, and if you make a change, you have to update more tests to ensure you hit all the possibilities.

Thanks for sharing. And please don't be offended at my random writings. After all, I'm just a nitpicky bastard anyway! ;^)

...roboticus

P.S. I don't remember what prompted me to write this, and it's far away from where it was. I didn't want to throw it away, so here ya go, an added bonus:

Roboticus' rule #1: If you always have to do something, you should *never* have to do it.

Many languages foist this stuff on us. Perl is one of the better ones. One example is the switch statement in C:

switch (c) { case 'A': do_this(); break; case 'B': do_that(); break; case 'C': do_this() && do_that(); }

Here, you *always* have to put in the break statement, or C will fall through to the next case. You should only have to do something explicit for the *NON*default case.


In reply to Re: RFC: beginner level script improvement by roboticus
in thread RFC: beginner level script improvement by georgecarlin

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.