In Part I, I discussed the meta-coding aspects to maintainable code - how to layout and structure your code so that you get the most out of it. In this part, I want to discuss some of the Perlisms that can help you improve your code.

There are hundreds of situations where the exact same logic can be written in a number of ways. This is TMTOWTDI, and one of the foundations of Perl. However, there're usually worse ways and better ways. This is to discuss some of the better ways, when it comes to maintainability. Now, some of these may be worse, performance-wise. However, I'm going to say that, when it comes to a normal application, "fast enough" is a very important concept. If you need more speed, add RAM. If you need more speed, add more CPU. If you need even more speed, then optimize your code. Your time coding is the most expensive part of an application. Remember that. Now, onto some examples.

Let's say that you have a $typeNum, which is one of 1, 2, or 3. There's also a $typeName for each $typeNum, and you need to be able to convert between the two. We have a function that does it. But, what should the function do?

sub getTypeName { my $typeNum = shift; if ($typeNum == 1) { return "A NAME"; } elsif ($typeNum == 2) { return "B NAME"; } elsif ($typeNum == 3) { return "C NAME"; } else { die "$typeNum not valid\n"; } }
That's the most obvious and brute force method. And, I must say, it works perfectly fine in this instance. However, what if there are 26 choices? The canonical answer is to create a hash or array, with keys of $typeNum and values of $typeName. And, that works perfectly fine. But, what if all your values map perfectly well like this?
sub getTypeName { my $typeNum = shift; die "$typeNum not valid\n" unless 1 <= $typeNum && $typeNum <= 3; my $typeName = ('A' .. 'C')[$typeNum - 1] . " NAME"; return $typeName; }
Now, what if you have a string you want to break up. If it's nicely delimited, you can use split very nicely. But, what if it's positional? The obvious answer is to use substr a number of times, sorta like this
my $first = substr $line, 0, 2; my $second = substr $line, 2, 4; my $third = substr $line 10, 4;
Again, this doesn't scale very well beyond, say, three items. Even then, it's ugly. So, the first thing most people do is turn to unpack. That would look something like
my ($first, $second, $junk, $third) = unpack "A2A4A4A4", $line;
That $junk in there isn't very aesthetically pleasing. So, how about we do something like
my ($first, $second, $third) = (unpack "A2A4A4A4", $line)[0,1,3]; # Or, you could do ... my ($first, $second, $third) = ($line =~ /(.{2})(.{4})(.{4})(.{4})/)[0 +,1,3];
This is using the concepts of lists and slices. unpack returns a list. Instead of assigning that list immediately, you can index into that list, either as a straight access or a slice.

Now, I'm still very unhappy about the fact that we have three variables. We could easily assign it to a list and call it @data, but I don't like having to use numeric indices. I'd much rather use a hash. Maybe something like

my @colNames = qw(first second third); my %hash; @hash{@colNames) = (unpack "A2A4A4A4", $line)[0,1,3];
Now, that's more like it!
  1. Easy (and readable!) parsing of a fixed-length record
  2. Easily modifiable code (that can even be gotten from a configuration file)
  3. Taking only what I want to take, thus not cluttering up the assignment
  4. Assigning it to a easy-to-read-and-use hash
And, there are a number of other applications for these techniques. Have fun!

Update: davorg's comment about the template definitions in unpack is right on the money. I'm not very familiar with it, primarily because I never use pack or unpack for parsing fixed-length records. I will usually do a split //, $line and work with the resultant array, either with splice, shift, or foreach. The idea was to demonstrate that slicing is a useful direction to go in many cases.

------
We are the carpenters and bricklayers of the Information Age.

Don't go borrowing trouble. For programmers, this means Worry only about what you need to implement.


In reply to Maintainable code is the best code Part II by dragonchild

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.