Re: To sub or not to sub, that is the question?

Another way to look at this issue is that a subroutine is a logical entity representing one conceptual block. This means that if a statement (or group of statements) is conceptually one action, it should be a sub. If it is conceptually two actions, it should be two subs. And, so on.

Thus, for example, you often see print_header() and print_footer() in CGI scripts. What this does is group together the print statements that make up the header. The average reader doesn't care what goes into the header when s/he is reading the program flow. If s/he cares, then s/he can go into that function and concentrate on the action of printing the header, without getting distracted by anything else.

Now, one thing I would be careful of is over-compartmentalizing. One thing I hate about reading other people's OO code is that I have to trace 10 function calls just to find out who's doing what. Usually, this is in 10 different files, to boot! I would very much reccomend thinking about generic functions. For example:

sub get {
    my $self = shift;
    my ($attr_name) = @_;

    if(exists $self->{$attr_name}) {
        return $self->{$attr_name};
    } else {
        return undef;
    }
}
[download]

is much better and easier to read than:

sub getAlpha {
    my $self = shift;

    return $self->{Alpha};
}

sub getBeta {
    my $self = shift;

    return $self->{Beta};
}

sub getGamma {
    my $self = shift;

    return $self->{Gamma};
}
[download]

You have one function that does everything in a very understandable manner. (For those who complain that generic functions are slower, I wonder why you're using OO Perl...) If you want further generic stuff (in this case, boundary checking and the like), you can use a dispatch table to hand off to handlers in the cases where a generic function doesn't encapsulate enough. Or, you could just do your checking outside, either in main code or in another function. The basic idea is that there is one function, at the heart, that does one thing.

Now, I'm contradicting myself, and I know this. But, as the other posters have made clear, there is no hard-and-fast rule as to when a sub is good or not. It very quickly becomes a religious debate. You have to come to your own conclusions. I know my level, and I suspect many of the monks know their own levels, too. Where's yours? That's a matter of experience, which comes only from trial and error.

Comment on Re: To sub or not to sub, that is the question? Select or Download Code

Replies are listed 'Best First'.
Re (tilly) 2: To sub or not to sub, that is the question? by tilly (Archbishop) on Jun 26, 2001 at 15:58 UTC
One suggestion. I would prefer to have your get() routine barf if you tried to get a non-existent attribute instead of proceeding. This allows you to catch typos in names at run-time. However beyond that using subroutines gets you into the following paradox. Suppose that working with heavily factored and organized code is 5 times harder per page of code. Suppose that it does the work of 20 pages of unfactored and unorganized code. If you assume that the issue of tracking down the same bugs in unfactored code is similar to the issue of changes made for one reason affecting things all over the place, then that is a gain in productivity of a factor of 4. However whenever picking up that code I can guarantee you that you will feel that factor of 5 issue. As for the assumption, that is where experience comes in. It is easy to break things into a million subroutines. What you learn over time is how to break things into a million subroutines that organize into clumps which present loosely coupled interfaces to the rest of the world. After you do that you just do not modify existing public interfaces lightly. Add a parameter, sure. But don't lighly assume that nothing depends on old (even old and stupid) behaviour. Also another good idea - the test suite - makes this easier by giving you a good idea what behaviour someone else is depending on somewhere which you didn't remember.	[reply]
Re: Re (tilly) 2: To sub or not to sub, that is the question? by dragonchild (Archbishop) on Jun 26, 2001 at 17:39 UTC
One suggestion. I would prefer to have your get() routine barf if you tried to get a non-existent attribute instead of proceeding. This allows you to catch typos in names at run-time. Actually, in the original code, it printed out an error, then continued. And, returning undef does barf, if you're bothering to check your return values for error before continuing. If you're not, then you deserve all problems you get. (Hint, hint!) To continue on with tilly's million subroutines, it's all about interfaces. A subroutine, in a very basic way, is just an API between you, the programmer, and some (hopefully!) well-defined and bug-free behavior. What the subroutine does under the hood should be something you don't have to worry about. That's one of the reasons to create a subroutine. "Code once and forget." (Also, there's "Code once, use twice" for the other reason to create subroutines.)	[reply]
Re (tilly) 4: To sub or not to sub, that is the question? by tilly (Archbishop) on Jun 26, 2001 at 18:46 UTC
Two questions for you. Why should your user code make assumptions for you about whether or not undef is a valid value? That implies a knowledge of your internals that I don't like. Compare two coding standards. One says that it is the responsibility of calling code to test for possible error returns and the other says it is the responsibility of the code being called to test and throw exceptions. Which is more likely to happen without error? In which can you put better error messages? Feel free to convince me otherwise, but my current thinking is quite clearly that it is better to throw exceptions early. User behaviour that didn't want that can trap it with eval BLOCK. (Similarly I consider it the responsibility of functions to do something useful in both array and scalar context.) BTW the shortcomings of C have a lot to do with my attitudes on throwing exceptions versus making it the responsibility of the called code to process possible error returns. For a particularly bad example, consider EAGAIN. It is well-documented. A ton of system calls can return it. But in practice it almost never gets returned, that aspect of code never gets tested, and so that aspect of code tends to be particularly buggy. The result? A Unix system runs fine for months on end. Then you put it under unusual load, once, and key long-running processes wind up with messed up states. Right when you least wanted that. And it isn't just one or two badly written programs that do that. There are lots of them out there. Now there are reasons why EAGAIN exists, legitimate reasons. Given what a Unix kernel has to do, I am not going to argue that it shouldn't exist. But that is a reliability flaw that I don't think should be introduced lightly.	[reply]
Re: Re (tilly) 4: To sub or not to sub, that is the question? by dragonchild (Archbishop) on Jun 26, 2001 at 21:15 UTC
Re (tilly) 6: To sub or not to sub, that is the question? by tilly (Archbishop) on Jun 26, 2001 at 22:30 UTC
Some notes below your chosen depth have not been shown here
Re: Re: To sub or not to sub, that is the question? by jplindstrom (Monsignor) on Jun 26, 2001 at 14:39 UTC
Now, one thing I would be careful of is over-compartmentalizing. One thing I hate about reading other people's OO code is that I have to trace 10 function calls just to find out who's doing what. I think this is in part related to the way OO works with inheritance. It's basically a good thing, because code is located at the "correct" level if you do your OO design right. But like you say, it's a burden on the programmer to see what's actually happening where. And like I mentioned in my previous post, one solution is Class Browsers that provide a coherent view of what methods and properties are available (i.e. the interface) in a class, regardless of where they are physically/inheritance-wise implemented. For example, in the Eiffel programming environment (uh, correction, when I did Eiffel in school five years ago :), there is the option to provide a "flattened" class, with all the inherited code in one single source file. Actually, I thought that was such a good ide that I created a module Pod::Class::Flattened to flatten POD comments (I use POD a lot myself) for a class with parent classes. It's not yet publicly available, but maybe if this sounds like someting that would be interesting to more people... /J	[reply]


Just another Perl shrine
	PerlMonks