convenientstore has asked for the wisdom of the Perl Monks concerning the following question:

I been coding perl for past 4 month and I am still very much in very begining stage.

As I am trying to read and write more longer program and trying to leave the begining stage behind I run into one question that I cannot find any answers from anywhere.
Is there a hard rule regarding when to use subroutine? How does one know if using subroutine is better than not using it?
Can you guys share(or at least point me to right direction or URL) your rules or where you learned when the best time to use subroutine?
Any help would be greatly appreciated.

Replies are listed 'Best First'.
Re: when to use subroutine
by graff (Chancellor) on Oct 28, 2007 at 17:07 UTC
    For me, there are just two major points:

    When you find yourself copying two or more adjacent lines of code, and changing just a few items for each copy that you make, then those lines should be a subroutine, and the things that change from one copy to the next should be parameters to the subroutine.

    When you find yourself going into a fairly deep flow-control structure of nested loops and/or conditionals, and a lot of the nested blocks involve lots of complicated code (making it hard to see what the flow-control is actually doing), then at least some of those nested blocks might be better off as subroutines, to make the "main" (outer-most) portion of the flow control more compact so that it's easier to read and comprehend.

    Both of those should be used in moderation, but the first is more of a general rule that I would apply in most (if not all) cases where it's possible -- try to avoid writing the same thing more than once in a given script -- whereas the second needs to be balanced against a tendency to have too many layers of subroutines calling other subroutines that call other subroutines that ..., which can make it just as hard to keep track of what's really going on.

    In any case, it helps to start with pseudo-code, laying out the algorithms and flow-control in a concise and coherent manner, and taking the time to note where a given functionality is needed at different points (i.e. should be a subroutine). And of course, when subroutine and variable names are well chosen, this helps a great deal -- the code starts to become "self documenting".

      I really like that first "point". Only thing I'd add is that if the single line of code is complex enough -- maybe it deserves a subroutine of its own.

      As many have said, the purpose of "subs" is to make things more clear. But why does adding subs make things more clear? Obviously you can't (normally)just make every line a sub and have that make it clear.

      I'd think of subroutines as ways of mentally helping you (and others if they read your code) to break your code up into small, functional chunks. Each subroutine can be a concept. Once you have the subroutine 'built', you should be able to use it anywhere (assuming its well designed) like a "custom addition" to perl for your specific program.

      For instance, suppose you want to do something similar to the "index()" function, which can find a single character embedded in a string. However, instead of "single characters", suppose you have "symbolic names of characters" (ex.: "<Up>", "<Down>", "<Left>", "<Right>"). Now you want to find your "character"'s position in the array of characters. Suppose you are trying to find the character named "<Up>" (equivalent to the "up arrow" key that is not on the numeric pad). You need to search for it in an array with the symbolic names from above (<Up>, <Down>...).

      Your normal index function looks like:
      "index $mykey, $string"
      So now you create a function that does the same for your symbolic names, called "my_string_index":

      sub my_string_index($\@) { my ($symkey, $symkeys)=@_; my $index=-1; for ($index=0;$index<=$#$symkeys; ++$index) { if ($symkeys->[$index] eq $symkey) { return $index; } } return -1; }

      Now you can use your new subroutine to perform the same function for your symbolic keys as the regular index does for single-letter keys:
      "my_string_index $key_to_find, @symbolic_keynames;"

      You have "added" your own "string_index" function that finds your symbolic-keys in an array of symbolic keynames. Ex:

      @cursor_keys = qw(<Up> <Down> <Left> <Right>); $keyname_to_find='<Down>'; my_string_index $keyname_to_find, @cursor_keys;
      You know longer have to think about the implementation -- it's just there, in your program. Later, if you don't like the performance of "my_string_index" using the 'for' loop, you can substitute other perl code to speed up the routine. If you've made your function 'stand-alone' (doesn't change anything outside the function), then you won't have to change the code in multiple places. You just have to change your 1 function, "my_string_index", and then everyplace that uses that function will gain in performance.

      Another place to use functions -- as "place holders". If I want to write a file update program and I want to focus on the update code, first, I can start with a dummy skeleton:

      # skeleton file read & write functions... sub readfile { my ($array_p)=@_; @$array_p=<>;} sub save_file { my ($array_p)=@_; foreach (@$array_p) { print $_; } } # (above dummy routines will be changed later) ### main program ### # first open file readfile(\@cur_file); #will put file into specified array #update code #delete blank comment lines and blank lines @mynewfile=grep ( !/^#?\s*$/, @cur_file); #save results back in file save_file(\@mynewfile); #will save array back to file
      Using the above type of skeleton, you can start with your filter simply reading and writing to the terminal for testing. Later you can add "real" code to read and write the file you want (instead of standard input and output), but for the now, you can play & develop just the filter code.

      Hope this gives you some ideas....

      Linda

Re: when to use subroutine
by brian_d_foy (Abbot) on Oct 28, 2007 at 17:56 UTC

    There isn't any hard rule (meaning something that everyone will agree on and everyone uses across all situations). Often the aswer depends on real-world factors such as "do I really care since this is a one page script" and "What's on TV tonight?". For anything that is going to be part of something that lives beyond next week, I generally start with subroutines instead of switching to them later.

    Subroutines give names to logical tasks. Instead of typing out a bunch of code "inline", you use the name instead. You don't really care about how the work gets done as long as it gets done. When the implementation hides behind the subroutine name, it doesn't clutter the flow of your program, and if you decide to change it later, the program doesn't really care (unless you're changing the argument order or return values somehow). When people read your program, they see the concepts and ideas rather than the mechanics:

    ########### # Here's what this program does: my $data = read_data(); my @records = extract_records( $data ); my @answers = do_calculations( @records ); print_results( @answers ); ########### # Here's how this program does it: sub read_data { ... lots of distracting code ... } sub extract_records { ... more distracting code ... } sub do_calculations { ... this stuff is complicated ... }

    Some people will tell you to "refactor" code into a subroutine when you start using it in multiple places in your program. That's fine, and you'll see some of this in How a script becomes a module. You code up a little bit of code to do something, and realize that you need to do the same thing somewhere else so you copy and paste, Then you need it somewhere else, so you move it into a subroutine. That works okay, but after a while you figure out that you do that so often that you might as well just start with a subroutine.

    Another rule is to keep any section of code short; not as short as possible (this isn't golf), but not longer than it needs to be. Like the example before, the idea is to show the concept and not the mechanics. When your while loop starts to spill off the screen, move some of it's code into a subroutine. The loop takes up fewer lines and you get a shorter block of code. That's easier to read and understand. Inside your subroutines you do that same thing: when they get too long, break them up.

    When you break up code into subroutines, you want to "decouple" that code. Subroutines only know what you tell them by passing them data. They don't look inside global variables or try to figure out the state of the larger task. They do their bit and give a result. Since you've decoupled and isolated that code, you don't have weird "action at a distance" bugs, and it's a lot easier to test your subroutines (another benefit of everything in subs).

    Now, inside the subroutines, you really only want to do one thing. I'm not going to say what that one thing is, but the idea is that subroutines should be short and well defined. The one thing is does is a logical task, and maybe that logical task is really several implementation steps. Those implementation steps are sub-tasks which probably belong in their own subroutines. Now, despite your question about knowing where the line is, you'll get used to where the line between inlining and subroutining is by simply programming for a bit. You don't want a couple giant subroutines, but you don't want 10,000 subroutines all doing trivial things.

    Good luck :)

    Oh, and some books that might give you a better feel for programming, even if they don't directly answer this question: The Pragmatic Programmer, The Practice of Programming, and Refactoring. There are probably plenty of other books too, but looking at my shelf right now that's what I see.

    --
    brian d foy <brian@stonehenge.com>
    Subscribe to The Perl Review
Re: when to use subroutine
by Joost (Canon) on Oct 28, 2007 at 17:20 UTC
    If you're asking that question, you are probably using far fewer subroutines than you should. :-)

    Some suggestions:

    • Try to put everything in your code that does something you can name as a distinct action in its own sub. Subroutines should ideally not be bigger than a paragraph of text. In other words, a single sub should be easy to read and do something that's easily described.

      Don't do:

      sub get_me_a_drink($drink_name) { if ($drink_name eq 'soda' or $drink_name eq 'beer') { # go to fridge # open fridge door # get drink return $drink } elsif ($drink_name eq 'coffee' or $drink_name eq 'tea') { # get water # boil water # mix ingredients return $drink } die "Can't make this drink"; }
      Do this:
      sub get_me_a_drink($name) { if (is_cold($name)) { return get_from_fridge($name); } if (is_hot($name)) { return brew_hot_drink($name); } die "Can't make luke-warm drinks"; }
    • Use less variables and more function calls (don't count the variables that hold your subroutine arguments). This will make your code look less like a list of steps and more like a sentence. In other words, don't do:
      sub brew_hot_drink { my $ingredients = shift; my $water = get_water(); my $hot = boil($water); my $drink = mix($hot,$ingredients); return $drink; }
      but do
      sub brew_hot_drink { my $ingredients = shift; return mix( $ingredients, boil(get_water())); }
    • When you're feeling comfortable with this, take a look at the things you can do with anonymous subroutines (i.e. the flexibility you can get when you can pass bits of code around instead of just bits of data). This is a large topic, and more advanced, but you should learn at least the basics as soon as you're getting comfortable. merlyn's done a column on this topic. Also, see Why Closures?.
      Don't do:
      ...
      Do this:
      ...
      i second that! i'm working on a very old script which has, for example, 130 lines of nested if-else blocks, and it goes down 6 levels. as if this wasn't complicated enough, it uses global variables.

      always imagine yourself sitting in front of your code six months later trying to understand what it does. having to scroll up and down the whole time is exhausting.
      there's the rule of 24 lines for a subroutine as a maximum. originally because that's the standard terminal height. some might say, nowadays you have big screens and big windows, but that dosn't help much, because then still your eyes have to move up and down, so that 24-rule (with exceptions of course) is still a good rule.
Re: when to use subroutine
by chromatic (Archbishop) on Oct 28, 2007 at 17:16 UTC

    When you can point to a chunk of code that could stand on its own and give it a meaningful name, you've found a subroutine. Make it a subroutine.

      I personally believe that a nice example can be found in some code (link @ GG) I posted in clpmisc some time ago.

      (Minimal) premise:

      Code:

      #!/usr/bin/perl use strict; use warnings; use List::Util 'sum'; use constant TESTS => 20; use Test::More tests => TESTS; sub naive { my @arr = map +($_) x $_[$_], 0..$#_; @arr % 2 ? @arr[(@arr-1)/2] : (@arr[@arr/2 - 1] + @arr[@arr/2])/2; } sub findidx { my $i=shift; ($i -= $_[$_])<0 and return $_ for 0..$#_; } sub smart { my $t=sum @_; $t%2 ? findidx +($t-1)/2, @_ : (findidx($t/2-1, @_) + findidx($t/2, @_))/2; } for (1..TESTS) { my @a=map int rand 10, 0..5; is smart(@a), naive(@a), "Test @a"; } __END__

      Here, if I were not to have findidx() as a separate sub but inline its code/logic into smart(), I would probably do it like thus:

      sub smart { my $t=sum @_; if ($t%2) { my $i=($t-1)/2; ($i -= $_[$_])<0 and return $_ for 0..$#_; } else { my $i=$t/2-1; my ($found1, $found2); for (0..$#_) { ($found1=$_), last if ($i -= $_[$_])<0; } my $j=$t/2; for (0..$#_) { ($found2=$_), last if ($j -= $_[$_])<0; } return ($found1+$found2)/2; } }

      Of course there are other WTDI as usual, however one cannot but notice that in this particular case not only did factoring code away in a sub reduce duplication, but it even gave one the possibility of applying some syntactic sugar that makes code overall more terse and clear.

      Update: switched from spoiler tags around the minimal description to readmore at Argel's request.

Re: when to use subroutine
by GrandFather (Saint) on Oct 28, 2007 at 20:12 UTC

    If it makes sense to give a chunk of code a name that describes what the code does in a succinct fashion then the code should probably be in a sub.

    If the same sequence of steps is performed in multiple places, they should be in a sub.

    Breaking code up into subs does a few things for you. It:

    1. makes program flow clearer
    2. documents expected behavior
    3. makes testing easier
    4. reduces development time
    5. makes debugging easier

    There is a rule of thumb that says "a sub should fix on a page (of whatever media you use)" which is a slightly useful guide in that anything much bigger than a page you probably can't understand easily as a unit, so should break up in to smaller pieces (subs). It's ok as a guide, but the more important driver is that it makes sense for the code in the sub to be treated as an entity. It's a little like breaking a web page into a number of nested pages where you don't have to follow all the links to make sense of the top page, but if you want the detail, follow the links.


    Perl is environmentally friendly - it saves trees
      makes testing easier

      I was waiting for someone to say that. If you put a chunk of code in a sub, and then move the sub to a module, you can write automated tests that verify that that sub does what you think that it does (and they can be run again at any time to verify that it hasn't broken for some reason since you last looked at it).

      Sharing code and improving clarity are important, but making sure the code works is important also: if you can break any bit of functionality out into a sub, it becomes possible to test it to make sure it really does function.

      (It might be better if you started reversing the sense of the question: "Is there any reason I shouldn't put this piece of code into a sub?")

Re: when to use subroutine
by tuxz0r (Pilgrim) on Oct 28, 2007 at 18:38 UTC
    This can be one of those divisive topics around the water cooler at work; and, as I'm seeing here, everyone has their own experience which leads to some similar answers and some different opinions on when to use subroutines. I'll throw my two-cents in as well.

    For me, a subroutine is one of a number of solutions to help provide one of the following:

    • Modularization - compartmentalizing specific functionality in one location to make it easier to change, debug and understand.
    • Removal of duplicated code - if you do a common task more than once, write it once instead of copying/pasting the code throughout your program.
    All of these things help to streamline your processing flow by reducing the amount of code someone (you or whoever might have to maintain it) has to read through to get a good understanding of what the code is doing.

    Subroutines aren't the only way to provide this, and Perl, like many languages, offers a number of ways to provide modularization and code reuse. Subroutines can be separated into a separate file, a file of related subroutines could be made into a Module and even turned into an reusable Object.

    All the commenters make good points, but I think you'll just have to take a look at your code while keeping these ideas in mind and see what makes sense.

    ---
    echo S 1 [ Y V U | perl -ane 'print reverse map { $_ = chr(ord($_)-1) } @F;'

      thank you all,
      I think I definitely need to read up more on how to design the code and think like programmer in general before going into more complicated stuff in perl(mainly OO subject).
      Perl is awesome language for me because it would allow me to do amazing stuff at work that I wouldn't even dream'd of getting it done w/out it. Now that I took upon myself to write bit more complicated code at work, I see that lack of general computer science background is coming back to haunt me.

      It is something I feel I would like to study further on before digging deeper into the perl.(as this is really my first computer language). I just ordered "Pascal, an introduction to the art and science of programming" per someone's recommendation on the forum and also I would like to look into the book "The Pragmatic Programmer, The Practice of Programming, and Refactoring" Brian recommended(but it wasn't on amazon nor my collection of libraries' so I will have to dig deeper).

      My biggest problem is that I too also try to learn by reading someone's code but sometimes I wonder why such a simple operation(those one liner sub routine) would require its own function. "How a script becomes a module" was excellent reading for me and sort of gave me more insight to other's thoughts.
      Thank you again and back to the reading..

        The Pragmatic Programmer, The Practice of Programming, and Refactoring are three separate books. All three are worth reading (and they should be easier to find now).

Re: when to use subroutine
by mwah (Hermit) on Oct 28, 2007 at 17:52 UTC
    convenientstore
    Is there a hard rule regarding when to use subroutine?

    No. There hasn't been one and there won't be one anytime soon.

    How does one know if using subroutine is better than not using it?

    This depends entirely of the absolute knowledge of the person in question. There are aspects of efficiency, comprehensibility, maintainability etc. A subroutine is just one of the means used to transfer a concept into a technical thing, a code or program.

    Can you guys share(or at least point me to right direction or URL) your rules or where you learned when the best time to use subroutine?

    I learned to use subroutines just naturally, by *not* using them in the beginning. If your knowledge evolves over time, your conceptual view of the optimal structuring and transferability into code starts to include "subroutines", "records" (structured data), "modules" and all these things. Don't try to imitate things which you didn't really internalize. Just don't use any subroutine in your own code until it gets crystal clear for you to do so for a reason you like ...

    Regards

    mwa