Over the last few days I have twice seen answers to questions on this site advocating a foreach loop construct which has made me scratch my head somewhat. These answers have been in no way wrong, but, to my mind, promote code constructs which may lead to less maintainable code.

The foreach loop construct in question may be paraphrased thus:

foreach (1..5) { if ( $object->method( "arg$_" ) ) { # ... do stuff } }

The foreach loop iterates through a range of numbers, specified with either the range operator or a C-style for construct, using this loop index for construction of a parameter name or value through concatenation. That is, there is no use of the loop index as a storage index or reference, the loop using the index only for the creation of string values through concatenation.

This approach, to my mind, is very C-like and, in my humble opinion, could be better performed in the following manner:

foreach ( qw/ arg1 arg2 arg3 arg4 arg5 / ) { if ( $object->method( $_ ) ) { # ... do stuff } }

The difference between these foreach loop constructs is minor from a syntatical standpoint, but for foreach loop constructs where loop indices are being used only for the creation of string values through concatenation, this latter approach is advantageous primarily for reasons of on-going code maintenance.

That is, this latter approach:

It is for these first two reasons alone that I would advocate this latter approach to foreach loop constructs to improve the maintainability and usefulness of developed code.

I would welcome your comments and observations.

 

perl -e 'print+unpack("N",pack("B32","00000000000000000000000111001111")),"\n"'

Replies are listed 'Best First'.
•Re: On Foreach Loops and Maintainability
by merlyn (Sage) on Oct 11, 2002 at 14:00 UTC
    I prefer the first. At a glance, I can see that it's every element of the sequence, and that they're extremely regular. When you list them all explictly, I have to stare at each one to see if it's one higher than the previous and doesn't have some typo.

    "Capture regularities in code, irregularities in data", is the koan of the senior programmer. You've bucked that for a microoptimization.

    But yes, I understand your maintenance concern. Suppose besides "arg1".."arg5" (which is another way you could have written that and didn't, which made me wonder), you eventually might have had "arg19" and "gee37". You could have written the original loop as:

    for (map("arg$_", 1..5)) { .. }
    And then the maintenance person can add:
    for (map("arg$_", 1..5, 19), "gee37") { .. }
    That'd probably be the most flexible, and the most clear.

    -- Randal L. Schwartz, Perl hacker

      You could have written the original loop as:
      for (map("arg$_", 1..5)) { .. }
      That was my first though, but on further reflection, it looks like configuration information (i.e., which form fields we're interested in) is getting mixed into the code. Is this a bad thing? Perhaps. If the fact that we're interested into "arg1" .. "arg5" (and might later be interested in "arg19" or "gee37") appears elsewhere in the code, then lifting that information out into a configuration section would make maintenance safer.   my @formArgs = map { "arg$_" } 1..5; and then, later in the code
      foreach ( @formArgs ) { .. }

      <aol>me too</aol>

      And to take this a bit further, if there's no else, I'd write this like so:

      for (map("arg$_", 1..5, 19), "gee37") { $object->method($_) or next; # ... }

      Fewer indentation levels are generally easier to read. Finding this at the bottom of a function always makes me scroll up and down a couple times:

      # ... } } } # play "guess the block this belongs to" here $_ .= "something"; } } }

      Makeshifts last the longest.

        Is there a reason that the idea merlyn mentioned (that I think is a excellent use of perl) was comprehensively glossed over?

        I'd also try and do away with the if block but I think cond or next; reads a little unintuatively and would probably use
        next unless cond;

        for ( 'arg1' .. 'arg5', "arg19", 'gee13' .. 'gee37' ) { next unless $object->method( $_ ); # do stuff }

        Cor! Like yer ring! ... HALO dammit! ... 'Ave it yer way! Hal-lo, Mister la-de-da. ... Like yer ring!
        "Capture regularities in code, irregularities in data", is the koan of the senior programmer. You've bucked that for a microoptimization.

      To me, it seems that your approach is working against this maxim. You are embedding the irregularities in the code. Sure, they are data in the code, but you have to work out that fact because you build up a list of 1-5 and 19 and attach gee37 to the end of it.

      I think that this:

      @obj_args = qw/ arg1 arg2 arg3 arg4 arg5 arg19 gee37 /; for ( @obj_args ) { ... }

      Is clearer. In this example, you've clearly separated the irregularities into a data structure and used that to drive the code.

      This kind of separation has all kinds of maintenance advantages, not just clarity on first reading. Say the maintenance programmer is confronted with a case where the arguments arg3 and gee37 have strange behavior. It's easy to change the code for this and you're doing it in such a way as to make it very clear if someone else working with the maintenance programmer would immediately grasp the intent.

      I much prefer:

      # @objargs = qw/ arg1 arg2 arg3 arg4 arg5 gee37 /; # # test just arg3 and gee37 @objargs = qw/ arg3 gee37 /; for ( @objargs ) { ... }

      To this:

      #for (map("arg$_", 1..5, 19), "gee37") { # # test just arg3 and gee37 for ( qw/ arg3 gee37 / ) { ... }

      This shows how the irregularity of the data is more clearly separated from the code. When you use something like:

      (map("arg$_", 1..5, 19), "gee37")

      You are using code to represent the irregularity of the data.

Re: On Foreach Loops and Maintainability
by Tanalis (Curate) on Oct 11, 2002 at 14:04 UTC
    I think .. I'd agree in principle. Your latter syntax is evidently clearer to read and maximises flexibility if the arg format changes as code evolves.

    Consider, however, a case where there are hundreds or even thousands of argNs - in that case, it's clearly easier to read and maintain a foreach ( 0 .. 1000 ) than a qw// containing a huge number of words. (This isn't so rare - consider iterating through a fixed portion of a client database, for example).

    I think both approaches have their relative pros and cons - and it's a case of choosing whichever is most suitable for the code being written, taking readability and maintenance into account at design-time. After all, most Perl programmers would be able to quickly understand either syntax, especially if it's familiar code or territory.

    Just some quick thoughts...
    --Foxcub

Re: On Foreach Loops and Maintainability
by perrin (Chancellor) on Oct 11, 2002 at 14:11 UTC
    What happens when the name "arg" changes to "param"? Now I have to change it five times.
      But it's only in 1 place that you have to look to change the code. What if you use the construct "arg$_" multiple times in the body of the loop? It will be slightly harder to change that, and make sure you have gotten them all.

      Counting the keystrokes in my favorite editor to change arg to param in any of the above examples: ctrl-shift-5 a r g \n p a r a m \n !

      12 keystrokes, easy enough. If you are worried about replacing other things, then it's +1 stroke per interactive search/replace....

      Like FoxCub said, I think if you have both methods in your toolbox, and can use them when appropriate, then you are a better programmer for it.

        That's true, but it sets off my "once and only once" radar. I don't like repeating things.
Re: On Foreach Loops and Maintainability
by ignatz (Vicar) on Oct 11, 2002 at 17:40 UTC
    The whole premise seems ugly. arg1, arg2, arg3 bleach:-P

    For me the most important thing is the process: Do arg1, then arg2, then arg3, etc. The code needs to clearly define what the process is:

    @process = qw/ arg1 arg2 arg3 arg4 arg5 /; foreach ( @process ) { if ( $object->method( $_ ) ) { # ... do stuff } }
    Even better:
    foreach ( $object->getProcess() ) { if ( $object->method( $_ ) ) { # ... do stuff } }
    so that the object controls what the process is and as the object changes so can the process. (ADDED: I could see distilling this down even more and in better ways.)

    I'd bet that this is the slowest way to do things, but at least for me it seems the clearest. I don't know what posts you are refering to so I could be completely talking out of my butt.

    ()-()
     \"/
      `                                                     
    

      I too have a little problem with the names. Those ascending numbers lead me to conclude that the problem domain is being poorly mapped into Perl. merlyn says that you should capture regularities in code. Yes indeed, but that still doesn't validate this map { "arg$_" } approach as being a good one. A completely different approach is required. Which is why I like ignatz's idea.

      Can someone post references to the thread(s) that rob_au is talking about? I seem to have missed them, and I'm not sure if I'm taking the example too literally.

      One thing that I can see in favour of rob_au's initial approach is that it allows you grep the source code to find out where arg4 is used. Never underestimate the power of static analysis. If you're creating those names on the fly then you don't have that luxury.

      If the project is big enough, then it may even be worth going as far as pulling those names out of the source code and storing them in a data file. That gives you a different degree of flexibility and control. You can write a Makefile to ensure that you can run a program against the data file to make sure it is stays coherent.


      print@_{sort keys %_},$/if%_=split//,'= & *a?b:e\f/h^h!j+n,o@o;r$s-t%t#u'