in reply to Re^2: Refactoring nested if block
in thread Refactoring nested if block

But, assuming I would still need both the index and the binary chunk, is there a better way to name the variables?

Well, I can't see the rest of the code, but to my mind that's the point. One of the great advantages of partitioning your code into self-contained subroutines and using locally scoped variables is that you do not have to consider the whole application when naming your variables, as you would if using large scopes or global vars.

In the case of your two loop iterators, at the level of this subroutine, that is all they are: Loop interators. In fact at this level, the use of the word 'index' is a misnomer as you index into arrays, but you key into hashes. So I suppose that $major_key and $minor_key might make sense, but then it is fairly obvious from their use that they are keys. And their scope only covers a dozen or so lines, so it is unlikely that you, (or anyone working on that subroutine), will forget or loose sight of that.

Outside of that subroutine, the keys those loop interators transiently hold, may have associations with "frames", (whatever they are :), and it might make sense to name variables that serve a similar purpose, in other parts of the code to reflect that, but within this subroutine they just serve to iterate the keys of the hash so referencing "frame" in the names serves little purpose. Even the references to "major" and "minor" are not especially relevant at this level, though the implication that one is subordinate to the other is accurate and fits their use, and is no worse than $key1 & $key2 or even $k1 & $k2. Though I would say that they are no better than those either.

I think where you are coming from, is the school of thought that says you use a single, consistant name for any given entity throughout your program. I've seen this proposed in several books, and was even taught it a long time ago on a Fortran77 course. I cannot remember the scoping rules for Fortran77, but I recall having to use a lot of global vars, so consistant naming was a requirement. Such naming conventions were also a necessity in (old) COBOL programs with their huge DATA-SECTIONs at the top of the programs.

If you are going to do this is Perl, especially in larger, more complex programs, you are throwing away one of the main benefits of local scoping. You (almost) might as well stick with global vars and have done with it.

I guess if your local naming conventions dictate this practice, or if this is a large, existing application that uses similar conventions consistantly throughout the rest of the code, then sticking with it when refactoring this sub might make sense. In the long term, recognising that regardless of what the entities that these loop variables hold, are in the wider context, within the scope of this subroutine, and these variable's lives, they are simply loop / hash iterators, will allow you to write more concise lines of code and so concentrate soley upon what these variables are doing within this subroutine and treat the data coming into the subroutine in a generic fashion.

In the end, that's a personal view and your decisions will be influenced by the wider context of your codebase and your personal or corporate preferences.


Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
Lingua non convalesco, consenesco et abolesco. -- Rule 1 has a caveat! -- Who broke the cabal?
"Science is about questioning the status quo. Questioning authority".
In the absence of evidence, opinion is indistinguishable from prejudice.

Replies are listed 'Best First'.
Re^4: Refactoring nested if block
by Argel (Prior) on Mar 28, 2006 at 00:51 UTC
    Well, I can't see the rest of the code, but to my mind that's the point. One of the great advantages of partitioning your code into self-contained subroutines and using locally scoped variables is that you do not have to consider the whole application when naming your variables, as you would if using large scopes or global vars.

    Forgive me for disagreeing, but lets say I use $network_device_polling_data in five subroutines and $ndpdata in the sixth all to represent the same thing then that just introduces confusion and increase the liklihood of introducing a bug in the future. Subroutines and the contents of subroutines do not exist within a vacuum.

      Forgive me for disagreeing

      No need to :)

      lets say I use $network_device_polling_data in five subroutines and $ndpdata in the sixth all to represent the same thing

      I recall saying something about " if your local naming conventions dictate this practice, or if this is a large, existing application that uses similar conventions consistently throughout the rest of the code, then sticking with it when refactoring this sub might make sense.", which I think covers the situation you describe.

      Subroutines and the contents of subroutines do not exist within a vacuum.

      Actually, as far as makes sense within the context of your application/library, they should. That is the purpose and benefit of abstraction and encapsulation.

      Let's take Perl's sort routine as a contrived example that supports my premise. Internal to the implementation, it doesn't care what the data it is sorting is. It could be an array, a list, hash keys, hash values; it may consist of numbers, integers, reals, characters, strings; and any of those could represent people, animals, dogs, Chihuahua's, cats, files, directories, picture files, volumes of water, distances to towns, sizes of feet etc. etc.

      All these attributes of the data to be sorted are completely irrelevant to the function for which the sort subroutine is written. It only needs to know a) how many there are; b) what collating sequence should be used to compare two of them; c) whether an ascending or descending ordering is required. Even then, 2 of those 3 pieces of information can be abstracted out of the sort code via a callback.

      Of course, you may say that is a "special case" as most application subroutines cannot be abstracted in such a general way--and you're right, but the purpose of abstraction and encapsulation is to simplify the code at any given point by reducing the amount of information that the programmer needs to carry in his head when reading or writing it. Computers are very good at remembering everything--the weak link is the wetware instructing them.

      ...say I use $network_device_polling_data...

      This is the part that I would ask you to think about. I pick up your code and need to modify one of these subroutines and I encounter the variable $network_device_polling_data. What does that variable tell me about the data it contains?

      Well, obviously it has something to do with polling a network device, but is it

      • Data received from polling a network device?
      • Data used to poll a network device?
      • Data transmitted when polling a network device?
      • Data that summarises the actions taken, or responses received, or both, from polling one, or more, network devices?
      • Is it applicable to a specific network device?
      • Or all network devices?
      • Or a subset of network devices or device type or...

      Of course, it could be data that relates to all device polling by any device host anywhere in the network. And again, it might be statistics gathered about that process, or control values used by that process, or just a boolean flag that say "Don't do any device polling on this network", or "All devices are polled on this network".

      It could even mean all of these thing at different places in the program.

      And the point is that it is simply impossible to come up with variable names that completely and transparently encapsulate the state and purpose of the data they contain, no matter how long you make them. They always need to be viewed and understood in the context of their scope and use.

      $network_device_polling_data may make sense at the outer level of it's scope, but within a subroutine that is called to normalise that data by (say) lowercasing everything, escaping control characters, and trimming whitespace, it is just a string that needs to be manipulated in a particular way. It may be that the manipulations are unusual and specific to network device polling data, in which case the subroutine name should reflect that fact and be called (something like) normalise_network_polling_data(), but then within that subroutine, it becomes unnecessary and unhelpful to use the full nomenclature to refer to the data. Compare

      sub normalise{ my $network_device_polling_data = shift; $network_device_polling_data =~ tr/.../.../; $network_device_polling_data =~ s/..../..../; $network_device_polling_data = quotemeta( $network_device_polling_ +data ); ...; return $network_device_polling_data; }

      with

      sub normalise_network_device_polling_data { my $data = shift; $data =~ tr/.../.../; $data =~ s/..../..../; $data = quotemeta( $data ); ...; return $data; }

      I might take that one step further and make it

      sub normalise_network_device_polling_data { die "This is a procedure not a function" if defined wantarray; for ( @_ ) { tr/.../.../; s/.../.../; $_ = quotemeta; } }

      In a book I just read, there is a line in the last Chapter that reads

      Henry raises a hand in greeting, then gestures to the young registrar to accompany him to the lightbox where Baxter's scans are on display.

      There is no need for the author to elucidate that Baxter is a 20 something yob with whom Henry had a slight fender-bender earlier in the day; and whom was going to exact his revenge upon Henry by having his lackeys physically beat him; until Henry, a neurosurgeon noticed several tell tale signs that Baxter was suffering the early symptoms of Huntington's disease and that he (Henry), had used this knowledge to embarrasses Baxter in front of his lackeys and so escaped the beating; and that several hours later, Baxter had forced his way into Henry's house with the intention of exacting his revenge upon Henry and his family and as a result of a struggle that ensued had been thrown down the stairs, receiving a severe blow to head on a marble step; and that now Henry was about to save his life by performing an operation to correct a depressed midline fracture and remove extradural and subdural heamatomas.

      All of that information is known through context.


      Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
      Lingua non convalesco, consenesco et abolesco. -- Rule 1 has a caveat! -- Who broke the cabal?
      "Science is about questioning the status quo. Questioning authority".
      In the absence of evidence, opinion is indistinguishable from prejudice.