in reply to Re: Refactoring nested if block
in thread Refactoring nested if block
I really appreciate all of your input, especially BrowserUK. I'll try those changes, and see how it works. I agree - testing is critical.
One thing I figure I should describe is my naming convention. Yes, I know $major_frame_index and $minor_frame_index are long, but my reasoning was that $major or even $major_frame didn't seem descriptive enough for me. See, this is part of a larger app, in which I parse a large chunk of binary data (major frame), which is in turn made up of smaller chunks of data (minor frames). This resulted in two types of data for each type of frame, a numerical index and a chunk of binary data. The indices I use to reference their locations I called $major_frame_index and $minor_frame_index, and the chunks of data I have as scalar variables, $minor_frame and $major_frame. But writing this now makes me rethink my need to have store the binary chunks in variables... I don't have access to the rest of the code now, so can't say for sure.
But, assuming I would still need both the index and the binary chunk, is there a better way to name the variables?
-- Burvil
|
|---|
| Replies are listed 'Best First'. | |
|---|---|
|
Re^3: Refactoring nested if block
by xdg (Monsignor) on Mar 25, 2006 at 13:30 UTC | |
If the major and minor frame indices are integers, are they sequential and could they be stored in array references rather than hash references? This might allow you to shorten/tighten things a bit -- for example, eliminating all the numerical sorting of keys and integer checks. I'd suggest pulling all your hard-coded numbers into variables that explain them. (E.g. why just minor frame data for 3 .. 17, why all the special logic around minor frame index 3 being 165.) Also, while "golfing" (reducing characters) can often make things more confusing, in this case, I think trying to strip out some of the "line-noise" of Perl so that the logic is clearer might help. That would include changing %{ $ref_data } to just %$ref_data and $ref_data->{$filename}->{$major_frame_index}->{$minor_frame_index} to $ref_data->{$filename}{$major_frame_index}{$minor_frame_index} if not just $ref_data->{$filename}{$i}{$j} (simple variables for loop iterators). You could also break those up into temporary variables for each loop so you're not cluttering up the screen with the long names. You might consider moving the logic when you're at row length or you hit your special frame data 3 equals 165 actions to separate subroutines with descriptive names. That would strip out lines to make the logic clearer (at the cost of some efficiency, true) as it brings your alternate conditions closer together visually. Something that jumps out is that you're pushing an array slice onto @cur_spin but checking to see if the length of @cur_spin ever hits the row length exactly. What's implied is that the row length is a multiple of the slice length but that's not immediately obvious. And what happens if you get to the end of a major frame and came up short? It doesn't look like anything would be written. You also don't seem to sort the keys of the minor frames when you loop through them. Is that a bug? Since just about every hash you have you wind up wanting sorted, you might just want to use a sorted hash in the first place. (Though I haven't used it, CPAN search found Tie::Sorted::Hash.) Having looked through your code a few times now, it looks like your special logic for @cur_spin equals zero is trying to pick off the name for the row and you need that logic since you reuse @cur_spin as your accumulator and don't want to try to generate a name each time through. That makes me think you should consider just assembling all of the minor frames up front and just splice them off a row at at time rather than looping over them. For example, I'd approach it more like this:
This kind of assumes that you can generate a row name in a subroutine from the row itself other than with all those indexes. If you really want/need the indexes, then you'll have to do something more like what you've got with all the indices and looping instead of the way I've done it with map. I don't really have enough context for your application to say. -xdg Code written by xdg and posted on PerlMonks is public domain. It is provided as is with no warranties, express or implied, of any kind. Posted code may not have been tested. Use of posted code is at your own risk. | [reply] [d/l] [select] |
|
Re^3: Refactoring nested if block
by BrowserUk (Patriarch) on Mar 25, 2006 at 10:16 UTC | |
But, assuming I would still need both the index and the binary chunk, is there a better way to name the variables? Well, I can't see the rest of the code, but to my mind that's the point. One of the great advantages of partitioning your code into self-contained subroutines and using locally scoped variables is that you do not have to consider the whole application when naming your variables, as you would if using large scopes or global vars. In the case of your two loop iterators, at the level of this subroutine, that is all they are: Loop interators. In fact at this level, the use of the word 'index' is a misnomer as you index into arrays, but you key into hashes. So I suppose that $major_key and $minor_key might make sense, but then it is fairly obvious from their use that they are keys. And their scope only covers a dozen or so lines, so it is unlikely that you, (or anyone working on that subroutine), will forget or loose sight of that. Outside of that subroutine, the keys those loop interators transiently hold, may have associations with "frames", (whatever they are :), and it might make sense to name variables that serve a similar purpose, in other parts of the code to reflect that, but within this subroutine they just serve to iterate the keys of the hash so referencing "frame" in the names serves little purpose. Even the references to "major" and "minor" are not especially relevant at this level, though the implication that one is subordinate to the other is accurate and fits their use, and is no worse than $key1 & $key2 or even $k1 & $k2. Though I would say that they are no better than those either. I think where you are coming from, is the school of thought that says you use a single, consistant name for any given entity throughout your program. I've seen this proposed in several books, and was even taught it a long time ago on a Fortran77 course. I cannot remember the scoping rules for Fortran77, but I recall having to use a lot of global vars, so consistant naming was a requirement. Such naming conventions were also a necessity in (old) COBOL programs with their huge DATA-SECTIONs at the top of the programs. If you are going to do this is Perl, especially in larger, more complex programs, you are throwing away one of the main benefits of local scoping. You (almost) might as well stick with global vars and have done with it. I guess if your local naming conventions dictate this practice, or if this is a large, existing application that uses similar conventions consistantly throughout the rest of the code, then sticking with it when refactoring this sub might make sense. In the long term, recognising that regardless of what the entities that these loop variables hold, are in the wider context, within the scope of this subroutine, and these variable's lives, they are simply loop / hash iterators, will allow you to write more concise lines of code and so concentrate soley upon what these variables are doing within this subroutine and treat the data coming into the subroutine in a generic fashion. In the end, that's a personal view and your decisions will be influenced by the wider context of your codebase and your personal or corporate preferences. Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
Lingua non convalesco, consenesco et abolesco. -- Rule 1 has a caveat! -- Who broke the cabal?
"Science is about questioning the status quo. Questioning authority".
In the absence of evidence, opinion is indistinguishable from prejudice.
| [reply] [d/l] [select] |
by Argel (Prior) on Mar 28, 2006 at 00:51 UTC | |
Well, I can't see the rest of the code, but to my mind that's the point. One of the great advantages of partitioning your code into self-contained subroutines and using locally scoped variables is that you do not have to consider the whole application when naming your variables, as you would if using large scopes or global vars. Forgive me for disagreeing, but lets say I use $network_device_polling_data in five subroutines and $ndpdata in the sixth all to represent the same thing then that just introduces confusion and increase the liklihood of introducing a bug in the future. Subroutines and the contents of subroutines do not exist within a vacuum. | [reply] |
by BrowserUk (Patriarch) on Mar 28, 2006 at 09:04 UTC | |
Forgive me for disagreeing No need to :) lets say I use $network_device_polling_data in five subroutines and $ndpdata in the sixth all to represent the same thing I recall saying something about " if your local naming conventions dictate this practice, or if this is a large, existing application that uses similar conventions consistently throughout the rest of the code, then sticking with it when refactoring this sub might make sense.", which I think covers the situation you describe. Subroutines and the contents of subroutines do not exist within a vacuum. Actually, as far as makes sense within the context of your application/library, they should. That is the purpose and benefit of abstraction and encapsulation. Let's take Perl's sort routine as a contrived example that supports my premise. Internal to the implementation, it doesn't care what the data it is sorting is. It could be an array, a list, hash keys, hash values; it may consist of numbers, integers, reals, characters, strings; and any of those could represent people, animals, dogs, Chihuahua's, cats, files, directories, picture files, volumes of water, distances to towns, sizes of feet etc. etc. All these attributes of the data to be sorted are completely irrelevant to the function for which the sort subroutine is written. It only needs to know a) how many there are; b) what collating sequence should be used to compare two of them; c) whether an ascending or descending ordering is required. Even then, 2 of those 3 pieces of information can be abstracted out of the sort code via a callback. Of course, you may say that is a "special case" as most application subroutines cannot be abstracted in such a general way--and you're right, but the purpose of abstraction and encapsulation is to simplify the code at any given point by reducing the amount of information that the programmer needs to carry in his head when reading or writing it. Computers are very good at remembering everything--the weak link is the wetware instructing them. ...say I use $network_device_polling_data... This is the part that I would ask you to think about. I pick up your code and need to modify one of these subroutines and I encounter the variable $network_device_polling_data. What does that variable tell me about the data it contains? Well, obviously it has something to do with polling a network device, but is it Of course, it could be data that relates to all device polling by any device host anywhere in the network. And again, it might be statistics gathered about that process, or control values used by that process, or just a boolean flag that say "Don't do any device polling on this network", or "All devices are polled on this network". It could even mean all of these thing at different places in the program. And the point is that it is simply impossible to come up with variable names that completely and transparently encapsulate the state and purpose of the data they contain, no matter how long you make them. They always need to be viewed and understood in the context of their scope and use. $network_device_polling_data may make sense at the outer level of it's scope, but within a subroutine that is called to normalise that data by (say) lowercasing everything, escaping control characters, and trimming whitespace, it is just a string that needs to be manipulated in a particular way. It may be that the manipulations are unusual and specific to network device polling data, in which case the subroutine name should reflect that fact and be called (something like) normalise_network_polling_data(), but then within that subroutine, it becomes unnecessary and unhelpful to use the full nomenclature to refer to the data. Compare
with
I might take that one step further and make it
In a book I just read, there is a line in the last Chapter that reads Henry raises a hand in greeting, then gestures to the young registrar to accompany him to the lightbox where Baxter's scans are on display. There is no need for the author to elucidate that Baxter is a 20 something yob with whom Henry had a slight fender-bender earlier in the day; and whom was going to exact his revenge upon Henry by having his lackeys physically beat him; until Henry, a neurosurgeon noticed several tell tale signs that Baxter was suffering the early symptoms of Huntington's disease and that he (Henry), had used this knowledge to embarrasses Baxter in front of his lackeys and so escaped the beating; and that several hours later, Baxter had forced his way into Henry's house with the intention of exacting his revenge upon Henry and his family and as a result of a struggle that ensued had been thrown down the stairs, receiving a severe blow to head on a marble step; and that now Henry was about to save his life by performing an operation to correct a depressed midline fracture and remove extradural and subdural heamatomas. All of that information is known through context. Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
Lingua non convalesco, consenesco et abolesco. -- Rule 1 has a caveat! -- Who broke the cabal?
"Science is about questioning the status quo. Questioning authority".
In the absence of evidence, opinion is indistinguishable from prejudice.
| [reply] [d/l] [select] |