in reply to Refactoring gone wrong (was: Cannot read in multiple lines)
in thread Cannot read in multiple lines

Let's look at that definition with a different emphasis:

Refactoring (noun): a change made to the internal structure of software to make it easier to understand and cheaper to modify without changing it's external observable behaviour.

Cosmetic changes such as renaming variables or eliminating the explicit use of the default variable where it will be used implicitly are not refactorings. They don't affect the structure of the code.

Add use strict and warnings
Okay. This is actually not a refactoring in itself, but it's the first part of the common perl big refactoring 'Make Everything Compile with use strict and warnings'.

The Addition of warnings or strict checking is definitely not a refactoring of any sort. Doing so changes the "external observable behaviour" of the code. I'm not saying it shouldn't be done. I agree that it should.

Split uses $_ by default, so don't need to state it
Doesn't alter behaviour. You could even consider this the latter half of Remove Parameter if you look at it a certain way. Increases clarity. Valid refactoring.

Split uses $_ whether you state it or not. It isn't at all like Remove Parameter because the parameter is used regardless of whether it do explicitly. It is not an "unused parameter." For an inexperienced Perl programmer, reducing it is likely to reduce clarity. For an experienced programmer it should be just as clear either way. In any case, it doesn't change the structure of the code and is not a refactoring.

Rename $id to $id_to_fetch since it's more descriptive
Doesn't alter behaviour. Increases clarity. Valid refactoring.

This does not alter the structure of the code. It is not a refactoring.

Since we're only interested in the school and name we can just extract those from the split ....
Doesn't alter behaviour. Does increase clarity. Valid refactoring.

Whether or not this is a refactoring at all is debatable. The one significant change resulting from this is that some unnecessary values may not be stored. The original did keep unnecessary values in the array when the split resulted in more than 2 elements. If the split always resulted in exactly 2 elements, I would argue that this would not be a refactoring but I don't know that to be the case and suspect it isn't.

We can then simplify the name return to 'return $name if $id eq $fetch_id;'
Doesn't alter behaviour. For me, this does increase clarity. Valid refactoring.

Besides the fact that this isn't exactly "simplified" from the original return $schoolname;, it doesn't really alter the structure of the code.

Although you didn't point it out yourself, you did drop the unnecessary temporary variable, $schoolname, and I suppose that is a minor refactoring.

On the filehandle issue, localizing a typeglob and relying on perl to close your filehandle for you is not, IMHO, good practice. Keep in mind that a localized typeglob localizes all the variables with that name, not just the filehandle. Consider:

perl -le '$A = "a"; {local *A; $A = "b" } print $A'
To get around that, you would need to use IO::Handle or allow perl to autovivify the handle for you (but only in 5.6.1 and above.) Doing so is often worth it but not for the sole purpose of obtaining a filehandle you don't have to explicitly close. In fact, closing the filehandle explicitly is generally considered a good practice.

Here's another quote from Refactoring: Improving the Design of Existing Code, this from the foreword:

So, what's the problem? Simply this: Refactoring is risky. It requires changes to working code that can introduce subtle bugs. Refactoring, if not done properly, can set you back days, even weeks. And refactoring becomes riskier when practiced informally or ad hoc. You start digging in the code. Soon you discover new opportunities for change, and you dig deeper. The more you dig, the more stuff you turn up. . .and the more changes you make. Eventually you dig yourself into a hole you can't get out of. To avoid digging your own grave, refactoring must be done systematically.
Refactoring is risky. It shouldn't be done informally. It requires a disciplined systematic approach. Here at the monastery I've noticed that the term "refactoring" is both overused and often incorrectly used altogether. Refactoring is not the same as editing or rewriting and we shouldn't further the impression that is.

Refactoring is the practice of restructuring a program by changing its basic components. It is worthwhile to remember the mathematical roots of the word. One might accurately construct the number 24 with the "program" 2**2 * 3 + 6 + 2 * 3 * 1 but refactoring that to 2**3 * 3 results in a program which is shorter, clearer, and more efficient.

-sauoq
"My two cents aren't worth a dime.";

Replies are listed 'Best First'.
Re: What is refactoring?
by adrianh (Chancellor) on Nov 10, 2002 at 00:23 UTC
    Cosmetic changes such as renaming variables or eliminating the explicit use of the default variable where it will be used implicitly are not refactorings. They don't affect the structure of the code.

    Unless I'm misunderstanding, this means that you would not accept Rename Method as a refactoring either. Correct?

    I think renaming methods (and variables) can make code "easier to understand and cheaper to modify". I don't understand what advantage making a distinction between lexical changes (renaming, removing implicitly passed parameters, etc.) and other more structural changes (extracting classes, etc.) gives you?

    Since Fowler is happy to call Rename Method a refactoring, I'm happy to call renaming a variable a refactoring :-)

    We can, of course, disagree whether a particular renaming helps or hinders - but that is a separate issue.


    The Addition of warnings or strict checking is definitely not a refactoring of any sort. Doing so changes the "external observable behaviour" of the code.

    Not necessarily.

    For my sins I have, in the past, had to take masses of non-documented, poorly written perl4 scripts and turn them into something sane.

    When given a task like this one of the first things I do (after creating a test suite) is make everything compile okay with strict & warnings. The process will go something like this:

    1. Run the test suite.
    2. Do perl -Mstrict -Mwarnings whatever.pl
    3. Fix one of the error messages.
    4. Run the test suite again.
    5. Tests failing? Undo last change and do it properly.
    6. Repeat steps 1-5 until no errors
    7. Add use strict; use warnings; to whatever.pl

    Lots of nice small steps. Test suite runs properly after each small step. I interpret this as no change in the "external observable behaviour" of the code.


    Split uses $_ whether you state it or not. It isn't at all like Remove Parameter because the parameter is used regardless of whether it do explicitly. It is not an "unused parameter." For an inexperienced Perl programmer, reducing it is likely to reduce clarity. For an experienced programmer it should be just as clear either way.

    Personally I have found that if the $_ is mentioned explicitly then the clarity of the code is improved by either:

    • removing it (if it's obvious it doesn't need to be mentioned - it's just visual clutter)
    • having an explicitly named variable (if it's not obvious it needs to be named)

    However, these sort of issues are influenced a great deal by personal taste and team experience so I'm happy to agree to disagree over the clarity issue.


    I said:

    Since we're only interested in the school and name we can just extract those from the split ....

    You said:

    Whether or not this is a refactoring at all is debatable. The one significant change resulting from this is that some unnecessary values may not be stored. The original did keep unnecessary values in the array when the split resulted in more than 2 elements. If the split always resulted in exactly 2 elements, I would argue that this would not be a refactoring but I don't know that to be the case and suspect it isn't.

    I'm not sure I follow this. How does the number of items pulled off the split affect the behaviour of the function (returning the ID)?

    To me the significant change is that we explicitly name what we were interested in (the $id and $name). The reader of the code no longer has to translate $split_record[0] to id and $split_record[1] to school name. This increases clarity and maintainability


    On the filehandle issue, localizing a typeglob and relying on perl to close your filehandle for you is not, IMHO, good practice. Keep in mind that a localized typeglob localizes all the variables with that name, not just the filehandle.

    You are quite correct, but that wasn't an issue in this case (and is very rarely an issue at all in my personal experience - but can lead to some evil bugs when it is).

    I know many like having an explicit close. I tend to code so that the scope will close the filehandle for me since it mixes better with exception based code. If you don't have the scope close the filehandle you have to put a catch block around everything that opens a filehandle - which can be a pain.

    This is an arguable point I freely admit (and the fact that I forgot the local in the original does make a strong case for keeping the close in from the defensive coding point of view :-).


    Refactoring is risky. It shouldn't be done informally. It requires a disciplined systematic approach. Here at the monastery I've noticed that the term "refactoring" is both overused and often incorrectly used altogether. Refactoring is not the same as editing or rewriting and we shouldn't further the impression that is.

    Amen.

    I think my failure to keep my refactoring and debugging hats separate when removing that close demonstrates quite clearly what happens when you let that discipline slip.

    However, this doesn't mean that all editing and rewriting is not refactoring. I class renaming methods and variables, along with similar changes that increase clarity without affecting the code behaviour, as refactoring :-)

      Localizing a typeglob to get a scope-bound filehandle is not necessarily a good idea for various other reasons as well. Why not simply: open my($fh), "<", $filename or die "open $filename: $!$/"; If that's an issue due to old Perl version, there's still Symbol::gensym, and a short wrapper function will make its use nearly as concise as the above snippet.

      Makeshifts last the longest.

        Localizing a typeglob to get a scope-bound filehandle is not necessarily a good idea for various other reasons as well.

        Oh. Didn't know there were other possible problems. What are they?

        I know of the gensym solution for early perls - but since 99% of the time overriding the non-IO bits of the glob hasn't been an issue for me I stuck with the local method when I couldn't autovivify the filehandle.

        Curious to know what the other possible pitfalls are.