Making cosmetic changes to code is not "refactoring."

I beg to differ :-)

Making cosmetic changes to code is almost the essence of refactoring. According to Fowler (my emphasis):

Refactoring (noun): a change made to the internal structure of software to make it easier to understand and cheaper to modify without changing it's external observable behaviour.

People sometimes only consider structural refactorings like Replace Type Code with State/Strategy and ignore the smaller refactorings like Inline Temp that can be just as important.

Obviously 'easier to understand' is a subjective term (people can argue forever about Inline Temp vs Introduce Explaining Variable) - so take all the 'increases clarity' that follow as 'increases clarity for me'.

First off, the ones that I definitely consider to be refactorings:

Add use strict and warnings
Okay. This is actually not a refactoring in itself, but it's the first part of the common perl big refactoring 'Make Everything Compile with use strict and warnings'. The end result of this refactoring shouldn't alter behaviour and makes things easier to understand and cheaper to modify in the future.
Split uses $_ by default, so don't need to state it
Doesn't alter behaviour. You could even consider this the latter half of Remove Parameter if you look at it a certain way. Increases clarity. Valid refactoring.
Rename $id to $id_to_fetch since it's more descriptive
Doesn't alter behaviour. Increases clarity. Valid refactoring.
Since we're only interested in the school and name we can just extract those from the split ....
Doesn't alter behaviour. Does increase clarity. Valid refactoring.
We can then simplify the name return to 'return $name if $id eq $fetch_id;'
Doesn't alter behaviour. For me, this does increase clarity. Valid refactoring.

Next a bad refactoring:

Since we're not looking at the end of the line we don't need the chomp at all ...
You're quite correct and this only applies if each line has more than two fields. I should have made that assumption explicit in the code or left the chomp in.

Finally bug-fixes gone wrong. I said:

We don't need the close - the filehandle will automatically be closed when it falls out of scope.

And you said.

that's terrible advice. The filehandle should absolutely be closed. It's a global (package) variable and probably won't go out of scope until the program ends. Leaving it open is an undocumented and probably undesired side-effect.

Quite correct. I'm stupid. It was late. Foolish Adrian.

It's instructive that I made this error because of a classic mistake. Let's look at the original code again:

sub get_schoolname( $ ) { my $id = $_[0]; my $schoolname; #get groupname from current database open(SCHOOLDATA,"school_data.pdt") or die "Can't open school data +file: $!"; while ( <SCHOOLDATA> ) { chomp($_); @split_record = split( /\|/ , $_ ); print $split_record[0]; if( $split_record[0] eq $id ) { $schoolname = $split_record[1]; } } close( SCHOOLDATA ); return( $schoolname ); };

Now, when I looked at this I actually did notice that it stomped all over the global SCHOOLDATA :-)

What I should have done at this point is write a failing test.

However, due to a little too much hubris some variant of "The code is so simple! Who needs a test to demonstrate the bug" went through my head. Bad Adrian.

Later on I coding an extra close into the loop so that the file handle would be closed after the early exit in my version. As soon as I'd finished I thought - "Well that's pointless. Once you've localised SCHOOLDATA it will be closed when it falls out of scope" and removed both close statements.

I then wrote up my nice little decision to remove the close - and promptly forgot to go add the local that would make it valid!

So, instead of removing one bug I added a new one. Very bad Adrian.

Now, if I'd been a good little developer and written some failing tests:

use Test::More tests => 4; { local *SCHOOLDATA; open(SCHOOLDATA, "school_data.pdt") or die; is(get_schoolname(2), 'bunny werks', 'existing id found'); is(tell(*SCHOOLDATA{IO}), 0, 'filehandle still open'); }; { local *SCHOOLDATA; open(SCHOOLDATA, "school_data.pdt") or die; is(get_schoolname(-1), undef, 'bad id rejected'); is(tell(*SCHOOLDATA{IO}), 0, 'filehandle still open'); };

I wouldn't have humiliated myself in front of my peers like this :-)

So the moral of the story is:

By the way, IMHO, his variable names were just fine for the task at hand. I would have chosen them differently and probably closer to the names you picked but that's largely a matter of style. Style is something we all develop for ourselves. It takes time and we find ourselves making niggling little decisions like whether to use $id_to_fetch or $fetch_id as you seem to have struggled with. Personally, I would have stuck with $fetch_id because I think prepositions in variable names are ugly and cumbersome but it's still a style decision.

Didn't have to struggle with the name change. Took longer to type than it did to think about.

While I agree that it is a style decision I think the fact that both of us would have chosen something other than '$id' is a pointer to the fact that the original choice wasn't as clear as it could be. Not wrong - just not as descriptive as it could be.

Rewriting code just because you can is rarely a good idea. You should always have a good reason for rewriting something.

For me increasing the clarity of the code (for some definition of clarity) is a good enough reason.

Even with good reason, you should be careful to keep the functionality equivalent unless you're sure it doesn't work.

Quite correct. Fouled up with the close & chomp changes.

In other words, if it ain't broke...

I tend to disagree. This approach can turn the code base of a large project into a Big Ball Of Mud in a surprisingly small period of time. Over the years I'm spending more and more time Refactoring Mercilessly and finding it works very well.

<Adrian wanders off to write out 'always write a failing test for a bug' 1000 times>


In reply to Refactoring gone wrong (was: Cannot read in multiple lines) by adrianh
in thread Cannot read in multiple lines by Angel

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.