in reply to Cannot read in multiple lines
Works fine for me (once the missing trailing braces were added).
Just for fun, we'll refactor it into more idiomatic perl...
This gives us...
use strict; use warnings; sub get_schoolname { my $id_to_fetch = $_[0]; local *SCHOOLDATA; open(SCHOOLDATA,"school_data.pdt") or die "Can't open school data +file: $!"; while ( <SCHOOLDATA> ) { my ($id, $name) = split /\|/; return $name if $id eq $id_to_fetch; }; };
update: localised SCHOOLDATA so that that darn filehandle is actually closed. Thanks to sauoq and tachyon for pointing out my error.
|
|---|
| Replies are listed 'Best First'. | |
|---|---|
|
Re: Re: Cannot read in multiple lines
by sauoq (Abbot) on Nov 04, 2002 at 08:34 UTC | |
Just for fun, we'll refactor it into more idiomatic perl... Making cosmetic changes to code is not "refactoring." The only item you listed that actually makes a bit of real difference is: and that's terrible advice. The filehandle should absolutely be closed. It's a global (package) variable and probably won't go out of scope until the program ends. Leaving it open is an undocumented and probably undesired side-effect. Oh wait... You also made this subtle yet significant change: The problem with that is that you made an assumption based on a small bit of sample input. What if the input allows some lines to consist of only two fields? If it does, then you might well want the split to remove a newline from the school name. Granted, it will probably work without the split but recommending its removal without more information is irresponsible. By the way, IMHO, his variable names were just fine for the task at hand. I would have chosen them differently and probably closer to the names you picked but that's largely a matter of style. Style is something we all develop for ourselves. It takes time and we find ourselves making niggling little decisions like whether to use $id_to_fetch or $fetch_id as you seem to have struggled with. Personally, I would have stuck with $fetch_id because I think prepositions in variable names are ugly and cumbersome but it's still a style decision.
Rewriting code just because you can is rarely a good idea. You should always have a good reason for rewriting something. Even with good reason, you should be careful to keep the functionality equivalent unless you're sure it doesn't work. In other words, if it ain't broke... </rant> -sauoq "My two cents aren't worth a dime."; | [reply] |
by adrianh (Chancellor) on Nov 07, 2002 at 03:01 UTC | |
Making cosmetic changes to code is not "refactoring." I beg to differ :-) Making cosmetic changes to code is almost the essence of refactoring. According to Fowler (my emphasis): Refactoring (noun): a change made to the internal structure of software to make it easier to understand and cheaper to modify without changing it's external observable behaviour. People sometimes only consider structural refactorings like Replace Type Code with State/Strategy and ignore the smaller refactorings like Inline Temp that can be just as important. Obviously 'easier to understand' is a subjective term (people can argue forever about Inline Temp vs Introduce Explaining Variable) - so take all the 'increases clarity' that follow as 'increases clarity for me'. First off, the ones that I definitely consider to be refactorings: Next a bad refactoring: Finally bug-fixes gone wrong. I said: We don't need the close - the filehandle will automatically be closed when it falls out of scope. And you said. that's terrible advice. The filehandle should absolutely be closed. It's a global (package) variable and probably won't go out of scope until the program ends. Leaving it open is an undocumented and probably undesired side-effect. Quite correct. I'm stupid. It was late. Foolish Adrian. It's instructive that I made this error because of a classic mistake. Let's look at the original code again:
Now, when I looked at this I actually did notice that it stomped all over the global SCHOOLDATA :-) What I should have done at this point is write a failing test. However, due to a little too much hubris some variant of "The code is so simple! Who needs a test to demonstrate the bug" went through my head. Bad Adrian. Later on I coding an extra close into the loop so that the file handle would be closed after the early exit in my version. As soon as I'd finished I thought - "Well that's pointless. Once you've localised SCHOOLDATA it will be closed when it falls out of scope" and removed both close statements. I then wrote up my nice little decision to remove the close - and promptly forgot to go add the local that would make it valid! So, instead of removing one bug I added a new one. Very bad Adrian. Now, if I'd been a good little developer and written some failing tests:
I wouldn't have humiliated myself in front of my peers like this :-) So the moral of the story is: By the way, IMHO, his variable names were just fine for the task at hand. I would have chosen them differently and probably closer to the names you picked but that's largely a matter of style. Style is something we all develop for ourselves. It takes time and we find ourselves making niggling little decisions like whether to use $id_to_fetch or $fetch_id as you seem to have struggled with. Personally, I would have stuck with $fetch_id because I think prepositions in variable names are ugly and cumbersome but it's still a style decision. Didn't have to struggle with the name change. Took longer to type than it did to think about. While I agree that it is a style decision I think the fact that both of us would have chosen something other than '$id' is a pointer to the fact that the original choice wasn't as clear as it could be. Not wrong - just not as descriptive as it could be. Rewriting code just because you can is rarely a good idea. You should always have a good reason for rewriting something. For me increasing the clarity of the code (for some definition of clarity) is a good enough reason. Even with good reason, you should be careful to keep the functionality equivalent unless you're sure it doesn't work. Quite correct. Fouled up with the close & chomp changes. In other words, if it ain't broke... I tend to disagree. This approach can turn the code base of a large project into a Big Ball Of Mud in a surprisingly small period of time. Over the years I'm spending more and more time Refactoring Mercilessly and finding it works very well. <Adrian wanders off to write out 'always write a failing test for a bug' 1000 times> | [reply] [d/l] [select] |
by sauoq (Abbot) on Nov 07, 2002 at 23:24 UTC | |
Let's look at that definition with a different emphasis: Refactoring (noun): a change made to the internal structure of software to make it easier to understand and cheaper to modify without changing it's external observable behaviour. Cosmetic changes such as renaming variables or eliminating the explicit use of the default variable where it will be used implicitly are not refactorings. They don't affect the structure of the code. The Addition of warnings or strict checking is definitely not a refactoring of any sort. Doing so changes the "external observable behaviour" of the code. I'm not saying it shouldn't be done. I agree that it should. Split uses $_ whether you state it or not. It isn't at all like Remove Parameter because the parameter is used regardless of whether it do explicitly. It is not an "unused parameter." For an inexperienced Perl programmer, reducing it is likely to reduce clarity. For an experienced programmer it should be just as clear either way. In any case, it doesn't change the structure of the code and is not a refactoring. This does not alter the structure of the code. It is not a refactoring. Whether or not this is a refactoring at all is debatable. The one significant change resulting from this is that some unnecessary values may not be stored. The original did keep unnecessary values in the array when the split resulted in more than 2 elements. If the split always resulted in exactly 2 elements, I would argue that this would not be a refactoring but I don't know that to be the case and suspect it isn't. Besides the fact that this isn't exactly "simplified" from the original return $schoolname;, it doesn't really alter the structure of the code. Although you didn't point it out yourself, you did drop the unnecessary temporary variable, $schoolname, and I suppose that is a minor refactoring. On the filehandle issue, localizing a typeglob and relying on perl to close your filehandle for you is not, IMHO, good practice. Keep in mind that a localized typeglob localizes all the variables with that name, not just the filehandle. Consider: To get around that, you would need to use IO::Handle or allow perl to autovivify the handle for you (but only in 5.6.1 and above.) Doing so is often worth it but not for the sole purpose of obtaining a filehandle you don't have to explicitly close. In fact, closing the filehandle explicitly is generally considered a good practice. Here's another quote from Refactoring: Improving the Design of Existing Code, this from the foreword: So, what's the problem? Simply this: Refactoring is risky. It requires changes to working code that can introduce subtle bugs. Refactoring, if not done properly, can set you back days, even weeks. And refactoring becomes riskier when practiced informally or ad hoc. You start digging in the code. Soon you discover new opportunities for change, and you dig deeper. The more you dig, the more stuff you turn up. . .and the more changes you make. Eventually you dig yourself into a hole you can't get out of. To avoid digging your own grave, refactoring must be done systematically.Refactoring is risky. It shouldn't be done informally. It requires a disciplined systematic approach. Here at the monastery I've noticed that the term "refactoring" is both overused and often incorrectly used altogether. Refactoring is not the same as editing or rewriting and we shouldn't further the impression that is. Refactoring is the practice of restructuring a program by changing its basic components. It is worthwhile to remember the mathematical roots of the word. One might accurately construct the number 24 with the "program" 2**2 * 3 + 6 + 2 * 3 * 1 but refactoring that to 2**3 * 3 results in a program which is shorter, clearer, and more efficient. -sauoq "My two cents aren't worth a dime."; | [reply] [d/l] [select] |
by adrianh (Chancellor) on Nov 10, 2002 at 00:23 UTC | |
by Aristotle (Chancellor) on Nov 10, 2002 at 15:21 UTC | |
| |
|
Re: Re: Cannot read in multiple lines
by tachyon (Chancellor) on Nov 04, 2002 at 10:19 UTC | |
We don't need the close - the filehandle will automatically be closed when it falls out of scope. Actually that is wrong. FILE globs are not locally scoped.
If you really feel that not explicitly closing your filehandles is good programming practice and want to depend on Perl doing it for you you need to use
From memory this is only valid for Perl 5.6+ cheers tachyon s&&rsenoyhcatreve&&&s&n.+t&"$'$`$\"$\&"&ee&&y&srve&&d&&print | [reply] [d/l] [select] |
by adrianh (Chancellor) on Nov 07, 2002 at 03:04 UTC | |
Quite correct. Me very very foolish. Bad Adrian. | [reply] |
|
Re: Re: Cannot read in multiple lines
by Angel (Friar) on Nov 04, 2002 at 00:01 UTC | |
| [reply] |
by Enlil (Parson) on Nov 04, 2002 at 00:09 UTC | |
This will keep Update: As dingus pointed out. I did mean $/ as opposed to $\, which is the output record separator. So I have changed all instances in the code and response above. Thanks dingus -enlil | [reply] [d/l] [select] |
by dingus (Friar) on Nov 04, 2002 at 13:06 UTC | |
I would guess that at some point you assigned $\ a new value, or perhaps undefined it. If you need this you should put it in a block and localize the assignment, like so: ... Surely you mean $/ not $\. $/ is INPUT_RECORD_SEPARATOR while $\ is OUTPUT_RECORD_SEPARATOR according to perlvar Other than that - it looks like excellent advice. One other possibility is that the input file has been produced by a computer with different line characteristsics. E.g. A windows PC reading a file creted by a Unix program will also do precisely what you are experiencing. In which case setting $/ to be the very specific chr(13) woule be a good idea. Dingus | [reply] |