in reply to Re (tilly) 2 (disagree): Another commenting question, in thread Another commenting question,
In my own defense, I would like to make it clear that I would never actually use $x as a variable name in a program. That was just for a quickie example. Any good programmer would do as you say with variable/subroutine names, no globals, and good structure. I did not make that sufficiently clear in my post. In no way can commenting, however good, make up for bad code. Good code always comes first, and good comments come second.
However, I still contend that the good comments must, in fact, come. I, too, have been stuck doing a lot of maintenance work, and never have I come across any code so clear that it negates the need for comments. Your assertion that code can stand as its own comment seems unrealistic. Perhaps you just write much, much better code than anyone I've ever encountered :)
Yet in my experience, the code is never sufficient. Just a few days ago, I came across some code which confused me. I even wrote to perlmonks about it. As it turns out, my understanding of the code was flawless, but I still didn't know what it did, because the author had given no indication of why he did it.
I also took the time to peruse your example, and I must say that it seems a far better proof of my point than yours. When you read it, do you suppose that if you were not the author that you would instantly say, "Oh, of course! This code reads text data separated by an arbitrary character!"? No. But by simply reading the abstract which you so thoughtfully included, the reasoning behind the code is immediately revealed. Now, I don't know for sure if that code works as written, but I can guarantee that if I had to test or debug it, it would take at least twice as long to do so without those comments.
Also note that it was those style of comments that were what I was attempting to advocate earlier, as opposed to play-by-play commentary on the mechanics. I also would like to apologize if there was any hostility creeping into my tone. I've simply wasted way too much time shoveling through the crap of otherwise good programmers who proclaimed their code self-explanatory.
Some people drink from the fountain of knowledge, others just gargle.
Re (tilly) 4 (disagree): Another commenting question,
by tilly (Archbishop) on Mar 21, 2001 at 02:17 UTC
|
If you peruse my example and walked away from it thinking
that it stands as an example of why to comment every line,
use pretty formatting of comments, put comments at the
end of lines where they have to be adjusted by anyone
making changes to the code, etc, etc, etc - then you
missed the point.
My last paragraph of what you were responding to said:
Also note that I don't actually write uncommented code.
But my comments are about different aspects of the code
than the mere mechanics of what each line is doing...
Stare at that carefully.
The sample commenting examples that you gave resulted in
comments whose formatting would be a maintainance issue
that so closely duplicate what the code says that there is
little added value. Secondly the very verbosity of the
comments led you to use a useless variable and a confusing
comment. You claim you wouldn't use such a variable name
in practice, but I only have what you wrote to judge by.
And in what you wrote you used a useless variable name and
a useless - confusing - comment.
The sample code I pointed you at is commented. But it is
commented in a completely different way and style. Plus I
can tell you how that style would change in a slightly
different circumstance.
I comment. Yes. I comment intent. I highlight key points.
I do not explain my algorithms. I do not
explain my variable names. I do not produce a
separate document for the human and the computer. And
above all I have a commenting style that is not
going to be a lot of work to maintain.
So if you need to change the code and make it do something
else, or if you need to understand the mechanics so you can
do something similar, then you will need to read the code.
Humans should be able to figure out how the code does
what it says it does.
Now you claim I misunderstood you and you never meant to
advocate play by play commentary on the mechanics. But
the examples that you gave were play by play commentary
on the mechanics. The statement you made (that I cannot
disagree strongly enough with) is that there is no such
thing as overcommenting. The formatting of comments that
you advocated poses a serious maintainance problem. You
claim that overcommenting only accounts for perhaps
0.1% of comment related problems out there.
I am sorry, but I cannot reconcile these two points of
view. What you are saying now does not fit what you
showed as an example of what you think people should do.
I have dealt with problems due to overcommenting, poorly
maintained comments, comments that only make sense if you
are in the original author's point of view, comments that
cover for poorly thought out code, comments whose formatting
makes code hard to tinker with...and most of these problems
would have either not been there or been lessened if the
original programmers understood that not all comments are
good.
In fact for every one of those items I either saw them
in your examples of how to comment, or I saw you doing
things that have lead to these errors in code that I have
worked with.
Now you claim that the style of commenting that I
demonstrated is what you were attempting to advocate.
Well it certainly isn't what you describe. So let
me explicitly explain the style that I follow:
- Every function that is part of the public API gets a
brief description of input, output, and action. If the
function is not documented, it is not public. If the
function name starts with an _, it is probably not even
(usefully) callable.
- The overall package gets a brief description of
what it does.
- If the package is somewhat private (ie for me and
people I work closely with), documentation goes inline
in front of the described function.
- If it is to be more generally used, documentation
goes into pod.
- Key conceptual points may be highlighted with comments.
- Function names describe what they do. If that means
long function names, so be it.
- Variable names describe what they do. For instance
flags should be questions which the current value is
the answer to. If the meaning of a variable is unclear,
then it needs a better name, not a comment.
- Variables are not documented unless they are
part of the public API. Documentation of variables is
to inform others what may be used, not to explain what
the variable does.
- The formatting and naming scheme is consistent.
- The formatting of comments is chosen to not cause any
issues upon maintainance of code. This generally means
being on separate lines from code, and means a minimum
of fancy formatting.
- The mechanics of the code are not documented
unless it uses a non-obvious API. In that case the
other API is commented about inline.
- Whenever reasonable, large functions are broken into a
series of smaller functions so that the body of the
function reads as a series of hopefully self-descriptive
function calls.
- Wherever I spot a significant opportunity for error,
I place an error check with an informative error message
in case things go wrong. The error message is intended
to serve as a useful comment both upon reading the
source and during any failure condition.
- Functions are grouped into modules. Within a module
functions appear in alphabetical order. (This is a
personal style settled on after finding that initial
attempts to divide by what the functions did inevitably
broke down when at different points in the project I
thought differently about the code.)
I describe this as being, as much as is feasible, the code
standing as its own comment. What cannot be done by
making the code clear is commented. Whatever can be done
by making the code clear, is.
A better way to put this might be that I do not comment on
the code. I comment on the usage of the code and I comment
on other code. I comment on desired changes, I comment
about what is critical, and what the external
dependencies are. But the code at hand should be
readable and should stand as its own comment.
Now with all that in mind, please go back and read my
example again and compare to this detailed summary. Does
my code not fit my description? Does my code need detailed
comments about how it works? Do I, despite that, comment
in a useful way while requiring a minimum of work to
maintain the comments as the code changes? | [reply] |
|
My last paragraph of what you were responding to said:
Also note that I don't actually write uncommented code. But my comments are about different aspects of the code than the mere mechanics of what each line is doing...
Stare at that carefully.
Then please do me the same favor. In my first post, I wrote quite clearly:
The "why not what" model of commenting is a really good idea
I had already expressed my agreement with the previously stated idea that comments should focus on areas other than what most people comment on. The reason I wrote what I did was that some people's advice "don't comment badly" was being expressed as "don't comment." I merely wanted to reassert the importance of comments for future developers.
Secondly the very verbosity of the comments led you to use a useless variable and a confusing comment. You claim you wouldn't use such a variable name in practice, but I only have what you wrote to judge by. And in what you wrote you used a useless variable name and a useless - confusing - comment.
Really? I also wrote this:
###########################################################
#
# Here are some comments on the following code.
# It does stuff.
# Its cool.
#
###########################################################
Do you suppose that is an accurate reflection of the type of comments I would put in real code? If you're going to use what I write to judge by, use all of it. I wasn't attempting to give examples of "real" comments and their content, because like I said that had already been expressed, and I agreed with it. I was trying to point out that A. comments are necessary and B. you can find ways to stick them in clearly and readably wherever they may be necessary.
Now you claim that the style of commenting that I demonstrated is what you were attempting to advocate
After reading your subsequent list of style points, I see that we are farther apart than I had thought from viewing one small piece of code. I would like to say that I still agree with the content that you chose to include in your comments. However, there were a few areas that I took issue with. Out of politeness and benefit of the doubt, I had assumed that these were merely oversights, but after reading your style-guide I can see that these were conscious decisions, that I would like to comment on (no pun intended).
If the function is not documented, it is not public
Are you saying that you never document a function that is not part of a public interface? If so, that is severely negligent. The purpose of comments is not solely, or even primarily, to describe the interface. Comments exist for the benefit of people who might need to maintain or modify the code. There is no reason to assume that because a function is not part of the public interface that it is magically immune to the need for update.
If the function name starts with an _, it is probably not even (usefully) callable
OOC, I've never seen this before, is it commonly used? If not, then how is that useful for other people?
Function names describe what they do
Really, so get_row and _get_row do what, the same thing?
Variable names describe what they do
Hmm, let's see. In one short example I find the following variables: $lookup, @data, @allowed, $q_sep, $match_sep, $expected, $piece, $req, $opt, $default, @res, $bad, and $sep. So in some 200 lines you managed to declare 13 variables that give no indication of what they do or why they were declared.
Now, I have to say that I more or less agree with the other points you mentioned. And I really would like to reiterate that I'm not trying to be a jerk, or to attack you personally (I say this because I know I have a very aggressive argument style, which doesn't reflect my emotional opinions of the person I'm arguing with--really!). What I'm trying to point out is that every programmer's own code always makes more sense to him than it will to anyone else. The variable and function names that make oh so perfect sense to you now mean nothing to the next guy who comes to read your code six months later.
And all of the points that you cautioned against are valid and accurate. It is important to make sure that comments are not so specific that they need to be updated continually with every minor code change. It is important that every comment should be written to add something useful to the reader's understanding. "There is no such thing as too many comments" was obvious (or so I thought) hyperbole, attempting to make the point that every programmer since first we did away with punch-cards has always thought he needed less comments than he did.
You are obviously a good programmer, and obviously a better one than me. However, that is part of my point. SE4's write the programs that SE1's get stuck trying to maintain. The threshold of what qualifies as "non-obvious" varies greatly. You seem to code under the impression that whatever makes perfect sense to you will make just as much sense to everyone else.
Some people drink from the fountain of knowledge, others just gargle. | [reply] [d/l] |
|
I unfortunately don't have the energy to give this thread
the full attention it clearly deserves. However I have
to say that there is a fundamental point I made. You
complain that it would be great if good programmers did
everything they do and also put in everything
that you would like to see. You looked at that code and
then you said I was a good programmer. I am trying to
describe what has made me a better programmer - and
the kind of things that can make for more good
programmers...
Unfortunately programmers are people. And as people
there are tradeoffs they have to make. I believe that the
tradeoffs I am making with my style are worthwhile. The
purpose of those guidelines is to make it possible
for me to code like I do, and possible to keep the
code clean after I am done with it and while I am
maintaining it. Yes, it takes work for others to figure
out what the code does. However it is my honest belief
that with those stylistic guidelines I can get more done,
and it will be less work for someone to pick up than if I
coded in a different style.
Besides which, given the choice I would prefer to work at
a higher level and bring co-workers up to that level
rather than bringing myself down to the lowest common
denominator. My decisions, while they may seem hard to
learn when you see them on the net, are actually designed
based in large part upon what I have and have not found
that I could teach.
Page by page you might find a different style easier. But
I am aiming to get more done in less pages and wind up with
better organized code which is more easily tested, more
easily proven correct, and more easily modified. I prove
that it is easily modified by constantly modifying my own
code. If I have trouble modifying and cleaning it up then
there is a problem. I don't write gems. I write production
code that I have to live with.
When you look at that code bear in mind that when I started
I had a general idea of what I wanted to do, and a
general idea what I wanted the API to look like. But
there were going to be details to be filled in. In fact
I added and removed functions, added, named, renamed, and
destroyed variables and parameters. In short that is code
that has been reconceptualized and rearranged before you
ever saw it.
While I am writing it I am constantly refactoring it. Yes,
it would be nice to have a beautiful collection of
documentation on how it works, how to think about it, and
what you want to look at to make changes. But if I tried
to provide that commentary then I would be stuck
with my first bad approach to everything because it is far
too much work to change the documentation on how it is
supposed to go. And if it is that much work for me to nail
down my assumptions now, what will change when someone with
a need makes the API a wrapper around a better module that
can handle a series of related formats? I have no clue,
but I know that change is not an unreasonable one to make,
and I would far prefer that more formats were supported in
that way than by cutting and pasting the code to 50 places.
What, you think that code is perfect and will only be built
on? Uh, uh. If I have a need instead of duplicating that
code elsewhere I am going to figure out how to generalize
the idea, scoop out the body into something more general,
and then leave a shell. And in that process the detailed
comments and variables are going to go..where? I can't
predict. I know I can't predict. I couldn't predict what
would change while I was writing and I sure as heck can't
predict my mind in 6 months.
So I am not bogging it down with clutter that will limit
my options. If in context a variable is unclear then I
will give it a better name. But if its scope is small
and it is clear within the scope, then I won't. Let's
take the 13 variables that I had that you didn't like:
- $lookup Defined on line 24, scope lasts to
line 39. (I am only counting lines of code in scope,
the closing brace is actually line 40. Big whoop.) Only
used to line 32. You are right that it is poorly named.
It is a reference to a hash and should be named $field_pos
after the property of the object it is pulled from.
But in fact I wrote extract before I had actually
written bind_fields and when I settled on a name for the
fields I already was using it and didn't rename it.
- @data Defined on line 26. Used to end of scope
on line 39. This appears in a function called extract()
that is documented to extract data out of the row for
specified fields. This array holds that data. Longer
names could be chosen but I don't find this one to be
inappropriate to its scope.
- @allowed Defined on line 32. Scope closes on
line 35. Considering that the second and final time it
appears it is part of this string: "Valid fields are:
(@allowed)\n" I find its meaning abundantly clear
and highly doubt any good Perl programmer who understood
English would have trouble...
- $q_sep Defined on line 61. Used to line 63.
Would you prefer to see quotemetaed_sep as a name
instead?
- $match_sep Defined on line 62, used on line 74.
Through the whole program there is a field called sep,
that appears in the documentation which is the one
character separator between fields. That is a compiled
RE that will, wonder of wonders actually match whatever
sep currently is! The second place it appears may be
read, "Try to match sep, and if you fail then...". If you
have dealt with much of my code that will not appear to be
a strange name.
- $expected Defined on line 79. Used on line 80.
Those 2 lines generate and spit out a warning message.
That variables holds (duh) what I expected to see. Given
the context that should not be a problem to understand.
Let me say more about this.
The message would be a fatal error and would read something
like: Expected ',' at foo.csv, line 25, char 23 which
would be followed by a complete stack backtrace. If you
opened foo.csv, went to line 25, and went to char 23 you
would find, wonder of wonders, that a quoted fields had just
ended on character 22, but you didn't have the expected
divider right after that.
Plus if you practice reading that style you will see that it
is a pretty good comment. If on line 74 you fail to match
$match_sep, and on line 75 you find you were not at the end
of the line, then you failed to match $sep. Well then guess
what $match_sep is supposed to match??
- $piece On line 88, in a function that is supposed
to match a quoted field, I define $piece and then append
stuff to it and return it. Would you guess that, just
perhaps, $piece is the piece of the quoted field that we
have discovered so far? (It is only a piece because we
could go over a line break and need to fetch a new line,
which is what we are promising that we will do that is not
done by Text::CSV.)
- $req The comment says to see node 43323. That
would be Handling named function arguments. $req is an anonymous array of
required fields. See also the comment on line 110.
- $opt An anonymous array of optional fields.
See the same places as for $req.
- $default An anonymous hash of defaults for
fields. See the same places as for $req.
- @res The resulting values, in order.
- $sep The one character separator. As
documented on line 235.
Now you claim these give no indication of what they do or why
they were declared? I think we will need external opinions
here because I emphatically disagree. If I am building
an error message and I put what I expected to see in a
variable named $expected, well reading that a day later, a
week later, a month later, or a year later I would expect to
see that variable contain information on what I expected
to see! Likewise if I document a parameter to a function
that I call "sep", and I have a function called "set_sep"
which is also documented, and I have a field in the object
which is likewise called "sep", then probably pretty much
anything called "sep" is going to refer to whatever that
documentation says.
If that isn't reasonable, what is?
Likewise I claim that my function names are supposed to
indicate what things do, and leading underscores often
indicate that they cannot be usefully called externally.
You claim to have not heard of the underscore rule and ask
if it is widely used. FYI it is widely used in the Perl
community. For the first instance I turned up in the Perl
documentation see perltoot and look for "underscore".
The first module I found it in was LWP::UserAgent,
which calls _need_proxy and _elem, defines the first itself
and the other is from a parent module called
LWP::MemberMixin.
And about get_row vs _get_row, you cynically ask if they do
the same thing. Well what would you guess that they should
do given the names? How about get a row? Astoundingly that
is exactly what both of them do!
So why the difference? Why is one public and one private?
Well _get_row does not manage the objects state as desired,
and depends on a variety of initializations having happened.
Therefore _get_row will get a row but you can't call it
directly. By contrast get_row gets a row but does not
depend on actions having taken place.
If would be natural to guess about now that get_row is
probably just implemented as a wrapper around _get_row that
sets things up so that _get_row can be called and will work
right. Amazingly enough this is how it actually works.
Well you may wonder, why have a separate function? Well
there are two reasons.
The first is that you do have this reasonably clear
separation between the necessary initializations and the
work of parsing a row from the data. When I spot such
separations I like to create new functions.
The second is that, based on past experience in writing
little parse engines, I know full well that it is a
common and frustrating error to have accidental exit
points in the parse loop that you didn't anticipate.
Therefore I test that. There are two ways to test it.
One is to introduce a flag. The other is to move the
logic into a function from which you exit in the middle,
and falling out of the loop is a fatal error. I find the
second easier to understand...
This is stuff I have learned programming. There is more
of it there - a lot more. Can I really be expected to
document all of that? How much do you want me to say?
I can probably fill a book, but no matter what I say I
won't have covered all of the likely questions that a
maintainance programmer would be likely to come up with.
I would succeed in making it hard for me to fix and
modify the ossified corpse of the code...
Now if you want to see a longer code example, take a look
at Math::Fleximal. I wrote that today. It is conceptually
more complex. Now the interesting thing about it is that
I have never before tried to write anything like it. I
just started knowing how to do arithmetic and tried to
make it all work. I won't lie, my initial conception is
not what I ended up with. The API is kind of rough, but
not as rough as what I started coding to. The first time
I went to write times I still thought that I was going to
intertwine base conversions with multiplication. That
didn't work, and my attempt wound up as part of set_value.
Likewise I vaguely thought that dup and new were going to
be the same thing. Nope. Not even. And I never dreamed
how many times the phrase (shift)->dup() would appear.
That wasn't planned, it just happened.
Now that is a rough draft. If you want to complain, go
ahead. There are a lot of things that need fixing. I
know that. I can probably produce a longer list of
fixes needed than you can...
| [reply] |
|
|