Beefy Boxes and Bandwidth Generously Provided by pair Networks
Come for the quick hacks, stay for the epiphanies.

On comments

by pemungkah (Priest)
on Dec 19, 2010 at 23:16 UTC ( [id://877916] : perlmeditation . print w/replies, xml ) Need Help??

Someone commented recently that comments != documentation.

I'd like to point out that there are actually three kinds of documentation in every Perl program, each of which serves different purposes.

In essence, all programs are communicating simultaneously with three audiences:

  1. The computer, which only cares that it receives an unambiguous set of instructions in some manner: whether that's elegantly-documented programming language translated by a compiler or interpreter or binary files makes no difference to the computer. For our purposes, we're using Perl, and beyond that the computer only cares that we have a Perl interpreter; it makes no judgements beyond "this program can read that one and tell me what to do" (anthropomorphizing wildly).
  2. The user of the code, who needs to know what the program does (or is supposed to do), how to make it work, and what the program can say about errors it knows how to detect. I prefer to use POD for this.
  3. Programmers: both the original author and anyone else who'll be reading the code to try to understand what was intended and how that was intended to be done. This is the point where there tends to be a lot of argument.

One thing that programmers tend to lose sight of at times is that what they do for a living is write things that can be easily read by others and are descriptions of what they intend the machine to do. Computer languages do not confer higher efficiency to what the computer does, but are used to communicate in a pidgin that humans can understand, and that the computer can turn into runnable sets of instructions which it can execute.

The code should make the implementation of your algorithm as clear as it possibly can on its own. If for some reason (efficiency, compactness of expression, or other important reason) the code is not going to be easily read by the intended audience, then it's necessary to add the third kind of documentation, comments, to do a good job. If you intended audience is very good Perl programmers, then maybe you don't need comments.

If it's meant to be read and understood by a different audience, then you need to make it so that it is understandable to them.

All modern code is documentation of the work to be accomplished; it's simply that we have a processor that makes a portion of that documentation executable by the computer. No one faults Knuth for writing dense mathematical analysis of algorithms; if you're reading his books, that at least in part is why you are reading them. They are, however, not meant for casual programmers or beginners with little math background; those folks would be better served with a different book - which would communicate different things, yes, but which would make sense to its target audience.

"WHALE SINKS BOAT, EVERYONE DEAD" vs. Moby Dick. Yes, they communicate the same information, but the novel transmits more useful peripheral information. Comments do the same, and sometimes I find that the peripheral information - why did I decide to do this? why did I want to do this? - can be just as important as what I actually decided to do.

Depriving yourself of this important avenue of communication to both others and yourself because you think "comments are useless" is misguided.

The argument I most often hear against using comments at all is "but the comments get out of sync with the code!" - they don't "get" out of sync. Comments are part of the code. If you are responsible for maintaining the code then you are responsible for maintaining the comments as well. I completely agree that bad or misleading comments are useless. It's simply that it's your job to fix them.

  • If the code does not match the comments, fix the comments -- assuming the code is passing its tests.
  • If the code is not working, the comments may be one of the few things that tell you anything about what was intended.
  • If you make a set of comments obsolete, fix them - or if you're sure that the changes stand alone, then delete them - but only as a last resort.
  • If the comments contain anything at all about the intent of the code, you should try to correct and preserve them; a comment is a signpost of possible complexity. Unless you've mitigated that complexity in a completely transparent way for all the intended readers, keep the comments.

Replies are listed 'Best First'.
Re: On comments
by mr_mischief (Monsignor) on Dec 20, 2010 at 04:21 UTC

    There's been a lot of comparing software development to building erection, machine shops, and other physical construction types of tasks lately. Let me draw some parallels, then, to certain products for those tasks. A furnace, water heater, circular saw, drill press, or welder has more than one type of documentation. Those are located in different places for different audiences and for different purposes. There's the user's manual. That tells people how to use the device day-to-day and where to get optional attachments and expendable replacement parts. Then there's the maintenance manual that covers how to fix broken devices and where to get replacements for the permanent parts of the device the end user shouldn't be bothered to replace. Then there are warning stickers near places where electricity, hot surfaces, moving parts, lasers, or sharp edges cause particular dangers. These are for anyone: the end-user, the maintenance worker or repair shop, or a curious visitor. They draw attention to certain areas to avoid and warn people to be careful around certain parts no matter what their level of intimacy with the device.

    A big part of the issue here is that the user and the maintenance programmer can often find the same type of documentation about the program, even though it is for different audiences. That type is prose about how to use features of the program to do things which is found separately from the code. The only difference between this type for the end user and for the programmer is which features are covered. Think of these as the user manual and the maintenance manual for your plumbing and heating fixtures or power tools. For the end user, features of the user interface and how to operate the program are covered. For the maintenance programmer, features of the programmer interface (APIs, data formats, extension mechanisms, callback hooks, and more) and how the parts work together are the topics. Those are definitely features, and they can form what is clearly an interface. Many programmers, especially for a large and complex program, want this sort of documentation ever as much as the end user does. They want the developer interfaces covered with as much care and detail as the end-user interface.

    Comments can be documentation, too. They should not be the sort of documentation mentioned above, though. A comment placed locally in a particular section of code should be used only to convey information germane to that local section of code. If there's for example a line that seems superfluous but works around a serious security risk and therefore shouldn't be removed for the sake of speed or efficiency*, it is my opinion there should be a comment about that line right there. If there's some reason a section of code looks crufty because it's working around a bug in a lower layer of software, you probably want to document that right in the code with a comment, too. Think of these comments as "attention" or "warning" labels near the hot, sharp, blinding, or crushing parts of the device.

    The comments really shouldn't be the place you document everything a programmer needs to know about APIs and data formats. The POD or external documentation in some other format really shouldn't be the only place an important security, speed, or bugfix hack in the code is mentioned. You might want to mention it there as well, but comments immediately above or to the side of the code the next programmer might try to edit is the safest place to put such "warning label" documentation.

    * This actually happened in part of the Linux kernel somewhat recently. A line of assembly code for the AMD and Intel x86_64 processors (which zero-filled a register) was removed which lead to a local root exploit. To someone without proper knowledge, it looked like a wasted CPU operation. The same bug had been fixed in 2007 (CVE-2007-4573) in Linux and was reintroduced at some point. It was rediscovered as a vulnerability in 2010 (CVE-2010-3301). Had there been a clear enough comment close enough to that line in the code, perhaps the vulnerability would have stayed closed.

      Excellent thoughts.

      Another, often-overlooked source of “documentation” is the change-history in the version control system, and any defect-reporting system records that you may have.   (And you do have them and you do use them, don't you?...   Ahh, yes.   Of course you do.   Of course.)

      When you’re fixing a bug that was reported in the defect tracking system, record the defect-number in the comment.   When you check-in the change, update the defect record to show it.   This lets you perceive exactly what has changed in a module over time, and why.

      This is a “dynamic” form of information, analogous to the “running log” that was routinely kept in the engine-room, or the “captain’s log” (“Stardate 4523.3 ...”) that was kept on the bridge.

      Bonus Question:   what happened on that stardate?   Extra Vulcan-points if you don’t have to Google it...

        As it happens, in addition to being a Perl geek, I'm a sailboat captain (currently heading for the Bahamas along the US East Coast, chasing the warm weather.) One of the first things that crew aboard my boat learn is that any actions other than the obvious and expected ones must be logged - and anyone either coming on watch or about to do any kind of work must review the log. In addition, anything requiring general notice (e.g., work in process) is specially flagged - e.g., if someone's down in the engine compartment and changing the alternator belt, then not only is the starting battery switched out of the circuit, but the starting key is taken out of its slot and secured to its hook with a red Velcro strap. Otherwise, there's nothing to stop someone from thinking "oh, they forgot to flip the battery switch to 'Start'!" and flipping it on, then cranking the engine.

        Shipboard communication, much like the language of science, has been developed over a period of centuries and for much the same reason (plus the extra bit of awareness that you're likely to maim or kill someone, or yourself, if you get it wrong.) That's why stories like that kernel bug being restored make me itch; that's precisely the kind of miscommunication that shipboard procedures are intended to prevent.

        "Language shapes the way we think, and determines what we can think about."
        -- B. L. Whorf
Re: On comments (Commenting and Documentation References)
by eyepopslikeamosquito (Archbishop) on Dec 20, 2010 at 01:49 UTC
Re: On comments
by sundialsvc4 (Abbot) on Dec 20, 2010 at 02:39 UTC

    I have made quite a bit of money, over the past fifteen(!) years, from roughly 350,000+/- lines of (non-Perl...) source code written by “a complete stranger.”


    Quite a bit of that code is (if I may say so myself...) brilliant.   ;-)

    But, today, I don’t remember a single word of it.   Any time I have to re-approach any of it, I have to learn it all over again.   (Because, if I don’t, I know I’m gonna <!> it up... as would, of course, be true, no matter who originally wrote it.)

    And so, this is why I am very grateful to “the previous programmer” for the careful attention that he placed upon his work at the time.   This is what makes it ever-so-much easier for me, today, to decipher what he did ... and what he was thinking when he did it.   (Because today, sincere truth be told, I have no idea!!)

    It makes not the slightest bit of difference, after all, that “the previous programmer” was, in fact, yours truly.   (That was, in fact, fifteen years ago...)

    Most of us, in this business, have the luxury of blaming our sorrows upon “the previous programmer.”   But my personal experience over the past fifteen years quietly confirms just how much information is lost forever unless “the previous programmer” took care to pay very-careful attention at the time . . .

      I totally agree!

      In the past many people were praising my "good" documentationš.

      The truth is that I regularly walk to another room just to stand there asking myself what I actually wanted.

      Commenting is pure egoism, like this I'm freeing resources normally bound to memorizing for more creative processes.

      Cheers Rolf

      1) actually I suspect most were just counting the words instead of reading them...

Re: On comments
by BrowserUk (Patriarch) on Dec 19, 2010 at 23:52 UTC

    (Most)Comments are worse than useless. They are a drag on both the original and future programmers time and effort.

    Show me some of your comments + the associated code, and I'll re-write the code to make the comments superfluous(*).

    (*)It doesn't always work. Some comments are useful. But rarely.

    Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
    "Science is about questioning the status quo. Questioning authority".
    In the absence of evidence, opinion is indistinguishable from prejudice.

      I both agree and disagree. I notice that the more experienced a programmer is, the more useful the comments in the code are.

      A good programmer codes in a way that comments on how the current code works doesn't need any comments, nor do the variable names or their purpose(s).

      What does need comments is the reason why the current code uses a specific algorithm or - maybe double as important - why a certain other, maybe more obvious, algorithm was not used.

      Those are the most valuable types of comments in code IMHO. Even if the original reason turns out not to be true after 15 years of hardware development or different compiler optimizations.

      Enjoy, Have FUN! H.Merijn

        If you look above, I was very careful to say "(Most)"; and "Some comments are useful.".

        My contention is that they are far rarer than most people believe. Please also see Programming *is* much more than "just writing code".

        Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
        "Science is about questioning the status quo. Questioning authority".
        In the absence of evidence, opinion is indistinguishable from prejudice.

      I disagree with you here.

      Comments have one big advantage over the code: they are written to be read by humans. And they are usually on a higher level of abstraction.

      I sometimes have to read through thousands of lines of code in search for an error. And that is usually code I have never seen before, but where I only have a rough idea what should happen. So I pass through the code, and focus on reading the comments - till I find possible locations of the error. Only then I look at the code of these locations in detail. Typically, I then compare the code of the "probably error location" with older versions of the same file - and also focus on the comments first - to get the idea why something was changed.

      The approach above works well most of the times.

      Of course that the depends on the quality of the comments. They need to give you an idea what the coder wanted to do. There are many types of comments that are useless ... possible you, BrowserUk, are referring to things like

      if ($i > 0) # check if $i is greater than zero
      which I agree to be harmful: they waste time and give no benefit - and even worse(!): they create the mentality of "comments are useless" in the mind of some developers. I hate reading their code ;-/

      Sometimes code and comments are not aligned or even contradicting. In my experience, those places are more likely to contain errors than others: it is a sign that a developer made some change in a hurry - and if he had no time for adding a comment, he probably hadn't time to think deep enough on the side-effects of the change (and most probably the code was not reviewed either ...). So even outdated comments are (in some kind) helpful for finding errors...

      All the best, Rata (who thinks comments are a really useful "tool")

        I disagree with you here.

        Of course, you, and the OP, have every right to your opinion and conclusions. just as do I. I've had this debate many times down the years and no one has yet convinced me to change my mind.

        I laid out my reasoning here in a meditation: Programming *is* much more than "just writing code"., so I won't repeat them in this thread.

        I have on occasion succeeded in changing the opinions of others. Whilst those occasions have been rare, there is a common theme to those occasions of my success. And that is the challenge I made above.

        Every pro-comments advocate I ever encountered has agreed that there are "good comments" and "bad comments". But what they will rarely ever do is back up their conceptual support for good comments with real life examples that they are prepared to back.

        On those rare occasions when I have persuaded people to offer up (real life) examples for discussion, per my challenge to the OP, they have often been persuaded by my re-writes of the examples that the comments are redundant. But the examples do have to be real-life, not contrived.

        As I said above, it doesn't always work. We can all come up with a (usually contrived) example of a comment that cannot be better handled with code.

        Personally, I think that people become enamoured with the idea of "good comments", but few if any ever live up to their own ideals of what is actually a "good comment". Hence my challenge.

        Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
        "Science is about questioning the status quo. Questioning authority".
        In the absence of evidence, opinion is indistinguishable from prejudice.

      I have to confess the most frequent comment I use is:

      #Note2Ray: FIX ME!



        Yes. I too have left a bunch of similar annotations in source code scattered around Europe down the years. More in hope than expectation, like forlorn messages in bottles. My wife says I'm unfixable.

        Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
        "Science is about questioning the status quo. Questioning authority".
        In the absence of evidence, opinion is indistinguishable from prejudice.
      Please head on over to, and fix that up so it doesn't need comments. I'll wait. :)

      That was a big project - about a year - and all I did was comment the code. Why? Because nobody other than a very few people understood the debugger. Reading that code with no comments is sort of like hearing half a phone conversation, as the debugger is really just the Perl side of the execution-time hooks that make debugging possible. The comments clarify the things that are happening elsewhere that are important to the the debugger code that's just been called from someplace else. That contextual info is important - and it can't be in the code; it has to be in the comments. Yes, you could switch back and forth between the debugger source and the Perl source, but you'd have to understand the Perl core pretty well. With comments, you can just read and understand in much less time with much less effort.

      Compare the 5.004 debugger to the 5.8.8 one to see the difference between the uncommented source and the commented. I think it was worth doing, as (among other things) Devel::ebug and the beginnings of the debugger tests in the Perl core tests came out of that. I firmly believe it was because the stuff you should "just get from the code" was finally laid out, explained, and made possible to follow.

      I also believe that nearly anyone who's basically familiar with Perl can now read the debugger source and actually get something out of it other than a headache, even if it's a resolution to never ever write code like that.

      Comments, as I mentioned, are not meant to tell you what the code can tell you itself; they are meant to tell you what decisions went into putting that code there in the first place, and the context in which these decisions made sense. As in the kernel bug noted, the reason for the decision is sometimes more important and considerably more complex than the code itself is. When a few characters make the difference between a root exploit and secure code, put in the comment!

      Writing comments is writing; if the comments are good, then they communicate an arc of logic from point start to finish. The combination of the comments and the code should tell you more than either could alone. If they don't then the comments are wasted space - but should be condensed, corrected, and made to communicate, not simply discarded!

      The comments and code conflict? There are several possible reasons, but one important possibility is that there really was a disconnect between the desire and the implementation, and that the code is actually wrong. It may be functioning and passing its tests, but it may nonetheless not be what was actually intended. It may be very hard to find out what was really wanted - and the code and comments don't help - but the fact that they conflict communicates something important that the code alone would not and cannot. If you simply strip the comments and then look at the code, you've thrown away data. Never a good idea when you're looking at code you don't know well (meaning that you know what was wanted, what it's doing, and why).

      I honestly find an approach that says "never" comment or "always" delete comments suspect; it's as if the programmer had decided that they didn't like else clauses. Sure, you can write code that's functionally equivalent without them, but you've made a choice that deliberately makes the code's primary job - communicating to the next programmer - more difficult, and I think that's a bad choice.

      Comments are as integral a part of the language, and if you've decided never to use them, you really should think about that, and be sure it's not (for instance) compensating for feelings of insecurity. "I'm so good I don't need comments! Really I am!"

      I prefer to know that I need comments some places, and I make sure I put them in at those places. I don't think I'm not as good because I comment; I think I'm more considerate and better cognizant of my abilities by doing so. Maybe the next programmer really will be so good he or she doesn't need comments. Let's hope the one after the decommenter is as good as the decommenter thinks he or she is.

      Know your audience, and be honest and pragmatic. Not every programmer is going to be as brilliant as every other. You're not going to be as brilliant every day. Make sure the average programmer (and you) will be able to read the code with the comments and get as much out of it as the brilliant one can without the comments. This is eminently possible, and worth it, in my opinion.

        Sorry, but bad comments do not make up for bad code.

        All of this:

        Can be replaced by:

        { local( $trace, $single, $^D ); ($evalarg) = $evalarg =~ /(.*)/s; # Untaint # $usercontext built in DB::DB @res = eval "$usercontext $evalarg;\n"; # '\n' for nice recur +sive debug }

        Not only is it shorter and clearer, it is safer. As tchrist points out in the above code:

        # 'my' would make it visible from user code # but so does local! --tchrist

        You've gone to great verbosity to explain how the variables: $otrace, $osingle, $od, are used to prevent the user code from messing with the debugger's internal state. Completely missing the fact that by storing copies in those localised globals, it is just as likely that the user code will mess with those variables as the originals. And if they do, the code will be restoring the messed with values over the top of the untouched original values.

        Whereas if you simply localise the original globals, they'll be restored when the block exits regardless of what the user does. And the need for all the code-concealing and confusing comments just goes away.

        I can see from what you've done to that we'll never agree. That's fine. Some people like marmite, some don't.

        But, the last four paras of your post above show me that you've either not read Programming *is* much more than "just writing code"., or you have and still think you can justify your position by concluding that those with the opposed viewpoint are either too lazy or too arrogant to comment. And that is just plain not the case.

        I rarely comment because I've found over the years that the vast majority of comments tell me nothing more than the code does. But worse, as above, the comments attempt to persuade me stuff that is just flat out wrong.

        Even your point that "Not every programmer is going to be as brilliant as every other." totally misreads that reasoning. For example, any programmer unfamiliar with local--which is a larger portion of modern Perl programmers--will, from reading your comments above, get entirely the wrong idea about what it does and how it works. And will go away with entirely the wrong impression because of those bad comments.

        If however, there were no comments, they would have had to have gone to the documentation, and read up about the local keyword and learnt what it really does.

        Like I said. Bad comment are worse than no comments. And most comments are bad. Even those by people who think they write good ones.

        Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
        "Science is about questioning the status quo. Questioning authority".
        In the absence of evidence, opinion is indistinguishable from prejudice.
Re: On comments
by ambrus (Abbot) on Dec 22, 2010 at 20:57 UTC

    Ok, so if everyone agrees that the

    movq $5, %rax ! set value of rax register to 5
    style comments are unnecessary, then why do people recommend YAPE::Regex::Explain so much?

      A part of it: It's a dynamic/current, programmatic explanation. It isn't lying to you, misdirecting you, or offering an opinion. It's also mostly recommended to novices. A verbose (and perfectly accurate) comment on every line code would be useful for a beginner.

Re: On comments
by hsmyers (Canon) on Dec 24, 2010 at 04:46 UTC
    Much of the last 10 or 15 years I've been brought in as a hired gun to 'maintain' (read fix) existing code. I follow a standard procedure. First I strip all of the comments. Then I reformat the code. I do the first step because I don't care what someone else thinks the code is going to do or should do--- all I care about is what the code does do. I make the second change purely for comfort. When I use one or the other of the various code for-matters, I match my own standard which increases the code's clarity at least for me. With these two tasks accomplished I have at least a chance to succeed!


    "Never try to teach a pig to wastes your time and it annoys the pig."
      I recently wrote a few modules for our company where the ratio of comments vs code was about 5:1. And the code wasn't complicated. Mostly finding some maximums and value switches in sequences, followed by changing some numbers.

      Still, I consider the comments to be very important. Sure, it's easy to see from the code that some numbers at the end of the sequence are changed - but without the comments, you will have no idea why it happens.

      If the code is changing ownership to you, these are fine steps, though I wonder what removing comments that explain a reasoning for a choice of construct or flow would do you any good.

      If however the code is maintained by you just this time but is expected to be maintained by someone else in the near future, changing the style/layout would - for me - be a reason to fire you. Companies have (or at least should have) a company standard/policy. You - as a maintainer - should follow that strictly.

      Same for open source. Depending on how strict or how loose rules are set up, you should follow them. For me this is a reason to almost never contribute to GNU projects, as they want patches/code-chunks in their style, and as I think their style sucks, I don't write code for them.

      People giving me patches for my projects in their own style are reviewed and reformatted to match my style, because these are my projects. People not complying to my style is my major reason not to hand out commit bits.

      Enjoy, Have FUN! H.Merijn
        Since all of this happens on a copy of the original I fail to understand anything of your justification to 'fire' me. Did you fail to note the word 'fix' in parens? This means it doesn't work. If it doesn't work, comments are at best suspect and often completely misleading. Again the comment about copy allows me to format the code in any way shape or form I desire. I don't work for these companies, I'm brought in from the outside to put out a specific fire. Those things that help do that are good. Now at some point the error or errors will be found. That information, not quite a patch but close, will be turned over to those in charge of the code in question. So as I walk out the door, the only thing that will have changed will be that which didn't work. Note that if it needs comments as negotiated with those I turn the code over to, then I will be pleased to comment according to any standard they happen to choose. I've nothing against comments, I just don't want them in my way...


        "Never try to teach a pig to wastes your time and it annoys the pig."