Commenting Showing Intent [CSI]
Version: 1.2.1
Last Updated: 2019-08-26
The CSI (Commenting Showing Intent) Commenting Standards refers to astyle of code commenting which allows for the complete rewriting of aprogram in any language, given only the comments. It is also a backhandedreference to Crime Scene Investigation: putting the clues together torecreate the whole event.
Purpose and Theory
In many languages, it is considered “bad practice” to not comment code,yet so many programmers fail to do it. Many more write comments tothemselves, but these comments are often cryptic at best. Still others writecomments that restate the functionality of the code.
Because of these problems, several practices exist which condemn comments inmost, or all, situations. This leads to code which becomes inexorably separatedfrom its specification.
The CSI Commenting Standard offers a non-language-specific standardfor writing comments.
Note
Intent-Commenting is discussed in detail in an article byLead Developer Jason C. McDonald entitledTo Comment Or Not To Comment
Advantages
We’re asking you to type twice, if not three times, as much as you do now.What’s the advantage to that?
1. CSI comments become aliving specification, wherein the expected behaviorof the program and the actual functionality live in close proximity to oneanother, and can be more easily kept in sync and up-to-date.
2. When you come back to your code after a long time away, you will beable to find your footing much faster.
3. When other developers (or non-programmers) read your code or APIdocumentation, they will be able to understand it more thoroughly,much quicker. This is vital to efficient code reviews and third-partydebugging.
4. You significantly reduce the entry learning curve for new contributors.Because learning the code is much easier, individuals are able to startmaking meaningful contributions more quickly.
5. You and your code reviewers will be able to desk-check, debug, and trackcomplicated logical thought patterns much quicker. Mismatches between intentand actual behavior become evident.
6. When you are dealing with a complex logical process that you are tryingto write code for, you can start by writing your entire function inCSI-compliant comments. Then, using these comments as a sort of pseudocode,you can write the code under each comment.
7. You can translate your program into another computer language much quicker.Again, by using the CSI-compliant comments, you can rewrite each line ofcode to work in your programming language.
8. The code becomes collaterally useful for demonstrating the languagesyntax and features themselves. Armed with an understanding of the goal, theless experienced developer can more quickly surmise how the code works.(See #4.)
CSI vs. Self-Commenting Code
“Self-Commenting Code” is a practice wherein a code’s functionality isself-evident. Through naming, structure, and various other techniques,the immediate purpose becomes obvious to the reader. This is beneficial inany language.
However, “Self-Commenting Code” is seldom capable of expressing the entireintent, or “why”, of the code.
It is nearly impossible to express theintended behavior of the code;only theactual behavior is evident, which can conceal logic errors.
Code cannot imply the reason the current approach was taken over another.
Code can seldom self-express its purpose in its larger context. Even attemptingto do so can lead to impractically long function and class names.
The CSI Standard should existalongside Self-Commenting Code practices, notinstead of.
CSI | Self-Commenting | |
---|---|---|
Topic | Intendedbehavior. | Actual behavior. |
Question | WHY did wewrite this code? | WHAT does thecode do? |
Expresses | Language-agnosticspecification. | Language-specificfunctionality. |
CSI vs. Documentation
The CSI Commenting Standardshould not be confused with documentation.
CSI | Documentation | |
---|---|---|
Audience | Maintainers andDevelopers | End-Users andEnd-Developers |
Topic | Intent and designof code. | How to use thesoftware/API. |
Question | WHY did wewrite this code? | WHAT does thecode do andHOWdo we use it? |
However, these standardsmay be merged with API documentation standards,to help produce better autodocs. The important distinction is thatCSI comments state WHY, anddoc comments state WHAT and HOW.
Keeping Up-To-Date
A common argument against comments is that“comments become outdated tooquickly”, and“maintaining comments takes extra work”. However, properapplication of the CSI Commenting Standard avoids both of these problems.
When developers are in the habit of using this standard, the first step inmodifying code is to update the intent-comment.
Developers spend considerably less time trying to recreate the intent ofthe previous developer(s), including their past selves. Only a portion ofthis time must be used to update the comments.
Format
ALGOL-Based Languages
For readability, CSI comments should use single-line comments only forsingle-line statements. If more than one line is required, multi-linecomments should be used (if available in the language.)
It is recommended that each line in a multi-line comment start with analigned asterisk, as this improves the readability of the comment.
/* This is a multi-line * comment with the * preceding asterisk. */
In any language, we strongly recommend leaving an extra space between thecomment token and the comment text, to aid in readability.
Python
CSI comments should not be confused with docstrings (see CSI vs.Documentation). Line comments should be used for CSI. Placing thecomment above the code in question is recommended. Inline commentsare prone to causing an overrun of PEP 8’s line length limits.
# This is a CSI comment, describing intent.doSomething()
Commenting Style
Again, many of these principles can be applied to documentation commentsas well. The distinction is that CSI comments stateWHY.
Note
I have intentionally oversimplified the code examples to makethem easy to quickly understand. Most real code is far less obviousin its intention at first glance.
Tone
Comments should be written in a conversational tone, in the same manner thatthe code might be explained to a newcomer. It should be free oflanguage-specific syntax as much as practical. This enables non-programmers(and programmers from other languages) to understand the code more easily.
BAD
// set box_width to equal the floor of items and 17intitems_per_box=floor(items/17)
This merely restates the code in a generic way, and it entirely redundantwhen paired with self-commented code. It also depends on the languageterm “floor” - if a reader is unfamiliar with this term, they will haveto look it up just to understand the comment - a situation that we shouldavoid as much as possible.
BAD
// Find how many times 17 goes into y, without a remainder.intitems_per_box=floor(items/17);
Now we know what the code is doing, in a language-agnostic fashion. As aside benefit, the reader can also surmise what “floor” does, if he or shewere unfamiliar with the term.
However, this comment is still not true CSI, as it is only statingWHAT,and notWHY. Furthermore, the self-commented code makes this redundantto an average C++ developer.
BEST
/* Divide our items among 17 boxes. * We'll deal with the leftovers later. */intitems_per_box=floor(items/17);
Now we knowWHY the code is here - we’re dividing our items amongthe boxes. We also know that this line isn’t intended to handle theextra items (thus why we are usingfloor()
).
If you imagine a lone maintainer looking to change this code to dividethe items among any number of boxes, the comment would make his changeobvious, even with a minimal understanding of the code…
/* Divide our items among the specified number of boxes. * We'll deal with the leftovers later. */intitems_per_box=floor(items/boxes);
Avoiding Vagueness
CSI comments should specifically outline the programmer’s logic andreasoning. The more left unsaid and undefined, the less effectivethe comment.
BAD
// This tells us how much we can handle.intmaximum_range=27;
This is too vague, and redundant given the variable name. (I’m assuming thisisn’t being clarified by immediately prior comments.)
BETTER
// This tells us the maximum workable integerintmaximum_range=27;
This is still vague. If we didn’t know exactly what “maximum workable integer”meant in this context, we’d still be confused. (Again, assuming no context.)
BEST
// Anything larger than this integer causes the algorithm to return 0.intmaximum_range=27;
Ahh, so thealgorithm has a specific limitation! All becomes clear…
Humor
Humor should not be suppressed, so long as it does not detract from clarity.It makes the documentation a lot easier to read, because who likes drydocumentation?
The first rule of humor is applicable here, though: don’t force it.If you try to be funny, you won’t be. The only point is to not forceyourself to be totally serious.
That said, don’t be crass for crass’ sake, as it may drive away others,detracting from the whole point of this standard.
ACCEPTABLE
/* We return -1 instead of 0 to avoid a * math error in the upcoming division. */return-1;
BETTER
/* We return -1 instead of 0 to keep the * math gremlins happy in the upcoming division. */return-1;
Context
Context is very useful in comments. Since we’re aiming for a conversationaltone, it is okay for one comment to help explain the comment immediatelyfollowing. However, we do not want to become too reliant on context, as itis yet one more thing the reader must keep track of.
The following would be good in a short function.
EXAMPLE
/* count tracks the number of times the word “Bah” * appears in the given text. */// We encountered a “Bah”, increment the count.// Return the count.
The following would be better in a very large function.
EXAMPLE
/* count tracks the number of times the word “Bah” * appears in the given text. */// We encountered a “Bah”, increment the count.// Return the count of “Bah” instances.
Length
Obviously, the above practices will result in longer comments. This isn’t abad thing, as it seriously increases the code’s readability, and speeds updebugging. Appropriate brevity comes with practice.
Bear this in mind: a single comment should state thepurpose of a line orblock of code in plain English.
ACCEPTABLE
/* Search through the list of integers we got from the user * and find the number of integers that are divisible by * both 5 and 7. Then, return the sum of those numbers. */intsum=0;for(inti=0;i<len;++i){if(!(nums[i]%5)&&!(nums[i]%7)){sum+=nums[i];}}returnsum;
This attempts to pack entirely too much information into one comment,which slows us down. We now have to stop and determine whatsum+=nums[i]
is doing, based on the big comment. It is alsolengthier than it needs to be.
BEST
// Store the running sum.intsum=0;// Search through the list of integers...for(inti=0;i<len;++i){// If the number is divisible by both 5 and 7...if(!(nums[i]%5)&&!(nums[i]%7)){// Add it to our sum.sum+=nums[i];}}// Return the final sum.returnsum;
By spreading out the comments, we can see the intention behind eachpiece of code.sums+=nums[i]
is obviously adding the numberwe found to our running sum.
Spreading out comments also helps to ensure they are kept up-to-date. One ofthe reasons programmers neglect to update comments is that they are notin the immediate vicinity of their other changes.
Frequency and Necessity
The core standard is this:comment everything at first. Each logical stepshould have an explanation. Yes, it doubles the size of your document, but you(and other people) will be able to better read the code and documentationlater.
In a nutshell, aim to comment more lines of code, not to pack more intoone comment.
There may be a rare occasion where a line of code is so entirely obvious andordinary, a CSI comment would be redundant. However, before drawing thisconclusion in a given instance, ask yourself whether someone entirelyunfamiliar with the syntax and program would immediately know what theintent was.
OBVIOUS
# Greet the user.print(welcome_message+username+".")
This line of Python code is so obvious, we could choose to omit the commentand still be CSI-compliant.
MOSTLY-OBVIOUS
# Display the status or error code from the rendering engine.print(get_status(render_engine))
This line is a little harder to parse, unless you know that our theoreticalfunctionget_status()
queries the object’s status, and returns it asa string. Even if we surmised that much, we might not know that error codesare returned here as well (perhaps we’re looking for that line!)
NON-OBVIOUS
# Display the result of the final step of calculation.print(str(foo%bar*baz))
We need the comment here to specify that we are actually completing the laststep of a calculation within our print statement.
Trimming Contents
Commenting WHY instead of WHAT can be difficult, especially when you’re familiarwith the code. It may be tempting to write vague comments, or even remove them,as you work.
However, the purpose of the CSI standard is to inform the programmer who isnot presently familiar with the code. Therefore, we recommend the following:
Comment every logical statement while working. No exceptions!
Have someone unfamiliar with the code review the comments and suggestimprovements. You may be able to do this yourself, if you leave the commentsAND code alone for a couple of weeks first.
Using the insight from Step 2, rewrite WHAT comments to WHY, and eliminateentirely unnecessary comments.
Types of Comments
Declarations
CSI-compliant source code should specify the purpose and intent ofvariables and functions. As previously mentioned this can be merged withdocumentation standards, especially because the resulting autodocs willbe far more usable.
Note
If the name of a variable or function fully explains its intent,you may omit the comment as your documentation standard permits.
In these examples, we’ll demonstrate combining CSI with a Doxygen-compatibledoc comment. To that aim, the comments below contain the names of the itemsin question, in anticipation of the resultant autodocs.
VARIABLE/CONSTANT
/** The SILVER_INTEREST_RATE constant stores the * monthly interest rate for Silver savings accounts. */constintSILVER_INTEREST_RATE=1.06;
Preceding a variable or constant (especially the latter), we should stateits intent - its purpose for existing. While a good variable or constant nametells uswhat it is, the comment should statewhy it exists.
FUNCTION
/** The countBah function determines how many times * “BAH” appears in a given string. * \param the string to count "bah" in. * \return the number of times "bah" appeared. */intcountBah(stringinputText);
Immediately preceding a function declaration, its purpose should be stated, aswell as the purpose of the input values, in plain English.
Special Comments
UsingTODO
,NOTE
, andFIXME
comments is common practice inmany languages, and many tools exist that generate lists from these.The CSI standard recommends that these types of comments be used, andfollow the same tone as other comments when possible.
// TODO: Expand the whatchamacallit to make whozits.// NOTE: Is there a faster way to produce thingamabobs?// FIXME: This math always seems to produce the result "2".
Entry Points
Major features should have entry points, which indicate where one should startreading the code if they want to follow the entire call stack for a particularfunction or feature. For example, if a game engine has a long process forgenerating an animated character on the screen, the beginning of this process- such as the function that initializes it - should have the comment…
// ENTRY: Generate Animated Character
From this comment, the reader can follow each class, object, and functionthrough to the end to see the entire process.
In order for this to work, the call stack commenting should not have any“gaps” (such as a virtual function) that do not have some comment toindicate where the call stack continues in the code.
Entry points are not always practical, but where they are used, it will bemuch easier for a developer who is unfamiliar with the code to find “whereto start”.
Commenting Out Code
It can be very easy to confuse a regular comment and commented out code.There are two ways to clarify this action.
EXPLANATION METHOD
// It would seem that float is better for this task.//int foo = 187;floatfoo=187;// Just testing if we really need this function call at all.//refreshEverything();
Here, we add a preceding comment to explain why the code was commented out.The benefit to this is that it helps you and other programmers recognizeand follow changes in program logic.
This method is ideal in languages where double-commenting (below) isnot possible.
DOUBLE COMMENT METHOD
////refreshEverything();
We can “double-comment” out the code. This is probably ideal in situationswhere the commenting-out is temporary, and you don’t want to have to writean explanation.
COMBINATION METHOD
// Just testing if we really need this function call at all.////refreshEverything();
By combining the two methods, you can see what code was commented out,while stating the reasons behind it.
This method is ideal in languages where double-commenting is possible.
In any case, you should ultimately aim to remove commented-out code assoon as possible.
Top of Document
On the top of the document, the programmer should ideally list the project nameand version, module/class name and description, date last updated,and authors (optionally). This may be adjusted to comply with documentationneeds and individual standards.
/* Dohickey Class [Some Epic Project] * Version: 1.0 * * This performs jazz on input data to produce whatzit. * * Last Updated: November 25, 2014 * Author: Bob D. Example */
Immediately following in a separate multi-line comment, include copyrightand licensing terms. Because many licenses are extremely long, placing thelicense comment separate from the main top-of-document comment allows forthe license to be collapsed in most code-folding-capable IDEs.
/* LICENSE * Copyright (C) My Really Cool Software Company. * Licensing yada yada goes here. */