Lies, damn lies, and “l’ll document it later”

One of the most frequent lies programmers tell themselves is that “I’ll document it later”. Programmers are incredibly busy people (often on multiple projects) and rarely have an opportunity to go back and document code. Additionally, the more code you write, the larger your backlog becomes and the likelihood of going back and documenting your code – and especially documenting it well – continuously decreases, eventually to zero.

The major problem with this is that I have never believed-in or even seen self-documenting code (that was not completely trivial). I’m a big proponent of descriptive function names and long variable names but I do not believe they tell the whole story. Recently, a fellow developer and I got in a discussion about self-documenting code; he argued that all code should be self-documenting and I argued only trivial code could be self-documenting. So I took a nontrivial piece of code that had passed code review with flying colors and removed all of the documentation from it; then I showed it to him. He could tell it was doing something with percentages but beyond that was at a loss. When I added the comments back in he groaned and said, “Well of course!” which I thought proved my point rather nicely.

The problem: what is supposedly self-documenting to the original programmer when they wrote it may not be self-documenting to a completely different programmer. The original programmer has no way of knowing the background and experience level of the programmer that will be maintaining the code in the future. So while the following might be completely decent documentation for the original author:

The above code and documentation is probably completely useless to you or me. What is a “dingle-fart”? Why does this class have one? What are the valid values for the integer and what effect does it have on the class? Where can I read about the concepts? For the really lazy, is there at least a Wikipedia article URL that could be included in the code that contains relevant domain information? (Don’t copy the Wikipedia article’s text – that takes the information out of context, bloats the code, and doesn’t get updated by the Wikipedia authors).

Of course, this is only one example of completely poor documentation. Another is:

The documentation on this example provides absolutely no further insight into the code than the code does itself. I could’ve written this documentation without having the faintest clue about what was going on. Much better would be at least the name of the equation being used, again, perhaps a link to the relevant Wikipedia page or internal company documentation or use case (if the use cases are stored in some well indexed persistent location).

Moreover, even the original author may not understand the code two or three years down the road (or less if the programmer is incredibly busy and writing lots and lots of code while not getting very much sleep, like if you have a brand new baby girl in the house :-)). Personally, I have written lots of code that I have no memory of writing years later (I have 4 kids, all ≤5-years old – that’s my excuse anyway) so I view good documentation as a message to myself in the future on what I was trying to do at the time and why I was doing it. Then, when I have some sleep, if I find code where the code doesn’t seem to match the documentation, I know I have found something that needs to be investigated and may be the bug I’m currently looking for.

I certainly do NOT view it as a liability when the code and documentation don’t seem to match!  It just means that something needs to be changed (either the code or the documentation) and I need to figure out which. Without the documentation, I would have absolutely no way of knowing that the programmer’s original intent differed from what was actually coded. Therefore having the documentation is still very valuable contrary to what the “self-documenting crowd” would have you believe.  Additionally, the documentation is in English, and should be conversationally written (not stilted and formal) so it’s much more likely to represent what the developer was actually trying to accomplish with the code.

Thus, code reviews should also examine the documentation to make sure it properly describes the intent of the code.  Of course if it doesn’t, the developer should update it!

The goal

Good documentation should communicate why the programmer coded the algorithm the way they did and what they hoped to accomplish. The how is of course right there in the code, however even then it is often useful to provide higher-level descriptions. With the code example above I would like to know what the property actually is in layman’s terms, or in terms that you could reasonably expect somebody in the (appropriate) industry to understand.

From “The Pragmatic Programmer”, Andrew Hunt and David Thomas suggest that based on the Do-Not Repeat Yourself (DRY) principle:
[quote style=”1″]The DRY principle tells us to keep the low-level knowledge in the code where it belongs, and reserve the comments for other, hi-level explanations.[/quote]So one of the major problems with “documenting later” is that the “why?” and the “what?” are not as clear in the programmer’s mind as they were at the time they wrote the code. Of course the problem gets worse the longer we wait to document the code.  Eventually the “why?” and “what?” may be lost completely, and even if we do eventually go back to the code and try to add documentation, will be attempting to reconstruct it from the “how?”

Finally

Note that I want to see documentation on classes, on methods and properties, and even on private member variables, and especially inside methods! Here’s an example snippet from some real production code I wrote a few years ago inside a method:

I grabbed this code basically at randomly from my application, and even without seeing all of the code in the method or the documentation on the method I believe that this code is reasonably easy to understand.  By happy coincidence this is actually some of the most complex code I’ve written.  The green comments become visual separators within the code, helping the reader to “chunk” the related pieces of code together (which is also why white space is very important!). You might be able to understand the code without the comments, but with them the code is much easier to read.

4 thoughts on “Lies, damn lies, and “l’ll document it later””

  1. Excellent points Robert. Now, one thing I’d like to warn people for is when, e.g. in your last example the code changes in such a way that the comments no longer are valid. If the programmer does not update the comments it can set the unsuspecting fellow programmer totally on the wrong foot when he starts to maintain that code. So, you could say “document while developing” AND “update documentation while maintaining.

    1. Agreed! The documentation becomes a valuable resource – just like the code – that needs to be maintained.

  2. This is an excellent post. I want to share it with my team, but the code on the page is weirdly garbled (examles below). Do you mind fixing it for easier consumption? Thanks

    // Loop over every month that the person has this insurance policyvar startDate = this.StartDate;var endDate = this.EndDate;

    ∃c( WellWrittenCode(c) ∧ ∃p( Programmer(p) ∧ ¬Understands(c,p) ⇾ ¬SelfDocumenting(c) ) )

Leave a Reply

Your email address will not be published. Required fields are marked *