News for August 2003

A Taxonomy of Comments

Christian Sepulveda writes about comments in source code:

Not all comments are bad. But they are generally deodorant; they cover up mistakes in the code. Each time a comment is written to explain what the code is doing, the code should be re-written to be more clean and self explanatory.

This statement is provocative and interesting, but wrong. There are more good uses for comments than bad ones. (The rest of Sepulveda’s posting is more nuanced, and much of what I write here expands on points he makes.)

Comments are an escape hatch for expressing everything about a program that the programming language can’t. Comments therefore don’t fall into a single natural category.

Rather than stating a single purpose or use for comments, one can start by stating the purpose of the non-comment portions of the source code. Comments are used for everything else, which is at least the following:

The Coding Compromise

Non-comment source code is a compromise among meeting the needs of two classes of consumers: compilers and runtimes on one hand; and human developers on the other1. The priority given to each of these consumers depends on the context of the program’s development and deployment. An example at the level of programming language selection is that I use C++ for performance-critical applications or deployment to resource-limited platforms, but for more readable programs I use Python. However, even within a specific context, the compromise is always present.

1 There are of course also tradeoffs within each of these audiences: for example, size versus speed for program execution; readability by domain experts versus programming language experts on the human side.

Comments don’t have to meet this compromise. A line of code may legitimately be biased towards program execution (the compiler audience), but the line of code plus its comment can meet the needs of the human reader as well.

As an example, let’s look at the inner loop of the scan convertor for a graphics library that I worked on. This loop is responsible for drawing spans within a polygon. A span is the portion of a scan line from the x position of an edge on the left side of the polygon’s interior to the x position of the next leftmost edge on its right. The definition of “interior” depends on the fill mode of the polygon. Since (for source size and code size reasons) a single function implements both fill modes, the way that this function computes the interior of the polygon is conditional on the fill mode.

If I’m implementing the computation for human readability, especially by someone who isn’t familiar with low-level graphics programming, the initialization code, and the code to handle an edge transition, might look like this:

bool inside = false;
int windingCount = 0;
if (windingType == kEvenOddFill) {
   inside= !inside;
} else {
  windingCount += edge->direction;
  inside = windingCount != 0;
}

(I I cared more about abstraction and flexibility, I might even turn windingType into an instance of a class with an update function:
windingType.updateState(edge);
inside = windingType.isInside();)

The same functionality coded for runtime efficiency might instead look like this:

int windingMask = (windingType == kEvenOddFill) ? 1 : -1;
bool inside=false;
int windingCount = 0;
inside = windingCount & windingMask;

The efficient code uses integer/bitfield and integer/boolean puns which wouldn’t even compile in a strictly typed language, but which take advantage of the implementation of values of these types in C++ to optimize the performance characteristics of the program. (It also uses a clever hack that I was proud of at the time.) These tricks turn the process of reading the source, however, into an exercise in reverse engineering.

Comments ameliorate this reverse engineering process, by speaking sotto voce to the human audience:

// windingMask tells which bits of the windingCount to test.
// For even-odd fill, test the low bit, to tell whether the number is odd.
// Otherwise, test all the bits, to tell whether it's non-zero.

The Limits of Expression

Another reason for a comment is to compensate for the limits of the programming language at hand. There is often no way to express the design of the program within the syntax of a particular programming language. Languages in mainstream use have become moderately good at abstracting over data structures; they’re less good at abstracting over types, control structures, definition patterns, or patterns of composition, and they generally lack means of describing design patterns or architecture2.

A comment can add information about a program’s design, such as that the program implements the Pipes-and-Filters architecture, that a class implements the Flyweight Design Pattern, that a declared float represents feet, or that a declared String may be null. There are languages which can express each of these facts directly, in which case the comment is superfluous, but chances are you aren’t using one of them.

2 Haskell is good at expressing abstractions over types; C++ is better than Java. Languages with either a lightweight syntax for closures (aka blocks), such as Smalltalk and Ruby, or with structural macros such as JSE, or with both such as Dylan and Scheme, can express abstractions over control structures. Languages with definition-level macros or modifiers such as Bigwig, Dylan, Lisp, or Elide can abstract over definition patterns. Architecture languages such as ArchJava are still in the research stage.

Levels of Abstraction

Source code is typically written only at one level of abstraction. It may combine different units of structure or composition, such as methods, classes, and packages, but the source code typically doesn’t contain both statements that express something in a high-level or coarse-grained way, and the same thing in a low-level or fine grained way. (An exception is invariants in Eiffel, which express the abstract what as well as the concrete how.) After all, this would be redundant, from the perspective of the compiler.

The human reader, on the other hand, would like to be able to understand the purpose of a code block, function, or package without reading its implementation. Different audiences care about different levels of detail, and at different times. An API user needs a description of a method’s external behavior; someone working on the implementation of a method needs a high-level understanding of its implementation or algorithms as a roadmap of the implementation, maybe an overview of what each block does, but a line-by-line understanding of a particular block within the method at only a particular time. Source code is either there or not, and often can’t be easily skimmed; comments let the reader turn the dial to positions betweeen no information and too much.

Program as Palimpsest

A comment is a note from the past to the future.

I write a comment wherever I’m afraid someone might change the program for the worse. One reason this might happen, discussed above, is because the program’s design isn’t evident in the source. Another is that the reason for an implementation decision comes from data that isn’t present in the design. Corner cases and performance metrics are two kinds of data that aren’t present in a program’s design. Changes made in response to these data frequently result in code that’s suboptimal from a readability perspective, and that in the absence of that data would appear to be redundant. In the absence of information to the contrary, this code should be optimized out, for performance and code size reasons as well as from a readability perspective. Comments are a protection against this form of regression.

For example, I’ve written code like this:

if (shape.bounds.contains(point) && shape.contains(point)) ...

First this code tests if the bounding box of the shape contains point; if it does, it tests if the shape itself contains point. The first test is redundant. It’s extra source text to read, extra code to maintain (if I rename “shape” or “point”, there’s one more subexpression to edit), extra code to deploy, and it breaks the Law of Demeter — it increases the coupling between this code and the implementation of the class of shape. As it stands, in my book, this is bad code.

Here’s a non-executable line that, prepended to the example above, changes it into correct code:
// On a 400MHz P5, testing shape.bounds speeds text tracking by 10%

Without this line, there’s nothing to indicate whether the code is the result of premature and possibly misguided optimization, or whether it actually provides some benefit to compensate for its readability and maintainability shortcomings. With it, the source code records its own history; if anyone decides to remove the hard-won optimization, it won’t be out of ignorance.

Joel Spolsky has written an essay about corner cases. I’ll just add that corner case code looks like optimization code: without a comment it’s hard to tell whether it’s contingency or clutter.

Time Bomb or Time Capsule

Comments are also a place to put wishlist items and to-dos. Some of these may be “mistakes in the code”, but an item can be a mistake in the code without being a mistake in the software development process. The point of software engineering is to make tradeoffs between implementation speed, implementation resources, deployment resources, and code quality. Sometimes putting all the emphasis on code quality or feature completeness is the wrong way to make the tradeoff — for example, when bytes count, when days count, when the code is throwaway and the corner condition will never be met , or when the enhancement isn’t yet useful.

When the developer knows that a corner case is present or that an enhancement is possible, comments add a choice between implementing it immediately on one hand, and ignoring it altogether on the other. The first alternative is expensive immediately, because of its affect on the schedule and its distraction from work implementing the more important use cases. The second is expensive later, because it loses information that’s available now and that may have to be recreated then.

A comment at the implementation site can document the cause of the error in implementation terms, or collect notes about how the missing condition could be handled. The comment captures knowledge from construction time, when it’s available, and saves it for maintenance or enhancement time, when it’s useful.

The Path to Deployment

Conventional programming languages define program execution, but don’t express information about activities that occur before the deployed program is executed. Other activities than execution require other artifacts. Development requires examples. Testing requires tests and contracts. Deployment requires packaging information.

These artifacts can be placed in separate files. Often they’re ephemeral, such as a test case that is typed into the command line and then discarded. But it’s convenient to keep these artifacts close to the program units that they describe, so that they can be updated and reused as the program evolves. Many of them are also useful to the human reader trying to understand the program: for example, the examples and tests that drive and verify development make useful API documentation too. Capturing these artifacts in comments is a way to do meet these needs.

(All of the other uses of comments are a special case of this point. Design notes and explanations support construction, maintenance, and white-box testing, which are all non-execution activities.)

The Third Consumer

I wrote at the beginning of this entry that there were two consumers for source text: humans, and the compiler. I lied. There’s a third class of consumers: comment processors, which treat comments as source code in an supplementary language — a language that fills gaps in the coverage of the source code’s primary language.

Tim Peters’s doctest treats comments as unit tests. iContract reads comments for pre- and post-conditions. XDoclet is an extensible processor, preconfigured to read packaging and deployment descriptions. These tools address the non-execution activities of software development.

SPARK is a generic parser that reads BNF productions from the comments of methods that their semantic actions. It patches the failure of Python to abstract over defining forms.

All of these functions of comments as metadata could be implemented by extending the language instead. (Contracts are from Eiffel, for example, which supports them in its core syntax.) However, the use of structured comments as metadata lets a third-party tool address a gap in the language in a way that leaves the source structure intact for compilers and other source processors (such as editors, source debuggers, and refactoring tools) that wouldn’t recognize an extended grammar. Just as comments are an escape hatch for human readers, they’re an escape hatch for development tools as well.

And these two kinds of comments — natural language and structured metadata — are related. They both add information that can’t be expressed in the primary programming language. And they’re both useful to humans too.

Posted: August 31st, 2003
Categories: Essays, Programming
Tags:
Comments: 5 Comments.

Test versus Type

A lightweight language such as Python or JavaScript fits a lot of program design into a small amount of source text. A heavyweight language such as C++ or Java uses more tokens to express the same design.

Many people (Guido van Rossum, Bruce Eckels , Steve Ferg) report that working in a lightweight language is about five times more productive than working in a heavyweight language. This matches my personal experience. Lutz Prechelt1 reproduces this difference within a research setting.

It’s easy to make up reasons why it’s easier to work with shorter sources. There’s less data entry, fewer places to make mistakes, and it’s easier to read the programs. A program in a heavyweight language has a lot of boilerplate that doesn’t add information about the program’s design or intent. Eliminating the boilerplate seems like a clear win.

But what about source text that does add information about the program — in particular, what about types? Removing type declarations from a program also makes it shorter, but it removes information that the compiler and runtime can use to verify the program. Doesn’t this make the program less reliable? Or, given a reliability goal, doesn’t it make it take longer to reach that goal?

From the anecdotes and studies above, it looks like this isn’t actually the case. I’d like to look at why.

1 An empirical comparison of C, C++, Java, Perl, Python, Rexx, and Tcl for a search/string-processing program, Lutz Prechelt. See especially figure 16.

Two Types of Types

An Explicity-Typed Language (ETL) is one in which functions and variables must be declared with type declarations. (These type declarations are conventionally called manifest types.) Java and C++ are ETLs.

An Implicitly-Typed Language (ITL) is one in which type declarations are not necessary. Some ITLs, such as JavaScript 1.5 and Python, lack type declarations altogether. In other ITLs, such as Dylan, Haskell, JavaScript 2.0, and JScript.NET, type declarations are optional, and can be added at any point during program development.

An ITL isn’t necessarily a weakly typed language. (In a weakly typed language, type mismatches are never detected. C++ is weakly typed where it uses with static casts.) Python is an example of latent typing: types are always checked, just not at compile time.

Neither is an ITL necessarily a dynamically typed language. Haskell and ML are examples of statically typed languages where the type declarations are often optional, but the types are known at compile time. The compiler infers variable and function types from the source, just as a Java or C++ compiler infers the types of individual subexpressions.

(This discussion is condensed from the appendix to Steve Ferg’s excellent article comparing Python to Java.)

No Pain No Gain?

It’s common knowledge that it takes longer to get a program in an Explicitly-Typed Language to first clean compile, because it takes time to get the type declarations right. Initial program development consists of editing type declarations as well as non-declaration text (the expressions, control flow, and program structure), followed by a block of time to get the program to first clean compile. The white area below represents work on the non-declaration portion of a program; the gray stripes represent work on type declarations.

In order to show the division between time spent on type declarations and time spent on other program text more clearly, I’m going to slide the type declarations to the right:

This picture appears to show that it takes more time to develop a program that contains type declarations. The received wisdom among ETL fans is that what you get for this extra time is error checking. ETL development takes longer, but gives you a more robust product. The equivalent ITL development effort (which isn’t shown here) would be even longer, because it would include debugging time to catch the problems that the compiler didn’t. In fact, some of these problems might not show up until the product is in the field.

Tests versus Types

Test-Driven Development (TDD) is a development methodology where the test cases are written first, and code is added to the program proper only as necessary to implement correct behavior for test cases.

TDD advocates have noted that if you write your test case first, you also get error checking. In fact, you get better error checking than you do with explicit types, because the tests are tailored to the code paths, the edge cases, and the specific problem domain. After all, the only kinds of errors that explicit typing catches are those of the kind where you used a string in integer context. Programmers who have used implicitly-typed languages can attest that these aren’t the kinds of errors that are hard to find. In fact, a lot of the work to make an ETL program compile is work that’s only necessary to get around how little to compiler knows about the runtime types.

The comparison of an ETL, where the extra work goes into typing explicit types, and an ITL with TDD, where the extra work goes into test cases, looks like this:

Let’s say you could start with a program without type declarations. Would you rather add type declarations, which catch a certain class of problems at compile time? Or would you rather add test cases, which in a strongly typed language catch this same class of problems at runtime, and catch problems that the type system can’t express as well?

But why not use TDD with an ETL?

Tests Plus Types

Here’s what happens when you use TDD with an ETL. You’re taking that first picture, and multiplying it by two: once for the test cases, and again for the program itself.

Robert Martin compares this TDD to dual-entry bookkeeping. Explicit types are another form a dual-entry bookkeeping. Using both explicit type declarations and test cases to validate is triple-entry bookkeeping. (And the time spent writing explicit types could have been spent on more test cases, or more of the program.)

But it’s actually worse than triple-entry bookkeeping, because an ETL requires that you use explicit type declarations in your test code too. Not only does the program itself cost more to write, but so do its test cases. It’s quadruple-entry bookkeeping: one of the three entries has two entries itself.

This is why even software written in an ETL is often tested with an ITL test harness. If you can afford to work in two languages, and if the test harness and the debugger are well-enough integrated that you can write the test cases first and use them for TDD, this works. (It brings the cost of using an ETL back down to triple-entry bookkeeping.) But there’s one more gotcha.

The Type Tax

The problem is that once type declarations are part of the source text, they’ve got to be maintained along with the rest of the code. The earlier the type declarations are added, the greater their contribution to the maintenance cost.

In a waterfall model of code construction, where all the code is written, brought to clean compile, and works the first time, this doesn’t matter. In incremental development, where compilable program is debugged or extended to handle additional features or test cases, the cost can be high.

Types are similar to comments, test cases, and end-user documentation in this way, except that types are coupled more closely to each line of source text. Factor out a variable from an expression, and the method comment and end-user documentation can stay the same, but a new type declaration is required. This tight coupling makes the maintenance tax on types even higher than it is for other program annotations, and the tradeoff for when to start adding them, if at all, is different.

Since the cost/benefit ratio of type declarations can be different at different points in program maturity, one style would be to add them late in program development (for performance reasons, or for their documentation benefits to future maintainers). This style of development is possible in a language with optional type declarations, such as Dylan or Haskell. That’s what the first picture on this page shows. ETLs such as C++ and Java simply don’t allow this.

Updated 9/25/2003 to rename “write types” to “edit tpes”, and to add a picture showing type declarations interleaved on a finer grain.

Posted: August 20th, 2003
Categories: Software
Tags:
Comments: 22 Comments.

Dot Numbers

Dot numbers are a new notation for numbers, that make integer addition look like rational multiplication. They may be useful in primary school math education. The idea is that once you understand integers and addition, you can learn another way to look at it that sets you up to understand fractions and multiplication.

I made up dot numbers a few years ago to try to explain negative numbers to my then-four-year-old son.

Basics

A dot number is a way of writing a number. A dot number is represented as a number of dots above a line. This is the number 3, as a dot number:

A negative dot number is a number of dots below the line. This is the dot number -3:
.

Addition

To add two dot numbers, combine the dots above the line, and the dots below the line. This is 3 plus 2:

This is -3 plus -2:

And this is 3 plus -2:

Cancelling

A dot number is in normal form if all its dots (if any) are on the same side of the line. A dot number that isn’t in normal form can be “normalizing”, or transformed into another dot number that represents the same integer and is in normal form. As long as there’s a dot above the line and a dot below the line, cross the dots out. The pairs of dots “cancel” each other.

The preceding number (3 plus -2) can be normalized.

To subtract a number, flip it upside down and add that. This is 3 minus 2:

Dots and Factors

Dot numbers are interesting because they’re isomorphic to fractions. Dot numbers are to addition (and subtraction, and the unit) as fractions are to multiplication (and division, and factors).

Just as the multiplicative inverse (the reciprocal) of a fraction can be obtained by turning it upside down, the additive inverse (the negative) of a dot number is obtained by flipping it upside down. (The inverse in both cases is the “vertical inverse”).

Just as fractions are multiplied by multiplying the tops and bottoms (numerators and denominators) separately, dot numbers are added by adding their tops and bottoms.

And just as any number can be represented by multiple fractions (1/2, 2/4, 4/8) which can be normalized (“reduced”) by dividing the top and bottom by integral factors (4/8 = 4×1/4×2 = 4×1/4×2 = 1/2), so an integer (3) can be represented by multiple dot numbers (3-0, 4-1, 5-2), each of which can be reduced by subtracting units from its top and bottom (5-2 = (1+1+1+1+1)-(1+1) = (1+1+1+1+1)-(1+1+1+1+1) = (1+1+1+1+1)-(1+1+1+1+1) = 3-0).

Posted: August 14th, 2003
Categories: Illustrations, Math Education, Visualizations
Tags:
Comments: 1 Comment.

Laszlo Blogging Widget

Last month I did a code sprint on a blogging aggregator written in LZX. You can see the results at myLaszlo.com. There’s a screenshot on this page.

(more…)

Posted: August 8th, 2003
Categories: OpenLaszlo, Projects
Tags:
Comments: 1 Comment.

Rethinking MVC

In the Model-View-Controller architecture, the Model is decoupled from information about the user interface. In a Data-Driven Presentation, the data contains all the information necessary to assemble the user-interface elements. These design patterns appear at first to be exclusive mutually exclusive: either the data contains presentation information, or it doesn’t. This apparent conflict is because of a confusion between the Model of MVC, and the Data in DDP.
(more…)

Posted: August 4th, 2003
Categories: Essays, OpenLaszlo
Tags:
Comments: 4 Comments.

myLaszlo.com

Check out myLaszlo.com. As well as showing off some examples, this is a way of getting free hosting for Laszlo applications written using the Laszlo Presentation Server’s Developer Edition — which is also free.
(more…)

Posted: August 1st, 2003
Categories: OpenLaszlo
Tags:
Comments: No Comments.