Synchronizing Client Models 6

Posted by Oliver on February 27, 2008

You’re implementing a client-server application. The client is in JavaScript. It contains a model class, Person. The model is backed by a server-side Person model, and a REST controller at /person. Periodically, the client updates the server’s model, but there can be client-side instances that don’t yet exist on the server, such as when a model is first created and the server hasn’t yet gotten the message.

I’ve written this code a few times now, in JavaScript, and in ActionScript. if If you write it the obvious way, you run into an interesting set of race conditions. Here’s the code, and the race conditions, and some ad-hoc solutions. In the next post, I’ll introduce a metaobject pattern, queue ball, that I’ve used to solve these race conditions in a more principled and re-usable fashion.

Note: As of 2008-02-28, none of this code has been tested. It’s all extracted from code that’s like the code here, but I haven’t copied and pasted these specific examples into an execution environment, which probably means they fail.

Getting Personal

Here’s the model, with some server proxy mojo mixed in:1

// creates a client-only instance
function Person(attributes) {
  this.attributes = attributes||{};
  // if a server mirror exists, this.id is set to its id 
}
 
// creates a client instance that is mirrored by a new server instance
Person.create = function(attributes) {
  var person = new Person(attributes);
  person.create();
  return person;
}
 
Person.prototype = {
  // creates a server instance for this client instance
  create: function() {
    jQuery.post('/person/create', this.attributes, function(data) {
      this.id = data.id;
    }.bind(this)); 
  },
 
  //  updates attributes of this instance, and, if it exists, its server mirror
  update: function(attributes) {
    Hash.merge(this.attributes, attributes);
    this.id && jQuery.post('/person/update/' + this.id, attributes);
  },
 
  // deletes this instance's server mirror
  remove: function() {
    this.id && jQuery.post('/person/delete', {id:this.id});
    delete this.id;
  }
}

This implementation uses jQuery for transport, and assumes a Hash.merge method from some collection library (say, Prototype’s). It creates a class by setting prototype directly, and it doesn’t detect or recover from XHR errors. All these choices are just to have something concrete to write about; they don’t affect the substance of this article.

A Day at the Races

Do you see the race conditions? There’s at least three: create+update, create+delete, and update+update.

Race Condition 1: Create then Update

function createThenUpdate() {
  var aPerson = Person.create();
  aPerson.update({name:'Edgar Dijkstra'});
}

The problem with createThenUpdate is that aPerson won’t have an id by the time update is called, so update won’t send the new values to the server. The call to create is synchronous, but the communication with the server, and therefore the call to the callback (that sets aPerson.id) is asynchronous, and therefore won’t occur until Person.create returns.

In detail:

  1. createUpdate calls Person.create
  2. Person.create calls new Person
  3. aPerson.create calls jQuery.post
  4. jQuery.post calls XMLHttpRequest.send (not shown)
  5. XMLHTTPRequest.send, jQuery.post, and aPerson.create return
  6. createUpdate calls aPerson.update
  7. [time passes]
  8. Client sends HTTP Request to server
  9. [more time passes]
  10. Client receives HTTP Response
  11. Callback in aPerson.create sets aPerson.id

Solution 1: Explicit Callbacks

One solution to this problem is to thread the code through callbacks (in effect, performing CPS conversion by hand). aPerson.create calls a callback function once it’s internal callback function is called, so Person.create takes a callback parameter too, and so on up the call chain. (In this case, the buck stops here.)

Let’s add a callback parameter to Person.create, that is called once the HTTP response to /person/create is received.

Person.create = function(attributes, callback) {
  var person = new Person(attributes);
  person.create(callback);
  return person;
}
 
Person.prototype = {
  // creates a server instance for this client instance
  create: function(callback) {
    jQuery.post('/person/create', this.attributes, function(data) {
      this.id = data.id;
      callback && callback();
    }.bind(this)); 
  }
}

Then we can rewrite createThenUpdate thus:

function createThenUpdate() {
  var aPerson = Person.create({}, function() {
    aPerson.update({name:'Edgar Dijkstra'});
  });
}

Adding the UI

It was easy to spot the race condition in createThenUpdate — and easy to fix it — because the calls to create and the update were in consecutive statements, within the same function. In the real world, they’re at the bottom of different call chains, as in this jQuery code that binds some model actions to an HTML view:2

$('#person create-button').click(function() {
  $(this).disable(); // avoid double-creation
  $('#person update-button').enable();
  gCurrentModel.create();
});
$('#person update-button').click(function() {
  gCurrentModel.update($('#person').serialize());
});

Click “create“, edit a field, and then click “update“. Sometimes the update will hit the server, sometimes it won’t: it depends on whether the response to the /person/create request has returned by the time you click the second button. We’ve just created an AJAX version of the 500-mile bug.

Let’s thread the callbacks through this code, in order to avoid enabling the “update” button until the callback is called:

$('#person create-button').click(function() {
  $(this).disable(); // avoid double-creation
  gCurrentModel.create({}, function() { $('#person update-button').enable() });
});
$('#person update-button').click(/* unchanged */);

This is awful! First, it requires you to weave callbacks through both your view and your model code.3 But worse, it’s a leaky abstraction. The view layer has to know about an arbitrary (from the outside) limitation — that you can’t call update until create has called its callback — of the model layer.

Solution 2: Implicit Callbacks

Another solution is to use a library such as Narrative JavaScript or JavaScript Strands, that does the CPS conversion (adds the callbacks) for you. I like this approach a lot, but I do a lot of work in contexts where those compilers aren’t applicable4, and many folks (often including, for these reasons and others, me) prefer to work in pure JavaScript. I therefore won’t go further down that path here.

Solution 3: Action Queue

Finally, we can add a queue to the model. With the modification below, calling update while the model is waiting for an id no longer drops server updates; it simply queues them for playback once the response to /person/create is received.

Person.prototype = {
  _updateQueue: null,
 
  create: function() {
    this._updateQueue = [];
    jQuery.post('/person/create', this.attributes, function(data) {
      this.id = data.id;
      while (this._updateQueue.length)
        this._sendUpdate(this._updateQueue.shift());
      delete this._updateQueue;
    }.bind(this));
  },
 
  // the caller must treat `attributes` as deep-frozen once
  // this method has been called
  update: function(attributes) {
    Hash.update(this.attributes, attributes);
    if (this.id)
      this._sendUpdate(attributes)
    else if (this._updateQueue)
      this._updateQueue.push(attributes);
  },
 
  _sendUpdate: function(attributes) {
    jQuery.post('/person/update/' + this.id, attributes);
  }
}

We can use a “method algebra” to optimize this a bit: It doesn’t matter how many times update is called while waiting for the create response — it only needs to send an update once. (The algebra is that there’s an operation +: update × updateupdate that can combine consecutive updates update1 + update2 = update3.)

Person.prototype = {
  _pendingUpdates: null,
 
  create: function() {
    this._pendingUpdates = {};
    jQuery.post('/person/create', this.attributes, function(data) {
      this.id = data.id;
      if (this._pendingUpdates) {
        this._sendUpdate(this. _pendingUpdates);
        delete this. _pendingUpdates;
      }
    }.bind(this));
  },
 
  update: function(attributes) {
    Hash.update(this.attributes, attributes);
    if (this.id)
      this._sendUpdate(attributes)
    else if (this._pendingUpdates)
      Hash.merge(this._pendingUpdates, attributes);
  },
 
  _sendUpdate: function(attributes) {
    jQuery.post('/person/update/' + this.id, attributes);
  }
}

I’m going to back off from this optimization, though. The reason is that it only works if the two calls to update are consecutive — when there are no intervening calls that also send messages that operate on the same instance. With a more full-featured API (with more actions that send messages to the server), this won’t generally be true.

For example, let’s extend Person with a setPermissions method. If we could ignore race conditions, this method might look like this:

Person.prototype = {
  _pendingUpdates: null,
 
  setPermissions: function(permissions) {
    this.permissions = permissions;
    this.id && jQuery.post('/person/set_permissions', {id:this.id, permissions:permissions});
  }
}

This naive implementation is vulnerable to a create+setPermissions race condition analogous to the create+update race condition that we just fixed, though. We can fix them both by generalizing the post-create queue, so that it can contain arbitrary actions, not just update records:

Person.prototype = {
  _pendingActions: null,
 
  create: function() {
    this._pendingActions = {};
    jQuery.post('/person/create', this.attributes, function(data) {
      this.id = data.id;
      while (this._pendingActions.length) {
        var action = this._pendingActions.shift();
        this[action.methodName].apply(this, action.arguments);
      }
      delete this._pendingActions;
    }.bind(this));
  },
 
  update: function(attributes) {
    Hash.update(this.attributes, attributes);
    if (this.id)
      this._sendUpdate(attributes);
    else if (this._pendingActions)
      this.pendingUpdates.push({methodName:'_sendUpdate', arguments:[attributes]);
  },
 
  _sendUpdate: function(attributes) {
    jQuery.post('/person/update/' + this.id, attributes);
  },
 
  setPermissions: function(permissions) {
    this.permissions = permissions;
    if (this.id)
      this._sendSetPermissions(permissions);
    else if (this._pendingActions)
      this.pendingUpdates.push({methodName:'_sendSetPermissions', arguments:[permissions]);
  },
 
  _sendSetPermissions: function(permissions) {
    jQuery.post('/person/set_permissions', {id:this.id, permissions:permissions});
  }
}

Race Condition 2: Create then Delete

function createThenDelete() {
  var aPerson = Person.create();
  aPerson.delete();
}

By now, you should be able to spot the problem here. The reasoning is exactly the same as for update: when delete is called, aPerson won’t yet have an id.

We could fix this with a callback:

function createThenDelete() {
  var aPerson = Person.create({}, function() {
    aPerson.delete();
  });
}

This has the attendant disadvantages of having to bake knowledge about the client-server protocol into Person’s clients, and having to thread callbacks through the UI. After all, it’s rare that we would create a Person simply to delete it; the more common case is that the creation and deletion would be at the bottom of different call chains — often initiated from outside the application, in response to user actions — such that it’s difficult to thread the first as a callback of the second. And note that, as with create+update, we can’t simply ignore the delete unless the server creation has responded: if we do this, we’ll occasionally drop a delete on the floor, because it was called after the create was sent, but before the response.

The best local solution is to build on the action queue solution above — by simply adding another method to the queue.

Person.prototype = {
  delete: function() {
    if (this._pendingActions)
      this.pendingUpdates.push({methodName:'_sendDelete');
    else
      delete this.id;
  },
 
  _sendDelete: function() {
    jQuery.post('/person/delete', {id:this.id});
    delete this.id;
  }
}

This works, but it should make you uncomfortable. We’re adding (almost) the same conditional to every single method.

Race Condition 3: Overlapping Updates

function updateThenUpdate(aPerson) {
  aPerson.update({name:'Edgar Djikstra'});
  aPerson.update({name:'Edgar Dijkstra'});
}

From looking at updateThenUpdate, it looks like the first call to update will occur before the second. And it does! (Duh.) And it looks like the misspelled name in the first call will be replaced by the correct name in the second call. And it will! (Well…on the client…read on.) Because: the first call to XMLHttpRequest.send (with the misspelled name) occurs before the second call to XMLHttpRequest.send (with the correction), and the client therefore sends the message with the misspelled name before it sends the message with the correction. But our run of good luck stops here. There is, unfortunately no guarantee about the order in which the server will receive these messages. Generally, the first message will be received before the second. Sometimes, they will arrive in the other order, and the misspelling will overwrite the correction.

There are two ways to fix this problem: by sequencing messages, or by holding outgoing messages (holding each outgoing message until the previous one returns). Sequencing messages is the higher-performance solution (it doesn’t hold up messages), but requires more work and involves switching both the client and the server from a straight REST API, which may not be possible5.

For simplicity, we’ll look at the second solution: holding outgoing messages. This solution has the advantage that the general-purpose solution to the other race conditions (presented in the next article) happens to implement it too. (In this article, we’ll implement with an explicit Serialized object instead.) Message sequencing doesn’t help with those other cases at all: the problem with them is that the second message is never sent, not that it’s sent out of order.

Here’s a quick-and-dirty implementation of the hold outgoing messages solution. The following code defines Serialized.post as a drop-in replacement for jQuery.post, that refuses to post data until the previous post has completed (successfully, or with an error).6

var Serialized = {
  queue: [], // arguments for pending 
  defer: false,
  post: function(url, data, callback, type) {
    if (this.defer) {
      this.queue.push(Array.prototype.slice.call(arguments, 0));
      return;
    }
    this.defer = true;
    jQuery.ajax({url:url, type:'POST', data:data, success:success, complete:complete.bind(this)});
    function complete() {
      if (this.queue.length)
        this.post.apply(this, this.queue.shift();
      this.defer = false;
    }
  }
}

Next Up: Queue Ball

I’d like to factor all those conditionals out of the Person methods. Then I’d like to extract the queue code from create, so that I can use it on update (to solve the update+update problem). Finally, there are some general-purpose techniques here, so I’d like to extract the whole mess from Person, where I can apply it to any model (or to code that has some of the same concerns, even if it’s not synchronized model code). But this post is already long enough, so I’ll just close with the promise to write that up, so that I have to do it.


1 Would you rather have code with a cleaner separation of concerns? Here it is. You’ll find that it doesn’t make the race conditions go away, but that it doesn’t change the set of techniques for solving them. (It does make the “explicit callbacks” solution even worse.) I’ve therefore stuck with the double-duty Person implementation in the body of this article, to make the code easier to follow.

function Person(attributes) {
  this.attributes = attributes || {};
  this.proxy = null;
}
 
Person.prototype = {
  create: function() {
    this.proxy = new PersonProxy();
    this.proxy.create(this.attributes);
  },
 
  update: function(attributes) {
    Hash.merge(this.attributes, attributes);
    this.proxy && this.proxy.update(attributes);
  },
 
  remove: function() {
    this.proxy.remove();
    delete this.proxy;
  }
}
 
function PersonProxy() {
  this.id = null;
}
 
PersonProxy.prototype = {
  create: function(attributes) {
    jQuery.post('/person/create', attributes, function() { this.id = id }.bind(this)); 
  },
 
  update: function(attributes) {
    this.id && jQuery.post('/person/update/' + this.id, this.attributes);
  },
 
  remove: function() {
    this.id && jQuery.post('/person/delete', {id:this.id});
    delete this.id;
  }
}

2 This implementation somewhat mixes the model with the view. It’s not the clearest code. It would be cleaner if it used listeners and reactive programming techniques — but the fact that it’s so explicit makes it easier to follow what’s going on.

3 I’ve used this approach, and it wipes the floor with using listeners or delegates or other unthreaded callbacks, where you have to store state in objects in order to match listeners with their context, but it’s still a pain to maintain.

4 CPS conversion introduces a lot of function allocations and invocations. I’ve been scared to try a system that introduces them globally, instead of letting me judiciously thread a few callbacks in by hand, when developing for a slow ECMAScript implementation such as Flash < 9 or MSIE. (I even use my own libraries sparingly in such a situation.)

5 XMPP preserves message order, by sending all the messages over a single stream. One could also add a sequence number to each message. The receiver (in this case, the server) should buffer messages that arrive out of order, so that it can process them in the order in which they occur. This is how a streaming protocol such as TCP is implemented: by adding sequence numbers and buffering on top of an unordered protocol such as IP. HTTP is implemented on top of TCP, but only uses TCP to preserve the order of packets within a message, so multiple HTTP requests (and responses) can get out of order again. It seems that keepalive might fix the problem, and that load balancers might re-introduce it, and that affinity might fix it again, but only if you can guarantee that your load balancer is properly configured. But I’m getting out of my depth here.

6 This code assumes that a request will never take longer than the client timeout setting to reach the server. Otherwise, complete could be called before the server receives the first message, the client would send the next message, and the server would process them out of order. That’s one reason I called this implementation quick-and-dirty.

More Monads on the Cheap: Inlined fromMaybe 8

Posted by Oliver on February 26, 2008

This article is about how to deal with null values. It follows up on this one. It’s intended for code stylists: people who care a lot about the difference between one line of code and two, or keeping control statements and temporary variables to a minimum. (A code stylist is kind of like the dual of a software architect, although one person can be both.) It’s not about code golf — although you might learn some strokes to use on that — but about keeping the structure of your code, even at the expression level, close to the way you think about the problem, if you think like me.

If you’re not a code stylist — and I’m not saying that being a code stylist, any more than being a prose stylist, is either a good or a bad thing — you might find it baffling that someone would put so much time into such simple topic. I won’t try to convince you otherwise. In that case, you might want to check back next week, when I’ll move back up to the bigger picture. (Specifically, some fun stuff involving how to use meta-object programming to solve race conditions in client-server models.)


A nullable or optional, type is one that might have a value of a certain basis type, but might be null. For example, a nullable array is either an array or null. Even if you don’t use a language with type declarations, you probably use a language with types. If a variable or field (JavaScript property) is expected to hold only arrays, it has type array; if it sometimes ends up holding null as well, it has type nullable array instead.

Haskell has a function fromMaybe that turns a nullable type into a non-nullable type, but replacing it with a default value when it’s null. What would this look like in a more conventional language, and where would you use it?

I’m using JavaScript as an example language here, but the techniques here apply to Ruby and, to a lesser extent, Python as well.

The First Problem Set

Here’s your assignment. It has three parts. In all of them, products is a list of products . In JavaScript, this list is represented by an instance of Array.

First, if products is non-empty, display its first item; otherwise, do nothing. This is easy enough:

if (products.length) {display(products[0])}

Or, for a more Lisp- or Ruby-like style, with the advantage that it can be nested in an expression:

products.length && display(products[0])

Second, apply a preload() function to each item in products. This is easy too:1

products.forEach(preload)

Finally, extract the id from each product, and pass the list of ids to a function preloadAll.2

preloadAll(products.pluck('id'))

Raising the Bar

Let’s make this problem harder. This time, products might be an array, but it might be null.

“Hey!” you (ought to) protest. “That’s a stupid design. You’re giving me poorly typed data, and this introduces complexity and its attendant costs (development time, code size, test cases) to deal with it.”

Well, yes. But this is the real world. Maybe you’re reading an attribute from a deserialized XML element. XML schemas allow for this kind of abbreviation, and using it makes documents more concise (and therefore both lower bandwidth and easier to inspect for debugging), so you’ll probably see this at some point. Maybe you’re reading or a property from a JSON object, where the server omits null lists (for the same reasons — message size and debuggability — as for XML). Or maybe you’re reading products from a library that represents empty lists by null — for performance reasons (to avoiding making empty lists), or for backwards compatibility, or just out of laziness. I’ve seen all of the cases, a number of times.

Or maybe you used the technique in Monads on the Cheap to write something like (order||{}).products. Now that you’ve propogated a null order into a null products — to avoid wrapping an if statement around the code that dealt with order — you’ve got to pay the piper. You followed my advice and I dug you into a hole; now I’d better toss you a rope ladder.

Solution 1: Fixing the input on entry

You could fix products before you use it: insert products = products || [] at the top of your function to change a nullable array into a non-null array, by replacing null by a default value. If products is a local variable (as opposed to a function parameter), you could even do this where it’s initialized: replace var products = order.products, say, by var products = order.products || [].3 So the three solutions above become simply:

// products may be null
products = products || [];
// products instanceof Array
if (products.length) {display(products[0])}
 
products = products || [];
products.forEach(preload)
 
products = products || [];
preloadAll(products.pluck('id'))

Raising the Bar Again

Where products is a local variable, “fixing the input” really is the best solution. However, it’s not the most general solution (for reasons I’ll get to). So I’ll move the bar again.

This time, instead of the variable products, it’s the expression offer.products that evaluates to the nullable array. What’s the smallest change required to adapt our code to null values, in this scenario?

Solution 2: Introducing a temporary variable

This one looks absurdly easy too. Changing a line of code to accommodate nullable arrays involves a simple program transformation. Replace offer.products by products, and insert var products = offer.products || [] on the line before. Here are the before cases, where offer.products is not allowed to be null:

// requires products instanceof Array
if (products.length) {display(offer.products[0])}
offer.products.forEach(preload)
preloadAll(offer.products.pluck('id'))

And here are the after cases, where offer.products is allowed to be null:

// accepts null products
var products = offer.products || [];
if (products.length) {display(products[0])}
 
var products = offer.products || [];
products.forEach(preload)
 
var products = offer.products || [];
preloadAll(products.pluck('id'))

Non-local Transformations

There’s something funny about the “temporary variable” program transformation. offer.products is an expression — you can nest it in another expression: as the argument to a function, before a property accessor, or as part of a conditional. var products = offer.products||[]; ...; ...(products)... is a statement sequence. In fact, it’s a statement sequence with a hole — it doesn’t strictly embed the original code, but it isn’t strictly embedded by it, either; instead, it’s woven in.

These differences — that this transformation changes the syntactic type of the code that you’re applying it to (from an expression to a statement), and that you have to weave it into the existing code — make it non-local.4 Here’s what I mean by this:

To apply this transformation — to change code that expects an array into code that accepts a null — we look for an occurrence of offer.products; we replace it by products; and then we travel up the syntax tree (we look at the expression that contains offer.products, and then the expression that contains that) until we find a statement. Finally we insert var products = offer.products||[] before that statement.

Admittedly, there hasn’t been much to this in the statements so far. We’ve simply replaced the first snippet below (with one line of code) by the second snippet (with two lines). And the lines are adjacent, so it’s easy enough to read them as a unit.

// requires products instanceof Array
preloadAll(offer.products.pluck('id'))
// accepts null products
var products = offer.products || [];
preloadAll(offer.products.pluck('id'))

It gets worse, though. Let’s make offer nullable too, and add some more computation. (I’m trying to get offer.products far enough inside of an expression that we can get a feel for where the problems arise.)

In the first block below (which doesn’t deal with nullable arrays), offer is either an object with a products property (which is always an Array), or null. If it’s not null, we load its products. We then set loaded to true if there was an offer, and if any of its products were already loaded. (preloadAll returns true in this case.) Simple enough:

// accepts null offer; requires products instanceof Array
var loaded = offer && preloadAll(products.pluck('id'));

Now, how do we change this if not only offer, but offer.products, are nullable? We apply the transformation above, inserting the statement var products = ... and changing offer.products to products. But where do we insert the statement? It has to go before the call to preloadAll, but after the test for whether offer is null.5 But in the listing above, there isn’t any such location!

To create one, we have to split the initialization expression in half, in order to get the test for offer and the use of offer.products into separate statements, so that there will be room for a statement (not added yet) between them:

// accepts null offer; requires products instanceof Array
var loaded = false;
if (offer)
  loaded = preloadAll(offer.products.pluck('id'));

And now we can hoist offer.products out of the second new statement, without moving it before the first:

// accepts null offer, offer.products
var loaded = false;
if (offer) {
  var products = offer.products || [];
  if (preloadAll(products.pluck('id'))
    loaded = true;
}

This is awful! Not only did it go from one line to five, but loaded changed from a non-mutable variable with an initializer into a mutable variable with two different assignments, such that its initialization is split across the inside and the outside of a conditional. This is the kind of code that, if I let it take over 5% of my program, takes up 50% or my debugging time.

You might think these problems are because of the distinction between statement and expression in Algol-style languages (C, C++, Java, JavaScript). This is partly right, but it’s only somewhat better in Lisp-style languages (Smalltalk, Lisp itself, Ruby). Here’s a straight translation of the JavaScript code into Ruby:

// before: accepts offer == nil; requires offer.products.is_an? Array
loaded = offer && preloadAll(products.map &:id));
# after: accepts offer == nil, offer.products == nil
loaded = false
if offer
  products = offer.products || []
  loaded = preloadAll(producs.map &:id)
end

Now let’s use the fact that Ruby statements are expressions to re-write the second case:

# also accepts offer == nil, offer.products == nil
loaded = offer && preloadAll(begin products = offer.products || []; products.map &:id));

Sure, this is back down to one line. And it avoids the cascading rewrite of the first transformation (where changing the innermost expression into a statement required changing the expression that contains it into a statement). However it’s far from idiomatic Ruby.

Worse, like the JavaScript snippet (and this is another problem with temporary variables), it introduces a products into the surrounding environment, or overwrites the existing value of products if there’s already a variable with that name there — a subtle source of bugs, especially when you apply this transformation more than once.

Solution 2: Ifs and Ands

You could use conditional statements to transform the original solutions from these:

// requires non-null product
if (offer.products.length) {display(offer.products[0])}
offer.products.forEach(preload)
preloadAll(offer.products.pluck('id'))

into these:

// accepts null product
if (offer.products && offer.products.length) {display(offer.products[0])}
if (offer.products) offer.products.forEach(preload)
if (offer.products) preloadAll(offer.products.pluck('id'))

The first line (if (products) {...}) already had a conditional, so we stuck the new test into the existing conditional. The other two lines didn’t, so we wrapped the original code in new conditionals to hold the new test.

Scalability

The “ifs and ands” solution works, but it doesn’t scale. (”Doesn’t scale” is Enterprise for “I don’t like it.” In this case, I’ll rationalize define “scale” as “grows linearly and compositionally with the number of variables and/or the complexity of the syntactic context”.)

First, like the “temporary variable” solution, it’s non-local — it involves changing an expression into a statement, and therefore the expression that contains that expression into a statement, and so on up the line.

It’s also non-linear (in the sense of linear logic6, not linear algebra). Where an expression occurs once in the original code, it occurs twice in the new code. It evaluates offer.products three times instead of twice in the first case, and twice instead of once in the other two.

To see why this is bad, imagine if instead of offer.products it were offer.getProducts, or customer.offer.products, or ((customer||{}).offer||{}).products. Or imagine if it were an expression that had side effects — then the first example wouldn’t have worked anyway, but we would have just broken the other two.

To get another taste of how the expressions replicate with this strategy, take a look at what happens when here’s more than one of them. What if there are two such properties, offer.products and offer.merchants, and we only want to execute our code if they’re both non-empty? Here’s the case for non-nullable arrays:

// offer.products and offer.merchants are non-null
if (offer.products.length && offer.merchants.length) {...}

This code transforms into this:

// offer.products and offer.merchants may each be null
if (offer.products && offer.products.length &&
    offer.merchants && offer.merchants.length) {...}

Or let’s say we wanted to sum the lengths of two properties, offer.ordered and offer.saved. The code for the non-nullary case is simply offer.ordered.length + offer.saved.length. The nullary case balloons into a statement sequence:

// offer.products and offer.merchants may each be null
var count = 0;
if (offer.ordered) count += offer.ordered.length;
if (offer.saved) count += offer.ordered.length;

Or we could use the ternary operator, but still at the cost of typing (and evaluating) each nullable subexpression twice:

// offer.ordered and offer.saved can each be null
(offer.ordered ? offer.ordered.length : 0) : (offer.saved ? offer.saved.length : 0)

The problem with all of these is that we’ve had to write out each variable name twice, inviting defects. In fact, I made a paste-o in one of the examples above. I could fix it, but I bet I’d make it again if I later changed the code to include offer.wishlist in the count.

Solution 3: Inlined fromMaybe

Here’s an alternative. To change code that expects a non-nullable array to a nullable array, change array to array||[]. This is a local transformation: it changes an expression into an expression (not a statement), so you don’t need to re-write the code that contains it. It’s also a linear transformation (again, in the sense of linear logic, not linear algebra): an expression that only occurs once before the transformation, only occurs once after it.

The original solutions transform thus:

// offer.products can be null
if ((offer.products||[]).length) {...}
(offer.products||[]).forEach(...)
if ((offer.products||[]).length || (offer.saved|[]).length) {...}

Note that each transformation is local: no new control structures are introduced, so there’s no cascade of expression -> statement transformations up the syntax tree. We can see that by the fact that the troublesome loaded case remains largely intact.

// offer and offer.products can each be null
var loaded = offer && preloadAll(offer.products||[]).length);

Here’s the summation code:

// offer.ordered and offer.saved can each be null
count = (offer.products||[]).length || (offer.saved||[]).length;

Beyond Arrays and JavaScript

This technique works in any language where arbitrary values can be used in a boolean context (that is, practically every language except Java) and where null is considered false, and for any type whose values test true. This includes Object and Array in JavaScript, additionally Number and String in Ruby (since 0 and “” are considered true in that language), and — well, the moral equivalent of Object types in Python, since Python collections implement nonzero() or len.

But actually we can go ahead and use the technique even with types that contain a false value, where we want to replace that false value by a default anyway (either the same false value, or a different value that tests as true). For example, even though JavaScript "" tests false, we can use inputString || "" to coerce a nullable string to a non-null string, since it will null and "" are the only two values that it will change to ""

Here are some examples that go beyond arrays. First, using the ternary operator. (Which isn’t so bad here, since the expression is in a variable already — bear with me and pretend the expressions are more complex):

var count = products ? products.length : 0; // the original example: an array
var value = inputString ? parseInt(inputString, 10) : 0; // string
var option = options ? options.key : 'default'; // Object used as dictionary
var result = fn ? fn(argument) : defaultValue; // Function
sprite.moveTo(x ? x : 0, y ? y : 0); // number

And now, using the inlined fromMaybe technique, in JavaScript, Ruby, and Python:

// JavaScript
var count = (products || []).length;
var value = parseInt(inputString || '0', 10);
var option = ({key:'default'}.key;
var result = (fn || Function.K(defaultValue))(argument);
sprite.moveTo(x || 0, y || 0);
# Ruby
count = (products || []).length
value = (inputString || '0').to_i
option = (options || {:key => 'default'})[:key]
sprite.move_to(x || 0, y || 0)
# Python
count = (products or []).length
value = int(inputString or '0')
option = (options or {'key': 'default'})['key']
sprite.moveTo(x or 0, y or 0)

The Real Thing

For reference, here’s how these examples look in Haskell.

let count = length (fromMaybe [] products)
let value = read (fromMaybe "0" inputString)
let option = lookup (fromMaybe [["key", "default"]] options) "key"
moveTo sprite (fromMaybe 0 x) (fromMaybe 0 y)

This is fairly unidiomatic Haskell. You can do a lot better, by modifying the functions instead of the values:

let count = maybe 0 length products

Scala also has a nullable type (Option), with a getOrElse method.

val count = (products getOrElse List()).length

Although, as with Haskell, you’d write this differently in idiomatic Scala:

val count = products.map(_.length) getOrElse 0


1 forEach was added in JavaScript 1.6, and works in Firefox. You can get cross-browser implementations from Dean Edwards or the Mozilla Developer Center; or with frameworks such as the JQuery (in the jQuery collection plugin), Prototype (where it’s called each), or MochiKit (where it’s a top-level function).

2 anArray.pluck is from Prototype. In pure JavaScript 1.6 (or another library that extends JavaScript with the 1.6 collection functions), this would be products.map(function(product) { return product.id }). Or in Functional, it’s map('_.id', products)

3 In conjunction with monads on the cheap, in the scenario where products might be null because order might be null, the code looks like this: var products = (order||{}).products || []. In fact, this is simply an extension of monads, where the default value is the empty array, instead of {}.

4 This disruption is in addition to the fact that now you’ve got to come up with a variable name (usually easy), and make sure that if you do this to two different pieces of code in the same scope you use two different variables (harder), and hold a larger set of variable names in your head when you’re reading this code a year later (hardest).

5 In this particular case, we could instead use the cheap monads idiom (offer||{}).products. But not every embedding expression is a test for nullity.

6 Linear logic is just a system where you can’t replace once occurrence of an expression by two. I didn’t link to the wikipedia page in the text because it’s written for logicians, not programmers, and makes it look scary-complicated, but here it is. Failing linearity is what goes wrong in C macros.

Adding the Easy Piece; or, The Metaphor of the Rock

Posted by Oliver on February 02, 2008

The novice project manager cares about a program’s size. The experienced manager cares when it gets big.

Big programs are, from a developer’s perspective, slow. Slow not to run, but to develop: effortful to maintain, expensive to change. Half the job of a project manager is to keep programs small by keeping their requirements small (and half the job of an architect is to keep them small when the requirements are large); this is about the case when this isn’t enough.

Developing a program is like pushing a rock around a room. (The rock is called “code base”. The room is an irregular shape called “design space”, with “requirements” marked off along the wall.) Big programs are big, heavy rocks. They require more push, to get less far.

Sometimes you can get several people to push one rock. This works if the rock is the right shape – if it’s straight and wide, say, so that many people can push it the same direction, at the same time. Otherwise they will push at different angles, and the greater part of their efforts will cancel.


Sometimes, you can break the rock into smaller pieces – so that different people can push them, or so that you can push them separately, or so that you can abandon some pieces and use instead pieces that are sited where you’re going. (The advantage of a pre-sited piece’s location can outweigh the fact that it’s not quite the right shape.)

Sometimes you can polish the floor, or insert rollers, or employ a cart (use languages, platforms, practices, tools). Sometimes, though, you spend more time polishing the floor than you would have spent moving the rock; and sometimes the rollers end up aimed differently from where it turns out you need the rock. But even when these tricks work, they only increase the size of rock that’s counts as “big”. Some rocks are still big rocks.

This perspective – the size of the rock – is the static view of code size. It’s concerned with how to move rocks of a fixed size. The static> view is the right view for maintenance programming. It’s the right view for (most of) version three of a product, where most of the program is unchanged from version two. Often it’s the right view for version two as well.

For a greenfield project – a project where you’re (mostly) not modifying an existing system, but (mostly) starting from scratch – the static view is too limited. In a greenfield project, you’re not just moving the rock. You’re building it too.

(How do you build a rock? By fastening other rocks to it. Or sometimes by coating chicken wire with papier mâché, if you just need to know how it looks.)

Greenfield development sounds simpler. Instead of pushing the rock where it’s needed, you just build it there. There’s some work (pushing) that you don’t have to do.

But oh, did I mention that you won’t know where the rock needs to go until you’ve already built some of it? (Funny how often that isn’t mentioned.) In fact, you may have even less of an idea where to site the rock that hasn’t been built than one that has, because the shape of the rock helps determine the site, And if a rock takes long enough to build, the room may change too.

So, you’ve got to build the rock, and you’ve got to push it, and you could do these in any order (build some, push some, build some more, push some more), except that you won’t know where to push it until you’ve got some built. And just to make this harder – and more realistic – there’s some parts you can’t build where you start, but only where you push it to.

This is the dynamic view of code size: that code size changes over the course of a project. The fundamental question about dynamic code size is: Does it matter what order you build the rock? If you know you’ve got to add a piece, does it matter whether you add it at the beginning or end of the project?

And the answer is yes, it matters. Sometimes greatly. Sometimes enough to determine whether you can build, and push, your rock at all. The reason is because building the rock makes it harder to push.

Often, on a project, there’s a few pieces that you know at the outset need to be there. They’re standard parts, that need to be attached to every project. You know how to attach them. It will be comforting to do so.

(Sometimes these pieces have names like “localization”, “optimization”, and “scalability”; or “graphic design complete”; or “windows compatibility”; or “undo”.)

Some advantages of attaching these pieces early are: an easy sense of accomplishment, and a feature checked off. They simulate great and early progress. The disadvantage is that they increase project drag. Attaching them makes pushing the rock harder. Make it harder sooner, and you’ve made your trajectory like the orange line, and not the green one, below.

It’s not that you should never add these pieces first. Building and pushing rocks is engineering; and engineering is the art of the trade-off.

Here are some reasons to add a piece sooner. You don’t know if you’ll be able to add it later, or if you’ll have time if you wait, or how long it will take. (You want to front-load uncertainty and risk.) Or: you need to show the rock to somebody, and they can’t imagine what it will look like with the missing piece. Or: you can’t imagine this either. Or: knowing how to add a piece gnaws at you – it’s a distraction – and you can’t concentrate on harder work until you’ve added the easy piece. Or: the rock is useful only halfway built, and the piece is the useful one. Or: you’ll have to tear the rock apart to attach a piece later, and this is more work than pushing a bigger rock now.

Engineering is the art of the informed trade-off. These reasons to make a rock bigger sooner are (can be) good reasons. Weigh them against the cost of pushing a bigger rock.

Monads on the Cheap I: The Maybe Monad 14

Posted by Oliver on December 16, 2007

Don’t worry, this isn’t YAMT (Yet Another Monad Tutorial). This is a practical post about a code smell that afflicts everyday code, and about an idiom that eliminates that smell. It just so happens that this idiom corresponds to one of the uses for monads in Haskell, but that’s just theory behind the practice, and I’ve saved the theory for the end.

This post is about style: implementation choices at the level of the expression and the statement. Style doesn’t matter much in a small program, or a write-only program (one that nobody will read later). It isn’t necessary to make a program run: by definition, it doesn’t make a functional difference. Style makes a difference to how easy or pleasant a program is to read; this can make a difference to whether it gets worked on (by its author, or somebody else) later.

The Problem

The types are Product, Offering, Merchant, and Name. A Product might have an Offering; an Offering might have a Merchant; and a Merchant might have a Name.

The code displays the name of the merchant that is offering the selected product. The input is named product. This value might be an instance of Product, or it might be null. The specification is this: If product is a product that has an offering that has a merchant has a name, then display that name; else do nothing.

[This problem is adapted from an eCommerce application I'm writing.]

This code isn’t difficult to write. The challenge isn’t in make it run; it’s in making it readable.

The Obvious Implementation

Here’s a naive solution. This particular code is JavaScript, but the structure comes out the same in Python, or Ruby, or C++ or Java, or Lisp.

var product = ...;
if (product) {
  var offering = product.offering;
  if (offering) {
    var merchant = offering.merchant;
    if (merchant) {
      var merchantName = merchant.name;
      if (merchantName)
        displayMerchantName(merchantName);
    }
  }
}

Let’s stop for a moment and identify the smells. There are two of them: I call the first one Narrative Mismatch, and the second Baton Carriers.

The first smell has to do with those nested conditionals. There’s a narrative here, that flows from the beginning of the spec (”the input is named product“) to the end (”display that name”). The code that corresponds to this narrative starts with product=..., and ends displayMerchantName(...). However, the line of this narrative isn’t linear in this implementation; it’s a series of Russian dolls. The climax to the story is buried in a subplot, displayed as though it were some exceptional case. Narrative Mismatch is when the dramatic structure (the specification) of the program doesn’t match its control structure (in the implementation).

The other problem is the temporary variables. These are the variables such as offering and merchant. These variables are “bit players”: each variable enters the stage (the local scope) only to perform a trivial bit of business, and then hangs around as clutter. Or, to use a racing metaphor, this implementation turns what could be a sprint into a relay race, where a series of runners hands off the data baton.

Can we do better?

If the code did nothing but display the merchant name and then return, we could invert each of the conditionals, and bail out early. This changes the control structure to match the plot: the beginning and end of the narrative are both at the same (top) level of the control structure.

var product = ...;
if (!product) return;
var offering = product.offering;
if (!offering) return;
var merchant = offering.merchant;
if (!merchant) return;
var merchantName = merchant.name;
if (!merchantName) return;
displayMerchantName(merchantName);

This implementation retains two of the disadvantages from the first, and it introduces a third. First, that’s still an awful lot of control flow (four lines) obfuscating a tiny amount of data flow (four lines). Second, we’ve still got those baton variables, offering, merchant, etc. Finally, the new implementation works only if the fragment is an entire function: if displayMerchantName is the last statement in the function1.

How about if we eliminate the batons altogether? Then we can combine all the tests into a single conditional:

var product = ...;
if (product
    && product.offering
    && product.offering.merchant
    && product.offering.merchant.name)
  displayMerchantName(product.offering.merchant.name);

This implementation has fewer lines, at least, but there’s still a lot of repetition2 — the text product.offering occurs four times.

It’s also inefficient. The previous implementations computed the value of product.offering only once, but this one computes it four times, and contains a total of nine member accesses to the alternatives’ three. These multiple accesses may not matter for this particular example: the case where all nine of those accesses are executed is the same case that ends with a a function call to a display routine. But consider the case where the accessor is product.offering() instead; or, in Ruby or ECMAScript 4, the case where product.offering is a getter.

We can re-introduce the temporary variables, this time within the conditional, to keep the control flow simple:

var product = ...,
    offering, merchant, merchantName;
if (product
    && (offering = product.offering)
    && (merchant = offering.merchant)
    && (merchantName = merchant.name))
  displayMerchantName(merchantName);

…but now we’ve brought back the batons.

(You may also have a stylistic object to the assignments inside of conditionals in this latest attempt. I’m not completely against the construct, but I avoid it unless it makes the code clearer, and it’s not clear that it does that, here.)

Stepping Back

I’d like to be able to write something that matches the language of the specification:

var product = ...,
    merchantName = product.offering.merchant.merchantName;
if (merchantName)
  displayMerchantName(merchantName);

and have the system just know that product.offering evaluates to null if product is null; and that product.offering.merchant evaluates to null if product.offering is null, etc. Now, we might not like null.property to evaluate to null everywhere (that might move bug symptoms too far away from their defect sites), but we’d like it here.

This property — that member access to a property of null is itself null — could be called “null contagion”. Many languages have “float contagion”, where an arithmetic operation produces a float if either of its arguments is a float. (i.e. a+b is equivalent to float(a)+float(b) if either of a or b is a float.) “Null contagion” is similar to float contagion except that it only applies to the member access (.) operator, and only from left to right.

You may be used to null contagion from other contexts — for example, from the declarative languages such as CSS and XPath and SQL, that adjoin the mainstream languages. It’s just not present in any of the procedural languages that I listed at the top of this post. For example, if product were XML and we were using E4X, we could write this as product.offering.merchant.name as desired. Or if we could apply CSS to a JavaScript object, we might write something like $('offering > merchant > name', product). Or with XPath (and an Hpricot-like syntax, this time): product/'offering/merchant/name/text()'.

So one solution would be to write a function that applies CSS or XPath paths to JavaScript objects. This works and there’s even Java libraries that do this, but here I’m looking for something less heavyweight than a library, and something that doesn’t require parsing strings at runtime. I’m looking for an idiom that approximates null contagion within the language.

Without further ado, here’s how to write an expression that evaluates to object’s property property if object is not null, and to null otherwise3:

(object||{}).property

And this can be chained as follows:

((object||{}).property1||{}).property2

So let’s apply this, in two steps, to the problem above. First, get rid of all the intermediate conditionals, in order to match the control flow to the narrative:

var product = ...,
    offering = (product||{}).offering,
    merchant = (offering||{}).merchant,
    merchantName = (merchant||{}).name;
if (merchantName)
  displayMerchantName(merchantName);

And now that each intermediate variable is used only once, we can inline its value and eliminate its name:

var product = ...,
    merchantName = (((product||{}).offering||{}).merchant||{}).name;
if (merchantName)
  displayMerchantName(merchantName);

This isn’t as clean as the E4X example, but it’s relatively concise: no batons, and no non-narrative control flow.

It’s also, in the case where most products have offerings that have merchants that have names, potentially more performant. A || (which short circuits) is comparable to an if statement, and the number of conditionals of either form (|| or if) is conserved across all the implementations in this post. The “null contagion” version defines {} literals, but these are only interpreted to create objects when the property target is null — here, assumed to be the exceptional case. And since this attempt has fewer instances of variable name lookup than in the others, it should be more efficient than the others except when in the presence of more optimization that most of the scripting languages provide right now.

What about the case where most products don’t have offerings, or the offerings don’t have merchants, vel cetera? If this were a frequent case, and this were inner-loop code, I might introduce a global empty object in order to reduce the object instantiation overhead (which happens right away) and the GC load (which shows up over time). E.g. define var $N={}, and use:

var product = ...,
    merchantName = (((product||$N).offering||$N).merchant||$N).name;
if (merchantName)
  displayMerchantName(merchantName);

(The Ruby equivalent of {} is {} for a Hash, with array-style access; and OpenStruct.new for an Object, with property getter syntax. You definitely want a reusable singleton for the latter.)

On the minus side, even this version always performs all four tests and all three property accesses. That’s three extra tests (beyond the attempts at the top of this post) when product is null, and two extra when product.offering is null, and so on. So it might still slower be than the solutions that don’t use null contagion — but on the other hand, it still avoids the temporary variables of those solutions. In other words, there’s no substitute for actually measuring the difference, within the program and on each platform that you’re optimizing for. Oh well.

What Does this Have to Do With Monads?

An expression such as product.offering.merchant.name is a pipe. It can be read as “get the value of product; and then get that value’s offering; and then get that value’s merchant; and then get that value’s name“. The dot says two things: (1) evaluate the expression (product) to its left, and then (2) use that as target for the projection function (['offering']) immediately to its right.

We’ve made up a new construct, (object||{}).property, which is like object.property except that if you put a null in (as the value of object), you get null out (as the value of (object||{}).property). In effect, we’ve replaced dot’s interpretation of “and then”. The interpretation of object.property’s “and then” includes “and if object is null, then error”. The interpretation of the new “and then” adds “and if object is null, evaluate to null”.

This construct, (object||{}).property, isn’t a monad. It isn’t a monad because it isn’t associative; and it isn’t associative because the property accessor (dot) isn’t either [dot isn't the morphism of a category]. (A.b).c isn’t the same as A.(b.c) — in fact, the latter isn’t even well formed.

However, the class of property projection functions — just the .b part — does form a category. If you consider just the .b.c.d part of A.b.c.d, it doesn’t make any difference whether you read it as “apply the ‘b’ projection, and then apply the application of the ‘d’ projection to the ‘c’ projection, to that”; or “apply the ‘c’ projection to the ‘b’ projection, and then apply the ‘d’ projection to that”.

You won’t find “null contagion” proposed as an implementation technique on the web. What you’ll find is the Maybe Monad.


1 It would be sad if this were an entire function. That would indicate that the function that contains displayMerchantName is just a wrapper for displayMerchantName, to protect it from being called when there isn’t a Product with an Offering with a Merchant. Having to define a new function to wrap each instance of a common pattern is like having to define a new word to name the reference of each noun phrase. It’s reminiscent of the boilerplate getters and setters in logorrheic enterprise code.

2 “Repetition” is defined as “a host site for a defect infection”.

3 JavaScript has more than one null object. This code takes advantage of the fact that an Object (even one with no properties) is never null, while undefined — the value of an undefined property access — is null.

Overloading Semicolon, or, monads from 10,000 Feet 11

Posted by Oliver on December 02, 2007

amichail on reddit asks about understanding monads in one minute. My thoughts ran longer than a comment and more than a minute, so I’ve placed them here.

The main message of this posting is that you already use monads, just without the labels. The complexity in most explanations comes from factoring out the different pieces of what you already know, and from the mathematical exposition in terms of category theory and monad laws. (I like the math, but you won’t find any of it here.) This posting trades away accuracy for ease; I hope it’s a helpful start.


A monad, as used in Haskell, is a rule that defines how to get from one statement in a program to the next. For example, there’s an implicit monad in the (JavaScript) sequence var x = 1; var y = x+1. It describes how to get from var x = 1 to var y = x+1.

A monad describes the flow of control, and the flow of data. Typically, the flow of control is something like “execute the first statement once, and then execute the next statement once”, and the flow of data is something like “the first statement computes a value, and makes it available to the next statement” (through a variable binding, say).

Sometimes the rules are more complicated. For example, if one statement is throw or raise, the statements that follow it are skipped. If one statement sets a global variable, the statements that follow it can get additional data by reading that variable. And if one statement changes the world outside the program (say, that statement creates a file), this will affect what happens when the following statements peek at the world, maybe even in other ways (say, by reading the free space on that volume).

Also, “following statements” is a dynamic notion, not a static one. If function f calls g, then all the statements in f that follow f’s call to g, follow all the statements in g (at least, up until g’s return).

You’ve used monads. You’ve used the State monad (which manages global variables), the Error monad (which enables exceptions), and the IO monad (which handles interactions with the file system, and other resources outside the program). You may not have thought much about these properties, because they come “for free”: in most languages, you don’t need to do anything special to get them.

In Haskell, you do need to do something special. All you get by default is the “typical” case from above: one statement computes a value; the next statement can read it. If you want additional behavior (State, or Error, or IO), you have to say so. You can say so by declaring the type of your statement block. Just like every variable in a statically-typed language such as C or Java has a compile-time type (int if its values are integers, or String if its values are strings), every statement in Haskell has a compile-time type too: Error Int if it might raise an exception; IO Int it it might interact with the world.

Why introduce this complexity into statement sequencing? The complexity comes with two benefits: a check on dynamism, and the ability to extend it.

The first benefit is that, if the only statements that might change the world have IO in their type, you can assert to the compiler that some compound statement or function not only doesn’t change the world within its body, but doesn’t call any functions that contain statements that do. And so on for other properties (”raises an exception”, “accesses global data”). This turns out to be useful.

The other benefit is that you can define your own sequencing rule. Remember that part about “execute the first statement once, and then execute the next statement”, and “the first statement computes a value, which the next statement can use”? You can replace these with “execute the first statement, but only execute the next statement if the value so far isn’t null”. This is the Maybe monad; it turns out to be useful too.

Or, you could replace the rule with “the first statement computes a list of values, and the second statement runs once using each of them“. This is the List monad; it’s — yes, you’re ahead of me here — it’s useful too.

I made an analogy between statements and variables, above. Java and C++ have typed variables, while Haskell adds typed statements. Let’s extend that analogy. Operators, such as plus and times, combine values. Some languages let you overload operators: Integer+Integer does one thing; you can define String+String to do another (hopefully string concatenation), and Vector+Vector to do a third (hopefully vector addition).

You can think of the semicolon as an operator that combines two statements. A definition for the semicolon operator is a monad: it defines the meaning of a compound statement composed of two simpler ones. Haskell lets you overload semicolon.

I hope this helps. If it did, now go read a monad tutorial and see how it works for real. If didn’t, go read a monad tutorial to see if it’s easier with actual examples, and the details and syntax filled in.

Some Lies

Here’s some of what I said above, that just isn’t true:

A monad isn’t just a rule. It’s a structure that includes a rule. Its data are a set (or type) of statements, and an operation that combines them — the “rule” above. (There’s more accurate and technical definitions, but that gets into the heavy math.)

Furthermore, the rule part of the monad isn’t just any rule. It has to have certain properties: the monad laws. However, you can perfectly well understand how to use a monad without being able to enumerate the monad laws1, just like you can use numbers without being able to enumerate the properties of a field or ring.

The descriptions of the State and IO monads above are particularly oversimplified. I think they’re useful for coming from procedural programming, but you’ll want to refine them in order to get to functional programming. Again, follow any monad tutorial.

In Haskell, you don’t even get statements by default. (I said above that you did.) Everything is an expression. As soon as you start using statements (which are just monad-valued expressions), you’re in monad land (even if it’s just the identity monad). The step from expressions to statements is the big one, and the differences between different monads aren’t as big a (syntactic) deal.

In Java, you do have to declare the types of some collections of statements. The collections are functions, and the type that you have to declare is the type of a statement that throws a checked exception. This fact about Java is widely despised, but it does give you a taste of Haskell.

Finally, thinking about monads as defining (or overloading) semicolon is a start, but sequencing is more general than that. Even in a language where every two lexical statements are separated by a semicolon, one statement can dynamically follow another without any particular relation between their sources, such as when one is in a function, that is called by the other.

1 The monad laws just say that (1) there’s a way to turn any expression (that computes a value) into a monad (that passes that value on); and (2) a sequence of statements acts the way you think it should.

One-Line JavaScript Memoization 19

Posted by Oliver on April 16, 2006

Computing the length of a Bezier curve is expensive, but the length of a given Bezier doesn’t change over time. In my JavaScript Bezier implementation, I wanted to compute the length only the first time it’s need, and save this result in order to return instantly thereafter.

This is a special case of memoization. There are well-known strategies for implementing memoization. But getLength is a nullary function, and there’s a trick for implementing memoization of nullary methods in a dynamic language such as JavaScript (or Python or Ruby). In these languages, you can memoize a nullary method by adding one line to it, without any support libraries. This line replaces the method by a constant function, that returns the computed value. This memoization strategy is also more efficient than the general-purpose solution that non-nullary methods require.

Memoizing by Hand

The naive way to implement memoization is to store the value in a property the first time it’s computed, and test for the presence of the property thereafter:

Bezier.prototype.getLength = function() {
  var length = this._length;
  if (length === undefined) {
    var length = ... // expensive computation
    this._length = length;
  }
  return length;
}

or:

Bezier.prototype.getLength = function() {
  if ('_length' in this) return this._length;
  var length = ... // expensive computation
  this._length = length;
  return length;
}

There are some problems with these solutions. First, they’re verbose. Worse, the domain logic (the line labelled “expensive computation”) and the memoization logic (the code that accesses length and this._length) are tangled up with each other.

These implementations also require a private property for each memoized method. And then there’s a (minor) performance hit too: it takes a few instructions, each time, just to discover that the return value has already been computed.

The Big Gun

The standard solution to the first two of the problems above (the length and structure of the implementation) is to write a higher-order-function, that creates a memoizing function from a non-memoizing one.

Function.prototype.memoize = function () {
  var fn = this;
  var cacheName = ... // create a unique property name
  return function() {
    var key = serialize(arguments);
    var cache = this[cacheName] || this[cacheName] = {};
    return key in cache ? cache[key] :
      cache[key] = fn.apply(this, arguments);
  }
}

(Keith Gaughan has a complete implementation here.) Then you can use it thus:

Bezier.prototype.getLength = function() {
  var length = ... // expensive compuation
  return length;
}.memoize();

This is the best general-purpose solution, and it’s good for a framework, or for a complete application. But in a small standalone code library such as this or this, I don’t want to include a copy of Function.prototype.memoize in each standalone file (and then worry about version skew); and I don’t want to make each file depend on a utility file (and then worry about file dependencies).

The One-Liner

A nullary function such as getLength is a special case of functions in general, and there’s a particular technique1 for memoizing it, that fits on a single line.

Here’s the first variant.  This is less source code that the version above, and uses fewer instructions to retrieve a memoized value. (In fact, it uses the fewest possible number of instructions, if the API for retrieving a value requires a function call at all.) It’s an extra line (the assignment to this.getLength) that you can stick in a nullary method, that memoizes the method’s value on a per-instance basis.

Bezier.prototype.getLength = function() {
  var length = ... // expensive computation
  this.getLength = function(){return length};
  return length;
}

The second variant moves the return and the assignment to the same line. Think of this line as a “memoizing return statement”. It comes at the cost of more opaque code, and an extra function call. (The overhead of a function call is probably not a problem if the computation is expensive enough to be worth memoizing anyway.) And although it’s not a style I would use for general-purpose programming, I find it acceptable in an idiom2.

Bezier.prototype.getLength = function() {
  var length = ... // expensive computation
  return (this.getLength = function(){return length})();
}

Another Use

You can use a similar technique to prevent a function from being called twice.à This is even simpler, since you donââ¬â¢t need to keep track of the value.à For example (from gradients.js):

OSGradients.initialize = {
  OSGradients.initialize = function() {};
  ... // initialization
}

Avoiding Memory Leaks

The inner function captures the variables from the outer function. In the method below, temp will never be deallocated, until the instance that getLength has been applied to (and that the constant version of getLength is therefore now attached to) has been deallocated.

Bezier.prototype.getLength = function() {
  var temp = new Array(10000);
  var length = ... // expensive computation
  // the following line captures length and temp:
  return (this.getLength = function(){return length})();
}

If there are large temporaries, or pointers to DOM elements that may be deleted later, you should either detach them before you exit the outer function body:

Bezier.prototype.getLength = function() {
  var temp = new Array(10000);
  var length = ... // expensive computation
  temp = null; // so the value won't be captured
  // the following line captures length and work:
  return (this.getLength = function(){return length})();
}

or use a helper function that closes over just the return value:

// Function.K(value) is a constant-valued function that returns
// value.
Function.K = function(value) {return function(){return value}};
Bezier.prototype.getLength = function() {
  var temp = ...
  var length = ...
  // the following line only captures length:
  return (this.getLength = Function.K(length))();
}

Thunks for the Memories

Although I didn’t realize it until later (when I was writing my Bezier library, coming at this more from the “how can I cache this value” perspective than from a “theory of memoization and programming language implementation” perspective), this is nothing more than an implementation of thunks for JavaScript.

In a lazy (or call-by-need) language such as Haskell, memoization comes for free. A variable with an initializer has the same syntax and semantics as a function; the initializer is evaluated once, the first time the variable is referenced, and then cached.

A referentially transparent nullary function has the syntax of a function, but the semantics of a value. (In Haskell, the syntax is the same too.) By memoizing it, we’re making this value call-by-need. One technique for implementing call-by-need (or “lazy”) values is to create a thunk that replaces its computation by its value the first time it’s evaluated. In JavaScript, we can implement these semantics but have to stick with the function-call syntax.

Beyond Thunks

There’s three directions to go from here, but all of them involve giving up the single-line solution.

First, it’s a bother that a method has to know its own name. This means you can’t paste the same line of code into each memoized function. This is the only way, however, that the method can replace itself with the constant function3. You can eliminate the need to type the name twice, if give up both the no-helpers requirement and the optimization that a memoized function doesn’t test any conditions the second time it is called. (The higher-order function in the section “The Big Gun” does this.)

Second, you can memoize non-nullary functions by getting out the big gun too. In this case the replace-yourself optimization is useless too, since each invocation needs to check whether the value has already been computed for these arguments, regardless of how many times the function has been invoked before.

A third extension lets you hang onto the optimization, but takes more than a line of code (and therefore, realistically, depends on a support function). It’s for the case where the return value depends upon the object state, and that state is mutable. In this case, you need a way to reset the memoized function to its initial state, so that it will perform the computation the next time it’s called. For example, setRadians below resets getDegrees so that the multiplication happens again upon the first call getDegrees after each time setRadians is called. (This example isn’t a realistic candidate for memoization, but pretend multiplication is really really really slow.)

function Angle(radians) {this.setRadians(radians)}
Angle.prototype.setRadians = function(radians) {
  this.radians = radians;
  this.getDegrees.reset();
};
Angle.prototype.getDegrees = function() {
  return this.radians * 180 / Math.PI;
}
memoizeConstantMethod(Angle.prototype, 'getDegrees');

Here’s a long but marginally readable implementation of memoizeConstantMethod:

function memoizeConstantMethod(object, property) {
  var f = object[property];
  var mf = function() {
    var value = f.call(this);
    var kf = function(){return value};
    kf.reset = reset;
    object[property] = kf;
    return value;
  }
  var reset = function() {
    object[property] = mf;
  }
  mf.reset = reset;
  reset();
}

And here’s a shorter version, that sacrifices the remaining readability for source code size4.

function memoizeConstantMethod(o, p) {
  var f = o[p], mf, value;
  var s = function(v) {return o[p]=v||mf};
  ((mf = function() {
    (s(function(){return value})).reset = mf.reset;
    return value = f.call(this);
  }).reset = s)();
}

Update: Peter Michaux has a nicely written description of a similar technique here.


1 Memoization in general requires a finite map (or “hash”) from argument lists to result values. In the special case where the function is nullary, you don’t need a map, because the empty list is the only key.

2 Just as I don’t need to understand what the tabs are in an English idiom such as “keep tabs on”, I don’t need to easily read a programming language idiom compositionally each time I see it.

3 You can’t search the object and its prototype chain for a function that’s === to the one being replaced, because the same function might be present at different property names.

4 This is written in might be called the “_why combinator” style of programming.

“Stretch” Languages, or, 28 years of programming 11

Posted by Oliver on February 05, 2006

Recently I reviewed the programming languages I’ve used over the 28 years1 of my programming career. The result is shown in the chart below. (Click on the image to see it full size.)

There are some obvious trends here2. The languages are mostly getting higher level. There are a few “survivors”: languages that I’ve used over the the course of a decade, although discontiguously: C/C++, Common Lisp, and Java. Java has replaced C (except for a stint around 2000 where I went back to low-level graphics programming at Alphamask), and the scripting languages have taken over from Common Lisp — they’re slower, but they’re terse, have better libraries, and are easier to deploy.

It didn’t surprise me that there are so many languages. For one thing, I like languages, and I enjoy learning how to program by learning new ones. For another thing, times changes and programming languages get better. You’d have to be crazy to program in 1970’s BASIC these days. (That’s BASIC, not Visual Basic. The language I cut my teeth on didn’t have functions or structures.) And there just isn’t much call for Z80 assembly any more.

But why so many languages at one time?

The reason is that different languages are good at different things. When I first learned to program, I used BASIC for everything. Then I started using assembly for code that had to be fast, but I stuck with BASIC for doing my homework. (In tenth grade I wrote programs to multiply and invert matrices, so I wouldn’t be bored silly doing it by hand. It took longer, but I had more fun.)

This is what the second chart shows. Again, click on the thumbnail to see the full size version. And beware! — the colors don’t mean the same thing as in the chart above. (Sorry.)

I use four kinds of languages: utility languages, for writing tools and solving problems; application languages, for writing distributable integrated applications; system programming languages, for implementing operating systems and low-level libraries; and a fourth category of languages, which I’ll call “stretch languages”, for learning how to program. (I also dabble with bash for shell programming and R for statistics, but I don’t really think of the way I use them as programming.)

A utility language is useful for solving small problems and for exploring problem domains. It’s what you write your ten-minute program in. If you can use your utility language as your file manipulation language and your build language, all the better. (For a while I did a lot of programming in C and then Java, and I tried to use make and ant for these. Now that I’m using Ruby as my utility language, I’m a lot happier using Rake instead.)

An application language is for building larger, more full-feature programs that other people can use. (This is my own use of the term. Think “desktop application”, or “web application”.) An application language often has requirements for speed, OS integration, and deployability, that a utility language does not.

(”Application” for me used to mean desktop applications. Recently I’ve only written web applications — although I did write a MacOS app a few months ago, in order to learn about the MacOS toolbox. That’s why the “Desktop Application” category in the second chart trails off, to be replaced by languages for writing web applications, and browser-based clients that connect to them.)

Lastly, the point of a stretch language is to teach me new ways to think about programming. This is the opposite of a utility language, although a stretch language can turn into a utility language after I absorb its concepts. A utility language makes it easy to express programs that I have in mind; it gets out of the way, so I only have to think about the problem domain, and not how to program. A stretch language can make it difficult to do even simple things, because I’m still learning the concepts that are necessary to use it idiomatically or, sometimes, to use it at all.

Sometimes a stretch language becomes a utility language or an application language. This was the case with Common Lisp, which took me a while to understand, but for a while was my most productive language.

More often, the concepts that I learn from a stretch language are helpful in using other languages, even if I have to express these concepts by writing the equivalent of macro-expanded code or a VM. For example, I’ve written a continuation-based pattern matcher in Common Lisp and a logic program in Java, even though neither language supports those respective features. Designing the programs as though the features existed made a hard problem tractable.

And finally, learning concepts from one language makes it easier for me to recognize them and start using them when they’re available in a more mainstream language or one with better deployment characteristics. Those of us who were familiar with advise and the CLOS MOP had a leg up on understanding how to use AOP when it came around to Java. The modern round of scripting languages (Python 2.x and Ruby) have MOPs — if you’re familiar with them from Smalltalk, you know how to take advantage of them in Python and Ruby too. And if I ever have to use C++ again, at least I’ll know from Haskell to look for template implementations of combinator libraries.

Here are the languages that I’ve learned from over the years, and what I’ve learned from each one3. Some of the languages that aren’t in my list would be perfectly respectable stretch languages for someone else — I just happened to be familiar with their concepts by the time I got to them. For example, Python is probably a fine way to learn about OO (and my son learned a lot of OO concepts from Logo Microworlds), and I could have learned metaobject programming from Ruby if I hadn’t already learned it from Smalltalk and Common Lisp.

  • BASIC: basic programming
  • Z80: ADTs; recursion
  • Prolog: logic variables; abstracting over control flow
  • CLU: exception handling; coroutines; parameterized data types
  • Smalltalk: OOP
  • Common Lisp: functional objects; meta-object programming
  • Java: concurrency
  • Eiffel: design by contract
  • Haskell: lazy evaluation; type classes
  • LZX (OpenLaszlo): constraint-based programming

Right now my stretch language is Haskell. When I write in Haskell, I feel like a beginning programmer again. I can’t use it when I’m in a hurry, and I don’t use it when I’m more interested in the problem domain than in the craft of programming. Reading Haskell is like reading poetry (I’m very slow at it, and I can’t skim because the new-concept density is very high), and writing Haskell is like writing poetry (something else that takes me forever). As opposed to Python and Ruby, which are more prosaic, and Enterprise Java, which is more like a tax form.


1 I first learned program in seventh grade, from the TRS-80 Level I Basic reference. That makes me both (1) almost 30, and (2) innumerate.

2 In this entire essay, I’m quantifying over the languages I use and the way I use them. YMMV.

3 This list includes more languages than the charts do because it includes languages that I’ve never used commercially. I’ve only ever used Prolog and CLU for college projects, for example, but I learned a lot from them anyway.

Grief 2

Posted by Oliver on December 09, 2005

One warm Monday morning last August my father died. The previous Wednesday he had been planning to see March of the Penguins, a movie he probably would have discussed with his grandchildren over the phone and video chat. Instead, that night he was taken to the hospital, after falling down his apartment stairs. Early Monday I leaned way over him in the ICU and held him as tightly as I could, and felt on my cheek his last, familiar, breath.

I know it’s callous, but when I hear about a man in his eighties dying, I picture someone whose life is done. It doesn’t evoke in me the automatic sorrow, the rage against mortality, that comes from an encounter with the death of a twenty year old, or a teen, or a child. I’m less than half of eighty now, and yet I’m older than most people have lived to for most of time: older than the life expectancies of many countries; older than my friends when we were young and promising; older than Mozart, older than Keats. Despite the extended American adolescence, by the time a man is thirty he’s had time to make his mark. Anything after that is bonus time.

But it’s one thing to read “eighty” on the obituary page. Reality is stepping into the place of someone who just stepped out from it, and looking around, and understanding where he’d stood.

For a few days after his death, I was my father. I lived in his house, I slept in his bed. I sat at his desk and used his phone to call his old friends, the ones that I had known as a child. (They aren’t any older now. Seventy to a forty-year-old looks the same way forty did to a boy of ten.) I learned a little bit about the strands of his life, after the fact, in the process of raveling them.

Even at eighty, my father led a more active life than I do now. There were letters on his desk from students, writers, colleagues. One journalist was writing a book about him. Another was waiting to talk to him for a book of interviews about George Plimpton, with whom he co-founded the Paris Review. He had been planning to co-teach a writing course at UNC-CH again this fall; he had been looking forward to teaching again at Bennington. His computer held some words towards an unfinished book.

It wasn’t an old life that I’d stepped into. It didn’t feel like a life that had been winding down. It’s funny to say it about a man in his eighties, but aside from the inconveniences of his eight-year-body, he had been in his prime. I didn’t just miss him as a father and grandfather then. I wanted to see what he’d do next with his life, what he’d write, and who he’d teach. He was a storyteller and a teacher; there are more stories and students, that only he could have taught and told.

Writers and teachers have friends who are writers too. Having a writer and teacher as a father means that his friends can express what I want the eloquence to say. Max’s lifelong friend Daphne Athas wrote me after he died (and I quote without permission, and hope that she will forgive me):

When Bland [Simpson] called the sun had just set, I had just arrived at the Pension room that I stay in, the phone rang, I sat on my bed and looked out at Sphakteria, the island across Pylos Bay from my window where the Athenians beat the Spartans in the 7th book of Thucydides. So do we lose our fathers. But it was a triumph for the Athenians, and Max would certainly appreciate that.

So do we lose our dads.

Three Lefts Make a Right: The Type Declaration Paradox 2

Posted by Oliver on January 03, 2005

A few days ago I argued that even though type declarations aren’t the best possible solution for any particular problem, they can be the right solution for solving several problems at once. I baffled even smart people. If I had longer I’d write a clarification. As it is, I’ll just give an example.

Type Declarations as Documentation

Let’s say that I’m writing a function f() that takes two arguments:

function gcd(a, b) {...}

I know as I’m writing the function what the range of parameter values is intended to be. They’re both non-negative integers. I could do at least four things to document this: add a comment, use descriptive variable names, add assertions, or insert type declarations.

To make this interesting, let’s assume this is in a language that doesn’t have a “non-negative integer” type and doesn’t have a mechanism to create restrictive types [1].

Comment encoding:

// a and b are non-negative integers
function gcd(a, b) {...}

Structured comment:

// @param a : Integer
// @param b : Integer
function gcd(a, b) {...}

Variable name encodings:

// Smalltalk style (loses information):
function gcd(anInteger, anotherInteger) {...}
// Extended Smalltalk:
function gcd(aNonNegativeInteger, anotherNonNegativeInteger) {...}
// Hungarian notation:
function gcd(nA, nB) {...}

Assertions:

function gcd(a, b) {
  assert a isinstance Integer && a >= 0
  assert b isinstance Integer && b >= 0
  ...
}

Type declarations:

function gcd(a: Integer, b: Integer) {...}

Note that the type declaration, from the perspective of how much information it captures, is the worst of these: it loses the information that a@ and @b are non-negative. (It’s tied with the short form of the Smalltalk convention here.)

This is because the type declaration is only being used for one purpose: documentation. And comments are better at documentation.

Let’s summarize the benefits of these solutions numerically. I’m pulling the numbers from a hat (with some attempt to make the rank match my perception); what’s important is that, according to this metric, type declarations aren’t the best solution for documentation. More verbose mechanisms, and mechanisms that violate DRY, get docked in the cost column.

mechanismcostbenefit
comment14
name32
assertion24
type declaration13

Type Declarations as Metadata

Now add another requirement: this function should be callable from a statically typed language, e.g. C. (You can substitute your own requirement: compile-time checking, database bridging, etc.) There are a couple of ways to do this too:

Metadata:

@cexport(bool gcd(int, int))
function gcd(a, b) {...}

Type declarations:

function gcd(a: Integer, b: Integer) : Boolean {...}

Again, the type declaration is more concise, but it’s not as general a solution to the problem of exposing a function in one language to code in another. The metadata solution allows me to choose an export name for the function, and perhaps do a better job of selecting equivalent types than a library or tool driven solely by the types in this language might perform. It does so at a significantly greater cost: in verbosity, and in the duplication of structure and names which now have to be maintained in parallel. Numerically:

mechanismcostbenefit
metadata24
type declaration12

Mow, for any particular use of metadata, the cost-benefit ratio may be different. For example, a tool driven by the metadata may not do any better a job than one driven by the type declarations, in which case the cost-benefit ratio may look more like this:

mechanismcostbenefit
metadata22
type declaration12

Adding Apples and Oranges

So far type declarations are 0-for-2. They’re worse than comments at documentation, and they’re worse than domain-specific metadata at bridging languages. But type declarations are stronger when they take on both challengers at once.

Let’s consider a function definition that satisfies both documentation and metadata requirements. I’ll consider two solutions: (1) a “swat team” solution composed of the best documentation solution (comments) together with the best metadata solution (a domain-specific annotation); and (2) type declarations:

Swat team:

// a and b are non-negative integers
@cexport(bool myGCD(int, int))
function gcd(a, b) {...}

Type declarations:

function gcd(a: Integer, b: Integer) : Boolean {...}

Numerically:

mechanismcostdocumentation benefittooling benefit
comment14
metadata22
comment+metadata342
type declaration132

Again, don’t pay two much attention to the actual numbers; the point to note is that even though a comment is better at documentation, and metadata is better (or in this case as good) at tooling, the comment+benefit solution has a worse cost-benefit ratio (3:6=1:2) than the type declaration solution (1:5). This is because the cost for type declarations stays constant while, the more requirements type declarations meet (even if poorly), the more the benefits rise. This as as opposed to the swat team approach, where the cost rises with the number of requirements (and therefore solutions).

Setting aside the math, it’s obvious at a glance that the swat team example is longer, and more complex, and therefore should be expected to have more overhead. A simple change, such as adding a parameter to a function, becomes a lot more expensive, because you’ve got to make each change so many places.

Notes

1 In fact, this language could even be Java. Consider a gcd that worked on BigDecimals. There isn’t a “non-negative BigDecimal” type, so a type declaration can’t encode the domain of the function.

The Type Declaration Compromise 11

Posted by Oliver on December 31, 2004

A vice grip is the wrong tool for every job. — anonymous

Type declarations aren’t the best tool for any particular purpose, but they’re a passable tool for a lot of different purposes, and therefore they’re often the best tool for meeting several purposes at once. There are better ways to comment a program, to add metadata for tools and libraries, or to verify program correctness; but type declarations, in many languages, are a passable way to do all of these jobs at once.

The situation is similar to that of a convergence device such as a camera phone. The voice quality on my camera phone isn’t that great (the speaker shares a tiny clamshell lid with the camera optics and CCD), and the camera doesn’t take great pictures (the lens is smaller than a halfpenny), but I’d rather carry one substandard device than two separate best-of-breeds — even if I have to go get my “real” camera when it’s worth taking a good photograph. You can make the same argument for a PDA phone versus a laptop. And I believe the same holds for type declarations, versus all the mechanisms that, considered separately, are superior to them.

This is why my ideal language has optional type declarations (unlike Java, in which they’re mandatory, and Python, in which they’re absent). I’d like to have the syntax available, for when I want to use a compromise that meets several of these requirements at once, but optional, so I can instead use features that are better tailored for each individual requirement, without redundancy.

Some of the uses for type declarations are:

  • Documentation. The type of a variable is more informative than just its name. For example, f(a: String, b: Integer): Boolean is a better starting point for understanding the intent of f@ than @f(a, b) alone.
  • Code Generation. An IDE can use the declared types to drive context-sensitive help such as docstring tooltips (”intellisense”). Foreign function generators such as jnih and SWIG can use the type declarations to create function interfaces that stub or bridge to other languages or libraries.
  • Runtime Introspection. Foreign function libraries can introspect variable types and function declarations to make foreign function calls at runtime. Database bridges can introspect class definitions to generate CREATE TABLE statements, and to bridge RDBM tables and instance hierarchies.
  • Compiler Optimization. A compiler can use type declarations to choose efficient unboxed representations, or (in a dynamic language) to elide runtime type checks.
  • Program verification. Type mismatches are detected closer to the source location of the error (when a String is passed to a function that expects a Float), instead of further down the call chain (at the point where a primitive operation such as @/@ is applied to a String). With compile-time type checking, type mismatches are detected even in the absence of a test case or code coverage.

There are better solutions for each of these requirements, when the requirement is taken on its own:

  • Documentation: Comments are generally superior to type declarations, because natural languages are more expressive and extensible than type systems. A simple type system can’t say very much, and a complex type system can be hard to read.
    For example, I want to know (and declare) that table is a Set of String, not just that it’s a Set. If my language (say, Java 1.4) can’t express this, I’m going to write table: Set<String> in a comment anyway. In this case, why write table: Set as well?
    Another example holds in languages without type synonyms (such as C’s typedef), such as Java up through Java 5. I generally want to know an object’s abstract type (Centimeters), not its concrete type (float or double).
  • Code Generation and Runtime Introspection: An extensible metadata system is a more general solution, and will work in cases where type declarations fall short. Again, the type system for one language doesn’t generally describe the information that is necessary to convert types into another language. Most languages have type systems that can’t distinguish between required and optional values (a String, versus either a String or null). This means, for instance, that you can’t infer a RDBM type from a declared type, because you don’t know if the value is nullable.
  • Compiler optimization: Type inference and optional type declarations are often better solutions than mandatory explicit type declarations, where this feature is needed at all. The weak point in many programs these days is the feature set, quality, or development time, not the execution space and speed. Even when optimization matters, explicit type declarations are often both too much and too little:
    Too much, because they often aren’t necessary: even a C++ or Java compiler infers the types of expressions, and it’s some arbitrary to turn this inference off at the function level. Sure, type declarations are useful at compilation unit boundaries to avoid whole-program analysis, but declaring the type of every variable and private function is a throwback to the 90s 80s 70s 60s.
    Too little, because program analysis is necessary anyway in order to infer much of what an optimizing compiler or modern memory manager needs to know: flow, lifetime, extent, aliasing, mutability.
  • Program verification: Unit tests, assertions, and contracts are better suited for this purpose alone. I rarely see code without a test case that actually works; the clean compile (even with type declarations) just means the easy part is done. Usually even the value-related errors are division by zero or invalid use of null, not something the type system would have caught; and the types of bugs that I see once I get to a clean compile are similar in Java (with mandatory type declarations) and Python (without type declarations at all).

Given the inadequacies of type declarations for each of their intended purposes, then, why use them at all?

Compromise

One reason is that a type declaration doesn’t just a poor job of one of these things; it does a poor job of each of them, and that means it does a good job of doing them all at once. If adding “: Integer” to a program produces an incremental improvement in both documentation and IDE support, and SQL and foreign function bindings that work some of the time, well, maybe that’s worth it, even if the documentation value alone wouldn’t have been justified doing this in addition to (or instead of) writing a comment.

I argued above that it wasn’t worth writing table: Set, since Set doesn’t say much of what I know about the value of table — but if writing “table: Set” also enables a compiler optimization and some early error detection, then maybe I’ll write table: Set instead of /* table: Set<String> */, in order to reduce the (partial) duplication in my source code that would be present if I wrote the ideal comment and the ideal (within the constraints of the type system) type declaration separately. Eliminating the comment in this case is arguably an application of DRY.

Removing the extra information in the comment hurts the quality of my documentation, by using a lowest-common-denominator (the type system) instead of a tool (natural language) that is tailored for the job. The fact that the type system is a compromise among several uses means that my use of it is a compromise too.

Concision

There are two other advantages of using type declarations. One is that type declarations are typically concise. The other is that they are integrated into the grammar. (The type systems in common use are also declarative and decidable, but once a type system is made powerful enough to express real-world program constraints, it becomes Turing complete.)

Common Lisp has a syntax for extensible declarations, that can be used for type declarations as well as other purposes. Fortunately, it’s optional. Unfortunately, it’s so verbose that I only use it as a last resort — only when I need the program to run faster, not just when I want to record my intent.

Many of the other alternatives to type declarations are similarly verbose. Preconditions (part of Design by Contract) are more expressive than type declarations, and can function as a superset of type declarations too, but saying something simple like “this is an integer” takes more typing as a precondition. As with comments, a type declaration does a little of what preconditions do, and in the cases where either type declaration or a precondition will work, the type declaration is more concise.

(Note that this is a property of the languages that have type declarations, not anything inherent about type declarations and preconditions themselves. Still, that’s what we have to work with today.)

Another example is unit tests. Above, I blithely stated that unit testing was superior to type declarations — and, if you’re able to easily write and run them, I believe this is true. The verbosity of a unit test can be a significant impediment to doing this, though. I’ve noticed that in a language (or library) with integrated unit tests I can use unit testing to find all the errors that type declarations would find (and more). But in a language where unit tests are verbose, I don’t write nearly so many. Regardless of the inherent merits of type declarations and unit test testing, the type declaration you write is more valuable than the unit test you don’t.

Granularity

The reason type declarations are concise is that they’re conventionally integrated into the host language’s grammar. This leads to another advantage: in principle, a type declaration can apply to any term in the grammar — like a comment, but unlike, say, the latest breed of metadata annotation mechanisms. (C# attributes, Java metadata annotations, and Python decorators are all constrained to the class and member level; they can’t be used to annotate a variable or expression.) In practice, most languages attach type declarations to variables instead of values (Common Lisp and Haskell are two exceptions), but this still gives you a finer annotation grain size than you get with metadata. This means, for instance, that a metadata approach to annotating Java or C# method parameters would have to include a new syntax for annotating the parameters of a function, inside an annotation that was attached to the function itself — and this would be more verbose, more complex, and require the maintenance of two duplicate structures (the annotation, and the signature itself). And this means that a problem (such as annotating the SQL types of a function’s parameters) that might better be solved conceptually by metadata, will often have a more widely applicable (to documentation and other requirements) and a more concise solution, and therefore a better practical solution, that uses type declarations instead.

Credits

Some insightful comments on the equally insightful Patrick Logan’s posting The ROI of Optional Type Annotations got me thinking about this.

Addendum

I’ve added a follow-up here.