Have you ever written a function that looks like this?
function requestProductDetails(id, k) { var value = gProductDetailsCache[id]; if (value) k(value) else ajax.get(‘/product/’+id, function(data) { gProductDetailsCache[id] = data; k(data); }); }
requestProductDetails calls its callback with the product details, which are stored in a cache. Since it might need to request this information from the server, it has to “return” it by passing it to a callback; in order to present a uniform API whether or not the product is cached, it “returns” the data this way whether it came from the cache or not.
requestProductDetails is intended to be used this way:
requestProductDetails(id, function(details) { infoPanel.setDetails(id, details); }); infoPanel.setName(id, gProductNames[id]);
(I gave infoPanel a somewhat silly API in order to demonstrate a point. The general pattern is that there’s some computation in the callback, and some other computation after the call.)
There’s a subtle problem in this code, which is that two different code paths run through it. In the cached case, infoPanel.setDetails is called before infoPanel.setName. In the uncached case (the first time through), it’s the other way around. If there’s a bug that causes setDetails to work only after setName has been called, you may well miss it during casual testing, because it will only trigger the second time you trigger the code — and once it does trigger, it will appear intermittently (especially if you have a more sophisticated cache), and be darned difficult to find.
I recommend this implementation of requestProductDetails instead. It makes the inside of the function more complex — and the setTimeout looks gratuitous — but it makes its outside simpler : requestProductDetailss callers are much easier to debug.
function requestProductDetails(id, k) { var value = gProductDetailsCache[id]; if (value) setTimeout(function() { k(value) }, 10); else ajax.get(‘/product/’+id, function(data) { gProductDetailsCache[id] = data; k(data); }); }
The general principle here is if a function sometimes has to call its callback asynchronously, always call it asynchronously, to minimize the number of possible code paths through the application.
Comments
I've spent a lot of time in the last year working in GWT and Silverlight 2, both of which involve programming in a strictly typed OO language that uses XHR for communications: so you have the same issues with concurrency that you talk about in Javascript. I've been writing a series articles of patterns about how to write correct RIA's despite the fact that callbacks can run in a random order:
http://gen5.info/q/category/asynchronous-communications/
while it is a good thing (tm) to make sure that code works the way the programmer expects - that the async callback gets executed after the next statement, one should remember that the reality of async is that you cannot guarentee order of execution, and thus, should instead modify the design of the program to not have to rely on the order of execution. There is no other way to fix this "problem".
The modified function actually does guarantee the order of execution: that the code that follows the call to
requestProductDetailswill always execute prior to the invocation of the continuation parameter.Making each part of program not rely on the order of execution may seem like a good thing, but it increases the number of required test cases exponentially, if nothing else. Some indeterminacy is inherent in distributed processing; the rest can be determinized.
The inside of the function doesn't have to look complex if you abstract out the gunk:
requestProductDetails = asyncMemoize(function (id, k) { ajax.get('/product/'+id, k); });Here's a definition of asyncMemoize. (My Javascript is rusty, so forgive any bad grammar.)
function asyncMemoize (f) { var cache = {}; return function (x, k) { var v = cache[x]; if (v) { setTimeout(function() { k(v); }, 1); } else { f(x, function (v) { cache[x] = v; k(v); }); } } }(A better definition could use varargs, but whatever.)
I agree that a given function should always or never invoke the callback asynchronously. Not because I want to minimize code paths, but because that's part of the interface. Your first version of requestProductDetails essentially has undefined behavior!
I would go further and suggest a naming convention, either as part of the function name or the name of the continuation argument, that indicates that it's asynchronous. (I've been steeped in Objective-C for the last year, and I'm loving the Smalltalk-style descriptive method/argument naming.)