LZX is a prototype-based language: any attribute that can be attached to a class definition, can be attached to an instance of that class instead. This is handy in UI programming, where there are a number of objects with one-off behaviors. It’s also handy in prototyping and incremental program development, where it creates the possibility for a novel kind of refactoring.
Read more »
Posted: March 28th, 2004
Categories:
OpenLaszlo,
Programming,
Software Development
Tags:
Comments:
3 Comments.
One feature of the recent LPS 2.0 release is the KRANK feature, for optimizing application startup performance.
Read more »
Posted: March 27th, 2004
Categories:
OpenLaszlo
Tags:
Comments:
No Comments.
Mickey Kaus writes:
It’s worth noting that, in the event, not only did successor Arnold Schwarzenegger get more votes (3,744,132) than Davis (3,562,487), he also got more votes than Davis got in November, 2002 (3,469,025) when Davis won reelection.
But comparing these yes-Schwarzenegger votes to the no-recall votes ignores those who voted not to recall Davis, but also voted for Schwarzenneger as their choice for governer if Davis were recalled (since they couldn’t vote for Davis). That is, the straight-up Schwarzenneger versus Davis comparison counts everyone whose first choice was Davis and whose second choice was Schwarzenegger as a yes-Schwarzenegger vote (as well as a no-recall vote), artifically inflating the Schwarzenegger count. And it counts everyone who preferred a third candidate (say, Bustamante) to Davis and Davis to Schwarzenegger as a vote for neither Davis nor Schwarzenegger, artificially deflating the Davis-versus-Schwarzenegger count.
The appropriate comparison is between voters who voted not to recall Davis, and voters who voted to recall Davis and for Schwarzenegger as the replacement candidate. Presumably this is knowable (if the ballots haven’t been destroyed), but it’s not on the Secretary of State’s summary page, at least.
How could you estimate this? One (crude) estimate is that everyone who voted for Bustamante would have voted for Davis if Bustamante hadn’t been on the ballot. This overcounts (it counts voters who preferred Bustamante to Schwarzenegger and Schwarzenegger to Davis as voting for Davis), and it undercounts too (it fails to count voters who preferred some fourth candidate to Davis and Davis to Schwarzenegger as voting for Davis over Schwarzenegger). It’s not obviously worse than the yes-recall to yes-Schwarzenegger comparison, though. It gives 3,744,132 votes to Schwarzenegger and 6,172,557 votes to Davis.
Even if Davis would have only split the Democratic vote with Bustamante (that is, if Davis and Bustamente were equally popular, and if none of the Schwarzenegger votes were from crossover Democrats), Davis would have got 4,920,316 votes to Schwarzenegger’s 3,744,132 votes.
And as for the rest of the comparison:
Almost a million more people (4,416, 280) voted to recall Davis than voted to reelect him last year.
8,203,005 people voted in 2003. 7,318,618 people voted in 2002. (These figures are from dividing the subtotals by the percentages; the web sites don’t list the totals directly.) If 8,203,005 people had voted in 2002 with the same distribution, 4,314,780 would have voted against Davis, so he actually picked up about 100,000 votes.
Just something to think about.
Posted: October 13th, 2003
Categories:
Math Education
Tags:
Comments:
3 Comments.
Over the past thirty years, the American Computer Corporation (ACA) has introduced personal computers, and in particular the bitmap display and the desktop metaphor, into the economy.
The ACA is a virtual company, composed of the following departments. The heyday for each department is given in parenthesis after the name of the real company that implements that virtual department.
- Development: Apple (late ’70s/’80s)
- Marketing: Microsoft (late ’80s/early ’90s)
- Distribution: Dell (’90s/’00s)
As a company’s market matures, it shifts its focus from research and development, to marketing, sales, and finally operations (including distribution). You can see this shift in the dates above.
Posted: October 4th, 2003
Categories:
Technology
Tags:
Comments:
No Comments.
The animated book list at the bottom of my home page was written in Laszlo. It’s 42 lines of code, and was written entirely in a text editor (emacs) — all the images are courtesy of Amazon.
Read more »
Posted: October 3rd, 2003
Categories:
OpenLaszlo,
Projects
Tags:
Comments:
1 Comment.
Do you have an old teletype with a 5-bit serial interface sitting around that you’ve been itching to hook up to the Internet? If so, this article at LinuxDevices.com is just what you’ve been looking for. — anonymous on Slashdot

Henry Minsky has written in LinuxDevices about his Internet Teletype. This is a vintage teletype he purchased on eBay, that is connected through a Mini-ITX and a web server to an email account. The email account is in turn connected to his Yahoo calendar.
The upshot of this is that Henry has a Model 28 Teletype sitting in the middle of his living room (right, with Kiki’s Delivery Service on floor), which chatters out reminders for company meetings and other events.
While this project required few of the many capabilities of the Mini-ITX system, the other projects on my list are all great candidates for using this board; high quality video and audio, as well as low power consumption, low noise, and most importantly low price will make this my platform of choice for future midnight engineering projects.
Henry has additional pictures here.
Posted: September 2nd, 2003
Categories:
Inventions,
Technology
Tags:
Comments:
1 Comment.
Christian Sepulveda writes about comments in source code:
Not all comments are bad. But they are generally deodorant; they cover up mistakes in the code. Each time a comment is written to explain what the code is doing, the code should be re-written to be more clean and self explanatory.
This statement is provocative and interesting, but wrong. There are more good uses for comments than bad ones. (The rest of Sepulveda’s posting is more nuanced, and much of what I write here expands on points he makes.)
Comments are an escape hatch for expressing everything about a program that the programming language can’t. Comments therefore don’t fall into a single natural category.
Rather than stating a single purpose or use for comments, one can start by stating the purpose of the non-comment portions of the source code. Comments are used for everything else, which is at least the following:
The Coding Compromise
Non-comment source code is a compromise among meeting the needs of two classes of consumers: compilers and runtimes on one hand; and human developers on the other. The priority given to each of these consumers depends on the context of the program’s development and deployment. An example at the level of programming language selection is that I use C++ for performance-critical applications or deployment to resource-limited platforms, but for more readable programs I use Python. However, even within a specific context, the compromise is always present.
Comments don’t have to meet this compromise. A line of code may legitimately be biased towards program execution (the compiler audience), but the line of code plus its comment can meet the needs of the human reader as well.
As an example, let’s look at the inner loop of the scan convertor for a graphics library that I worked on. This loop is responsible for drawing spans within a polygon. A span is the portion of a scan line from the x position of an edge on the left side of the polygon’s interior to the x position of the next leftmost edge on its right. The definition of “interior” depends on the fill mode of the polygon. Since (for source size and code size reasons) a single function implements both fill modes, the way that this function computes the interior of the polygon is conditional on the fill mode.
If I’m implementing the computation for human readability, especially by someone who isn’t familiar with low-level graphics programming, the initialization code, and the code to handle an edge transition, might look like this:
bool inside = false;
int windingCount = 0;
if (windingType == kEvenOddFill) {
inside= !inside;
} else {
windingCount += edge->direction;
inside = windingCount != 0;
}
(I I cared more about abstraction and flexibility, I might even turn windingType into an instance of a class with an update function:
windingType.updateState(edge);
inside = windingType.isInside();)
The same functionality coded for runtime efficiency might instead look like this:
int windingMask = (windingType == kEvenOddFill) ? 1 : -1;
bool inside=false;
int windingCount = 0;
inside = windingCount & windingMask;
The efficient code uses integer/bitfield and integer/boolean puns which wouldn’t even compile in a strictly typed language, but which take advantage of the implementation of values of these types in C++ to optimize the performance characteristics of the program. (It also uses a clever hack that I was proud of at the time.) These tricks turn the process of reading the source, however, into an exercise in reverse engineering.
Comments ameliorate this reverse engineering process, by speaking sotto voce to the human audience:
// windingMask tells which bits of the windingCount to test.
// For even-odd fill, test the low bit, to tell whether the number is odd.
// Otherwise, test all the bits, to tell whether it's non-zero.
The Limits of Expression
Another reason for a comment is to compensate for the limits of the programming language at hand. There is often no way to express the design of the program within the syntax of a particular programming language. Languages in mainstream use have become moderately good at abstracting over data structures; they’re less good at abstracting over types, control structures, definition patterns, or patterns of composition, and they generally lack means of describing design patterns or architecture.
A comment can add information about a program’s design, such as that the program implements the Pipes-and-Filters architecture, that a class implements the Flyweight Design Pattern, that a declared float represents feet, or that a declared String may be null. There are languages which can express each of these facts directly, in which case the comment is superfluous, but chances are you aren’t using one of them.
Levels of Abstraction
Source code is typically written only at one level of abstraction. It may combine different units of structure or composition, such as methods, classes, and packages, but the source code typically doesn’t contain both statements that express something in a high-level or coarse-grained way, and the same thing in a low-level or fine grained way. (An exception is invariants in Eiffel, which express the abstract what as well as the concrete how.) After all, this would be redundant, from the perspective of the compiler.
The human reader, on the other hand, would like to be able to understand the purpose of a code block, function, or package without reading its implementation. Different audiences care about different levels of detail, and at different times. An API user needs a description of a method’s external behavior; someone working on the implementation of a method needs a high-level understanding of its implementation or algorithms as a roadmap of the implementation, maybe an overview of what each block does, but a line-by-line understanding of a particular block within the method at only a particular time. Source code is either there or not, and often can’t be easily skimmed; comments let the reader turn the dial to positions betweeen no information and too much.
Program as Palimpsest
A comment is a note from the past to the future.
I write a comment wherever I’m afraid someone might change the program for the worse. One reason this might happen, discussed above, is because the program’s design isn’t evident in the source. Another is that the reason for an implementation decision comes from data that isn’t present in the design. Corner cases and performance metrics are two kinds of data that aren’t present in a program’s design. Changes made in response to these data frequently result in code that’s suboptimal from a readability perspective, and that in the absence of that data would appear to be redundant. In the absence of information to the contrary, this code should be optimized out, for performance and code size reasons as well as from a readability perspective. Comments are a protection against this form of regression.
For example, I’ve written code like this:
if (shape.bounds.contains(point) && shape.contains(point)) ...
First this code tests if the bounding box of the shape contains point; if it does, it tests if the shape itself contains point. The first test is redundant. It’s extra source text to read, extra code to maintain (if I rename “shape” or “point”, there’s one more subexpression to edit), extra code to deploy, and it breaks the Law of Demeter — it increases the coupling between this code and the implementation of the class of shape. As it stands, in my book, this is bad code.
Here’s a non-executable line that, prepended to the example above, changes it into correct code:
// On a 400MHz P5, testing shape.bounds speeds text tracking by 10%
Without this line, there’s nothing to indicate whether the code is the result of premature and possibly misguided optimization, or whether it actually provides some benefit to compensate for its readability and maintainability shortcomings. With it, the source code records its own history; if anyone decides to remove the hard-won optimization, it won’t be out of ignorance.
Joel Spolsky has written an essay about corner cases. I’ll just add that corner case code looks like optimization code: without a comment it’s hard to tell whether it’s contingency or clutter.
Time Bomb or Time Capsule
Comments are also a place to put wishlist items and to-dos. Some of these may be “mistakes in the code”, but an item can be a mistake in the code without being a mistake in the software development process. The point of software engineering is to make tradeoffs between implementation speed, implementation resources, deployment resources, and code quality. Sometimes putting all the emphasis on code quality or feature completeness is the wrong way to make the tradeoff — for example, when bytes count, when days count, when the code is throwaway and the corner condition will never be met , or when the enhancement isn’t yet useful.
When the developer knows that a corner case is present or that an enhancement is possible, comments add a choice between implementing it immediately on one hand, and ignoring it altogether on the other. The first alternative is expensive immediately, because of its affect on the schedule and its distraction from work implementing the more important use cases. The second is expensive later, because it loses information that’s available now and that may have to be recreated then.
A comment at the implementation site can document the cause of the error in implementation terms, or collect notes about how the missing condition could be handled. The comment captures knowledge from construction time, when it’s available, and saves it for maintenance or enhancement time, when it’s useful.
The Path to Deployment
Conventional programming languages define program execution, but don’t express information about activities that occur before the deployed program is executed. Other activities than execution require other artifacts. Development requires examples. Testing requires tests and contracts. Deployment requires packaging information.
These artifacts can be placed in separate files. Often they’re ephemeral, such as a test case that is typed into the command line and then discarded. But it’s convenient to keep these artifacts close to the program units that they describe, so that they can be updated and reused as the program evolves. Many of them are also useful to the human reader trying to understand the program: for example, the examples and tests that drive and verify development make useful API documentation too. Capturing these artifacts in comments is a way to do meet these needs.
(All of the other uses of comments are a special case of this point. Design notes and explanations support construction, maintenance, and white-box testing, which are all non-execution activities.)
The Third Consumer
I wrote at the beginning of this entry that there were two consumers for source text: humans, and the compiler. I lied. There’s a third class of consumers: comment processors, which treat comments as source code in an supplementary language — a language that fills gaps in the coverage of the source code’s primary language.
Tim Peters’s doctest treats comments as unit tests. iContract reads comments for pre- and post-conditions. XDoclet is an extensible processor, preconfigured to read packaging and deployment descriptions. These tools address the non-execution activities of software development.
SPARK is a generic parser that reads BNF productions from the comments of methods that their semantic actions. It patches the failure of Python to abstract over defining forms.
All of these functions of comments as metadata could be implemented by extending the language instead. (Contracts are from Eiffel, for example, which supports them in its core syntax.) However, the use of structured comments as metadata lets a third-party tool address a gap in the language in a way that leaves the source structure intact for compilers and other source processors (such as editors, source debuggers, and refactoring tools) that wouldn’t recognize an extended grammar. Just as comments are an escape hatch for human readers, they’re an escape hatch for development tools as well.
And these two kinds of comments — natural language and structured metadata — are related. They both add information that can’t be expressed in the primary programming language. And they’re both useful to humans too.
Posted: August 31st, 2003
Categories:
Essays,
Programming
Tags:
Comments:
5 Comments.
A lightweight language such as Python or JavaScript fits a lot of program design into a small amount of source text. A heavyweight language such as C++ or Java uses more tokens to express the same design.
Many people (Guido van Rossum, Bruce Eckels , Steve Ferg) report that working in a lightweight language is about five times more productive than working in a heavyweight language. This matches my personal experience. Lutz Prechelt reproduces this difference within a research setting.
It’s easy to make up reasons why it’s easier to work with shorter sources. There’s less data entry, fewer places to make mistakes, and it’s easier to read the programs. A program in a heavyweight language has a lot of boilerplate that doesn’t add information about the program’s design or intent. Eliminating the boilerplate seems like a clear win.
But what about source text that does add information about the program — in particular, what about types? Removing type declarations from a program also makes it shorter, but it removes information that the compiler and runtime can use to verify the program. Doesn’t this make the program less reliable? Or, given a reliability goal, doesn’t it make it take longer to reach that goal?
From the anecdotes and studies above, it looks like this isn’t actually the case. I’d like to look at why.
Two Types of Types
An Explicity-Typed Language (ETL) is one in which functions and variables must be declared with type declarations. (These type declarations are conventionally called manifest types.) Java and C++ are ETLs.
An Implicitly-Typed Language (ITL) is one in which type declarations are not necessary. Some ITLs, such as JavaScript 1.5 and Python, lack type declarations altogether. In other ITLs, such as Dylan, Haskell, JavaScript 2.0, and JScript.NET, type declarations are optional, and can be added at any point during program development.
An ITL isn’t necessarily a weakly typed language. (In a weakly typed language, type mismatches are never detected. C++ is weakly typed where it uses with static casts.) Python is an example of latent typing: types are always checked, just not at compile time.
Neither is an ITL necessarily a dynamically typed language. Haskell and ML are examples of statically typed languages where the type declarations are often optional, but the types are known at compile time. The compiler infers variable and function types from the source, just as a Java or C++ compiler infers the types of individual subexpressions.
(This discussion is condensed from the appendix to Steve Ferg’s excellent article comparing Python to Java.)
No Pain No Gain?
It’s common knowledge that it takes longer to get a program in an Explicitly-Typed Language to first clean compile, because it takes time to get the type declarations right. Initial program development consists of editing type declarations as well as non-declaration text (the expressions, control flow, and program structure), followed by a block of time to get the program to first clean compile. The white area below represents work on the non-declaration portion of a program; the gray stripes represent work on type declarations.

In order to show the division between time spent on type declarations and time spent on other program text more clearly, I’m going to slide the type declarations to the right:

This picture appears to show that it takes more time to develop a program that contains type declarations. The received wisdom among ETL fans is that what you get for this extra time is error checking. ETL development takes longer, but gives you a more robust product. The equivalent ITL development effort (which isn’t shown here) would be even longer, because it would include debugging time to catch the problems that the compiler didn’t. In fact, some of these problems might not show up until the product is in the field.
Tests versus Types
Test-Driven Development (TDD) is a development methodology where the test cases are written first, and code is added to the program proper only as necessary to implement correct behavior for test cases.
TDD advocates have noted that if you write your test case first, you also get error checking. In fact, you get better error checking than you do with explicit types, because the tests are tailored to the code paths, the edge cases, and the specific problem domain. After all, the only kinds of errors that explicit typing catches are those of the kind where you used a string in integer context. Programmers who have used implicitly-typed languages can attest that these aren’t the kinds of errors that are hard to find. In fact, a lot of the work to make an ETL program compile is work that’s only necessary to get around how little to compiler knows about the runtime types.
The comparison of an ETL, where the extra work goes into typing explicit types, and an ITL with TDD, where the extra work goes into test cases, looks like this:

Let’s say you could start with a program without type declarations. Would you rather add type declarations, which catch a certain class of problems at compile time? Or would you rather add test cases, which in a strongly typed language catch this same class of problems at runtime, and catch problems that the type system can’t express as well?
But why not use TDD with an ETL?
Tests Plus Types
Here’s what happens when you use TDD with an ETL. You’re taking that first picture, and multiplying it by two: once for the test cases, and again for the program itself.

Robert Martin compares this TDD to dual-entry bookkeeping. Explicit types are another form a dual-entry bookkeeping. Using both explicit type declarations and test cases to validate is triple-entry bookkeeping. (And the time spent writing explicit types could have been spent on more test cases, or more of the program.)
But it’s actually worse than triple-entry bookkeeping, because an ETL requires that you use explicit type declarations in your test code too. Not only does the program itself cost more to write, but so do its test cases. It’s quadruple-entry bookkeeping: one of the three entries has two entries itself.
This is why even software written in an ETL is often tested with an ITL test harness. If you can afford to work in two languages, and if the test harness and the debugger are well-enough integrated that you can write the test cases first and use them for TDD, this works. (It brings the cost of using an ETL back down to triple-entry bookkeeping.) But there’s one more gotcha.
The Type Tax
The problem is that once type declarations are part of the source text, they’ve got to be maintained along with the rest of the code. The earlier the type declarations are added, the greater their contribution to the maintenance cost.
In a waterfall model of code construction, where all the code is written, brought to clean compile, and works the first time, this doesn’t matter. In incremental development, where compilable program is debugged or extended to handle additional features or test cases, the cost can be high.

Types are similar to comments, test cases, and end-user documentation in this way, except that types are coupled more closely to each line of source text. Factor out a variable from an expression, and the method comment and end-user documentation can stay the same, but a new type declaration is required. This tight coupling makes the maintenance tax on types even higher than it is for other program annotations, and the tradeoff for when to start adding them, if at all, is different.
Since the cost/benefit ratio of type declarations can be different at different points in program maturity, one style would be to add them late in program development (for performance reasons, or for their documentation benefits to future maintainers). This style of development is possible in a language with optional type declarations, such as Dylan or Haskell. That’s what the first picture on this page shows. ETLs such as C++ and Java simply don’t allow this.
Updated 9/25/2003 to rename “write types” to “edit tpes”, and to add a picture showing type declarations interleaved on a finer grain.
Posted: August 20th, 2003
Categories:
Software
Tags:
Comments:
22 Comments.
Dot numbers are a new notation for numbers, that make integer addition look like rational multiplication. They may be useful in primary school math education. The idea is that once you understand integers and addition, you can learn another way to look at it that sets you up to understand fractions and multiplication.
I made up dot numbers a few years ago to try to explain negative numbers to my then-four-year-old son.
Basics
A dot number is a way of writing a number. A dot number is represented as a number of dots above a line. This is the number 3, as a dot number:

A negative dot number is a number of dots below the line. This is the dot number -3:
.
Addition
To add two dot numbers, combine the dots above the line, and the dots below the line. This is 3 plus 2:

This is -3 plus -2:

And this is 3 plus -2:

Cancelling
A dot number is in normal form if all its dots (if any) are on the same side of the line. A dot number that isn’t in normal form can be “normalizing”, or transformed into another dot number that represents the same integer and is in normal form. As long as there’s a dot above the line and a dot below the line, cross the dots out. The pairs of dots “cancel” each other.
The preceding number (3 plus -2) can be normalized.

To subtract a number, flip it upside down and add that. This is 3 minus 2:

Dots and Factors
Dot numbers are interesting because they’re isomorphic to fractions. Dot numbers are to addition (and subtraction, and the unit) as fractions are to multiplication (and division, and factors).
Just as the multiplicative inverse (the reciprocal) of a fraction can be obtained by turning it upside down, the additive inverse (the negative) of a dot number is obtained by flipping it upside down. (The inverse in both cases is the “vertical inverse”).
Just as fractions are multiplied by multiplying the tops and bottoms (numerators and denominators) separately, dot numbers are added by adding their tops and bottoms.
And just as any number can be represented by multiple fractions (1/2, 2/4, 4/8) which can be normalized (“reduced”) by dividing the top and bottom by integral factors (4/8 = 4×1/4×2 = 4×1/4×2 = 1/2), so an integer (3) can be represented by multiple dot numbers (3-0, 4-1, 5-2), each of which can be reduced by subtracting units from its top and bottom (5-2 = (1+1+1+1+1)-(1+1) = (1+1+1+1+1)-(1+1+1+1+1) = (1+1+1+1+1)-(1+1+1+1+1) = 3-0).
Posted: August 14th, 2003
Categories:
Illustrations,
Math Education,
Visualizations
Tags:
Comments:
1 Comment.
Last month I did a code sprint on a blogging aggregator written in LZX. You can see the results at myLaszlo.com. There’s a screenshot on this page. 
Read more »
Posted: August 8th, 2003
Categories:
OpenLaszlo,
Projects
Tags:
Comments:
1 Comment.