Latin Agreement and Case 1

Posted by Oliver on May 25, 2008


Continue reading…

Fortunately 1

Posted by Oliver on January 31, 2006

Jim Grandy wrote:

From: jgrandy

Subject: stupid Google game

Date: January 7, 2006 6:17:58 PM EST

Google for "unfortunately, yournamehere":

Lots of fun hits for "unfortunately, jim":

  • unfortunately Jim’s orange dry suit made him look like a carrot
  • Unfortunately Jim is no longer with us as he died of a brain tumor in 1993.
  • Unfortunately, Jim did not respond. He disbelieved that it was an angel.
  • Unfortunately, Jim is only one person with a limited amount of time available to
    help Jane find answers to her questions.

I’ve turned this into a web page here.

I prototyped it with a screen scraper for Google, but I didn’t want to deploy a screen scraper.

Fortunately, Google has a Search API.

Unfortunately, Google’s API uses SOAP.

Fortunately, Ruby has a SOAP library.

Unfortunately, the Ruby SOAP library doesn’t work on Dreamhost.

Fortunately, the Yahoo Web Search API uses REST.

Unfortunately, Yahoo’s summaries don’t include enough right-hand context, so it’s harder to extract decent sentences from them.

Maybe I’ll go back to screen-scraping after all.

Update: Jim tells me he got the idea from Jorg Brown at Google.

Aargh! 29

Posted by Oliver on December 24, 2005

“Aargh!” But how do you spell it?

(Click here to skip straight to the visualization.)

In the late nineties, I tried using internet search as a spelling corrector. (I think I was using AltaVista at the time. It was the latest and greatest search engine, supplanting — was it Lycos?)

At the time, for the words I tried, there were about two orders of magnitude between a misspelling and the correct word. A spelling variant, such as “color” and “colour”, were typically less than one order of magnitude.

In 2002 I used Google to figure out the most common spelling for “closable”, for use in the OpenLaszlo API. It had been “closeable”; why use a spelling that most people would guess wrong the first time, I figured. [Update: This paragraph originally said the word was "resizeable", which is a straightforward misspelling.]

Here’s what this looks like today. First, a common misspelling:

compatible170M
compatable2M1.3%

And a couple of spelling variants:

closable137K
closeable101K73%
sizable8.3M
sizeable6.8M81%

(The percentage is the ratio of the page count to the page count of the most common variant, which is the form in bold above it.)

Some other misspellings:

commit73.9M
comit0.8M1%
resizable1.74M
resizeable0.18M10%
misspell466K
mispell55K12%

And some other acceptable variants:

color434M
colour63.0M16%
gray125M
grey73M59%
judgment77M
judgement24M32%

(What’s the difference between an acceptable variant, and a misspelling? An interesting topic for another posting. Maybe.)

What got me thinking about this again, was, of all things, thinking about how to spell “aargh!” One ‘a’, two, three…? And how many ‘r’s?

This is an interesting problem, first, because so many repetition counts are attested. There’s not just “mispelling” (1s) and “misspelling” (2s), but “argh”, “aargh”, “aaargh”, etc. And second, because the space is two-dimensional: not just “argh”, “aargh”, “aaargh”, …, but also “argh”, “arrgh”, “arrrgh”, … — and the product, with “aarrgh”, “aaarrrgh”, etc.

It’s clear that a wide range of spellings are acceptable. What’s the most common?

Without further ado, I created this page to help me find the answer.

There They’re

Posted by Oliver on September 05, 2004

(For Miles.)

_Possessives_Places_Contractions_Verbs
our..are
.here.hear
.wherewe’rewere
theirtherethey’re.
its.it’s.
your.you’re.
his, her, my...

Read across the rows to see words that are easily confused with each other. Read down the columns to see the patterns.

Things to note:
* All of the contractions have apostrophes ‘.
* Nothing in this table that isn’t a contraction has an apostrophe. “Its roof is red” doesn’t have an apostrophe even though “its” is a possessive. This makes “its” different from noun possessives (”The house’s roof is red”), but the same as “his”, “her”, and “their” (”Her hair is red”).
* The place words all have “here” in them: “here“, “there“, and “where“.
* Nothing that isn’t a place name has “here” in it: “Their hands are wet”, “They’re leaving”.

If you notice you’re using a word in this table, think about which kind of word it is (Possessive, Place, Contraction, or Verb), or which column it fits into. If you’re using “its”, would “his” also work? Or would you use a verb such as “we’re” — in which case you should be using “it’s” instead.

Once again, I’m sorry about English spelling — I would have done it differently. It’s not my fault.

PyWordNet 2.0 1

Posted by Oliver on April 19, 2004

After a spate of requests and a contribution from Wei-Hao Lin, I’ve finally gotten around to releasing an update of PyWordNet that works with the WordNet 2.0 database files. (WordNet 2.0 adds lexical links for derivational morphology and topical classification. This broke the PyWordNet 1.4 dictionary file parser.)

Continue reading…

Ingrediants and Isochems

Posted by Oliver on April 04, 2004

Now that product ingredient lists have become marketing bullets, here are two terms that I’ve found useful for thinking about them:


Ingrediant (with an ‘a’)

An item added to an ingredient list purely for its marketing effect, as opposed to any material qualities that it adds to the product. For example, the vitamins that are added to shampoos. (By analogy with surfactant, flavorant, colorant.)

Isochem

A more appealing synonym for a common ingredient. For example, “evaported cane juice” for “sugar”, or “hydrolyzed vegetable protein” for “monosodium glutamate (MSG)”.

Semiotics of Weddings

Posted by Oliver on July 10, 2003

A wedding is a coercion operator from a state, to an event that marks the beginning of the state. (The English word “marriage” denotes either.) The advantage of an event over a state, is that it can be used as a reference for other events, symbolizing happiness, community, fertility, ,etc., by placing these other events at the same time and location. This is analogous to time binding in linguistics.

Nobody’s On That Train 1

Posted by Oliver on June 23, 2003

I used to be an unrepetant nineteenth century grammarian. I never used a preposition to end a sentence; I took care to never never to split an infinitive; I knew “who” from “whom”; and I never, ever, used “they” for “he or she”.
Continue reading…