Monday, December 6, 2010

A seriously deep word count

This article in the NY Times discusses a project to feed enormous amounts of data into a computer—"the titles of every British book published in English in and around the 19th century — 1,681,161, to be exact"—to see what words the Victorians used most often as a guide to what they were thinking. This concordance of a century or so of material is possible nowadays because of our ability to compute vast amounts of data, although, of course, the results are open to much discussion, as the article explains. Early results of this project are interesting, but as one scholar cited demurs, one must be careful. She did a search on the words syntax and prosody for a technical analysis of poetry, and found a sudden "explosion" of the two words in 1832. "But it turned out that Dr. Syntax and Prosody were the names of two racehorses. You find 200 titles with ‘Syntax,’ and you think there must be a big grammar debate that year, but it was just that Syntax was winning.” Interesting stuff.

