An analysis of frequency counts for letters in English, with applications to the game of Scrabble. Also presents a distillation of the Google Books Ngram data, broken out by time periods.
- Peter Norvig
Bro, I love your nerdiness so much. Not sure what's up with the crazy font. I think this data is *very* useful for contestants on Wheel of Fortune and people who play Hangman (which is my favorite iPhone game to play when I don't have connectivity).
- Laura Norvig
"They all appear in published text that is published in English. You may have preconceptions about what counts as an "English," but the simplest approach is to include everything."
- Peter Norvig
"Yes -- maybe next month I'll go back and generate the data grouped by time period. Probably not by year, but maybe by 20-year buckets."
- Peter Norvig
"The major difference is that when he reports a count of 0, you can't tell if that means 1 in 100 thousand, or 1 in 100 billion. With my results, you can differentiate these cases pretty well. The ngrams with high counts (like the top 50 bigrams) remain fairly consistent."
- Peter Norvig
"I think that means that your connection was broken mid-way and you didn't get the whole file. I just tried and it works for me. The file size is 46,302,906 bytes."
- Peter Norvig
A frequently-asked-question list for the 2012 United States Presidential Election; all questions are answered with facts that are as objective as I can make them, except for the last question, in which I endorse President Obama.
- Peter Norvig