Over a million developers have joined DZone.

Rare Letter Combinations and Key Chords

· Big Data Zone

Read this eGuide to discover the fundamental differences between iPaaS and dPaaS and how the innovative approach of dPaaS gets to the heart of today’s most pressing integration problems, brought to you in partnership with Liaison.

A bigram is a pair of letters. For various reasons—word games, cryptography, user interface development, etc.—people are interested in knowing which bigrams occur most often, and so such information is easy to find. But sometimes you might want to know which bigrams occur least often, and that’s harder to find. My interest is finding safe key-chord combinations for Emacs.

Peter Norvig calculated frequencies for all pairs of letters based on the corpus Google extracted from millions of books. He gives a table that will show you the frequency of a bigram when you mouse over it. I scraped his HTML page to create a simple CSV version of the data. My file lists bigrams, frequencies to three decimal places, and the raw counts: bigram_frequencies.csv. The file is sorted in decreasing order of frequency.

The Emacs key-chord module lets you bind pairs of letters to Emacs commands. For example, if you map a command to jk, that command will execute whenever you type j and k in quick succession. In that case if you want the literal sequence “jk” to appear in a file, pause between typing the j and the k. This may sound like a bad idea, but I haven’t run into any problems using it. It allows you to execute your most frequently used commands very quickly. Also, there’s no danger of conflict since neither basic Emacs nor any of its common packages use key chords.

The table below gives bigrams whose percentage frequency rounds to zero keeping three decimal places. See the data file for details.

Since Q is always followed by U in native English words, it’s safe to combine Q with any other letter. (If you need to type Qatar, just pause a little after typing the Q.) It’s also safe to use any consonant after J and most consonants after Z. (It’s rare for a consonant to follow Z, but not quite rare enough to round to zero. ZH and ZL occur with 0.001% frequency, ZY 0.002% and ZZ 0.003%.)

Double letters make especially convenient key chords since they’re easy to type quickly. JJ, KK, QQ, VV, WW, and YY all have frequency rounding to zero. HH and UU have frequency 0.001% and AA, XX, and ZZ have frequency 0.003%.

Note that the discussion above does not distinguish upper and lower case letters in counting frequencies, but Emacs key chords are case-sensitive. You could make a key chord out of any pair of capital letters unless you like to shout in online discussions, use a lot of acronyms, or write old-school FORTRAN.

Update (2 Feb 2015):

This post only considered ordered bigrams. But Emacs key chords are unordered, combinations of keys pressed at or near the same time. This means, for example, that qe would not be a good keychord because although QE is a rare bigram, EQ is not (0.057%). The file unordered_bigram_frequencies.csv gives the combined frequencies of bigrams and their reverse (except for double letters, in which case it simply gives the frequency).

Combinations of J and a consonant are still mostly good key chords except for JB (0.023%), JN (0.011%), and JD (0.005%).

Combinations of Q and a consonant are also good key chords except for QS (0.007%), QN (0.006%), and QC (0.005%). And although O is a vowel, QO still makes a good key chord (0.001%).

Discover the unprecedented possibilities and challenges, created by today’s fast paced data climate and why your current integration solution is not enough, brought to you in partnership with Liaison

Topics:

Published at DZone with permission of John Cook, DZone MVB. See the original article here.

Opinions expressed by DZone contributors are their own.

The best of DZone straight to your inbox.

SEE AN EXAMPLE
Please provide a valid email address.

Thanks for subscribing!

Awesome! Check your inbox to verify your email so you can start receiving the latest in tech news and resources.
Subscribe

{{ parent.title || parent.header.title}}

{{ parent.tldr }}

{{ parent.urlSource.name }}