American science fiction writer Harlan Ellison1 has published over 1.700 short stories, novellas, screenplays, comic book scripts, teleplays, essays, and a wide range of criticism covering literature, film, television, and print media. Although he wrote using a one-finger-on-each-hand style of typing, he reportedly was doing this very fast.

Harlan Ellison
Harlan Jay Ellison (Cleveland (Ohio), 27 mei 1934 – Los Angeles, 28 juni 2018)

We ask ourselves the following question: which words would have been the easiest for Harlan to type? Although one can model this question in several ways, it seems clear that characters should alternate between hands and the linear distance traveled by the fingers of each hand should be minimized.

Since the amount of offset between keyboard rows does not seem to be standard, we simplify and assume the keys are arranged on a rectangular grid with unit spacing. A keyboard with QWERTY layout may then look like:

keyboard
Keyboard with QWERTY layout, with keys arranged in a rectangular grid. This ignores the small offset between keyboard rows.

The two fingers being used are allowed to roam over the whole keyboard (for example, the right hand is not required to stay on the right side), but crossing hands makes typing difficult, so we insist that each character typed by the right hand be in a column to the right of the preceding and succeeding characters typed by the left hand. Words that meet this condition are called Ellison words.

As indicated in the example above, we number the rows of a rectangular keyboard from top to bottom, and the columns from left to right, starting from zero. As such, the position of a key can be represented as a tuple $$(r, k)$$, where $$r$$ and $$k$$ indicate the number of the row and the column where the key is located. We define the linear distance $$d(a_1, a_2)$$ between two keys $$a_1$$ and $$a_2$$ on positions $$(r_1, k_1)$$ and $$(r_2, k_2)$$ as \[ d(a_1, a_2) = \sqrt{(r_1 - r_2)^2 + (k_1 - k_2)^2} \] For an Ellison words $$w$$ that is a sequence of $$n$$ characters $$a_1a_2\ldots a_n$$, we compute the score \[ s(w) = \frac{1}{n-1}\displaystyle\sum_{i=1}^{n-2}d(a_{i}, a_{i+2})\] This score is the average distance traveled by the fingers in moving to the next character while typing the word. The lower the score, the easier the word can be typed using a one-finger-on-each-hand style of typing. Based on this score, these are the easiest words for 4 through 13 characters we found in an English dictionary:

# letters words score
4 dodo, papa, tutu 0.00
5 dodos, ninon 0.33
6 banana 0.25
7 austere 1.08
8 terebene 0.71
# letters words score
9 abatement 1.44
10 maharajas 1.10
11 prohibitory 1.40
12 monotonicity 1.43
13 mononucleus 1.24

Ellison could easily have used most of these in a story about an infectious disease outbreak in India. But I guess that might have looked lazy.

Assignment

We represent the layout of a rectangular keyboard as a string (str), with each key represented by its unique character. Keys on the same row are listed from left to right, and the individual rows are listed from top to bottom and separated by vertical bars (|). We assume that the vertical bar does not occur as a character on a key of the keyboard. Any valid keyboard layout meets the following conditions:

Your task:

None of these functions may make a distinction between uppercase and lowercase letters. Except for the function is_keyboard, all other functions also have an optional parameter keyboard that takes the layout of a rectangular keyboard (str) that determines the positions of the keys. By default, the functions use the QWERTY keyboard layout from the introduction if no argument is explicitly passed to the parameter keyboard. If the given keyboard layout is not valid, an AssertionError must be raised with the message invalid keyboard.

Example

>>> QWERTY = 'QWERTYUIOP[|ASDFGHJKL:\'|ZXCVBNM,./\\'
>>> AZERTY = 'AZERTYUIOP^|QSDFGHJKLM%|WXCVBN?./+='

>>> is_keyboard(QWERTY)
True
>>> is_keyboard(AZERTY)
True
>>> is_keyboard(42)
False
>>> is_keyboard('ABC|DEF|ABC')
False
>>> is_keyboard('ABC|DEF|GHIJ')
False

>>> finger_position('W')
(0, 1)
>>> finger_position('w', keyboard=AZERTY)
(2, 0)
>>> finger_position('ß')
Traceback (most recent call last):
AssertionError: unknown character

>>> finger_positions('abatement')
[(1, 0), (2, 4), (1, 0), (0, 4), (0, 2), (2, 6), (0, 2), (2, 5), (0, 4)]
>>> finger_positions('monotonicity')
[(2, 6), (0, 8), (2, 5), (0, 8), (0, 4), (0, 8), (2, 5), (0, 7), (2, 2), (0, 7), (0, 4), (0, 5)]
>>> finger_positions('monotonicity', keyboard=AZERTY)
[(1, 9), (0, 8), (2, 5), (0, 8), (0, 4), (0, 8), (2, 5), (0, 7), (2, 2), (0, 7), (0, 4), (0, 5)]
>>> finger_positions('Ellison')
[(0, 2), (1, 8), (1, 8), (0, 7), (1, 1), (0, 8), (2, 5)]

>>> movements('abatement')
'RLRLRLRL'
>>> movements('monotonicity')
'RLRLRLRLRLR'
>>> movements('monotonicity', keyboard=AZERTY)
'LLRLRLRLRLR'
>>> movements('Ellison')
'RSLLRL'

>>> is_ellison_word('abatement')
True
>>> is_ellison_word('monotonicity')
True
>>> is_ellison_word('monotonicity', keyboard=AZERTY)
False
>>> is_ellison_word('Ellison')
False

>>> score('dodo')
0.0
>>> score('dodos')
0.3333333333333333
>>> score('banana')
0.25
>>> score('austere')
1.082842712474619
>>> score('terebene')
0.7060113295832983
>>> score('abatement')
1.4377850146065685
>>> score('maharajahs')
1.101569900005158
>>> score('prohibitory', keyboard=AZERTY)
1.405586837763654
>>> score('monotonicity')
1.430056307974577
>>> score('mononucleosis')
1.2409346854429897

Resources