In 1605, English philosopher, statesman, scientist, jurist, orator, and author Francis Bacon1 devised a method to conceal secret messages inside ordinary-looking text. Bacon's cipher works in two successive steps.
To encode a secret message, each letter of the message is initially replaced by a unique group of five symbols taken from a two-symbol alphabet. The above image shows an example of the key using a two-letter alphabet (a and b), as used by Francis Bacon in his De Augmentis Scientiarum2 (English: The Advancement of Learning, 1605). As it concerned a work in Latin, it followed the common practice to equate the letters i and j, and to equate the letters u and v.
However, because it is a binary code that uses two symbols in five positions, it enables to encode a total of $$2^5 = 32$$ different characters. As a result, there is no problem to apply Bacon's cipher with a key that uniquely encodes each of the 26 characters of our alphabet, and has room for 6 additional characters (for example, a space and some punctuation marks).
As a second step, we take an arbitrary piece of text and write it using two different fonts. The one font then corresponds for example to the symbol a, whereas the other font corresponds to the symbol b from the previous step. As a result, it does not really matter which letters are used in the text, as we only rely on the font to encode the secret message. The image above shows the fonts used by Francis Bacon. In order to conceal that the arbitrary text contains a secret message, he used two fonts that only differ from each other in a subtle way. After all, in handwriting an encoding using two almost identical fonts hardly catches the eye.
Instead of playing with fonts, we could for example also represent the symbol a by lowercase letters, and the symbol b by uppercase letters. Let's apply this strategy in an example where we want to encode the word ALICE. As a first step, we replace each letter by a sequence of five a's or b's:
A L I C E aaaaa ababb abaaa aaaba aabaa
Then we encode this secret message using lowercase letters (representing the symbol a) and uppercase letters (representing the symbol b) into the sentence Draco Dormiens Numquam Titillandus:
aaaaa ababb abaaa aaaba aabaa draco dOrMI eNsnu mquAm tiTil
In order to further illustrate the principle, we have highlighted all uppercase letters and the corresponding symbols b in boldface.
To apply Bacon's cipher, we make use of text files that define a fixed mapping between each of the 26 letters of the alphabet and their corresponding unique code of 5 binary symbols. Each line of such a text file starts with an uppercase letter, followed by a space, the unique code of 5 symbols from the alphabet {a, b}, another space, and the same unique code that now uses 5 symbols from the alphabet {0, 1}. Below, you can see an example of such a text file (restricted to the first few lines):
A aaaaa 00000 B aaaab 00001 C aaaba 00010 D aaabb 00011 E aabaa 00100 F aabab 00101 …
Your task:
Write a function readKey that takes the location of a text file. The given text file must contain a mapping of all letters of the alphabet onto their corresponding code that contains 5 symbols from a binary alphabet, in the format as described above. The function must return a dictionary that maps each uppercase letter from our alphabet onto its unique code that uses symbols from the alphabet {a, b}, as defined in the given text file.
Write a function encode that takes two arguments: i) a dictionary that maps all uppercase letters of our alphabet onto a unique code that uses 5 symbols from the alphabet {a, b} (cfr. the dictionaries returned by the function readKey) and ii) a message (string) that needs to be encoded. The function also has an optional third parameter that takes a piece of text (string). The function must use Bacon's cipher to encode the given message into the given text, using a binary code of uppercase and lowercase letters. In doing so, the function must only take into account the letters in the given text and the given message (all characters that are not letters must be ignored) and must make no distinction between uppercase letters and lowercase letters in the given message. In case the given text is too short to encode the given message, the last letter of the text is followed by the first letter of the text. In case no piece of text is passed to the optional third parameter, or in case the given text contains not a single letter, the function must encode the given message into a randomly generated string of letters.
Write a function decode that takes two arguments: i) a dictionary that maps all uppercase letters of our alphabet onto a unique code that uses 5 symbols from the alphabet {a, b} (cfr. the dictionaries returned by the function readKey) and ii) a message (string) that has been encoded using the function encode (based on the given dictionary). The function must return the letters of the original version of the message, using uppercase letters only.
In the following interactive session, we assume the text file key.txt3 to be located in the current directory.
>>> key = readKey('key.txt')
>>> key['A']
'aaaaa'
>>> key['Z']
'bbaab'
>>> encode(key, 'ALICE', 'Draco Dormiens Numquam Titillandus')
'dracodOrMIeNsnumquAmtiTil'
>>> encode(key, 'ALICE', 'ora et labora')
'oraetlAbORaOraetlaBoraOra'
>>> encode(key, 'ALICE')
'obthizZhBQwAatlmjrChikWwe'
>>> encode(key, 'ALICE')
'pwsychLwVIeRqiuxtwKcxqLtn'
>>> decode(key, 'dracodOrMIeNsnumquAmtiTil')
'ALICE'
>>> decode(key, 'oraetlAbORaOraetlaBoraOra')
'ALICE'
>>> decode(key, 'obthizZhBQwAatlmjrChikWwe')
'ALICE'
>>> decode(key, 'pwsychLwVIeRqiuxtwKcxqLtn')
'ALICE'