Braille is a tactile writing system used by people who are visually impaired. It is named after its creator, Louis Braille1 (1809–1852), a Frenchman who lost his sight as a result of a childhood accident. In 1824, at the age of fifteen, he developed a code for the French alphabet as an improvement on night writing. He published his system, which subsequently included musical notation, in 1829. The second revision, published in 1837, was the first small binary form of writing developed in the modern era.
Braille is traditionally written with embossed paper. Its characters are rectangular blocks called cells that have tiny bumps called raised dots that can be read while running one's fingers along the words. A full braille cell includes six raised dots arranged in two columns, each columns having three dots. The dot positions are identified by numbers from one to six.
From the six dots that make up a cell, 64 different configurations can be created. These 64 braille characters have been added to the UTF-8 character encoding2 (Unicode), as can be seen in the chart below. Note that the top left corner contains the braille character that includes no dots at all (making it invisible).
⠀ () |
⠁ (1) |
⠂ (2) |
⠃ (12) |
⠄ (3) |
⠅ (13) |
⠆ (23) |
⠇ (123) |
⠈ (4) |
⠉ (14) |
⠊ (24) |
⠋ (124) |
⠌ (34) |
⠍ (134) |
⠎ (234) |
⠏ (1234) |
⠐ (5) |
⠑ (15) |
⠒ (25) |
⠓ (125) |
⠔ (35) |
⠕ (135) |
⠖ (235) |
⠗ (1235) |
⠘ (45) |
⠙ (145) |
⠚ (245) |
⠛ (1245) |
⠜ (345) |
⠝ (1345) |
⠞ (2345) |
⠟ (12345) |
⠠ (6) |
⠡ (16) |
⠢ (26) |
⠣ (126) |
⠤ (36) |
⠥ (136) |
⠦ (236) |
⠧ (1236) |
⠨ (46) |
⠩ (146) |
⠪ (246) |
⠫ (1246) |
⠬ (346) |
⠭ (1346) |
⠮ (2346) |
⠯ (12346) |
⠰ (56) |
⠱ (156) |
⠲ (256) |
⠳ (1256) |
⠴ (356) |
⠵ (1356) |
⠶ (2356) |
⠷ (12356) |
⠸ (456) |
⠹ (1456) |
⠺ (2456) |
⠻ (12456) |
⠼ (3456) |
⠽ (13456) |
⠾ (23456) |
⠿ (123456) |
When Braille was adapted to languages other than French, many variants of Braille script were created. In a Braille script each character is represented by a unique combination of one or more cells. For example, the lowercase letter c is represented by the cell ⠉, the uppercase letter C is represented by the combination of two cells ⠠⠉ and the digit 3 is represented by the combination of two cells ⠼⠉. Note that all three characters share ⠉ as the last cell of their representation.
If a character is represented by a combination of two or more cells, no single prefix of that combination in itself constitutes the representation of a character. From the examples above we can therefore derive that the cell ⠠ (first cell of the combination representing the letter C) and the cell ⠼ (first cell of the combination representing the digit 3) by themselves do not represent any character in the Braille script. However, it is possible that they are used as a prefix in other combinations. For example, the combination ⠠⠙ could represent the uppercase letter D and the combination ⠼⠙ could represent the digit 4.
A Braille script is stored in a text file that uses UTF-8 encoding. Each line contains an ASCII character, followed by one or more UTF-8 characters that represent the combination of cells encoding the ASCII character in Braille.
⠀ a⠁ b⠃ c⠉ d⠙ e⠑ … (⠐⠣ )⠐⠜ /⠸⠌ \⠸⠡ -⠤
Note that the first line of the above text file consists of a space followed by the UTF-8 character representing a Braille cell that includes no dots at all. Both characters are invisible, but are there anyhow.
When opening a text file somefile.txt, the character encoding used by the file can be passed to the parameter encoding of the built-in function open:
>>> open('somefile.txt', 'r', encoding='utf-8')
Your task:
Write a function char2braille that takes the location (str) of a text file containing a Braille script. The function must return a dictionary (dict) that maps each ASCII-character (str) from the file onto the combination of Braille cells (str) encoding the ASCII character in the script.
Write a function braille2char that takes the location (str) of a text file containing a Braille script. The function must return a dictionary (dict) that maps each combination of Braille cells (str) from the script onto the corresponding ASCII character (str) it encodes.
Write a function encode that takes two arguments: i) a string $$s$$ (str) and ii) a dictionary (dict) for a Braille script as returned by the function char2braille. The function may assume that $$s$$ only contains ASCII characters that occur in the script and must return the representation of $$s$$ in Braille (str).
Write a function decode that takes two arguments: i) a string $$s$$ (str) that represents a message in Braille and ii) a dictionary $$d$$ (dict) for a Braille script as returned by the function braille2char. The function may assume that the string $$s$$ uses the script that is represented by dictionary $$d$$ and must return the original message (str).
In the following interactive session we assume the text file braille.txt3 to be located in the current directory.
>>> c2b = char2braille('braille.txt')
>>> c2b['c']
'⠉'
>>> c2b['C']
'⠠⠉'
>>> c2b['3']
'⠼⠉'
>>> c2b['"']
'⠘⠦'
>>> b2k = braille2char('braille.txt')
>>> b2k['⠉']
'c'
>>> b2k['⠠⠉']
'C'
>>> b2k['⠼⠉']
'3'
>>> b2k['⠘⠦']
'"'
>>> c2b = char2braille('braille.txt')
>>> encode('braille', c2b)
'⠃⠗⠁⠊⠇⠇⠑'
>>> encode('Louis Braille', c2b)
'⠠⠇⠕⠥⠊⠎⠀⠠⠃⠗⠁⠊⠇⠇⠑'
>>> encode('100cm', c2b)
'⠼⠁⠼⠚⠼⠚⠉⠍'
>>> encode('6\\'10"', c2b)
'⠼⠋⠄⠼⠁⠼⠚⠘⠦'
>>> b2k = braille2char('braille.txt')
>>> decode('⠃⠗⠁⠊⠇⠇⠑', b2k)
'braille'
>>> decode('⠠⠇⠕⠥⠊⠎⠀⠠⠃⠗⠁⠊⠇⠇⠑', b2k)
'Louis Braille'
>>> decode('⠼⠁⠼⠚⠼⠚⠉⠍', b2k)
'100cm'
>>> decode('⠼⠋⠄⠼⠁⠼⠚⠘⠦', b2k)
'6\\'10"'