In 1954, James Watson (PRO) and Francis Crick (TYR) formed the RNA Tie Club — a scientific gentleman's club whose mission was to solve the riddle of the RNA structure and to understand how it built proteins. The club had 20 members, each of whom was designated by an amino acid (the building blocks of proteins).
| member | training | tie designation |
|---|---|---|
| George Gamow | physicist | ALA |
| Alexander Rich | biochemist | ARG |
| Paul Doty | physical chemist | ASP |
| Robert Ledley | mathematical biophysicist | ASN |
| Martynas Ycas | biochemist | CYS |
| Robley Williams | electron microscopist | GLU |
| Alexander Dounce | biochemist | GLN |
| Richard Feynman | theoretical physicist | GLY |
| Melvin Calvin | chemist | HIS |
| Norman Simmons | biochemist | ISO |
| Edward Teller | physicist | LEU |
| Erwin Chargaff | biochemist | LYS |
| Nicholas Metropolis | physicist, mathematician | MET |
| Gunther Stent | physical chemist | PHE |
| James Watson | biologist | PRO |
| Harold Gordon | biologist | SER |
| Leslie Orgel | theoretical chemist | THR |
| Max Delbrück | theoretical physicist | TRY |
| Francis Crick | biologist | TYR |
| Sydney Brenner | biologist | VAL |
In his memoires George Gamow (ALA) recalled:
We were just drinking California wine and we got the idea.
Each member was given a black woolen necktie with an RNA felix embroidered in green and yellow (photograph, left to right: Francis Crick (TYR), Alexander Rich (ARG), Leslie Orgel (THR) and James Watson (PRO)).
Each member also received a gold tiepin with the three-letter abbreviation of his amino acid, which led several people to ask George Gamow (ALA) why his pin bore the wrong monogram.
Adopting the motto "Do or die, or don't try" they met twice a year to share ideas, cigars and alcohol. Several members of the RNA Tie Club went on to become Nobel Prize laureates, but if fell to Marshall Nirenberg — a non-member — to finally decipher the genetic code that forms the link between nucleic and amino acids.
We work with a secret code that the members of a club can use to exchange messages that are gibberish to non-members. The club members are registered in a comma-separated values (CSV) file whose first column contains the name of a club member and whose third column contains a designation that includes one or more uppercase letters. Each club member has a unique designation.
A secret message is represented by a sequence (list or tuple) of codes, where each code is a string (str) that contains a designation $$d$$ (one or more uppercase letters) that corresponds to a club member, followed by a position $$p$$ (one or more digits). Possible codes are GLU3, GLY2 or ALA10. To decode the secret message, each code must be replaced by the $$p$$-th letter in the name of the club member that corresponds to designation $$d$$. All letters in the decoded message are uppercase. For example, for the members of the RNA Tie Club, the code GLU3 corresponds with the letter B (third letter of Robley Williams), the code GLY2 with the letter I (second letter of Richard Feynman), and ALA10 with the letter O (tenth letter of George Gamow). Your task:
Write a function read_designations that takes the location (str) of a CSV file. The given CSV file must contain the registered club members in the format described above. The function must return a dictionary (dict) that maps all designations (str) from the given CSV file onto the names (str) of the corresponding club members. The names of the club members should be reduced to letters only, and converted to uppercase.
Write a function split_code that takes a code (str) containing a designation (one or more uppercase letters) followed by a position (one or more digits). The function must return a tuple containing the designation (str) and the position (int) as separate elements. If the argument does not represent a valid code, the function must raise an AssertionError with the message invalid code.
Write a function decode that takes two arguments: i) a secret message (str) and ii) the dictionary (dict; as returned by the function read_designations) containing the list of club members that was used to encode the secret message. The function must return the decoded message (str).
In the following interactive session, we assume the CSV file RnaTieClub.csv to be located in the current directory.
>>> designation = read_designations('RnaTieClub.csv')
>>> designation['GLU']
'ROBLEYWILLIAMS'
>>> designation['GLY']
'RICHARDFEYNMAN'
>>> designation['ALA']
'GEORGEGAMOW'
>>> split_code('GLU3')
('GLU', 3)
>>> split_code('GLY2')
('GLY', 2)
>>> split_code('ALA10')
('ALA', 10)
>>> split_code('R2D2')
Traceback (most recent call last):
AssertionError: invalid code
>>> decode(['GLU3', 'GLY2', 'ALA10', 'ASP4', 'ASP6', 'THR9', 'HIS11', 'PHE8', 'PHE4'], designation)
'BIOLOGIST'
>>> decode(('MET14', 'SER1', 'CYS5', 'PRO9', 'LYS4', 'HIS7', 'GLU11', 'GLU14', 'PHE4'), designation)
'PHYSICIST'
>>> decode(['CYS10', 'MET4', 'ARG8', 'ISO4', 'GLU8', 'MET18', 'PHE12'], designation)
'CHEMIST'
>>> decode(['THR9', 'PHE6', 'THR7', 'LEU9', 'GLU2', 'ALA1', 'TYR6', 'GLU14', 'ASP7'], designation)
'GEOLOGIST'
>>> decode(['THR9', 'ARG3', 'MET5', 'ALA5', 'ASN5', 'CYS2', 'MET14', 'GLY4', 'ASN11', 'ASN1'], designation)
'GEOGRAPHER'
>>> decode(['LYS11', 'MET8', 'ASP7', 'PHE7', 'ALA10', 'ISO11', 'ASN2', 'MET9', 'MET10', 'LEU5'], designation)
'ASTRONOMER'
>>> decode(['CYS12', 'PRO8', 'LYS8', 'CYS4', 'MET17', 'ISO12', 'PHE12', 'MET17', 'LYS6', 'MET17', 'PRO2', 'HIS6'], designation)
'STATISTICIAN'
>>> decode(['VAL7', 'ARG11', 'ALA3', 'MET3', 'ARG13', 'ASN11', 'HIS1', 'HIS5', 'PHE8', 'MET11'], designation)
'BIOCHEMIST'
>>> decode(['GLY12', 'ISO5', 'PHE9', 'GLY4', 'ASN8', 'ISO9', 'SER2', 'LEU7', 'LYS4', 'CYS10', 'TYR10', 'ALA8', 'GLN13'], designation)
'MATHEMATICIAN'
>>> decode(['ARG12', 'SER8', 'GLU13', 'ASP1', 'TRY9', 'PHE9', 'THR10', 'LEU12', 'MET8', 'GLN14', 'ISO8', 'ALA6', 'TYR4', 'LEU7', 'HIS5', 'CYS8', 'LEU7'], designation)
'COMPUTERSCIENTIST'