You have set up a laboratory experiment where you make use of a particle accelerator for bombarding bismuth atoms with zinc. The atomic collisions generated by the experiment result in a large number of chemical elements. Since all of these elements remained unknown until today, this discovery will undoubtedly yield the Nobel Prize in Chemistry. In order to publish your results, you still need to come up with new names for each of the new elements.

test tube

Because you want to assign names that are in line with the names of existing chemical elements, you decide to proceed in the following way. Chemical names are written with a capital letter, followed by two or more lowercase letters. So you stick to this usage of uppercase and lowercase letters. To recognize suffixes of element names, you (temporarily) append an underscore (_) to the names of the existing chemical elements. Then, you proceed as follows:

  1. Randomly pick a name of an existing chemical element, and take the first three letters of that name as the initial letters of the new name you are going to construct.

  2. Take the last two letters of the provisional new name, and search the names of the existing chemical elements for all possible characters (lowercase letters or underscores) that follow this bigram. Randomly pick a character from the list of candidates, and append it to the provisional new name.

  3. If you have chosen an underscore during step 2, the new name is considered to be complete. Of course, in that case the underscore must be removed from the name. Otherwise, keep on repeating step 2 until an underscore was chosen.

If the above procedure yields a name that was already assigned to an existing chemical element, you simply repeat the whole process until it produces a new name.

Assignment

Define a class NameGenerator that can be used to generate new names based on a sequence of example names, following the procedure outlined in the introduction. The objects of this class should at least have the following properties and methods:

In implementing these methods you should make sure to make optimal reuse of the methods that have already been implemented.

Example

In the following interactive session we assume that the file shortlist_elements.txt1 is located in the current directory.

>>> chemGen = NameGenerator()

>>> chemGen.add_name('Osmium')
>>> chemGen.prefixes
{'Osm'}
>>> chemGen.triples
{'sm': {'i'}, 'mi': {'u'}, 'iu': {'m'}, 'um': {'_'}}

>>> chemGen.add_name('bismuth')
Traceback (most recent call last):
AssertionError: invalid name
>>> chemGen.add_name('zINC')
Traceback (most recent call last):
AssertionError: invalid name
>>> chemGen.add_name('pH')
Traceback (most recent call last):
AssertionError: invalid name

>>> chemGen.add_name('Bismuth')
>>> chemGen.prefixes
{'Osm', 'Bis'}
>>> chemGen.triples
{'sm': {'u', 'i'}, 'mi': {'u'}, 'iu': {'m'}, 'um': {'_'}, 'is': {'m'}, 'mu': {'t'}, 'ut': {'h'}, 'th': {'_'}}

>>> chemGen.add_names('shortlist_elements.txt')
>>> chemGen.prefixes
{'Tha', 'Tel', 'Lan', 'Rut', 'Plu', 'Unu', 'Osm', 'Bis'}
>>> chemGen.triples
{'sm': {'u', 'i'}, 'mi': {'u'}, 'iu': {'m'}, 'um': {'_'}, 'is': {'m'}, 'mu': {'t'}, 'ut': {'o', 'h'}, 'th': {'e', '_', 'a'}, 'he': {'n', 'r', 'x'}, 'en': {'i'}, 'ni': {'u'}, 'an': {'u', 't'}, 'nt': {'h'}, 'ha': {'n', 'l'}, 'nu': {'n', 'm'}, 'al': {'l'}, 'll': {'u', 'i'}, 'li': {'u'}, 'el': {'l'}, 'lu': {'r', 't'}, 'ur': {'i'}, 'ri': {'u'}, 'to': {'n'}, 'on': {'i'}, 'er': {'f'}, 'rf': {'o'}, 'fo': {'r'}, 'or': {'d'}, 'rd': {'i'}, 'di': {'u'}, 'un': {'h'}, 'nh': {'e'}, 'ex': {'i'}, 'xi': {'u'}}

>>> chemGen.name()
'Osmuthalluthexium'
>>> chemGen.name()
'Ruthanthanium'
>>> chemGen.name()
'Lantherfordium'
>>> chemGen.name()
'Thanthenium'

Resources