In the periodic table each chemical element is assigned a symbolic name consisting of one, two or three letters. If a particular word can be written as a sequence of those symbolic names for chemical elements, then we say that this word is a chemical word. For example, catalysis is a chemical word, for it can be written as C-At-Al-Y-Si-S, the sequence of symbolic names for the elements carbon, astatine, aluminum, yttrium, silicon and sulphur. However, catalyst is not a chemical word. Below are a few examples of chemical word.

scientific terms chemical word
basic Ba-Si-C
halogens H-Al-O-Ge-N-S
catalysis C-At-Al-Y-Si-S
interferon In-Te-Rf-E-r-O-N
psychogenesis P-S-Y-C-Ho-Ge-Ne-Si-S
sepsis Se-P-Si-S
syphilis S-Y-P-H-I-Li-S

Assignment

Given is a text file periodic_system.txt1, that contains a list with information about the elements from the periodic table. Each line — except the first which is the header — contains the following information about an element: i) atom number, ii) symbolic representation, iii) English name, iv) Dutch name, and v) atom mass. The information fields are separated by a tab. The assignment consists of determining whether a given word is a chemical word or not. To do this, follow these steps:

  1. Write a function readSymbols that takes the location of a text file as its argument. The content of this text file must be formatted as the file periodic_system.txt2. The function must return the list of symbolic names that occur in the second column of the given text file3.

  2. Write a function longestPrefix to which two parameters must be passed. The first parameter word is a string, and the second parameter symbols is a list of strings. The function should return the longest prefix of word that is also a symbolic name that occurs in the list symbols. For example, the symbolic names B and Be are both prefixes of the word beach. The function should, however, return the longest prefix, so in this case that would be the string Be. If none of the symbolic names from the list symbols is a prefix of the given word, then the function should return an empty string. When determining whether a symbolic name is a prefix of a word, capitals should not be taken into account. However, maintain the spelling of the symbolic names when returning the prefix: Be and not BE or be.
    Terminology: A prefix consists of one or more letters in front of the word. For example, chem is a prefix of the word chemistry.

  3. Write a function chemicalWord to which two parameters must be passed. The first parameter word is a string, and the second parameter symbols is a list of strings. If the word is a chemical word, then the function should return the word as a sequence of symbolic names for chemical elements. The symbolic names should be separated by a hyphen (-). If the word is not a chemical word, then the function should return an empty string.
    The procedure for determining whether a word is a chemical word or not, and at the same time to generate the notation as a succession of symbolic names, is illustrated in the table below, on the basis of the word catalysis. As long as the word is not reduced to the empty string, you determine the longest prefix of the word that is also a symbolic name for an element (obviously using the longestPrefix function). If a prefix is found that is not an empty string, then remove this prefix in front of the word and add it to the back the sequence of symbolic names (taking into account the placement of hyphens). Once the longest prefix yields an empty string, you may conclude that it will not be a chemical word, and so let the function return an empty string. However, if you can continue to repeat the above procedure until the original word has been reduced to the empty string, you may conclude that it will be a chemical word, and you have also found the notation as a sequence of symbolic names.

    word longest prefix chemical word
    catalysis C
    atalysis At C
    alysis Al C-At
    ysis Y C-At-Al
    sis Si C-At-Al-Y
     s S C-At-Al-Y-Si
         C-At-Al-Y-Si-S

Example

In the following interactive Python session we assume that the text file periodic_system.txt4 is located in the current directory.

>>> symbols = readSymbols('periodic_system.txt')

>>> longestPrefix('ages', symbols)
'Ag'
>>> longestPrefix('shop', symbols)
'S'
>>> longestPrefix('density', symbols)
''

>>> chemicalWord('catalysis', symbols)
'C-At-Al-Y-Sis'
>>> chemicalWord('basic', symbols)
'Ba-Si-C'
>>> chemicalWord('leisure', symbols)
''