In the periodic table each chemical element is assigned a symbolic name consisting of one, two or three letters. If a particular word can be written as a sequence of those symbolic names for chemical elements, then we say that this word is a chemical word. For example, catalysis is a chemical word, for it can be written as C-At-Al-Y-Si-S, the sequence of symbolic names for the elements carbon, astatine, aluminum, yttrium, silicon and sulphur. However, catalyst is not a chemical word. Below are a few examples of chemical word.
scientific terms | chemical word |
---|---|
basic | Ba-Si-C |
halogens | H-Al-O-Ge-N-S |
catalysis | C-At-Al-Y-Si-S |
interferon | In-Te-Rf-E-r-O-N |
psychogenesis | P-S-Y-C-Ho-Ge-Ne-Si-S |
sepsis | Se-P-Si-S |
syphilis | S-Y-P-H-I-Li-S |
Given is a text file periodic_system.txt1, that contains a list with information about the elements from the periodic table. Each line — except the first which is the header — contains the following information about an element: i) atom number, ii) symbolic representation, iii) English name, iv) Dutch name, and v) atom mass. The information fields are separated by a tab. The assignment consists of determining whether a given word is a chemical word or not. To do this, follow these steps:
Write a function readSymbols that takes the location of a text file as its argument. The content of this text file must be formatted as the file periodic_system.txt2. The function must return the list of symbolic names that occur in the second column of the given text file3.
Write a function longestPrefix
to which two parameters must be passed. The first parameter word
is a string, and the second parameter symbols
is a list of strings. The function should return the longest
prefix of word that is
also a symbolic name that occurs in the list symbols.
For example, the symbolic names B
and Be are both prefixes
of the word beach. The
function should, however, return the longest prefix, so in this case
that would be the string Be.
If none of the symbolic names from the list
symbols is a prefix of the given word, then the function should
return an empty string. When determining whether a symbolic name is a
prefix of a word, capitals should not be taken into account. However,
maintain the spelling of the symbolic names when returning the prefix: Be and not BE
or be.
Terminology: A prefix
consists of one or more letters in front of the word. For
example, chem is a prefix
of the word chemistry.
Write a function chemicalWord
to which two parameters must be passed. The first parameter word
is a string, and the second parameter symbols
is a list of strings. If the word is a chemical word, then the
function should return the word as a sequence of symbolic names for
chemical elements. The symbolic names should be separated by a hyphen (-). If the word is not a
chemical word, then the function should return an empty string.
The procedure for determining whether a word is a chemical word or not,
and at the same time to generate the notation as a succession of
symbolic names, is illustrated in the table below, on the basis of the
word catalysis. As long as
the word is not reduced to the empty string, you determine the longest
prefix of the word that is also a symbolic name for an element
(obviously using the longestPrefix
function). If a prefix is found that is not an empty string,
then remove this prefix in front of the word and add it to the back the
sequence of symbolic names (taking into account the placement of
hyphens). Once the longest prefix yields an empty string, you may
conclude that it will not be a chemical word, and so let the function
return an empty string. However, if you can continue to repeat the above
procedure until the original word has been reduced to the empty string,
you may conclude that it will be a chemical word, and you have also
found the notation as a sequence of symbolic names.
word | longest prefix | chemical word |
---|---|---|
catalysis | C | |
atalysis | At | C |
alysis | Al | C-At |
ysis | Y | C-At-Al |
sis | Si | C-At-Al-Y |
s | S | C-At-Al-Y-Si |
C-At-Al-Y-Si-S |
In the following interactive Python session we assume that the text file periodic_system.txt4 is located in the current directory.
>>> symbols = readSymbols('periodic_system.txt')
>>> longestPrefix('ages', symbols)
'Ag'
>>> longestPrefix('shop', symbols)
'S'
>>> longestPrefix('density', symbols)
''
>>> chemicalWord('catalysis', symbols)
'C-At-Al-Y-Sis'
>>> chemicalWord('basic', symbols)
'Ba-Si-C'
>>> chemicalWord('leisure', symbols)
''