Esophagographers are members of the medical staff who carry out esophagography, defined in Stedman's Medical Dictionary (6th edition, 2008) as radiography of the esophagus using swallowed or injected radiopaque contrast media. The esophagus — commonly known as the foodpipe or gullet — is an organ in vertebrates which consists of a fibromuscular tube through which food passes, aided by peristaltic contractions from the pharynx to the stomach.
In this exercise, however, we are interested in the word esophagographers for a non-medical reason. Esophagographers, sixteen letters long, is the longest English word in which each letter occurs exactly twice. A fourteen-letter word with this property is scintillescent. Twelve-letter words with this property include happenchance and shanghaiings. Ten-letter words with this property include arraigning, concisions, intestines, and horseshoer.
Words in which each letter has the same occurrence are called repetition words. You are asked to:
Write a function occurrences that takes a word as its argument. The function must return a dictionary that maps each letter in the word onto the number of occurrences of that letter in the word. The function must process given words in a case insensitive way and the returned dictionary must only use lowercase letters as keys.
Use the function occurrences to write a function isRepetitionWord that takes a word as its argument. The function also has a second optional parameter minimal_repetition (default value 1). The function must return a Boolean value that indicates whether or not the given word is a repetition word. In determining whether the given word is a repetition word, the function must treat the letters in a case insensitive way and must ignore all characters that are not letters. To be a repetition word, the number of occurrences of each letter must also be at least as high as the value passed to the parameter minimal_repetition.
Use the function isRepetitionWord to write a function repetitionWords that takes the location of a text file as its argument. The text file must contain a list of words, each on a separate line. The function also has two optional parameters minimal_repetition and minimal_length, both having 1 as their default value. The function must return the set of repetition words contained in the given file whose length is at least as high as the value passed to the parameter minimal_length. In determing whether or not a word is a repetition word, the function must make use of the function isRepetitionWord. In doing so, the parameter minimal_repetition has the same meaning for the function repetitionWords as it has for the function isRepetitionWord.
In the following interactive session, we assume that the text file words.txt1 is located in the current directory.
>>> occurrences('CHACHACHA')
{'a': 3, 'h': 3, 'c': 3}
>>> occurrences('Esophagographers')
{'a': 2, 'e': 2, 'g': 2, 'h': 2, 'o': 2, 'p': 2, 's': 2, 'r': 2}
>>> occurrences('happenchance')
{'a': 2, 'c': 2, 'e': 2, 'h': 2, 'n': 2, 'p': 2}
>>> isRepetitionWord('CHACHACHA')
True
>>> isRepetitionWord('Esophagographers')
True
>>> isRepetitionWord('happenchance', minimal_repetition=3)
False
>>> repetitionWords('words.txt', minimal_repetition=2, minimal_length=10)
{'horseshoer', 'intestines'}