The International Statistical Classification of Diseases and Related Health Problems1 (ICD) is a list of the World Health Organization2 (WHO) containing more than 10.000 diseases and maladies that patients might present. In addition to a description, each disease or malady is also given a unique ICD-code that the medical community uses for record-keeping. For example, a patient admitted to the hospital with whooping cough would be logged in the database with ICD-code A37.
Some of the stranger complaints that can be found on the list:
ICD-code | description |
---|---|
A82.1 | Urban rabies |
A92.1 | O'nyong-nyong fever |
K08.10 | Complete loss of teeth, unspecified cause |
M26.0 | Major anomalies of jaw size |
N36.42 | Intrinsic sphincter deficiency (ISD) |
Q71.63 | Lobster-claw hand, bilateral |
T40.5X6 | Underdosing of cocaine |
V96.15 | Hang glider explosion injuring occupant |
W17.0 | Fall into well |
W61.43 | Pecked by turkey |
X15.1 | Contact with hot toaster |
X52 | Prolonged stay in weightless environment |
However, without any doubt our favorite so far is "Burn due to water-skis on fire" (V91.07). It's a dangerous world. Be safe out there.
The Hippocrates code is a cryptographic technique that makes use of ICD-codes. A message is encrypted by replacing each character with a Hippocrates-code of the form
<ICD-code>.$$n$$ ($$n \in \mathbb{N}$$)
The Hippocrates-code is chosen so that the $$n$$-th position in the description of the ICD-code is occupied by the encoded character. The positions of the characters are numbered from zero. The Hippocrates-codes are separated by a single space in the ciphertext. This way, the message
Eat wise, drop a size
is encrypted as
A75.0.0 K50.119.59 T43.2.75 V14.1.21 V42.1.54 E10.32.53 T21.70.5 T21.06.12 S59.239.62 M85.812.56 M85.812.34 Q95.3.27 V64.0.22 T37.3.61 S00.261.34 T49.4X6.76 F44.6.32 A02.24.23 V59.9.62 J10.89.41 O91.1.13
We see, for example, that the first letter (position 0) in the description of ICD-code A75.0 (Epidemic louse-borne typhus fever due to Rickettsia prowazekii) is occupied by the uppercase letter E and that a lowercase letter a occurs at position 59 in the description of ICD-code K50.119 (Crohn's disease of large intestine with unspecified complications). When encoding, uppercase and lowercase letters are considered different characters, and the spaces and punctuation marks are also replaced by a corresponding Hippocrates-code.
However, the encoding is not unique because the same character may occur at multiple positions and in multiple descriptions. As such, the same sentence may also be encoded as
W92.XXXD.0 I97.611.84 I97.611.38 S59.099.31 M90.832.38 V42.1.93 M61.25.29 I67.83.43 T37.92.67 W93.38 O22.23 V59.9.48 F04.23 C34.2.13 F44.6.43 A00.0.28 M85.812.5 A02.24.23 Z96.621.21 J10.89.41 N07.9.31
To encode and decode messages according to the Hippocrates code we use text files whose lines contain an ICD-code and the corresponding description, separated by a single space. These are the ICD-codes that are used for encoding and decoding. An ICD-code never contains spaces (dots are allowed). Your task:
Write a function code2character that takes the location
(str) of a text file containing ICD-codes. The function
must return a dictionary (dict) that maps all possible
Hippocrates-codes (str) for the given text file onto
their corresponding character (str).
Write a function decode that takes two arguments: i) a ciphertext (str) whose encoding is based on the Hippocrates code and ii) a dictionary (dict) as returned by the function code2character. The function must return the plaintext (str) corresponding to the given ciphertext. In case the first argument is no valid ciphertext for the Hippocrates code (second argument), a ValueError must be raised with the message invalid ciphertext.
Write a function character2codes that takes the location (str) of a text file containing ICD-codes. The function must return a dictionary (dict) that maps each character (str) that occurs in at least one of the messages in the given text file onto the set of all Hippocrates-codes (str) that can be used to replace the character when encoding the Hippocrates code.
Write a function encode that takes two arguments: i) a plaintext (str) and ii) a dictionary (dict) as returned by the function character2codes. The function must return a ciphertext (str) for the given plaintext, where encoding is based on the Hippocrates code. Each character of the plaintext must be replaced by randomly choosing among all possible Hippocrates-codes that encode the character. If the given plaintext contains characters that cannot be encoded by a Hippocrates-code, a ValueError must be raised with the message invalid plaintext.
In the following interactive session we assume the text file ICD.txt3 to be located in the current directory.
>>> c2k = code2character('ICD.txt4')
>>> c2k['A00.0']
'C'
>>> c2k['A00.2']
'o'
>>> c2k['A00.0.33']
','
>>> c2k['A00.2.33']
Traceback (most recent call last):
KeyError: 'A00.2.33'
>>> decode('A75.0.0 K50.119.59 T43.2.75 V14.1.21 V42.1.54 E10.32.53 T21.70.5 T21.06.12 S59.239.62 M85.812.56 M85.812.34 Q95.3.27 V64.0.22 T37.3.61 S00.261.34 T49.4X6.76 F44.6.32 A02.24.23 V59.9.62 J10.89.41 O91.1.13', c2k)
'Eat wise, drop a size'
>>> decode('W92.XXXD.0 I97.611.84 I97.611.38 S59.099.31 M90.832.38 V42.1.93 M61.25.29 I67.83.43 T37.92.67 W93.38 O22.23 V59.9.48 F04.23 C34.2.13 F44.6.43 A00.0.28 M85.812.5 A02.24.23 Z96.621.21 J10.89.41 N07.9.31', c2k)
'Eat wise, drop a size'
>>> decode('X66.6 Q99.99 Z12.34', c2k)
Traceback (most recent call last):
ValueError: invalid ciphertext
>>> k2c = character2codes('ICD.txt5')
>>> k2c['V']
{'O22.0', 'A00.1.15', 'I83.218.0', 'A00.0.15'}
>>> k2c['v']
{'I67.83.12', 'V59.9.50', 'V64.0.76', 'H59.121.12', 'M50.022.3', 'S44.8X9.19', 'T37.92.46', 'S44.8X9.51', 'V42.1.68', 'S66.911.68', 'T37.3.16', 'T37.1X.16', 'V04.92.67', 'M67.971.28', 'B08.04.4', 'Q95.3.53', 'V49.60.69', 'V64.0.3', 'T43.2.16', 'V64.0.26', 'E10.32.49', 'B57.30.40', 'A00.0.38', 'V64.0.13', 'I89.9.10', 'I83.218.9', 'V04.92.54', 'F44.6.3', 'I89.9.35', 'W24.0.23', 'M67.232.4', 'B57.30.43', 'S90.464.16', 'N13.9', 'V52.1.88', 'S14.9.25', 'V52.1.30', 'S78.922.54', 'V92.16.45', 'V14.1.51', 'S00.261.16', 'B57.30.28', 'V59.9.13', 'W93.19', 'W92.XXXD.19', 'V64.0.63', 'M50.022.34', 'J10.89.44', 'A00.1.38', 'A75.0.30', 'V14.1.64', 'T37.1X5.2'}
>>> k2c['.']
{'Z68.35.29', 'Z68.35.24', 'Z68.42.29', 'Z68.36.29', 'Z68.42.24', 'Z68.36.24', 'Z68.43.27'}
>>> k2c['xxx']
Traceback (most recent call last):
KeyError: 'xxx'
>>> k2c['!']
Traceback (most recent call last):
KeyError: '!'
>>> encode('Eat wise, drop a size', k2c)
'A75.0.0 K50.119.59 T43.2.75 V14.1.21 V42.1.54 E10.32.53 T21.70.5 T21.06.12 S59.239.62 M85.812.56 M85.812.34 Q95.3.27 V64.0.22 T37.3.61 S00.261.34 T49.4X6.76 F44.6.32 A02.24.23 V59.9.62 J10.89.41 O91.1.13'
>>> encode('Eat wise, drop a size', k2c)
'W92.XXXD.0 I97.611.84 I97.611.38 S59.099.31 M90.832.38 V42.1.93 M61.25.29 I67.83.43 T37.92.67 W93.38 O22.23 V59.9.48 F04.23 C34.2.13 F44.6.43 A00.0.28 M85.812.5 A02.24.23 Z96.621.21 J10.89.41 N07.9.31'
>>> encode('Eat wise, drop a size!', k2c)
Traceback (most recent call last):
ValueError: invalid plaintext
Each year the Occupational Safety and Health Administration6 (OHSA) — an agency of the United States Department of Labor — publishes a list of workplace deaths7, with a brief description of each incident. You do not want to die in any of these ways:
Worker died when postal truck became partially submerged in lake.
Worker was caught between rotating drum and loading hopper of a ready-mix truck.
Worker fatally engulfed in dry cement when steel storage silo collapsed.
Worker on ladder struck and killed by lightning.
Worker was pulled into a tree chipper machine.
Worker was caught between two trucks and crushed.
Worker died when his head was impaled by metal from the drive section of a Ferris wheel. The employee slipped after acknowledging he was clear and the wheel began to turn, trapping his head.
Worker was draining a tank; one of the employees climbed to the top of the tank and lit a cigarette and waved it over the opening in the tank. The tank exploded, killing the worker.
Worker was kicked by an elephant.
Sheriff Deputy was walking through the woods, working a cold case, and fell 161 feet into a sink hole.
It's hard to pick the worst one, but here's an attempt.
Worker was operating a skid-steer cleaning out a dairy cattle barn near an outdoor manure slurry pit. The skid-steer and the worker fell off the end of the push-off platform into the manure slurry pit, trapping the worker in the vehicle. Worker died of suffocation due to inhalation of manure.