The International Statistical Classification of Diseases and Related Health Problems1 (ICD) is a list of the World Health Organization2 (WHO) containing more than 10.000 diseases and maladies that patients might present. In addition to a description, each disease or malady is also given a unique ICD-code that the medical community uses for record-keeping. For example, a patient admitted to the hospital with whooping cough would be logged in the database with ICD-code A37.

The Hippocrates code
Guantanamo Bay, Cuba (July 3, 2004) — U.S. Naval Hospital, Guantanamo Bay (Gitmo) staff participates in an emergency room training exercise to ensure immediate and accurate treatment during an actual casualty. U.S. Navy photo by Photographer's Mate Airman Randall Damm.

Some of the stranger complaints that can be found on the list:

ICD-code description
A82.1 Urban rabies
A92.1 O'nyong-nyong fever
K08.10 Complete loss of teeth, unspecified cause
M26.0 Major anomalies of jaw size
N36.42 Intrinsic sphincter deficiency (ISD)
Q71.63 Lobster-claw hand, bilateral
T40.5X6 Underdosing of cocaine
V96.15 Hang glider explosion injuring occupant
W17.0 Fall into well
W61.43 Pecked by turkey
X15.1 Contact with hot toaster
X52 Prolonged stay in weightless environment

However, without any doubt our favorite so far is "Burn due to water-skis on fire" (V91.07). It's a dangerous world. Be safe out there.

Assignment

The Hippocrates code is a cryptographic technique that makes use of ICD-codes. A message is encrypted by replacing each character with a Hippocrates-code of the form

<ICD-code>.$$n$$    ($$n \in \mathbb{N}$$)

The Hippocrates-code is chosen so that the $$n$$-th position in the description of the ICD-code is occupied by the encoded character. The positions of the characters are numbered from zero. The Hippocrates-codes are separated by a single space in the ciphertext. This way, the message

Eat wise, drop a size

is encrypted as

A75.0.0 K50.119.59 T43.2.75 V14.1.21 V42.1.54 E10.32.53 T21.70.5 T21.06.12 S59.239.62 M85.812.56 M85.812.34 Q95.3.27 V64.0.22 T37.3.61 S00.261.34 T49.4X6.76 F44.6.32 A02.24.23 V59.9.62 J10.89.41 O91.1.13

We see, for example, that the first letter (position 0) in the description of ICD-code A75.0 (Epidemic louse-borne typhus fever due to Rickettsia prowazekii) is occupied by the uppercase letter E and that a lowercase letter a occurs at position 59 in the description of ICD-code K50.119 (Crohn's disease of large intestine with unspecified complications). When encoding, uppercase and lowercase letters are considered different characters, and the spaces and punctuation marks are also replaced by a corresponding Hippocrates-code.

However, the encoding is not unique because the same character may occur at multiple positions and in multiple descriptions. As such, the same sentence may also be encoded as

W92.XXXD.0 I97.611.84 I97.611.38 S59.099.31 M90.832.38 V42.1.93 M61.25.29 I67.83.43 T37.92.67 W93.38 O22.23 V59.9.48 F04.23 C34.2.13 F44.6.43 A00.0.28 M85.812.5 A02.24.23 Z96.621.21 J10.89.41 N07.9.31

Your task is to define a class Hippocrates that can be used to encode and decode messages using to the Hippocrates code. When creating objects of the class Hippocrates, the location (str) of a text file must be passed. Each line of the text file contains an ICD-code and the corresponding description, separated by a single space. These are the ICD-codes that are used for encoding and decoding. An ICD-code never contains spaces (dots are allowed). Make sure the class Hippocrates supports at least the following methods:

Example

In the following interactive session we assume the text file ICD.txt3 to be located in the current directory.

>>> codec = Hippocrates('ICD.txt4')
>>> codec.description('A00')
'Cholera'
>>> codec.description('A00.0')
'Cholera due to Vibrio cholerae 01, biovar cholerae'
>>> codec.description('A00.2')
Traceback (most recent call last):
ValueError: invalid ICD-code
>>> codec.character('A00.0')
'C'
>>> codec.character('A00.2')
'o'
>>> codec.character('A00.0.33')
','
>>> codec.character('A00.2.33')
Traceback (most recent call last):
ValueError: invalid Hippocrates-code
>>> codec.codes('V')
{'O22.0', 'A00.1.15', 'I83.218.0', 'A00.0.15'}
>>> codec.codes('v')
{'I67.83.12', 'V59.9.50', 'V64.0.76', 'H59.121.12', 'M50.022.3', 'S44.8X9.19', 'T37.92.46', 'S44.8X9.51', 'V42.1.68', 'S66.911.68', 'T37.3.16', 'T37.1X.16', 'V04.92.67', 'M67.971.28', 'B08.04.4', 'Q95.3.53', 'V49.60.69', 'V64.0.3', 'T43.2.16', 'V64.0.26', 'E10.32.49', 'B57.30.40', 'A00.0.38', 'V64.0.13', 'I89.9.10', 'I83.218.9', 'V04.92.54', 'F44.6.3', 'I89.9.35', 'W24.0.23', 'M67.232.4', 'B57.30.43', 'S90.464.16', 'N13.9', 'V52.1.88', 'S14.9.25', 'V52.1.30', 'S78.922.54', 'V92.16.45', 'V14.1.51', 'S00.261.16', 'B57.30.28', 'V59.9.13', 'W93.19', 'W92.XXXD.19', 'V64.0.63', 'M50.022.34', 'J10.89.44', 'A00.1.38', 'A75.0.30', 'V14.1.64', 'T37.1X5.2'}
>>> codec.codes('.')
{'Z68.35.29', 'Z68.35.24', 'Z68.42.29', 'Z68.36.29', 'Z68.42.24', 'Z68.36.24', 'Z68.43.27'}
>>> codec.codes('xxx')
set()
>>> codec.codes('!')
set()
>>> codec.encode('Eat wise, drop a size')
'A75.0.0 K50.119.59 T43.2.75 V14.1.21 V42.1.54 E10.32.53 T21.70.5 T21.06.12 S59.239.62 M85.812.56 M85.812.34 Q95.3.27 V64.0.22 T37.3.61 S00.261.34 T49.4X6.76 F44.6.32 A02.24.23 V59.9.62 J10.89.41 O91.1.13'
>>> codec.encode('Eat wise, drop a size')
'W92.XXXD.0 I97.611.84 I97.611.38 S59.099.31 M90.832.38 V42.1.93 M61.25.29 I67.83.43 T37.92.67 W93.38 O22.23 V59.9.48 F04.23 C34.2.13 F44.6.43 A00.0.28 M85.812.5 A02.24.23 Z96.621.21 J10.89.41 N07.9.31'
>>> codec.encode('Eat wise, drop a size!')
Traceback (most recent call last):
ValueError: invalid plaintext
>>> codec.decode('A75.0.0 K50.119.59 T43.2.75 V14.1.21 V42.1.54 E10.32.53 T21.70.5 T21.06.12 S59.239.62 M85.812.56 M85.812.34 Q95.3.27 V64.0.22 T37.3.61 S00.261.34 T49.4X6.76 F44.6.32 A02.24.23 V59.9.62 J10.89.41 O91.1.13')
'Eat wise, drop a size'
>>> codec.decode('W92.XXXD.0 I97.611.84 I97.611.38 S59.099.31 M90.832.38 V42.1.93 M61.25.29 I67.83.43 T37.92.67 W93.38 O22.23 V59.9.48 F04.23 C34.2.13 F44.6.43 A00.0.28 M85.812.5 A02.24.23 Z96.621.21 J10.89.41 N07.9.31')
'Eat wise, drop a size'
>>> codec.decode('X6.6.6 Q99.9.9 Z12.34')
Traceback (most recent call last):
ValueError: invalid ciphertext

Epilogue

Each year the Occupational Safety and Health Administration5 (OHSA) — an agency of the United States Department of Labor — publishes a list of workplace deaths6, with a brief description of each incident. You do not want to die in any of these ways:

  • Worker died when postal truck became partially submerged in lake.

  • Worker was caught between rotating drum and loading hopper of a ready-mix truck.

  • Worker fatally engulfed in dry cement when steel storage silo collapsed.

  • Worker on ladder struck and killed by lightning.

  • Worker was pulled into a tree chipper machine.

  • Worker was caught between two trucks and crushed.

  • Worker died when his head was impaled by metal from the drive section of a Ferris wheel. The employee slipped after acknowledging he was clear and the wheel began to turn, trapping his head.

  • Worker was draining a tank; one of the employees climbed to the top of the tank and lit a cigarette and waved it over the opening in the tank. The tank exploded, killing the worker.

  • Worker was kicked by an elephant.

  • Sheriff Deputy was walking through the woods, working a cold case, and fell 161 feet into a sink hole.

It's hard to pick the worst one, but here's an attempt.

Worker was operating a skid-steer cleaning out a dairy cattle barn near an outdoor manure slurry pit. The skid-steer and the worker fell off the end of the push-off platform into the manure slurry pit, trapping the worker in the vehicle. Worker died of suffocation due to inhalation of manure.