It's Boxing Day1 and the manager of a band promises to give the 10 musicians a huge bonus if they succeed in the following challenge.

The manager has randomly placed the 10 different instruments of the musicians in 10 large concealed boxes, which he has put in a windowless, soundproof practice room. He has also randomly marked the boxes with pictures of the 10 different instruments, but the pictures on the boxes do not necessarily correspond to the instruments inside. One by one, each musician has to enter the practice room and search for his own instrument. While he's in the practice room, he can look inside any five boxes. If he has found his own instrument, he must close the boxes before leaving the practice room using another door. The musicians can't in any way communicate to each other what they find and they only succeed if each one of them has found his own instrument.

boxes
Ten boxes contain ten different instruments. The boxes have also been marked with the pictures of the ten different instruments, but the pictures do not necessarily correspond to the contents of the boxes.

Each musician only has a 50% chance of finding his instrument by picking five random boxes. And the chances that all ten will succeed are even lower: just \[\left(\frac{1}{2}\right)^{10} = \frac{1}{1024}\] That's less than 0.1%.

The band however comes up with a strategy that has a better than 35% chance of working. Every musician first opens the box with the picture of his own instrument. If his instrument is inside the box, he's done and succeeds for his part of the challenge. Otherwise he looks at whatever instrument is in the box, and then opens the box with that picture on it.

boxes
The guitar player enters the room and opens the box marked with the picture of the guitar. As he finds a conga in the box, the next box he will open is the box marked with the picture of the conga.

He keeps going that way until he has either opened five boxes without finding his instrument (which fails the challenge) or finds his instrument (which makes him succeed for his part of the challenge).

boxes
The guitar player opens the box marked with the picture of the trumpet and finds his guitar. This makes him succeed in his part of the challenge.

This strategy works because each musician follows a linked sequence that starts with the box whose outside matches his instrument and ends with the box actually containing it. Note that if he kept going, that would lead him back to the start, making the linked sequence of boxes a loop. In graph theory2, such a loop is called a cycle3.

Following the linked sequence works much better than random guessing. For an even number of boxes $$m = 2n$$ it has been proven that the chance that all cycles will be of length $$n$$ or less can be computed as \[ 1 - \left( \frac{1}{n+1} + \frac{1}{n+2} + \frac{1}{n+3} + \cdots + \frac{1}{2n} \right)\] Plug in $$m = 10$$ boxes, and we get odds of about 35%. As $$n$$ increases, the odds approach 30%. Not a guarantee, but with a bit of musician's luck, it's far from hopeless they can claim their bonus.

Assignment

There are $$m$$ different items that are described by a unique string. None of the item descriptions contains a comma. The items are placed inside $$m$$ boxes that have also been marked with pictures of the items, but the pictures on the boxes do not necessarily correspond to the items inside.

The pictures on the boxes and the items inside are described by a text file. Each line of the file describes a box using two item descriptions that are separated by a comma (,). The first description corresponds to the picture on the box and the second description corresponds to the item inside the box. Such a text file may then look like

guitar,conga
saxophone,banjo
conga,maracas
drum set,saxophone
trumpet,guitar
bass guitar,trombone
synthesizer,trumpet
banjo,drum set
trombone,bass guitar
maracas,synthesizer

Here's a graphical representation of the pictures on the boxes and the items inside the boxes, as described by the text file.

boxes
Graphical representation of the pictures on the boxes and the items inside the boxes, as described in the text file.

Your task:

Example

In the following interactive session we assume the text files boxes01.txt4, boxes02.txt5, boxes03.txt6 and boxes04.txt7 to be located in the current directory.

>>> boxes = read_boxes('boxes01.txt')
>>> boxes
{'guitar': 'conga', 'saxophone': 'banjo', 'conga': 'maracas', 'drum set': 'saxophone', 'trumpet': 'guitar', 'bass guitar': 'trombone', 'synthesizer': 'trumpet', 'banjo': 'drum set', 'trombone': 'bass guitar', 'maracas': 'synthesizer'}

>>> cycle('guitar', boxes)
('guitar', 'conga', 'maracas', 'synthesizer', 'trumpet')
>>> cycle('saxophone', boxes)
('saxophone', 'banjo', 'drum set')
>>> cycle('bass guitar', boxes)
('bass guitar', 'trombone')
>>> cycle('clarinet', boxes)
Traceback (most recent call last):
AssertionError: invalid item

>>> longest_cycle(boxes)
5

>>> read_boxes('boxes02.txt')  # 2x same label
Traceback (most recent call last):
AssertionError: invalid boxes
>>> read_boxes('boxes03.txt')  # 2x same content
Traceback (most recent call last):
AssertionError: invalid boxes
>>> read_boxes('boxes04.txt')  # labels differ from contents
Traceback (most recent call last):
AssertionError: invalid boxes

Epilogue

Resources