Do you know American actor Kevin Bacon1?

Bacon and Purefoy
Kevin Bacon (right) with James Purefoy (left), who has a Bacon number of 2: Purefoy appeared in Women Talking Dirty with Helena Bonham Carter, and Bonham Carter appeared in Novocaine with Bacon.

He's the proverbial spider in the web of a parlor game called six degrees of Kevin Bacon where two or more players team up to arbitrarily choose an actor and then connect them to another actor via a movie that both actors appeared in together, repeating this process to try and find the shortest path that ultimately leads to Kevin Bacon. It rests on the assumption that anyone involved in the Hollywood film industry can be linked through their movie roles to Bacon within six steps. The game's name is a reference to six degrees of separation2, a hypothesis which posits that any two people on Earth are six or fewer acquaintance links apart. A great deal of research has been done to support this hypothesis both empirically and theoretically.

An actor's Bacon number is the minimal number of steps they are away from Kevin Bacon as defined by the game:

The higher the Bacon number, the more intermediate steps it takes to get to Kevin Bacon in the game.

Assignment

Two actors are co-stars if there's at least one movie in which they appear together. The distance between two actors is the minimal number of co-star steps that connect them together. Therefore, the distance between two co-stars is 1 and an actor's Bacon number is the distance between the actor and Kevin Bacon.

We use a text file containing movie appearances to determine whether two actors are co-stars and the distance between them. Each line of such a file contains the name of a movie, followed by a vertical bar (|) and the name of an actor who appears in the movie. This sample text file contains the appearances of seven actors in six movies (cast.txt3):

Sleepless in Seattle (1993)|Meg Ryan
In the Cut (2003)|Kevin Bacon
In the Cut (2003)|Meg Ryan
Clockwatchers (1997)|Lisa Kudrow
Six of a Kind (1934)|W.C. Fields
Apollo 13 (1995)|Tom Hanks
You've Got Mail (1998)|Parker Posey
Six of a Kind (1934)|George Burns
Apollo 13 (1995)|Kevin Bacon
You've Got Mail (1998)|Meg Ryan'
Sleepless in Seattle (1993)|Tom Hanks
Clockwatchers (1997)|Parker Posey
You've Got Mail (1998)|Tom Hanks

All co-stars among these seven actors (black) are connected with blue lines in the figure below, with the name of the movie in which they appear together as a label for the line (also in blue). The yellow rings — numbered from zero inside out — demonstrate how the distance between two actors $$a_1$$ and $$a_2$$ can be determined, with Kevin Bacon ($$a_2$$) as an example. Ring $$n$$ contains all actors at distance $$n$$ from actor $$a_2$$ ($$n = 0, 1, 2, \ldots$$).

Bacon number
All co-stars among these seven actors (black) are connected with blue lines, with the name of the movie in which they appear together as a label for the line (also in blue). The yellow rings — numbered from zero inside out — demonstrate how the distance between two actors $$a_1$$ and $$a_2$$ can be determined, with Kevin Bacon ($$a_2$$) as an example. Ring $$n$$ contains all actors at distance $$n$$ from actor $$a_2$$ ($$n = 0, 1, 2, \ldots$$).
Bacon number
All co-stars among these seven actors (black) are connected with blue lines, with the name of the movie in which they appear together as a label for the line (also in blue). The yellow rings — numbered from zero inside out — demonstrate how the distance between two actors $$a_1$$ and $$a_2$$ can be determined, with Kevin Bacon ($$a_2$$) as an example. Ring $$n$$ contains all actors at distance $$n$$ from actor $$a_2$$ ($$n = 0, 1, 2, \ldots$$).

We start at ring 0 which only contains actor $$a_2$$ (Kevin Bacon). Then we determine the actors in successive rings (inside out): ring $$n + 1$$ contains all co-stars of the actors in ring $$n$$ that do not appear in rings $$0, 1, 2, \ldots, n$$ ($$n = 0, 1, 2, \ldots$$). We repeat this process until we get a ring that contains actor $$a_1$$ or until we get an empty ring. In the first case, the number of the ring containing actor $$a_1$$ is the distance between actors $$a_1$$ and $$a_2$$. We can for example derive from the figure that the distance between American actress Lisa Kudrow4 ($$a_1$$) and Kevin Bacon is equal to 3 (her Bacon number). In the second case, the two actors cannot be connected. For the casts in the given file, this is for example the case for George Burns ($$a_1$$) and Kevin Bacon, as we can also see from the figure.

Define a class Roles that can be used to determine co-stardom and distance between actors based on movie appearances in a given file. The location (str) of the text file containing the appearances must be passed when creating a new collection of roles (Roles). A collection of roles $$r$$ (Roles) must support at least the following methods:

Tip

The text files in this assignment use the UTF-8 character encoding. This is now the default encoding on most computer systems (including Dodona). If encounter problems when reading files, you can explicitly specify the encoding when opening the files: encoding='utf-8'.

Example

In the following interactive session we assume the text file cast.txt5 to be located in the current directory.

>>> roles = Roles('cast.txt6')
>>> roles.movies('Kevin Bacon')
{'Apollo 13 (1995)', 'In the Cut (2003)'}
>>> roles.actors('Apollo 13 (1995)')
{'Kevin Bacon', 'Tom Hanks'}
>>> roles.are_costars('Kevin Bacon', 'Tom Hanks')
True
>>> roles.are_costars('Kevin Bacon', 'Parker Posey')
False
>>> roles.costars(['Kevin Bacon'])
{'Kevin Bacon', 'Tom Hanks', 'Meg Ryan'}
>>> roles.costars(('Parker Posey', 'George Burns'))
{'George Burns', 'W.C. Fields', 'Tom Hanks', 'Lisa Kudrow', 'Parker Posey', 'Meg Ryan'}
>>> roles.distance('Kevin Bacon')
0
>>> roles.distance('Tom Hanks')
1
>>> roles.distance('Meg Ryan')
1
>>> roles.distance('Parker Posey')
2
>>> roles.distance('Lisa Kudrow')
3
>>> roles.distance('James Purefoy')
-1
>>> roles.distance('George Burns')
-1
>>> roles.distance('W.C. Fields')
-1
>>> roles.distance('George Burns', 'W.C. Fields')
1

Epilogue: starstruck

Maarten Baes7 — an astronomy professor at Ghent University, Belgium — has a Bacon number of 5.

Maarten Baes (Bacon number)
Maarten Baes is five steps away from Kevin Bavon.

Baes plays the male lead as Belgian astronomer "Koen" in the movie Above Us All8 (2014), which also stars actress Pearl Davern9. Davern plays a minor role in the movie Resistance10 (1992) with Lorna Lesley11 as the lead actress, who also plays a lead role in the movie Just Out of Reach12 (1979) alongside actor Sam Neill13. Neill in turn can be seen as "Théo" in the movie Angel14 (2007), featuring Michael Fassbender15 as "Esmé". Fassbender himself has a Bacon number of 1, due to his role of "Erik Lensherr" in X-Men: First Class16 (2011) in which Kevin Bacon17 plays the role of "Sebastian Shaw".