Do you know American actor Kevin Bacon1?
He's the proverbial spider in the web of a parlor game called six degrees of Kevin Bacon where two or more players team up to arbitrarily choose an actor and then connect them to another actor via a movie that both actors appeared in together, repeating this process to try and find the shortest path that ultimately leads to Kevin Bacon. It rests on the assumption that anyone involved in the Hollywood film industry can be linked through their movie roles to Bacon within six steps. The game's name is a reference to six degrees of separation2, a hypothesis which posits that any two people on Earth are six or fewer acquaintance links apart. A great deal of research has been done to support this hypothesis both empirically and theoretically.
An actor's Bacon number is the minimal number of steps they are away from Kevin Bacon as defined by the game:
Kevin Bacon himself has a Bacon number of 0
all co-stars of Kevin Bacon have a Bacon number of 1
if the lowest Bacon number of all co-stars of actor $$\mathcal{X}$$ is $$n$$, then actor $$\mathcal{X}$$ has a Bacon number of $$n+1$$
The higher the Bacon number, the more intermediate steps it takes to get to Kevin Bacon in the game.
Two actors are co-stars if there's at least one movie in which they appear together. The distance between two actors is the minimal number of co-star steps that connect them together. Therefore, the distance between two co-stars is 1 and an actor's Bacon number is the distance between the actor and Kevin Bacon.
We use a text file containing movie appearances to determine whether two actors are co-stars and the distance between them. Each line of such a file contains the name of a movie, followed by a vertical bar (|) and the name of an actor who appears in the movie. This sample text file contains the appearances of seven actors in six movies (cast.txt3):
Sleepless in Seattle (1993)|Meg Ryan In the Cut (2003)|Kevin Bacon In the Cut (2003)|Meg Ryan Clockwatchers (1997)|Lisa Kudrow Six of a Kind (1934)|W.C. Fields Apollo 13 (1995)|Tom Hanks You've Got Mail (1998)|Parker Posey Six of a Kind (1934)|George Burns Apollo 13 (1995)|Kevin Bacon You've Got Mail (1998)|Meg Ryan' Sleepless in Seattle (1993)|Tom Hanks Clockwatchers (1997)|Parker Posey You've Got Mail (1998)|Tom Hanks
All co-stars among these seven actors (black) are connected with blue lines in the figure below, with the name of the movie in which they appear together as a label for the line (also in blue). The yellow rings — numbered from zero inside out — demonstrate how the distance between two actors $$a_1$$ and $$a_2$$ can be determined, with Kevin Bacon ($$a_2$$) as an example. Ring $$n$$ contains all actors at distance $$n$$ from actor $$a_2$$ ($$n = 0, 1, 2, \ldots$$).
We start at ring 0 which only contains actor $$a_2$$ (Kevin Bacon). Then we determine the actors in successive rings (inside out): ring $$n + 1$$ contains all co-stars of the actors in ring $$n$$ that do not appear in rings $$0, 1, 2, \ldots, n$$ ($$n = 0, 1, 2, \ldots$$). We repeat this process until we get a ring that contains actor $$a_1$$ or until we get an empty ring. In the first case, the number of the ring containing actor $$a_1$$ is the distance between actors $$a_1$$ and $$a_2$$. We can for example derive from the figure that the distance between American actress Lisa Kudrow4 ($$a_1$$) and Kevin Bacon is equal to 3 (her Bacon number). In the second case, the two actors cannot be connected. For the casts in the given file, this is for example the case for George Burns ($$a_1$$) and Kevin Bacon, as we can also see from the figure.
Define a class Roles that can be used to determine co-stardom and distance between actors based on movie appearances in a given file. The location (str) of the text file containing the appearances must be passed when creating a new collection of roles (Roles). A collection of roles $$r$$ (Roles) must support at least the following methods:
A method movies that takes an actor name (str). The method must return a set (set) containing the names (str) of all movies in which the actor appears according to roles $$r$$.
A method actors that takes a movie name (str). The method must return a set (set) containing the names (str) of all actors that appear in the movie according to roles $$r$$.
A method are_costars that takes two actor names $$a_1$$ and $$a_2$$ (str) . The method must return a Boolean value (bool) that indicates whether $$a_1$$ and $$a_2$$ are co-stars according to roles $$r$$.
A method costars that takes a collection $$\mathcal{C}$$ (list, tuple or set) of actor names (str). The method must return a set (set) containing the names (str) of all actors that appear in at least one movie in which one of the actors of collection $$\mathcal{C}$$ appears according to roles $$r$$. So, actors from collection $$\mathcal{C}$$ are automatically in the returned set, unless they do not appear in any movie according to roles $$r$$.
A method distance that takes an actor name $$a_1$$ (str). The function also has a second optional parameter that takes a second actor name $$a_2$$ (str; default value: Kevin Bacon). The method must return value -1 (int) if $$a_1$$ and $$a_2$$ cannot be connected according to roles $$r$$. Otherwise, the method must return the distance (int) between both actors according to roles $$r$$.
The text files in this assignment use the UTF-8 character encoding. This is now the default encoding on most computer systems (including Dodona). If encounter problems when reading files, you can explicitly specify the encoding when opening the files: encoding='utf-8'.
In the following interactive session we assume the text file cast.txt5 to be located in the current directory.
>>> roles = Roles('cast.txt6')
>>> roles.movies('Kevin Bacon')
{'Apollo 13 (1995)', 'In the Cut (2003)'}
>>> roles.actors('Apollo 13 (1995)')
{'Kevin Bacon', 'Tom Hanks'}
>>> roles.are_costars('Kevin Bacon', 'Tom Hanks')
True
>>> roles.are_costars('Kevin Bacon', 'Parker Posey')
False
>>> roles.costars(['Kevin Bacon'])
{'Kevin Bacon', 'Tom Hanks', 'Meg Ryan'}
>>> roles.costars(('Parker Posey', 'George Burns'))
{'George Burns', 'W.C. Fields', 'Tom Hanks', 'Lisa Kudrow', 'Parker Posey', 'Meg Ryan'}
>>> roles.distance('Kevin Bacon')
0
>>> roles.distance('Tom Hanks')
1
>>> roles.distance('Meg Ryan')
1
>>> roles.distance('Parker Posey')
2
>>> roles.distance('Lisa Kudrow')
3
>>> roles.distance('James Purefoy')
-1
>>> roles.distance('George Burns')
-1
>>> roles.distance('W.C. Fields')
-1
>>> roles.distance('George Burns', 'W.C. Fields')
1
Maarten Baes7 — an astronomy professor at Ghent University, Belgium — has a Bacon number of 5.
Baes plays the male lead as Belgian astronomer "Koen" in the movie Above Us All8 (2014), which also stars actress Pearl Davern9. Davern plays a minor role in the movie Resistance10 (1992) with Lorna Lesley11 as the lead actress, who also plays a lead role in the movie Just Out of Reach12 (1979) alongside actor Sam Neill13. Neill in turn can be seen as "Théo" in the movie Angel14 (2007), featuring Michael Fassbender15 as "Esmé". Fassbender himself has a Bacon number of 1, due to his role of "Erik Lensherr" in X-Men: First Class16 (2011) in which Kevin Bacon17 plays the role of "Sebastian Shaw".