EECS 370

Machine Problem 1, Spring 2000

The Six Degrees of Kevin Bacon

Due: Thursday, January 27, 2000 at 11:59 pm

The Six Degrees of Kevin Bacon is a game that was invented by three college students while watching the movie Footloose starting Kevin Bacon. The three students noticed that while Kevin Bacon rarely player the starring role, he often played a supporting role to some big name actors and actresses and he appeared in a number of ensemble movies. They theorized that any actor or actress could be traced to Kevin Bacon in six steps or less using co-stars of movies each actor or actress appeared in. Follow this link for more information on this game.

For example, let us take Harrison Ford. He starred in Raiders of the Lost Ark with Karen Allen. She starred in Aminal House with Kevin Bacon. Thus Harrison Ford could be traced to Kevin Bacon in two steps. The creators of the game called this two "degrees" instead of steps. If we did not know that Karen Allen could be used as a link, we could use the following links. Harrison Ford starred in Patriot Games with James Earl Jones (or you could have used any of the first three Star Wars movies). James Earl Jones starred in Conan the Barbarian with Arnold Schwartzenegger. Arnold starred in Twins with Danny Devito. Danny Devito starred in One Flew Over the Cookoo's Nest with Jack Nickolson. Jack Nickolson starred in A Few Good Men with Kevin Bacon. Here we connect Harrison Ford and Kevin Bacon with 5 steps or degrees. When this game is played at a party, the object is to connect the actor or actress with Kevin Bacon in the fewest number of steps or degrees.

For this programming assignment, you are to write a C or C++ program that will find the fewest number of steps between an actor or actress Kevin Bacon when given information about who starred in different movies. Note: for the remainder of the write up, the term "actor" is meant to imply both actor and actress. Thus Meryl Streep is an "actor". The program is to use text input to get commands from a user or a file. Each command will be given on its own line in the input and can be in either upper or lower case. The command will be the first non-white space character on the line. If the first non-white space character is not a command some error message should be produced. Each command (except for the # command) are to produce a message stating the command occurred. The commands are:

q
Quit the program.
h
Display help information for the program. This information must include all commands and a short description on how to use each command.
m
Get a list of actors that starred in a particular movie. On the first line after the command will be the name of the movie. Following the name of the movie will be the names of the actors who starred in the movie. Each actor name will be listed on its own line. The line following the last actor will be blank. This will indicate the end of the actor list. Each name (whether movie name or actor name) will be eighty characters or less in length.
#
Do nothing. This indicates the command is a comment. This is primarily to be use to document the commands given in a file. The comment will be given on the same line as the # character.
a
Get a list of movies in which a partictular actor starred. On the first line after the command will be the name of the actor. Following the name of the actor will be the names of the movies in which the actor starred. Each movie name will be listed on its own line. The line following the last movie name will be blank. This will indicate the end of the movie list. Each name (whether actor name or movie name) will be eighty characters or less in length.
f
Get commands from the specified file. The filename will be given on the following line. The filename will be eighty characters or less in length. The commands in the file can be any command including the f command to get commands from yet another file. This would cause a recursive situation that your program must be able to handle. When you get the the end of the file, your program is to return to the previous place where you were getting commands from.

Here are a list of data files that we created. These do NOT test all possible input to your program. So you will have to create your own tests for this. Also, these files are not the only ones that will be used when grading your program.

s
Search for the shortest list of steps (or degrees) that will go from the given actor to Kevin Bacon. The actor's name will be given on the next line of input and will be eighty characters or less in length. The list must clearly show which two actors costarred in which movie together.
l
For a given actor's name, list the names of all the movies that actor has starred in (that are known to the program) in lexicographical order. The actor's name will be given on the following line and will be eighty characters or less in length.
g
For a given movie name, get the names of all actors that starred in the movie (that are known to the program) in lexicographical order. The moviw name will be given on the following line and will be eighty characters or less in length.
v
List the names of all movies known to the program in lexicographical order.
t
List the names of all actors known to the program in lexicographical order.
o
Write out the entire knowledge of program to a file. The file name will be given on the following line. The filename will be eighty characters or less in length. If the file already exists, you are to append to the end of the file. The knowledge of the program is to be written in "movie major" format. This means that for each movie, you are to list the names of all actors that starred in the movie. The movies are to be listed in lexicographical order and the actor list for each movie is also to be listed in lexicographical order. The output must be written such that it can be read back into the program using the f command. This means that the information written to the file will be a series of properly formatted m commands.
c
Write out the entire knowledge of program to a file. The filename will be given on the following line. The filename will be eighty characters or less in length. If the file already exists, you are to append to the end of the file. The knowledge of the program is to be written in "actor major" format. This means that for each actor, you are to list the names of all movies that the actor starred in. The actor names are to be listed in lexicographical order and the movie list for each actor is also to be listed in lexicographical order. The output must be written such that it can be read back into the program using the f command. This means that the information written to the file will be a series of properly formatted a commands.

Lexicographical order is normal sorting order used for strings. All names will not be case sensitive, thus the name "Kevin Bacon" is the same as "keVIn bAcoN"; however, when storing names in your program the first letter of each word must be in upper case, while all remaining letters in the name are in lower case. Thus the name given as "chARlie McCarthy" would be stored as "Charlie Mccarthy". Your program is responsible for any neeeded case conversions.

Also all names are to stored with no leading or trailing white space characters (white space characters are blanks/spaces, horizontal tabs, vertical tabs and newline characters - in this program, since all names will be given on a single line, only the blank/space and horizontal tabs will be considered as white space) and only a single space character is used to separate words. Thus if a name has leading white space, multiple white space between words or trailing white space, your program will have to modify the name to deal with these.

Your program does not have to be smart enough to match the same actor using multiple names. For example, "Robert Deniro", "Bobby Deniro" and "Bob Deniro" would be three different actors. Also, the program does not have to recognize multiple forms of the same name, i.e. "Robert Deniro" and "Deniro, Robert" are different names, also "A Few Good Men" is different from "Few Good Men" or "Few Good Men, A".

You should think of this program as a large graph program. Finding the fewest steps connecting an actor to Kevin Bacon can be done using the breadth first search from the actor's vertex in the graph to Kevin Bacon's vertex in the graph. Since you will have to look up both actor and movie names, you must use some fast look-up algorithm like a hash table to quickly find the name's location in the graph.

This program will require the use of multiple source code files, separate compilation and the use of a makefile. The division of the subroutines between the multiple source code files must be logical. One suggestion would be to have the graph rountines in one source code file and the command interface commands in another source code file. This suggestion only specifies two source code files, you may wish to have more that two source code files for your program. Note: a "source code file" is not the same as a "header file". Source code files will have a file extension of .c (or .cpp, .C, .CC, etc.), while a header file will have a file extension of .h. Your program must separately compile the source code files and then link them together. Using a #include statement with a source code file (a .c file) will NOT satisfy this requirement.

This program will also require a 1-2 page write up of the data structures used in the program and the logical division of your program into multiple source code files (i.e. which routines are where). Remember that this write-up is to be written in ASCII format and is to be electronically turned in with your program. The name of this file should be "readme.txt". Also recall that your program will be given to another student to write a critique. Therefore, it is suggested that you do not include your Social Security Number in your program. Instead use your name and your EECS User ID to identify yourself.

Your program must be written in good programming style. This includes (but is not limited to) meaningful identifier names, a file header at the beginning of each source code file, a function header at the beginning of the function, proper use of blank lines and indentation to aide in the reading of your code, explanatory "value-added" in-line comments, etc.

The work you turn in must be 100% your own. You are not allowed to share code with any other person (inside this class or not). You may discuss the project with other persons; however, you may not show any code you write to another person nor may you look at any other person's written code.

You are to submit this project using the EECS Department's UNIX machine's turnin command. The project name for this assignment is mp1. Be sure to submit all source code, header files, make file as well as your program description. Failure to turnin all required pieces will result in a lower grade for the assignment.