Comparing top movie ratings to money going in vs. out¶

Motivation:¶

Problem¶

For movies with the highest ratings, is the money coming out from box office sales always more than the money going into production? I am assuming that both money going in and out will be high for popular movies with high ratings.

Solution¶

I found a dataset which contains data for the top 250 movies according to IMDB. Ideally the dataset will help to see if the money put into producing highly rated movies is really "worth it" looking at the money spent and the money gained.

Impact¶

If successful, this analysis could show which movies were successful in terms of not only ratings but were also smart financially. It's great if people like a movie, but it's hard for the producers if the movies don't make financial sense.

Dataset

Detail¶

I will use a Kaggle Dataset of IMDB Movies to observe the following features for each movie:

  • rank - Rank of the movie
  • name - Name of the movie
  • year - Release year
  • rating - Rating of the movie
  • genre - Genre of the movie
  • certificate - Certificate of the movie
  • run_time - Total movie run time
  • tagline - Tagline of the movie
  • budget - Budget of the movie
  • box_office - Total box office collection across the world
  • casts - All casts of the movie
  • directors - Director of the movie
  • writers - Writer of the movie

Potential Problems¶

A potential problem could be that the budget may be dependent for a type of movie. For example a sci fi or action movie with a lot of props might require more money automatically even if it doesn't increase the rating of the movie. Another issue could affects budget could be the amount of actors and how popular they are.

In [1]:
#First few lines of dataset

import pandas as pd

df = pd.read_csv(r'IMDB Top 250 Movies.csv')


df.head()
Out[1]:
rank name year rating genre certificate run_time tagline budget box_office casts directors writers
0 1 The Shawshank Redemption 1994 9.3 Drama R 2h 22m Fear can hold you prisoner. Hope can set you f... 25000000 28884504 Tim Robbins,Morgan Freeman,Bob Gunton,William ... Frank Darabont Stephen King,Frank Darabont
1 2 The Godfather 1972 9.2 Crime,Drama R 2h 55m An offer you can't refuse. 6000000 250341816 Marlon Brando,Al Pacino,James Caan,Diane Keato... Francis Ford Coppola Mario Puzo,Francis Ford Coppola
2 3 The Dark Knight 2008 9.0 Action,Crime,Drama PG-13 2h 32m Why So Serious? 185000000 1006234167 Christian Bale,Heath Ledger,Aaron Eckhart,Mich... Christopher Nolan Jonathan Nolan,Christopher Nolan,David S. Goyer
3 4 The Godfather Part II 1974 9.0 Crime,Drama R 3h 22m All the power on earth can't change destiny. 13000000 47961919 Al Pacino,Robert De Niro,Robert Duvall,Diane K... Francis Ford Coppola Francis Ford Coppola,Mario Puzo
4 5 12 Angry Men 1957 9.0 Crime,Drama Approved 1h 36m Life Is In Their Hands -- Death Is On Their Mi... 350000 955 Henry Fonda,Lee J. Cobb,Martin Balsam,John Fie... Sidney Lumet Reginald Rose
In [ ]:
 

Method:¶

I would like to create a graph to visualize this problem. I would made a line graph with the rating on the x-axis, and then a line each for box office income and budget on the y axis. I think that comparing the two lines could be interesting.

In [ ]: