Have Movies Gotten Longer?¶

  1. Describes and motivates a real-world problem where data science may provide helpful insights. Your description should be easily understood by a casual reader and include citations to motivating sources or relevant information (e.g. news articles, further reading links … Wikipedia makes for a poor reference but the links it cites are usually promising).

Proposal: Have movies gotten longer over time?

Many movie goers in the last few years have felt that movies have gotten longer. "Watching a movie in theaters these days can feel like a feat of endurance"(Kaur Why it feels like movies are getting longer). Is this really the case though? "It may not be scientifically proven, but sometimes it feels like movies are, indeed, getting longer"(Rubin Why are movies so long now?). Through data science I'm going to try and see if that is the case.

citations:

Kaur, Harmeet. “Why It Feels like Movies Are Getting Longer.” CNN, Cable News Network, 6 Feb. 2022, https://www.cnn.com/2022/02/06/entertainment/movie-runtimes-longer-mcu-batman-oscar-bait-cec/index.html. Rubin, Rebecca. “Why Are Movies so Long Now?” Variety, Variety, 23 Feb. 2022, https://variety.com/2022/film/features/batman-spider-man-long-movie-runtimes-1235187797/.

  1. Explicitly load and show your dataset. Provide a data dictionary which explains the meaning of each feature present. Demonstrate that this data is sufficient to make progress on your real-world problem described above.
In [ ]:
import pandas as pd
In [12]:
data = pd.read_csv("movies.csv")
In [13]:
data
Out[13]:
name rating genre year released score votes director writer star country budget gross company runtime
0 The Shining R Drama 1980 June 13, 1980 (United States) 8.4 927000.0 Stanley Kubrick Stephen King Jack Nicholson United Kingdom 19000000.0 46998772.0 Warner Bros. 146.0
1 The Blue Lagoon R Adventure 1980 July 2, 1980 (United States) 5.8 65000.0 Randal Kleiser Henry De Vere Stacpoole Brooke Shields United States 4500000.0 58853106.0 Columbia Pictures 104.0
2 Star Wars: Episode V - The Empire Strikes Back PG Action 1980 June 20, 1980 (United States) 8.7 1200000.0 Irvin Kershner Leigh Brackett Mark Hamill United States 18000000.0 538375067.0 Lucasfilm 124.0
3 Airplane! PG Comedy 1980 July 2, 1980 (United States) 7.7 221000.0 Jim Abrahams Jim Abrahams Robert Hays United States 3500000.0 83453539.0 Paramount Pictures 88.0
4 Caddyshack R Comedy 1980 July 25, 1980 (United States) 7.3 108000.0 Harold Ramis Brian Doyle-Murray Chevy Chase United States 6000000.0 39846344.0 Orion Pictures 98.0
... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ...
7663 More to Life NaN Drama 2020 October 23, 2020 (United States) 3.1 18.0 Joseph Ebanks Joseph Ebanks Shannon Bond United States 7000.0 NaN NaN 90.0
7664 Dream Round NaN Comedy 2020 February 7, 2020 (United States) 4.7 36.0 Dusty Dukatz Lisa Huston Michael Saquella United States NaN NaN Cactus Blue Entertainment 90.0
7665 Saving Mbango NaN Drama 2020 April 27, 2020 (Cameroon) 5.7 29.0 Nkanya Nkwai Lynno Lovert Onyama Laura United States 58750.0 NaN Embi Productions NaN
7666 It's Just Us NaN Drama 2020 October 1, 2020 (United States) NaN NaN James Randall James Randall Christina Roz United States 15000.0 NaN NaN 120.0
7667 Tee em el NaN Horror 2020 August 19, 2020 (United States) 5.7 7.0 Pereko Mosia Pereko Mosia Siyabonga Mabaso South Africa NaN NaN PK 65 Films 102.0

7668 rows × 15 columns

data dictionary¶

name: name of movie

rating: rating of movie

genre: genre of movie

year: year that movie was released

released: specific date movie was released

score: average IMBD score

votes: number of people who rated on IMBD

director: director of movie

writer: writer of movie

star: movie start of the movie

country: country of origin

budget: budget for the movie

gross: how much the movie grossed

compaany: company that produced the movie

runtime: runtime of the movie (how long the movie is)

This dataset has the year that each movie was released and its runtime so it is sufficient for solving whether or not movies have gotten longer over time.

  1. Write one or two sentences about how the data will be used to solve the problem.

First count how may movies came out in each year, and add up the total runtime of movies that came out in each year. Then use these two values to find the average runtime of movies in every year from 1980 to 2020 (the years contained in the dataset I'm using). From there you can see how the average runtime of movies has changed (or not changed).

In [ ]: