You would think that the more hours you work, the more productive you are and the more money you make. However, some speculate that there is a balance with these factors because overworking diminishes motivation and productivity. There has been a recent push for a 4-day workweek, as opposed to the traditional 5-day workweek, in order to boost productivity and wealth. This project aims to study these relationships.
The Four-Day Workweek Merits Consideration - Forbes
The evidence overwhelmingly says 4-day work weeks are good for everyone
import pandas as pd
import matplotlib.pyplot as plt
import numpy as np
# import datasets
df_hours_and_GDP = pd.read_csv('annual-working-hours-vs-gdp-per-capita-pwt.csv')
df_productivity_and_hours = pd.read_csv('productivity-vs-annual-hours-worked.csv')
# merge dataframes
df_merged = pd.merge(df_hours_and_GDP, df_productivity_and_hours, how='left', left_on=['Entity', 'Year'], right_on = ['Entity', 'Year'])
# drop columns
df_merged_clean = df_merged.drop(columns=['Code_x', 'Code_y', 'Annual working hours per worker_y', 'Population (historical estimates)_y', 'Continent_y', 'Population (historical estimates)_x'])
# rename columns
df_merged_clean_renamed = df_merged_clean.rename(columns={'Entity': 'Country',
'Annual working hours per worker_x': 'Annual Working Hours per Worker',
'GDP per capita (output, multiple price benchmarks)':'GDP per Capita',
'Productivity: output per hour worked':'Productivity',
'Continent_x':'Continent'})
# drop NaN (from certain columns)
df_all = df_merged_clean_renamed.dropna(subset=['Country', 'Year', 'Annual Working Hours per Worker', 'GDP per Capita', 'Productivity'])
# reset indexes
df2 = df_all.reset_index()
df_all = df2.drop(columns=['index'])
# present data
df_all
Country | Year | Annual Working Hours per Worker | GDP per Capita | Continent | Productivity | |
---|---|---|---|---|---|---|
0 | Argentina | 1950 | 2034.0000 | 2931.7388 | NaN | 3.727674 |
1 | Argentina | 1951 | 2037.8667 | 2940.7954 | NaN | 3.752668 |
2 | Argentina | 1952 | 2041.7408 | 2629.9502 | NaN | 3.365233 |
3 | Argentina | 1953 | 2045.6223 | 2747.4377 | NaN | 3.522695 |
4 | Argentina | 1954 | 2049.5112 | 2821.9634 | NaN | 3.623416 |
... | ... | ... | ... | ... | ... | ... |
3487 | Vietnam | 2015 | 2191.3704 | 6180.3580 | Asia | 4.946606 |
3488 | Vietnam | 2016 | 2169.5515 | 6368.6510 | NaN | 5.156925 |
3489 | Vietnam | 2017 | 2131.9683 | 6841.6543 | NaN | 5.652919 |
3490 | Vietnam | 2018 | 2131.9683 | 7217.9240 | NaN | 5.982665 |
3491 | Vietnam | 2019 | 2131.9683 | 7506.8170 | NaN | 6.739149 |
3492 rows × 6 columns
x = df_all['Annual Working Hours per Worker']
y = df_all['GDP per Capita']
# plot GDP data
plt.scatter(x, y, c='#DE6666', s=1, alpha=0.6)
plt.xlabel('Annual Working Hours per Worker')
plt.ylabel('GDP per Capita')
plt.title('GDP per Capita vs. Annual Working Hours per Worker Across the World (1950 - 2019)')
#find line of best fit
model = np.poly1d(np.polyfit(x, y, 3))
polyline = np.linspace(1250, 3000)
plt.plot(polyline, model(polyline), color='#853e3e')
[<matplotlib.lines.Line2D at 0x124db1010>]
# plot productivity data
y = df_all['Productivity']
plt.scatter(x, y, c='#4772D1', label='Productivity', s=1, alpha=0.6)
plt.xlabel('Annual Working Hours per Worker')
plt.ylabel('Productivity')
plt.title('Productivity vs. Annual Working Hours per Worker Across the World (1950 - 2019)')
#find line of best fit
model = np.poly1d(np.polyfit(x, y, 3))
polyline = np.linspace(1250, 3000)
plt.plot(polyline, model(polyline), color='#12398c')
[<matplotlib.lines.Line2D at 0x124cea4d0>]
Country: country in which the data was sourced
Year: year in which the data was sourced
Annual Working Hours per Worker: average working hours per worker over an entire year
GDP per Capita: GDP per person in country, djusted for differences in the cost of living between countries, and for inflation
Continent: continent in which the data was sourced
Productivity: output per hour worked, measured as GDP per hour of work
Based on the size of the dataset and the relativley strong correlations shown, this data is sufficient to make progress on the problem at hand.
This project focuses on the correlations between GDP, annual hours worked, and productivity across the world in order to find the best balance between time at work and lesiure time to maximize productivity and therefore GDP. In future analyses, we will isolate the data by country and/or continent and find countries with the greatest GDP and lowest time spent at work. We will use this information to further investigate how these regions structure their workforce in order to apply their methods to other countries, specifically those who have very low GDP compared to the number of hours worked.