healthcare cost¶

Part 1: The Real World Problem¶

Healthcare Insurance charges have dramatically increased, especially since COVID-19 . There are a few known factors that may influence how much insurance costs, such whether an individual receives insurance through their job via corporate insurance or if they utilize government healthcare programs such as Medicare. But are there other individual factors to blame for higher insurance prices? Things we may not have control over -such as sex, gender, and BMI- may be the reason you may pay more for healthcare insurance in the future compared to your fellow classmates.

There is existing evidence that such factors can influence insurance charges. It makes sense that medical charges increase with age, as the severity of many medical conditions worsen with time and require more intensive care. Additionally, a higher BMI is medically linked to higher obesity rates, which ultimately lead to more health complications and higher medical bills). However, exactly how influential are these factors on insurance prices and is their influence reasonable?

Part 2: Loading the Dataset¶

In [5]:
import pandas as pd

# opening file 
df_insurance = pd.read_csv('insurance.csv')
df_insurance
Out[5]:
index age sex bmi children smoker region charges
0 0 19 female 27.900 0 yes southwest 16884.92400
1 1 18 male 33.770 1 no southeast 1725.55230
2 2 28 male 33.000 3 no southeast 4449.46200
3 3 33 male 22.705 0 no northwest 21984.47061
4 4 32 male 28.880 0 no northwest 3866.85520
... ... ... ... ... ... ... ... ...
1333 1333 50 male 30.970 3 no northwest 10600.54830
1334 1334 18 female 31.920 0 no northeast 2205.98080
1335 1335 18 female 36.850 0 no southeast 1629.83350
1336 1336 21 female 25.800 0 no southwest 2007.94500
1337 1337 61 female 29.070 0 yes northwest 29141.36030

1338 rows × 8 columns

In [10]:
insurance_dict = {'age': 'age of patient', 'sex': 'gender of patient', 'bmi': 'body mass index of patient', 'children':'number of children patient has', 'smoker': 'indicates whether patient is a smoker', 'region': 'where in the United States a patient lives', 'charges': 'amount of money insurance is charging patient for their services'}
In [7]:
insurance_dict
Out[7]:
{'age': 'age of patient',
 'sex': 'gender of patient',
 'bmi': 'body mass index of patient',
 'children': 'number of children patient has',
 'smoker': 'indicates whether patient is a smoker',
 'region': 'where in the United States a patient lives',
 'charges': 'amount of money insurance is charging patient for their services'}

Part 3: Explanation of Project Methods¶

In order to find out the influence of individual factors on healthcare insurance prices, we will group the data together by factor and analyze the charges associated with each respective factor. In doing so, we can easily discover which individual factors have the highest influence on healthcare pricing and then compare these trends across all factors.

In the grand scheme of the project, one could utilize this findings of this project to help create a predictor of healthcare prices based on the factors presented. This could be extremely useful for both patients and doctors when deciding the best course of action to take when in medical need.

In [ ]: