I want to evaluate food insecurity in the United States, and the correlation between diabetes, obesity, and food insecurity. According to the USDA, food insecurity is the " limited or uncertain availability of nutritionally adequate and safe foods, or limited or uncertain ability to acquire acceptable foods in socially acceptable ways." Many Americans face food insecurity for a multitude of reasons such as income, employment, disabilities, etc. You can read more about it here: https://www.feedingamerica.org/hunger-in-america.
#food insecurity data
import pandas as pd
df_food_insecurity = pd.read_csv('Feeding_America_Food_Insecurity_2018.csv')
df_food_insecurity.head()
OBJECTID_1 | FIPS | STATE_FIPS | CNTY_FIPS | NAME | STATE_NAME | POPULATION | State | County__State | Food_Insecurity_Rate_2018 | ... | Pct__FI_Btwn_Thresholds | Pct__FI___High_Threshold | ChildFoodInsecurityRate_2018 | Food_Insecure_Children_2018 | FoodInsecureChildrenHH_1_2018 | FoodInsecureChildrenHH_2_2018 | Cost_Per_Meal_2018 | W_AnnFoodBudgetShortfall_2018 | Shape__Area | Shape__Length | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | 1 | 1001.0 | 1 | 1 | Autauga | Alabama | 56903 | AL | Autauga County, Alabama | 0.156 | ... | 0.132 | 0.405 | 0.214 | 2870.0 | 0.81 | 0.19 | 3.33 | 4857000.0 | 1.565568e+09 | 213533.767840 |
1 | 2 | 1003.0 | 1 | 3 | Baldwin | Alabama | 214651 | AL | Baldwin County, Alabama | 0.129 | ... | 0.187 | 0.442 | 0.169 | 7710.0 | 0.84 | 0.16 | 3.58 | 16274000.0 | 4.248941e+09 | 635766.429870 |
2 | 3 | 1005.0 | 1 | 5 | Barbour | Alabama | 26585 | AL | Barbour County, Alabama | 0.219 | ... | 0.108 | 0.241 | 0.320 | 1740.0 | 0.94 | 0.06 | 3.12 | 2988000.0 | 2.342322e+09 | 257811.107201 |
3 | 4 | 1007.0 | 1 | 7 | Bibb | Alabama | 23003 | AL | Bibb County, Alabama | 0.151 | ... | 0.212 | 0.319 | 0.209 | 970.0 | 1.00 | 0.00 | 2.94 | 1690000.0 | 1.621299e+09 | 191315.886193 |
4 | 5 | 1009.0 | 1 | 9 | Blount | Alabama | 57971 | AL | Blount County, Alabama | 0.136 | ... | 0.279 | 0.292 | 0.191 | 2580.0 | 1.00 | 0.00 | 3.14 | 4149000.0 | 1.684210e+09 | 238847.459010 |
5 rows × 26 columns
I'm getting in contact with Feeding America to get 2020 data so that the year matches the diabetes and obesity data.
#obesity data
df_obesity = pd.read_csv('percentage-of-obese-us-adults-by-state-2021.csv')
df_obesity.head()
State | Obesity Percentage | Unnamed: 2 | |
---|---|---|---|
0 | West Virginia | 40.6 | in % |
1 | Kentucky | 40.3 | in % |
2 | Alabama | 39.9 | in % |
3 | Oklahoma | 39.4 | in % |
4 | Mississippi | 39.1 | in % |
df_diabetes = pd.read_csv('percentage-of-us-adults-with-diabetes-as-of-2020-by-state.csv')
df_diabetes.head()
State | Diabetes Percentage | Unnamed: 2 | |
---|---|---|---|
0 | Mississippi | 17.2 | in % |
1 | West Virginia | 17.1 | in % |
2 | Alabama | 16.7 | in % |
3 | Louisiana | 16.0 | in % |
4 | Arkansas | 15.6 | in % |
Links for Data
I'm thinking of clustering states and regions based on similar food insecurity data and evaluate if food insecurity is directly correlated with diabetes and obesity. I also want to see if states with severe food insecurity are in certain regions of the US.