The real world problem that I believe data science would aid in significantly is in the area of public food in the US - specifically, if there is a future need for more food banks and other public food institutions. Homelessness and hunger are big problems in the US today. According to Feeding America, 100% of US counties have some kind of food insecurity, and the amount of food insecurity is said to be exponentially increasing due to the COVID-19 pandemic. Furthermore, in 2021, 53 million people used food banks and other programs. The US government is said to spend around $111 billion annually on the Supplemental Nutrition Assistance Program (SNAP) and other food assistance programs, according to this source. Data science would prove to be very useful in order to see the trends of this real world problem and see if more money should be budgeted towards food assistance, and if more food banks/community programs should be implemented across the country.
import pandas as pd
df_food = pd.read_csv('SNAP_history_1969_2019.csv')
df_food.head(5)
Fiscal Year | Average Participation | Average Benefit Per Person | Total Benefits(M) | Other Costs | Total Costs(M) | |
---|---|---|---|---|---|---|
0 | 1969 | 2,878 | 6.63 | 228.80 | 21.70 | 250.50 |
1 | 1970 | 4,340 | 10.55 | 549.70 | 27.20 | 576.90 |
2 | 1971 | 9,368 | 13.55 | 1,522.70 | 53.20 | 1,575.90 |
3 | 1972 | 11,109 | 13.48 | 1,797.30 | 69.40 | 1,866.70 |
4 | 1973 | 12,166 | 14.60 | 2,131.40 | 76.00 | 2,207.40 |
data_dict = {}
data_dict["Fiscal Year"] = "the fiscal year in which these statistics were collected"
data_dict["Average Participation"] = "the average amount of people in thousands that made use of food assistance programs in the given fiscal year"
data_dict["Average Benefit Per Person"] = "the average monthly dollars that a person would receive"
data_dict["Total Benefits"] = "the total benefits in a given year in millions of USD"
data_dict["Other Costs"] = "the cost to run the food programs, not including the cost for food itself, in millions of USD"
data_dict["Total Costs"] = "the total cost to run the food programs in millions of USD"
data_dict
{'Fiscal Year': 'the fiscal year in which these statistics were collected', 'Average Participation': 'the average amount of people in thousands that made use of food assistance programs in the given fiscal year', 'Average Benefit Per Person': 'the average monthly dollars that a person would receive', 'Total Benefits': 'the total benefits in a given year in millions of USD', 'Other Costs': 'the cost to run the food programs, not including the cost for food itself, in millions of USD', 'Total Costs': 'the total cost to run the food programs in millions of USD'}
This data set would allow us to see trends in the use of food assistance programs across the US, specifically SNAP and WIC. We could look at the average participation and adjust the total benefits and costs for inflation where all years are adjusted to a specific year's inflation rate, and then from there create a visual aid/graph that would allow us to see the average costs and benefits per person per year. Based on this, we could also predict future values for each of the keys and come to a conclusion of whether or not more food programs or food banks are needed across the country.