Car Accidents and Laws¶

Real-World Problem:¶

According the article How Many People Die From Car Accidents Each Year, about 1.3 million death are caused by car accidents annually over the world. Moreover, it is stated that in the United States, 32 people die because of drunk driving car crashes per day. In my home country, public are so used to these kind of events that even many political figures were one of the perpetrators.

I have been wondering that why do car accidents still happen so regularly even if there are so many related laws. What kind of policy seems to address the problem the most? It seems like these answer could be found by compairing the accidents and the policy made in different region, and I hope to find a combination of policies that could lowest the car accident possibility.

In [1]:
import pandas as pd
In [2]:
df_traffic_death = pd.read_csv('traffic_death.csv')

traffic_death_feature = {'Entity': "Country", "Year": "years from 1990 to 2019",
                         "Deaths": "traffic deaths number"}

df_traffic_death.head(5)
Out[2]:
Entity Year Deaths
0 Afghanistan 1990 4154
1 Afghanistan 1991 4472
2 Afghanistan 1992 5106
3 Afghanistan 1993 5681
4 Afghanistan 1994 6001
In [3]:
df_death_type = pd.read_csv('death_type.csv')

death_type_feature = {'Country': 'Country', 
                      "Drivers/passengers of 4-wheeled vehicles": "% of road traffic deaths by type",
                      "Drivers/passengers of motorized 2- or 3-wheelers": "% of road traffic deaths by type",
                      "Cyclists": "% of road traffic deaths by type",
                      "Pedestrians": "% of road traffic deaths by type",
                      "Other/unspecified roadusers": "% of road traffic deaths by type"}

df_death_type.head(5)
Out[3]:
Country Drivers/passengers of 4-wheeled vehicles Drivers/passengers of motorized 2- or 3-wheelers Cyclists Pedestrians Other/unspecified road users
0 Albania 39.4 11.9 7.8 38.7 2.2
1 Andorra NaN 50.0 NaN 50.0 NaN
2 Angola 59.5 NaN NaN 40.5 0.0
3 Antigua and Barbuda 62.5 0.0 12.5 25.0 0.0
4 Argentina 47.2 22.2 2.4 8.2 20.0
In [4]:
df_drunk_rules = pd.read_csv('drunk_rule.csv')

drunk_rules_feature = {"Country": "Country", 
                       "Definition of drink-driving by BAC": "whether or not the conutry have clear BAC to identify drink-driving",
                       "Existence of a national drink-drrving law": "whether or not having law about rink-driving",
                       "Attribution of road traffic deaths to alcohol": "% of traffic deaths caused by alcohol"}

df_drunk_rules.head(5)
Out[4]:
Country Definition of drink-driving by BAC Existence of a national drink-driving law Attribution of road traffic deaths to alcohol (%)
0 Afghanistan No Yes –
1 Albania Yes Yes 5.2
2 Angola Yes Yes –
3 Antigua and Barbuda No Yes 0.9
4 Argentina Yes Yes 17
In [5]:
df_bac_limit = pd.read_csv('bac_limit.csv')

bac_limit_feature = {"Country": "Country",
                     "BAC limit for general population": "the blood alcohol concentration standard for most people",
                     "BAC limit for novice drivers": "the blood alcohol concentration standard for young or new drivers"}

df_bac_limit.head(5)
Out[5]:
Country BAC limit for general population BAC limit for novice drivers
0 Afghanistan - -
1 Albania <=0.05 g/dl <=0.05 g/dl
2 Angola <= 0.06 g/dl <= 0.06 g/dl
3 Antigua and Barbuda - -
4 Argentina <=0.05 g/dl <=0.05 g/dl
In [6]:
df_speed_limit = pd.read_csv('speed_limit.csv')

speed_limit_feature = {"Country": "Country", 
                       "Maximum speed limits (Urban)": "speed limits in city side",
                       "Maximum speed limits (Rural)": "speed limits in country side"}

df_speed_limit.head(5)
Out[6]:
Country Maximum speed limits (Urban) Maximum speed limits (Rural)
0 Afghanistan 90 90
1 Albania 40 80
2 Angola 60 90
3 Antigua and Barbuda 32 64
4 Argentina 60 110

I think these dataset could help me address the problem by comparing the deaths in each country and see how there is relationship between these rules and the deaths. We can also see whether how does the policy about maximum speed limits and drink-driving affect on the amount of different type of traffic deaths.

We can build models of combination of differnet maximum speed limit and different blood alcohol concentration limit, and predict the possible death; finally find whether there is a best combination that can minimize the traffic death.