The High Speed Rail!!¶

This project will look at the potential viability of the hypothetical cross-country high speed rail system. \ The proposed rail would take one from Los Angeles to New York in under 10 hours and could connect most major US cities to form a sustainable, fast method of transportation for a wide variety of individuals as well as ease the strain on air travel that pollutes the air and often has major delays. \ \ We will use the following map of the hypothetical rail, this one was made by First Cultural at UC Berkeley but since the plans aren't official there isn't an official one to draw from - this is just an idea that connects a lot of the major cities. \ \ \ To investigate this topic more, check out Vox's article about it and it's new popularity or just this paper from the Congressional Research Service about it's Issue and Recent Events.

First we look at US air travel to identify how many trips can be substituded by hypothetical trains. We will look at flights that run between cities on the hypothetical rail and their passenger counts and use analysis of the rail system in Europe to predict how many of those plane passengers could be rail passengers. \ \ You can find the following data set on Kaggle, we will be using Destination_city, Origin_city, and Passengers

In [9]:

import pandas as pd

df_usa = pd.read_csv("Airports2.csv")
df_usa.head(6)

Out[9]:

	Origin_airport	Destination_airport	Origin_city	Destination_city	Passengers	Seats	Flights	Distance	Fly_date	Origin_population	Destination_population	Org_airport_lat	Org_airport_long	Dest_airport_lat	Dest_airport_long
0	MHK	AMW	Manhattan, KS	Ames, IA	21	30	1	254	2008-10-01	122049	86219	39.140999	-96.670799	NaN	NaN
1	EUG	RDM	Eugene, OR	Bend, OR	41	396	22	103	1990-11-01	284093	76034	44.124599	-123.211998	44.254101	-121.150002
2	EUG	RDM	Eugene, OR	Bend, OR	88	342	19	103	1990-12-01	284093	76034	44.124599	-123.211998	44.254101	-121.150002
3	EUG	RDM	Eugene, OR	Bend, OR	11	72	4	103	1990-10-01	284093	76034	44.124599	-123.211998	44.254101	-121.150002
4	MFR	RDM	Medford, OR	Bend, OR	0	18	1	156	1990-02-01	147300	76034	42.374199	-122.873001	44.254101	-121.150002
5	MFR	RDM	Medford, OR	Bend, OR	11	18	1	156	1990-03-01	147300	76034	42.374199	-122.873001	44.254101	-121.150002

The machine learning aspect I propose is looking at the transporation data in Europe to predict how it could translate to the US; Europe has a great rail system and while China's is bigger, I believe Europe to be a better model geographically and culturally to the US. While I don't have a super solid understanding of ML yet, I think we can take the following data on train and plane use as well as the corresponding populations for each country to get a picture of how Europeans travel, then develop a model of what such travel in the US would look like if we had a high speed rail. \ \ The following dataset is European plane travel, using country and passengers, measure specifies what type of travel it is. In our case we will only be using PAS_BRD which is commerical passengers. You can find it also at Kaggle

In [10]:

df_eur_planes = pd.read_csv("Passengers_Year_Transit.csv")
df_eur_planes.head(6)

Out[10]:

	country	measure	year	passengers
0	AUT	PAS_BRD	2021	11187400.0
1	BEL	PAS_BRD	2021	13516263.0
2	BGR	PAS_BRD	2021	5146280.0
3	CHE	PAS_BRD	2021	19293409.0
4	CYP	PAS_BRD	2021	4993689.0
5	CZE	PAS_BRD	2021	4796559.0

There is data about train passengers for all the countries in the world (will be cleaned to just Europe) but it is too big for read_csv. But the relevent columns are Country Name and Passengers served each year since 1961. You can find it at The World Bank's website. \ \ This is population data for countries in Europe, we will be suing country_name and population. Can also be found on Kaggle.

In [12]:

df_pops = pd.read_csv("europe populations.csv")
df_pops.head(6)
# European population data, will be using the country_name and population columns

Out[12]:

	Unnamed: 0	country_name	Continent	region	local_name	capital	area	population	population_per_sq_km	male_life_expectancy	female_life_expectancy	birth_rate	death_rate
0	0	Austria	Europe	Western Europe	Österreich	Vienna	83,879 km²	8,917,000	106.3	78.9	83.6	9.4	10.3
1	1	Belgium	Europe	Western Europe	België / Belgique	Brussels	30,530 km²	11,544,000	378.1	78.6	83.1	9.9	11.0
2	2	France	Europe	Western Europe	France	Paris	549,087 km²	67,380,000	122.7	79.2	85.3	10.9	9.9
3	3	Germany	Europe	Western Europe	Deutschland	Berlin	357,580 km²	83,161,000	232.6	78.6	83.4	9.3	11.9
4	4	Liechtenstein	Europe	Western Europe	Liechtenstein	Vaduz	161 km²	38,137	237.6	80.1	83.6	9.1	8.2
5	5	Luxembourg	Europe	Western Europe	Luxembourg/Lëtzebuerg	Luxembourg	2,590 km²	630,419	243.4	79.4	84.2	10.2	7.3