The High Speed Rail!!¶

This project will look at the potential viability of the hypothetical cross-country high speed rail system. \ The proposed rail would take one from Los Angeles to New York in under 10 hours and could connect most major US cities to form a sustainable, fast method of transportation for a wide variety of individuals as well as ease the strain on air travel that pollutes the air and often has major delays. \ \ We will use the following map of the hypothetical rail, this one was made by First Cultural at UC Berkeley but since the plans aren't official there isn't an official one to draw from - this is just an idea that connects a lot of the major cities. picture \ \ \ To investigate this topic more, check out Vox's article about it and it's new popularity or just this paper from the Congressional Research Service about it's Issue and Recent Events.

First we look at US air travel to identify how many trips can be substituded by hypothetical trains. We will look at flights that run between cities on the hypothetical rail and their passenger counts and use analysis of the rail system in Europe to predict how many of those plane passengers could be rail passengers. \ \ You can find the following data set on Kaggle, we will be using Destination_city, Origin_city, and Passengers

In [9]:
import pandas as pd

df_usa = pd.read_csv("Airports2.csv")
df_usa.head(6)
Out[9]:
Origin_airport Destination_airport Origin_city Destination_city Passengers Seats Flights Distance Fly_date Origin_population Destination_population Org_airport_lat Org_airport_long Dest_airport_lat Dest_airport_long
0 MHK AMW Manhattan, KS Ames, IA 21 30 1 254 2008-10-01 122049 86219 39.140999 -96.670799 NaN NaN
1 EUG RDM Eugene, OR Bend, OR 41 396 22 103 1990-11-01 284093 76034 44.124599 -123.211998 44.254101 -121.150002
2 EUG RDM Eugene, OR Bend, OR 88 342 19 103 1990-12-01 284093 76034 44.124599 -123.211998 44.254101 -121.150002
3 EUG RDM Eugene, OR Bend, OR 11 72 4 103 1990-10-01 284093 76034 44.124599 -123.211998 44.254101 -121.150002
4 MFR RDM Medford, OR Bend, OR 0 18 1 156 1990-02-01 147300 76034 42.374199 -122.873001 44.254101 -121.150002
5 MFR RDM Medford, OR Bend, OR 11 18 1 156 1990-03-01 147300 76034 42.374199 -122.873001 44.254101 -121.150002

The machine learning aspect I propose is looking at the transporation data in Europe to predict how it could translate to the US; Europe has a great rail system and while China's is bigger, I believe Europe to be a better model geographically and culturally to the US. While I don't have a super solid understanding of ML yet, I think we can take the following data on train and plane use as well as the corresponding populations for each country to get a picture of how Europeans travel, then develop a model of what such travel in the US would look like if we had a high speed rail. \ \ The following dataset is European plane travel, using country and passengers, measure specifies what type of travel it is. In our case we will only be using PAS_BRD which is commerical passengers. You can find it also at Kaggle

In [10]:
df_eur_planes = pd.read_csv("Passengers_Year_Transit.csv")
df_eur_planes.head(6)
Out[10]:
country measure year passengers
0 AUT PAS_BRD 2021 11187400.0
1 BEL PAS_BRD 2021 13516263.0
2 BGR PAS_BRD 2021 5146280.0
3 CHE PAS_BRD 2021 19293409.0
4 CYP PAS_BRD 2021 4993689.0
5 CZE PAS_BRD 2021 4796559.0

There is data about train passengers for all the countries in the world (will be cleaned to just Europe) but it is too big for read_csv. But the relevent columns are Country Name and Passengers served each year since 1961. You can find it at The World Bank's website. \ \ This is population data for countries in Europe, we will be suing country_name and population. Can also be found on Kaggle.

In [12]:
df_pops = pd.read_csv("europe populations.csv")
df_pops.head(6)
# European population data, will be using the country_name and population columns
Out[12]:
Unnamed: 0 country_name Continent region local_name capital area population population_per_sq_km male_life_expectancy female_life_expectancy birth_rate death_rate
0 0 Austria Europe Western Europe Österreich Vienna 83,879 km² 8,917,000 106.3 78.9 83.6 9.4 10.3
1 1 Belgium Europe Western Europe België / Belgique Brussels 30,530 km² 11,544,000 378.1 78.6 83.1 9.9 11.0
2 2 France Europe Western Europe France Paris 549,087 km² 67,380,000 122.7 79.2 85.3 10.9 9.9
3 3 Germany Europe Western Europe Deutschland Berlin 357,580 km² 83,161,000 232.6 78.6 83.4 9.3 11.9
4 4 Liechtenstein Europe Western Europe Liechtenstein Vaduz 161 km² 38,137 237.6 80.1 83.6 9.1 8.2
5 5 Luxembourg Europe Western Europe Luxembourg/Lëtzebuerg Luxembourg 2,590 km² 630,419 243.4 79.4 84.2 10.2 7.3