Uber, the ride-sharing tech giant, aims to provide the best customer experience possible. When using a ride-sharing service, the price is never fixed. For the same starting point and destination, customers observe that the fare always varies. Uber's prices are affected by the supply and demand of rides at a given time. The price of a ride can depend on many factors from the weather to an extensive amount of people summoning a ride at the same time. When the demand for rides increases, the price for the ride increaes to ensure that people who need a ride can get one- according to Uber. The intention behind this is that the people who need rides "more" than others will pay the extra surge price whereas others who are in less of a rush can wait for a lower price. Even though the datasets below also contain data for Lyft, we will focus on Uber. This is currently a problem because people will end up paying way more on an Uber ride than they really should be paying.
References Delaney, D. (2016, May 29). Surge pricing: Why a rainy day ride will cost you more. Tennessean.com. https://www.tennessean.com/story/money/2016/05/29/surge-pricing-why-rainy-day-ride-cost-you-more/84922758/
Helling, B. (2023). Surge Pricing: What It Is & How It Works For Riders & Drivers. Ridester.com. https://www.ridester.com/surge-pricing/
How Surge Pricing Works. (n.d.). Uber.com. Retrieved February 27, 2023, from https://www.uber.com/us/en/drive/driver-app/how-surge-works/
import pandas as pd
rides_df = pd.read_csv('cab_rides.csv')
rides_df.head()
distance | cab_type | time_stamp | destination | source | price | surge_multiplier | id | product_id | name | |
---|---|---|---|---|---|---|---|---|---|---|
0 | 0.44 | Lyft | 1544952607890 | North Station | Haymarket Square | 5.0 | 1.0 | 424553bb-7174-41ea-aeb4-fe06d4f4b9d7 | lyft_line | Shared |
1 | 0.44 | Lyft | 1543284023677 | North Station | Haymarket Square | 11.0 | 1.0 | 4bd23055-6827-41c6-b23b-3c491f24e74d | lyft_premier | Lux |
2 | 0.44 | Lyft | 1543366822198 | North Station | Haymarket Square | 7.0 | 1.0 | 981a3613-77af-4620-a42a-0c0866077d1e | lyft | Lyft |
3 | 0.44 | Lyft | 1543553582749 | North Station | Haymarket Square | 26.0 | 1.0 | c2d88af2-d278-4bfd-a8d0-29ca77cc5512 | lyft_luxsuv | Lux Black XL |
4 | 0.44 | Lyft | 1543463360223 | North Station | Haymarket Square | 9.0 | 1.0 | e0126e1f-8ca9-4f2e-82b3-50505a09db9a | lyft_plus | Lyft XL |
weather_df = pd.read_csv('weather.csv')
weather_df.head()
temp | location | clouds | pressure | rain | time_stamp | humidity | wind | |
---|---|---|---|---|---|---|---|---|
0 | 42.42 | Back Bay | 1.0 | 1012.14 | 0.1228 | 1545003901 | 0.77 | 11.25 |
1 | 42.43 | Beacon Hill | 1.0 | 1012.15 | 0.1846 | 1545003901 | 0.76 | 11.32 |
2 | 42.50 | Boston University | 1.0 | 1012.15 | 0.1089 | 1545003901 | 0.76 | 11.07 |
3 | 42.11 | Fenway | 1.0 | 1012.13 | 0.0969 | 1545003901 | 0.77 | 11.09 |
4 | 43.13 | Financial District | 1.0 | 1012.14 | 0.1786 | 1545003901 | 0.75 | 11.49 |
These two sets of data will be used to solve the problem by building a machine learning-based model that predicts the serge multipler- specifically based on the various weather conditions. The machine-learning based model that seems to be the best fit is ordinal regression/classification. This project will allow users to look at an area and see if there's currently a surge. This will be beneficial to users as they can choose to avoid this price by either waiting to call the Uber until the surge goes down or walking a bit to a starting position that is not in a surge zone.