Formula 1 Adjusted Driver Qualifying Comparison¶

DS 2500 Project Proposal - Page Lootsma

Description of Problem¶

The purpose of this project is to create a means of quantifiably comparing the qualifying times of Formula 1 drivers who did not drive in the same machinery at the same time.¶

It is notoriously difficult to compare drivers from different teams and different eras in Formula 1, as each team (consisting of only two drivers) has a different car that changes on a race-to-race basis. Thus, a driver can only be reasonably compared to their teammate, as this is the only individual in the same machinery and the same set of circumstances. This fascilitates the need for a data-driven method of quantifiably comparing drivers from different teams and different eras.

Typically, there are two major skills which are considered when assessing a driver's ability--their racecraft and their pace. Racecraft refers to a drivers' ability to maneuver and overtake other drivers over the course of an entire race. Quantifying a drivers' racecraft in a manner that be compared to other drivers is extremely difficult to facilitate, as every driver experiences a unique set of circumstances throughout a race that cannot be repeated or emulated.

A more effective means of quantifiably comparing drivers' is their pace. Pace is often regarded as the antithesis of racecraft; rather than observing a drivers' ability to 'work their way through the pack', pace refers to a drivers' raw speed over the course of a single lap, uninhibited by other drivers and pushing their machinery to the absolute limit.

Unlike racecraft, pace has is much more feasibly quantified. In order to determine their position at the start of the race, Formula 1 drivers participate in a qualifying session. The rules for qualifying are complicated, but essentially the drivers are attempting to complete the fastest single lap they can, as the order of the race will be determined by their qualifying time. This means that drivers are going flat-out under essentially the same track conditions without taking strategic elements such as tyre wear or traffic into consideration. As a result, drivers' qualifying times provide an invaluable window into the pure pace a driver possesses which can be directly compared to their teammate.

A link to the Kaggle dataset can be found here.

Some Sample Code¶

Import Pandas and read in the csv file.

In [1]:

import pandas as pd

df_quali = pd.read_csv('qualifying.csv')
df_quali.tail()

Out[1]:

	Car	Detail	Driver	DriverCode	Grand Prix	Laps	No	Pos	Q1	Q2	Q3	Time	Year
17231	Haas Ferrari	Qualifying	Kevin Magnussen	MAG	Abu Dhabi	9.0	20	16	1:25.834	NaN	NaN	NaN	2022
17232	AlphaTauri RBPT	Qualifying	Pierre Gasly	GAS	Abu Dhabi	9.0	10	17	1:25.859	NaN	NaN	NaN	2022
17233	Alfa Romeo Ferrari	Qualifying	Valtteri Bottas	BOT	Abu Dhabi	6.0	77	18	1:25.892	NaN	NaN	NaN	2022
17234	Williams Mercedes	Qualifying	Alexander Albon	ALB	Abu Dhabi	9.0	23	19	1:26.028	NaN	NaN	NaN	2022
17235	Williams Mercedes	Qualifying	Nicholas Latifi	LAT	Abu Dhabi	9.0	6	20	1:26.054	NaN	NaN	NaN	2022

Create a dictionary of all drivers, with each key being a specific driver's name containing a dataframe of that driver's qualifying results.

In [2]:

driver_list = df_quali['Driver'].unique()

all_drivers = {}

for driver in driver_list:
    bool_driver = df_quali['Driver'] == driver
    df_driver = df_quali.loc[bool_driver, :]
    all_drivers[driver] = df_driver

Showcase how different drivers can be accessed through the dictionary.

In [3]:

all_drivers["Max Verstappen"]

Out[3]:

	Car	Detail	Driver	DriverCode	Grand Prix	Laps	No	Pos	Q1	Q2	Q3	Time	Year
13625	STR Renault	Qualifying	Max Verstappen	VER	Australia	15.0	33	12	1:29.248	1:28.868	NaN	NaN	2015
13637	STR Renault	Qualifying	Max Verstappen	VER	Malaysia	16.0	33	6	1:40.793	1:41.430	1:51.981	NaN	2015
13663	STR Renault	Qualifying	Max Verstappen	VER	China	14.0	33	13	1:38.387	1:38.393	NaN	NaN	2015
13685	STR Renault	Qualifying	Max Verstappen	VER	Bahrain	14.0	33	15	1:35.611	1:35.103	NaN	NaN	2015
13696	STR Renault	Qualifying	Max Verstappen	VER	Spain	20.0	33	6	1:27.393	1:26.441	1:26.249	NaN	2015
...	...	...	...	...	...	...	...	...	...	...	...	...	...
17136	Red Bull Racing RBPT	Qualifying	Max Verstappen	VER	Japan	13.0	1	1	1:30.224	1:30.346	1:29.304	NaN	2022
17158	Red Bull Racing RBPT	Qualifying	Max Verstappen	VER	United States	15.0	1	3	1:35.864	1:35.294	1:34.448	NaN	2022
17176	Red Bull Racing RBPT	Qualifying	Max Verstappen	VER	Mexico	16.0	1	1	1:19.222	1:18.566	1:17.775	NaN	2022
17197	Red Bull Racing RBPT	Qualifying	Max Verstappen	VER	Brazil	23.0	1	2	1:13.625	1:10.881	1:11.877	NaN	2022
17216	Red Bull Racing RBPT	Qualifying	Max Verstappen	VER	Abu Dhabi	17.0	1	1	1:24.754	1:24.622	1:23.824	NaN	2022

179 rows × 13 columns

In [4]:

all_drivers["Ayrton Senna"]

Out[4]:

	Car	Detail	Driver	DriverCode	Grand Prix	Laps	No	Pos	Q1	Q2	Q3	Time	Year
807	Toleman Hart	Qualifying	Ayrton Senna	SEN	Brazil	NaN	19	16	NaN	NaN	NaN	1:33.525	1984
828	Toleman Hart	Qualifying	Ayrton Senna	SEN	South Africa	NaN	19	13	NaN	NaN	NaN	1:06.981	1984
859	Toleman Hart	Qualifying	Ayrton Senna	SEN	Belgium	NaN	19	19	NaN	NaN	NaN	1:18.876	1984
891	Toleman Hart	Qualifying	Ayrton Senna	SEN	San Marino	NaN	19	26	NaN	NaN	NaN	1:41.585	1984
904	Toleman Hart	Qualifying	Ayrton Senna	SEN	France	NaN	19	13	NaN	NaN	NaN	1:05.744	1984
...	...	...	...	...	...	...	...	...	...	...	...	...	...
5448	McLaren Ford	Qualifying	Ayrton Senna	SEN	Japan	NaN	8	2	NaN	NaN	NaN	1:37.284	1993
5471	McLaren Ford	Qualifying	Ayrton Senna	SEN	Australia	NaN	8	1	NaN	NaN	NaN	1:13.371	1993
5495	Williams Renault	Qualifying	Ayrton Senna	SEN	Brazil	22.0	2	1	NaN	NaN	NaN	1:15.962	1994
5522	Williams Renault	Qualifying	Ayrton Senna	SEN	Pacific	15.0	2	1	NaN	NaN	NaN	1:10.218	1994
5550	Williams Renault	Qualifying	Ayrton Senna	SEN	San Marino	10.0	2	1	NaN	NaN	NaN	1:21.548	1994

162 rows × 13 columns

From here, the thought process can be further extrapolated to link drivers and specific races

How Will This Solve the Problem¶

By calculating a time delta comparison between a driver’s current and previous teammates we can use the measure to compare a driver to their teammate’s former teammates, establishing a broad (but limited) range of driver comparisons.¶

Essentially, this creates a “chain” through which we can establish a comparison between almost every driver on the grid. For example, if a user was seeking to compare 2021 and 2022 Formula 1 champion Max Verstappen with 1988, 1990, and 1991 late champion Ayrton Senna, the program would run through Formula 1 drivers to establish a the most robust connection between the two via intermediary teammate comparisons.

Additionally, because drivers perform better on circuit types of circuits, the program would as the user for a specific circuit on which to compare the drivers--say the famous Spa Francorchamps in Belgium. The program would then establish a connection between the two inputted drivers using their current and former teammates before calculating an estimated time difference between the two drivers on the specific track in question.

For Formula 1 fans, this project serves as a way to predict how different drivers would compare if they were operating the same machinery. Additionally (and more practically), this could also provide Formula 1 teams with an invaluable resource towards determining which drivers' are faster and therefore which they should sign come the Formula 1 "silly season".