Formula 1 Adjusted Driver Qualifying Comparison¶

DS 2500 Project Proposal - Page Lootsma

Description of Problem¶

The purpose of this project is to create a means of quantifiably comparing the qualifying times of Formula 1 drivers who did not drive in the same machinery at the same time.¶

It is notoriously difficult to compare drivers from different teams and different eras in Formula 1, as each team (consisting of only two drivers) has a different car that changes on a race-to-race basis. Thus, a driver can only be reasonably compared to their teammate, as this is the only individual in the same machinery and the same set of circumstances. This fascilitates the need for a data-driven method of quantifiably comparing drivers from different teams and different eras.

Typically, there are two major skills which are considered when assessing a driver's ability--their racecraft and their pace. Racecraft refers to a drivers' ability to maneuver and overtake other drivers over the course of an entire race. Quantifying a drivers' racecraft in a manner that be compared to other drivers is extremely difficult to facilitate, as every driver experiences a unique set of circumstances throughout a race that cannot be repeated or emulated.

A more effective means of quantifiably comparing drivers' is their pace. Pace is often regarded as the antithesis of racecraft; rather than observing a drivers' ability to 'work their way through the pack', pace refers to a drivers' raw speed over the course of a single lap, uninhibited by other drivers and pushing their machinery to the absolute limit.

Unlike racecraft, pace has is much more feasibly quantified. In order to determine their position at the start of the race, Formula 1 drivers participate in a qualifying session. The rules for qualifying are complicated, but essentially the drivers are attempting to complete the fastest single lap they can, as the order of the race will be determined by their qualifying time. This means that drivers are going flat-out under essentially the same track conditions without taking strategic elements such as tyre wear or traffic into consideration. As a result, drivers' qualifying times provide an invaluable window into the pure pace a driver possesses which can be directly compared to their teammate.

A link to the Kaggle dataset can be found here.

Some Sample Code¶

Import Pandas and read in the csv file.

In [1]:
import pandas as pd

df_quali = pd.read_csv('qualifying.csv')
df_quali.tail()
Out[1]:
Car Detail Driver DriverCode Grand Prix Laps No Pos Q1 Q2 Q3 Time Year
17231 Haas Ferrari Qualifying Kevin Magnussen MAG Abu Dhabi 9.0 20 16 1:25.834 NaN NaN NaN 2022
17232 AlphaTauri RBPT Qualifying Pierre Gasly GAS Abu Dhabi 9.0 10 17 1:25.859 NaN NaN NaN 2022
17233 Alfa Romeo Ferrari Qualifying Valtteri Bottas BOT Abu Dhabi 6.0 77 18 1:25.892 NaN NaN NaN 2022
17234 Williams Mercedes Qualifying Alexander Albon ALB Abu Dhabi 9.0 23 19 1:26.028 NaN NaN NaN 2022
17235 Williams Mercedes Qualifying Nicholas Latifi LAT Abu Dhabi 9.0 6 20 1:26.054 NaN NaN NaN 2022

Create a dictionary of all drivers, with each key being a specific driver's name containing a dataframe of that driver's qualifying results.

In [2]:
driver_list = df_quali['Driver'].unique()

all_drivers = {}

for driver in driver_list:
    bool_driver = df_quali['Driver'] == driver
    df_driver = df_quali.loc[bool_driver, :]
    all_drivers[driver] = df_driver

Showcase how different drivers can be accessed through the dictionary.

In [3]:
all_drivers["Max Verstappen"]
Out[3]:
Car Detail Driver DriverCode Grand Prix Laps No Pos Q1 Q2 Q3 Time Year
13625 STR Renault Qualifying Max Verstappen VER Australia 15.0 33 12 1:29.248 1:28.868 NaN NaN 2015
13637 STR Renault Qualifying Max Verstappen VER Malaysia 16.0 33 6 1:40.793 1:41.430 1:51.981 NaN 2015
13663 STR Renault Qualifying Max Verstappen VER China 14.0 33 13 1:38.387 1:38.393 NaN NaN 2015
13685 STR Renault Qualifying Max Verstappen VER Bahrain 14.0 33 15 1:35.611 1:35.103 NaN NaN 2015
13696 STR Renault Qualifying Max Verstappen VER Spain 20.0 33 6 1:27.393 1:26.441 1:26.249 NaN 2015
... ... ... ... ... ... ... ... ... ... ... ... ... ...
17136 Red Bull Racing RBPT Qualifying Max Verstappen VER Japan 13.0 1 1 1:30.224 1:30.346 1:29.304 NaN 2022
17158 Red Bull Racing RBPT Qualifying Max Verstappen VER United States 15.0 1 3 1:35.864 1:35.294 1:34.448 NaN 2022
17176 Red Bull Racing RBPT Qualifying Max Verstappen VER Mexico 16.0 1 1 1:19.222 1:18.566 1:17.775 NaN 2022
17197 Red Bull Racing RBPT Qualifying Max Verstappen VER Brazil 23.0 1 2 1:13.625 1:10.881 1:11.877 NaN 2022
17216 Red Bull Racing RBPT Qualifying Max Verstappen VER Abu Dhabi 17.0 1 1 1:24.754 1:24.622 1:23.824 NaN 2022

179 rows × 13 columns

In [4]:
all_drivers["Ayrton Senna"]
Out[4]:
Car Detail Driver DriverCode Grand Prix Laps No Pos Q1 Q2 Q3 Time Year
807 Toleman Hart Qualifying Ayrton Senna SEN Brazil NaN 19 16 NaN NaN NaN 1:33.525 1984
828 Toleman Hart Qualifying Ayrton Senna SEN South Africa NaN 19 13 NaN NaN NaN 1:06.981 1984
859 Toleman Hart Qualifying Ayrton Senna SEN Belgium NaN 19 19 NaN NaN NaN 1:18.876 1984
891 Toleman Hart Qualifying Ayrton Senna SEN San Marino NaN 19 26 NaN NaN NaN 1:41.585 1984
904 Toleman Hart Qualifying Ayrton Senna SEN France NaN 19 13 NaN NaN NaN 1:05.744 1984
... ... ... ... ... ... ... ... ... ... ... ... ... ...
5448 McLaren Ford Qualifying Ayrton Senna SEN Japan NaN 8 2 NaN NaN NaN 1:37.284 1993
5471 McLaren Ford Qualifying Ayrton Senna SEN Australia NaN 8 1 NaN NaN NaN 1:13.371 1993
5495 Williams Renault Qualifying Ayrton Senna SEN Brazil 22.0 2 1 NaN NaN NaN 1:15.962 1994
5522 Williams Renault Qualifying Ayrton Senna SEN Pacific 15.0 2 1 NaN NaN NaN 1:10.218 1994
5550 Williams Renault Qualifying Ayrton Senna SEN San Marino 10.0 2 1 NaN NaN NaN 1:21.548 1994

162 rows × 13 columns

From here, the thought process can be further extrapolated to link drivers and specific races

How Will This Solve the Problem¶

By calculating a time delta comparison between a driver’s current and previous teammates we can use the measure to compare a driver to their teammate’s former teammates, establishing a broad (but limited) range of driver comparisons.¶

Essentially, this creates a “chain” through which we can establish a comparison between almost every driver on the grid. For example, if a user was seeking to compare 2021 and 2022 Formula 1 champion Max Verstappen with 1988, 1990, and 1991 late champion Ayrton Senna, the program would run through Formula 1 drivers to establish a the most robust connection between the two via intermediary teammate comparisons.

Additionally, because drivers perform better on circuit types of circuits, the program would as the user for a specific circuit on which to compare the drivers--say the famous Spa Francorchamps in Belgium. The program would then establish a connection between the two inputted drivers using their current and former teammates before calculating an estimated time difference between the two drivers on the specific track in question.

For Formula 1 fans, this project serves as a way to predict how different drivers would compare if they were operating the same machinery. Additionally (and more practically), this could also provide Formula 1 teams with an invaluable resource towards determining which drivers' are faster and therefore which they should sign come the Formula 1 "silly season".