Predicting NBA awards¶

Motivation:¶

Problem¶

Determining awards for NBA players can be controversial. With many great talents in the league, it can be hard for voters choose. The voters consist of media (sportswriters and broadcasters) from the US and Canada. Everyone is not going to have the same opinion, and therefore there will be disagreement.

Solution¶

Using a system based on stats can eliminate bias from voters and give the player the award who most deserves it. The goal of this project is to identify and use a relationship between NBA stats of this season's players(e.g. raptor statistics, wins above replacement) and the award winner's stats from previous year.

Impact¶

If we are able to find a relationship, we would be able to predict award winners based on their RAPTOR stats. In addition, if this prediction would end up correct, this give a lot of credit to FiveThirtyEight's RAPTOR statistic.

People that would interested by this data and information would be sports betters. The sports betting industry is a huge multibillion-dollar industry, and if you are putting money on the line, you wanna have a good chance at winning. Those who bet on NBA awards could benefit from this.

Dataset¶

Detail¶

I will use a FiveThirtyEight dataset of NBA player stats to observe the following features for each player:

Column	Description
`player_name`	Player name
`player_id`	Basketball-Reference.com player ID
`season`	Season
`season_type`	Regular season (RS) or playoff (PO)
`team`	Basketball-Reference ID of team
`poss`	Possessions played
`mp`	Minutes played
`raptor_box_offense`	Points above average per 100 possessions added by player on offense, based only on box score estimate
`raptor_box_defense`	Points above average per 100 possessions added by player on defense, based only on box score estimate
`raptor_box_total`	Points above average per 100 possessions added by player, based only on box score estimate
`raptor_onoff_offense`	Points above average per 100 possessions added by player on offense, based only on plus-minus data
`raptor_onoff_defense`	Points above average per 100 possessions added by player on defense, based only on plus-minus data
`raptor_onoff_total`	Points above average per 100 possessions added by player, based only on plus-minus data
`raptor_offense`	Points above average per 100 possessions added by player on offense, using both box and on-off components
`raptor_defense`	Points above average per 100 possessions added by player on defense, using both box and on-off components
`raptor_total`	Points above average per 100 possessions added by player on both offense and defense, using both box and on-off components
`war_total`	Wins Above Replacement between regular season and playoffs
`war_reg_season`	Wins Above Replacement for regular season
`war_playoffs`	Wins Above Replacement for playoffs
`predator_offense`	Predictive points above average per 100 possessions added by player on offense
`predator_defense`	Predictive points above average per 100 possessions added by player on defense
`predator_total`	Predictive points above average per 100 possessions added by player on both offense and defense
`pace_impact`	Player impact on team possessions per 48 minutes

RAPTOR is FiveThirtyEight's new NBA statisitc. It stands for 'Robust Algorithim (using) Player Tracking (and) On/off Ratings.' This statistic takes advantage of modern NBA player tracking and play-by-play data found on here. RAPTOR is a plus/minus stat that measures the contribution of a player's offense and defense per 100 possesions relative to the average NBA player.

In [6]:

import pandas as pd
# gets modern data since 2014
df_nba = pd.read_csv('modern_RAPTOR_by_team.csv')
df_nba.head()

Out[6]:

	player_name	player_id	season	season_type	team	poss	mp	raptor_box_offense	raptor_box_defense	raptor_box_total	...	raptor_offense	raptor_defense	raptor_total	war_total	war_reg_season	war_playoffs	predator_offense	predator_defense	predator_total	pace_impact
0	Alex Abrines	abrinal01	2017	PO	OKC	172	80	0.420828	-2.862454	-2.441626	...	-0.892617	-6.561258	-7.453875	-0.198700	0.000000	-0.198700	-3.298178	-6.535113	-9.833292	0.334678
1	Alex Abrines	abrinal01	2017	RS	OKC	2215	1055	0.770717	-0.179621	0.591096	...	0.654933	-0.724233	-0.069300	1.447708	1.447708	0.000000	0.339201	-0.611866	-0.272665	0.325771
2	Alex Abrines	abrinal01	2018	PO	OKC	233	110	1.123761	-1.807486	-0.683725	...	1.875157	0.740292	2.615450	0.311392	0.000000	0.311392	2.877519	-0.520954	2.356566	0.260479
3	Alex Abrines	abrinal01	2018	RS	OKC	2313	1134	0.236335	-1.717049	-1.480714	...	-0.211818	-1.728584	-1.940401	0.465912	0.465912	0.000000	-0.482078	-1.172227	-1.654306	-0.528330
4	Alex Abrines	abrinal01	2019	RS	OKC	1279	588	-3.215683	1.078399	-2.137285	...	-4.040157	1.885618	-2.154538	0.178167	0.178167	0.000000	-4.577678	1.543282	-3.034396	-0.268013

5 rows × 23 columns

In [5]:

# 2022-2023 season
df_nba2023 = pd.read_csv('latest_RAPTOR_by_team.csv')
df_nba2023.head()

Out[5]:

	player_name	player_id	season	season_type	team	poss	mp	raptor_box_offense	raptor_box_defense	raptor_box_total	...	raptor_offense	raptor_defense	raptor_total	war_total	war_reg_season	predator_offense	predator_defense	predator_total	pace_impact
0	Precious Achiuwa	achiupr01	2023	RS	TOR	1784	868	-1.925091	0.742752	-1.182340	...	-1.263367	-0.003950	-1.267317	0.650937	0.650937	-1.726975	0.248174	-1.478801	-0.872994
1	Steven Adams	adamsst01	2023	RS	MEM	2391	1133	-0.783499	3.910489	3.126990	...	0.382174	3.062029	3.444203	3.586879	3.586879	-0.004884	3.554965	3.550081	0.177138
2	Bam Adebayo	adebaba01	2023	RS	MIA	3988	1965	-1.600716	3.206394	1.605678	...	-0.600442	3.188671	2.588229	5.327369	5.327369	-0.387635	3.187949	2.800314	-0.461195
3	Ochai Agbaji	agbajoc01	2023	RS	UTA	1308	610	-1.102455	-1.333894	-2.436349	...	-0.934363	-1.079027	-2.013390	0.228766	0.228766	-1.207554	-2.200173	-3.407727	-0.212811
4	Santi Aldama	aldamsa01	2023	RS	MEM	2658	1232	-0.923960	0.684572	-0.239389	...	-0.939673	0.214642	-0.725031	1.278124	1.278124	-0.731312	1.248535	0.517223	0.420301

5 rows × 23 columns

Potential Problems¶

Since this RAPTOR statistic is relatively new, I am not sure how much it directly relates to predicting awards. Also, determining awards is not just based off numbers, there are other factors too. That includes the eye test, the media, and popularity.

Also, in order to predict awards, it is much harder to predict in the beginning of the season when little games are played. There are 82 games in a season per team, so the further into the season we are, the more accurate the prediction would likely be.

Method:¶

The method I believe we would be using is clustering. Given the raptor stats and other features above, we are trying to find the stats of a player who has the most similar of one from a previous award winner, to predict this year's winner. To find award winners from previous years, we will have to do research on that. I'm not sure if this will be the best method because stats of award winners vary from year to year.