Determining awards for NBA players can be controversial. With many great talents in the league, it can be hard for voters choose. The voters consist of media (sportswriters and broadcasters) from the US and Canada. Everyone is not going to have the same opinion, and therefore there will be disagreement.
Using a system based on stats can eliminate bias from voters and give the player the award who most deserves it. The goal of this project is to identify and use a relationship between NBA stats of this season's players(e.g. raptor statistics, wins above replacement) and the award winner's stats from previous year.
If we are able to find a relationship, we would be able to predict award winners based on their RAPTOR stats. In addition, if this prediction would end up correct, this give a lot of credit to FiveThirtyEight's RAPTOR statistic.
People that would interested by this data and information would be sports betters. The sports betting industry is a huge multibillion-dollar industry, and if you are putting money on the line, you wanna have a good chance at winning. Those who bet on NBA awards could benefit from this.
I will use a FiveThirtyEight dataset of NBA player stats to observe the following features for each player:
Column | Description |
---|---|
player_name |
Player name |
player_id |
Basketball-Reference.com player ID |
season |
Season |
season_type |
Regular season (RS) or playoff (PO) |
team |
Basketball-Reference ID of team |
poss |
Possessions played |
mp |
Minutes played |
raptor_box_offense |
Points above average per 100 possessions added by player on offense, based only on box score estimate |
raptor_box_defense |
Points above average per 100 possessions added by player on defense, based only on box score estimate |
raptor_box_total |
Points above average per 100 possessions added by player, based only on box score estimate |
raptor_onoff_offense |
Points above average per 100 possessions added by player on offense, based only on plus-minus data |
raptor_onoff_defense |
Points above average per 100 possessions added by player on defense, based only on plus-minus data |
raptor_onoff_total |
Points above average per 100 possessions added by player, based only on plus-minus data |
raptor_offense |
Points above average per 100 possessions added by player on offense, using both box and on-off components |
raptor_defense |
Points above average per 100 possessions added by player on defense, using both box and on-off components |
raptor_total |
Points above average per 100 possessions added by player on both offense and defense, using both box and on-off components |
war_total |
Wins Above Replacement between regular season and playoffs |
war_reg_season |
Wins Above Replacement for regular season |
war_playoffs |
Wins Above Replacement for playoffs |
predator_offense |
Predictive points above average per 100 possessions added by player on offense |
predator_defense |
Predictive points above average per 100 possessions added by player on defense |
predator_total |
Predictive points above average per 100 possessions added by player on both offense and defense |
pace_impact |
Player impact on team possessions per 48 minutes |
RAPTOR is FiveThirtyEight's new NBA statisitc. It stands for 'Robust Algorithim (using) Player Tracking (and) On/off Ratings.' This statistic takes advantage of modern NBA player tracking and play-by-play data found on here. RAPTOR is a plus/minus stat that measures the contribution of a player's offense and defense per 100 possesions relative to the average NBA player.
import pandas as pd
# gets modern data since 2014
df_nba = pd.read_csv('modern_RAPTOR_by_team.csv')
df_nba.head()
player_name | player_id | season | season_type | team | poss | mp | raptor_box_offense | raptor_box_defense | raptor_box_total | ... | raptor_offense | raptor_defense | raptor_total | war_total | war_reg_season | war_playoffs | predator_offense | predator_defense | predator_total | pace_impact | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | Alex Abrines | abrinal01 | 2017 | PO | OKC | 172 | 80 | 0.420828 | -2.862454 | -2.441626 | ... | -0.892617 | -6.561258 | -7.453875 | -0.198700 | 0.000000 | -0.198700 | -3.298178 | -6.535113 | -9.833292 | 0.334678 |
1 | Alex Abrines | abrinal01 | 2017 | RS | OKC | 2215 | 1055 | 0.770717 | -0.179621 | 0.591096 | ... | 0.654933 | -0.724233 | -0.069300 | 1.447708 | 1.447708 | 0.000000 | 0.339201 | -0.611866 | -0.272665 | 0.325771 |
2 | Alex Abrines | abrinal01 | 2018 | PO | OKC | 233 | 110 | 1.123761 | -1.807486 | -0.683725 | ... | 1.875157 | 0.740292 | 2.615450 | 0.311392 | 0.000000 | 0.311392 | 2.877519 | -0.520954 | 2.356566 | 0.260479 |
3 | Alex Abrines | abrinal01 | 2018 | RS | OKC | 2313 | 1134 | 0.236335 | -1.717049 | -1.480714 | ... | -0.211818 | -1.728584 | -1.940401 | 0.465912 | 0.465912 | 0.000000 | -0.482078 | -1.172227 | -1.654306 | -0.528330 |
4 | Alex Abrines | abrinal01 | 2019 | RS | OKC | 1279 | 588 | -3.215683 | 1.078399 | -2.137285 | ... | -4.040157 | 1.885618 | -2.154538 | 0.178167 | 0.178167 | 0.000000 | -4.577678 | 1.543282 | -3.034396 | -0.268013 |
5 rows × 23 columns
# 2022-2023 season
df_nba2023 = pd.read_csv('latest_RAPTOR_by_team.csv')
df_nba2023.head()
player_name | player_id | season | season_type | team | poss | mp | raptor_box_offense | raptor_box_defense | raptor_box_total | ... | raptor_offense | raptor_defense | raptor_total | war_total | war_reg_season | war_playoffs | predator_offense | predator_defense | predator_total | pace_impact | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | Precious Achiuwa | achiupr01 | 2023 | RS | TOR | 1784 | 868 | -1.925091 | 0.742752 | -1.182340 | ... | -1.263367 | -0.003950 | -1.267317 | 0.650937 | 0.650937 | 0 | -1.726975 | 0.248174 | -1.478801 | -0.872994 |
1 | Steven Adams | adamsst01 | 2023 | RS | MEM | 2391 | 1133 | -0.783499 | 3.910489 | 3.126990 | ... | 0.382174 | 3.062029 | 3.444203 | 3.586879 | 3.586879 | 0 | -0.004884 | 3.554965 | 3.550081 | 0.177138 |
2 | Bam Adebayo | adebaba01 | 2023 | RS | MIA | 3988 | 1965 | -1.600716 | 3.206394 | 1.605678 | ... | -0.600442 | 3.188671 | 2.588229 | 5.327369 | 5.327369 | 0 | -0.387635 | 3.187949 | 2.800314 | -0.461195 |
3 | Ochai Agbaji | agbajoc01 | 2023 | RS | UTA | 1308 | 610 | -1.102455 | -1.333894 | -2.436349 | ... | -0.934363 | -1.079027 | -2.013390 | 0.228766 | 0.228766 | 0 | -1.207554 | -2.200173 | -3.407727 | -0.212811 |
4 | Santi Aldama | aldamsa01 | 2023 | RS | MEM | 2658 | 1232 | -0.923960 | 0.684572 | -0.239389 | ... | -0.939673 | 0.214642 | -0.725031 | 1.278124 | 1.278124 | 0 | -0.731312 | 1.248535 | 0.517223 | 0.420301 |
5 rows × 23 columns
Since this RAPTOR statistic is relatively new, I am not sure how much it directly relates to predicting awards. Also, determining awards is not just based off numbers, there are other factors too. That includes the eye test, the media, and popularity.
Also, in order to predict awards, it is much harder to predict in the beginning of the season when little games are played. There are 82 games in a season per team, so the further into the season we are, the more accurate the prediction would likely be.
The method I believe we would be using is clustering. Given the raptor stats and other features above, we are trying to find the stats of a player who has the most similar of one from a previous award winner, to predict this year's winner. To find award winners from previous years, we will have to do research on that. I'm not sure if this will be the best method because stats of award winners vary from year to year.