NBA 2023 Draft Class Sucess Prediction¶

Motivation:¶

Problem¶

A problem that data science can provide helpful insights for is predicting the success of players in the NBA draft. By analyzing performance data from college players, classification algorithms can identify players with the potential to become stars based on their statistics. This is an important problem for NBA teams as they invest a lot of money in drafting players, and their success depends on picking the right players. According to a study by DraftExpress, over the last 20 years, NBA teams have spent over $1 billion on draft picks, and the success rate of those picks has been only around 50% (1). Therefore, using data science to improve the accuracy of draft predictions could lead to significant benefits for NBA teams.

Schlosser, K. (2019). An NBA team's draft mistake can cost them $8.3 million, and the league is using AI to help make better decisions. Business Insider.

Solution¶

The goal of this project is to utilize statistical features of basketball players such as points per game, field goal percentage, and player efficiency rating to predict their efficacy in the NBA draft. By analyzing the relationships between these statistical features and a player's performance in the draft, we can identify key factors that contribute to success in the NBA.

Impact¶

If successful, this work on predicting the potential success of NBA draft picks based on their performance data may have several impacts. Firstly, it could help NBA teams to make more informed decisions when selecting players in the draft. This could lead to more successful draft classes, which may translate into more successful seasons for the teams.

However, there could be some negative impacts as well. For example, relying too heavily on machine learning algorithms to make player selections could lead to a lack of diversity in the types of players chosen. This could stifle innovation and creativity in the sport, as teams focus solely on picking players who fit the mold of what the algorithm identifies as successful.

Dataset¶

Detail¶

We will use a 2023 NBA Draft Prospects Stats to observe the following features for each song:

Team: College or international team
GP: Games played
MPG: Minutes per game
PPG: Points per game
FGM: Field goals made
FGA: Field goals attempted
FG%: Field goal percentage
3PM: 3-pointers made
3PA: 3-pointers attempted
3P%: 3-point percentage
FTM: Free throws made
FTA: Free throws attempted
FT%: Free throw percentage
ORB: Offensive rebounds
DRB: Defensive rebounds
RPG: Rebounds per game
APG: Assists per game
SPG: Steals per game
BPG: Blocks per game
TS%: True shooting percentage
eFG%: Effective field goal percentage
ORB%: Offensive rebound percentage
DRB%: Defensive rebound percentage
TRB%: Total rebound percentage
AST%: Assist percentage
TOV%: Turnover percentage
STL%: Steal percentage
BLK%: Block percentage
USG%: Usage percentage
PPR: Pure point rating
PPS: Points per shot
ORtg: Offensive rating
DRtg: Defensive rating
PER: Player efficiency rating

Sufficient Data¶

These statistics can provide us with insights into a player's strengths and weaknesses, allowing us to assess their potential success in the NBA. For example, a player who has a high PPG and a high FG% may be more likely to become a successful scorer in the NBA. Similarly, a player who has a high RPG and a high BPG may be more likely to become a successful rebounder and shot blocker. By analyzing these statistics and identifying patterns and trends, we can create models and algorithms that can predict a player's potential success in the NBA draft. Additionally, I have included advanced stats from the dataset including PER, TS% etc. that can help us make complex descisions.

In [18]:

import pandas as pd
df = pd.read_csv('~/courses/ds2500/nba2023draftclass1.csv')
df.head()

Out[18]:

	Player	Team	GP	MPG	PPG	FGM	FGA	FG%	3PM	3PA	...	AST%	TOV%	STL%	BLK%	USG%	PPR	PPS	ORtg	DRtg	PER
0	Adam Flagler	BU	27	33.3	15.5	5.2	12.5	0.417	2.5	6.3	...	29.0	10.9	2.1	0.3	23.5	4.5	1.2	121.6	106.0	19.9
1	Adem Bona	UCLA	27	23.1	8.0	3.3	4.9	0.672	0.0	0.0	...	5.4	17.2	1.7	8.6	16.1	-3.5	1.6	122.0	88.3	20.8
2	Alex Fudge	UF	27	20.0	6.0	2.3	5.4	0.418	0.4	1.4	...	3.7	15.1	1.4	3.5	19.0	-4.3	1.1	94.1	97.1	12.4
3	Amari Bailey	UCLA	21	25.7	10.1	4.3	8.9	0.489	0.7	1.8	...	13.4	19.4	2.4	2.1	23.4	-3.8	1.1	97.4	91.4	15.2
4	Andre Jackson, Jr.	UConn	26	28.6	6.5	2.4	6.1	0.390	0.7	2.6	...	23.8	22.1	2.2	2.2	14.9	2.8	1.1	108.3	94.7	14.2

5 rows × 35 columns

How Data will Solve this Problem¶

To create a classification model, we would first need to define what we mean by "success". This could be based on various criteria such as the number of games played, awards received, or statistics achieved at the professional level. We would then use the data available in our dataset (such as GP, MPG, PPG, etc.) as input features and the success label as the output class label.

Next, we would split the data into training and testing sets and use various classification algorithms (such as regression or decision trees) to train a model on the training data. We would then evaluate the performance of the model on the testing data by computing various evaluation metrics such as accuracy, precision or recall.

The end result would be a classification model that can predict the success label of a given draft prospect based on their performance data. This model could be used by NBA teams or scouts to identify promising players and make better decisions in the draft.