NBA 2023 Draft Class Sucess Prediction¶

Motivation:¶

Problem¶

A problem that data science can provide helpful insights for is predicting the success of players in the NBA draft. By analyzing performance data from college players, classification algorithms can identify players with the potential to become stars based on their statistics. This is an important problem for NBA teams as they invest a lot of money in drafting players, and their success depends on picking the right players. According to a study by DraftExpress, over the last 20 years, NBA teams have spent over $1 billion on draft picks, and the success rate of those picks has been only around 50% (1). Therefore, using data science to improve the accuracy of draft predictions could lead to significant benefits for NBA teams.

Schlosser, K. (2019). An NBA team's draft mistake can cost them $8.3 million, and the league is using AI to help make better decisions. Business Insider.

Solution¶

The goal of this project is to utilize statistical features of basketball players such as points per game, field goal percentage, and player efficiency rating to predict their efficacy in the NBA draft. By analyzing the relationships between these statistical features and a player's performance in the draft, we can identify key factors that contribute to success in the NBA.

Impact¶

If successful, this work on predicting the potential success of NBA draft picks based on their performance data may have several impacts. Firstly, it could help NBA teams to make more informed decisions when selecting players in the draft. This could lead to more successful draft classes, which may translate into more successful seasons for the teams.

However, there could be some negative impacts as well. For example, relying too heavily on machine learning algorithms to make player selections could lead to a lack of diversity in the types of players chosen. This could stifle innovation and creativity in the sport, as teams focus solely on picking players who fit the mold of what the algorithm identifies as successful.

Dataset¶

Detail¶

We will use a 2023 NBA Draft Prospects Stats to observe the following features for each song:

  • Team: College or international team
  • GP: Games played
  • MPG: Minutes per game
  • PPG: Points per game
  • FGM: Field goals made
  • FGA: Field goals attempted
  • FG%: Field goal percentage
  • 3PM: 3-pointers made
  • 3PA: 3-pointers attempted
  • 3P%: 3-point percentage
  • FTM: Free throws made
  • FTA: Free throws attempted
  • FT%: Free throw percentage
  • ORB: Offensive rebounds
  • DRB: Defensive rebounds
  • RPG: Rebounds per game
  • APG: Assists per game
  • SPG: Steals per game
  • BPG: Blocks per game
  • TS%: True shooting percentage
  • eFG%: Effective field goal percentage
  • ORB%: Offensive rebound percentage
  • DRB%: Defensive rebound percentage
  • TRB%: Total rebound percentage
  • AST%: Assist percentage
  • TOV%: Turnover percentage
  • STL%: Steal percentage
  • BLK%: Block percentage
  • USG%: Usage percentage
  • PPR: Pure point rating
  • PPS: Points per shot
  • ORtg: Offensive rating
  • DRtg: Defensive rating
  • PER: Player efficiency rating

Sufficient Data¶

These statistics can provide us with insights into a player's strengths and weaknesses, allowing us to assess their potential success in the NBA. For example, a player who has a high PPG and a high FG% may be more likely to become a successful scorer in the NBA. Similarly, a player who has a high RPG and a high BPG may be more likely to become a successful rebounder and shot blocker. By analyzing these statistics and identifying patterns and trends, we can create models and algorithms that can predict a player's potential success in the NBA draft. Additionally, I have included advanced stats from the dataset including PER, TS% etc. that can help us make complex descisions.

In [18]:
import pandas as pd
df = pd.read_csv('~/courses/ds2500/nba2023draftclass1.csv')
df.head()
Out[18]:
Player Team GP MPG PPG FGM FGA FG% 3PM 3PA ... AST% TOV% STL% BLK% USG% PPR PPS ORtg DRtg PER
0 Adam Flagler BU 27 33.3 15.5 5.2 12.5 0.417 2.5 6.3 ... 29.0 10.9 2.1 0.3 23.5 4.5 1.2 121.6 106.0 19.9
1 Adem Bona UCLA 27 23.1 8.0 3.3 4.9 0.672 0.0 0.0 ... 5.4 17.2 1.7 8.6 16.1 -3.5 1.6 122.0 88.3 20.8
2 Alex Fudge UF 27 20.0 6.0 2.3 5.4 0.418 0.4 1.4 ... 3.7 15.1 1.4 3.5 19.0 -4.3 1.1 94.1 97.1 12.4
3 Amari Bailey UCLA 21 25.7 10.1 4.3 8.9 0.489 0.7 1.8 ... 13.4 19.4 2.4 2.1 23.4 -3.8 1.1 97.4 91.4 15.2
4 Andre Jackson, Jr. UConn 26 28.6 6.5 2.4 6.1 0.390 0.7 2.6 ... 23.8 22.1 2.2 2.2 14.9 2.8 1.1 108.3 94.7 14.2

5 rows × 35 columns

How Data will Solve this Problem¶

To create a classification model, we would first need to define what we mean by "success". This could be based on various criteria such as the number of games played, awards received, or statistics achieved at the professional level. We would then use the data available in our dataset (such as GP, MPG, PPG, etc.) as input features and the success label as the output class label.

Next, we would split the data into training and testing sets and use various classification algorithms (such as regression or decision trees) to train a model on the training data. We would then evaluate the performance of the model on the testing data by computing various evaluation metrics such as accuracy, precision or recall.

The end result would be a classification model that can predict the success label of a given draft prospect based on their performance data. This model could be used by NBA teams or scouts to identify promising players and make better decisions in the draft.