Using Machine Learning to Project NBA Three Point Percentage for 2020 NBA Draft Prospects

With the draft less than a week away, I wanted to share a recent project on projecting NBA three point percentage using college statistics.  While the results are by no means perfect or fully encompassing, I think they provide an interesting non-biased perspective on each prospect’s shooting profile.


Questions about a college player translating into an NBA prospect often start and end with that player’s ability to shoot at the NBA level.  Some players pan out as shooters in the NBA despite a lack of a college shooting resume, while others light up from beyond the three point line in college but struggle to shoot in the NBA.  Donovan Mitchell shot only 32.9% from three in college, but has proceeded to shoot 35.5% from three in the NBA despite a farther line and more difficult shots.  Jayson Tatum shot only 34.2% from three in college but has showed elite shooting ability in the NBA at 40.1% from three.  Malik Monk shot 39.7% from three in college on a lot of attempts, but has only shot 32.2% in the NBA.


This question isn’t just whether fringe shooters will be able to shoot at the NBA level, but also how we can identify which elite college shooters will be elite NBA shooters.  While Duncan Robinson was certainly a very good college shooter, few argued that he would be the top 10 NBA shooter that he is today.  Someone like Ian Clark on the other hand, had very impressive college shooting stats but has not shot all that well in the NBA.  Shooting should be something that is possible to predict at least somewhat accurately using college statistics.  Unlike other aspects of basketball, shooting is one thing that stays fairly similar between college and the NBA.  While the three point line does extend, the step up from college to NBA shooting is not nearly as big as the step up in things such as athleticism, rim protection or perimeter defense.


Major draft questions this year hinge on shooting ability for a number of highly regarded prospects.  Can Isaac Okoro develop a shot to complement his defensive and passing ability?  Will Tyrese Haliburton’s elite shooting percentages hold up despite his form? Can Onyeka Okongwu move his shooting range out to the three point line?


As an NBA draft fanatic, I’ve always been excited by the possibility of projecting NBA shooting using college stats.  Despite the potential uses of a three point percentage predictor, there is little public data about projecting college shooting to the NBA.  Other articles about this topic have used limited variables and have mainly compared the importance of basic statistics such as college three point percentage, college three point attempts, and college free throw percentage.  There doesn’t seem to be a public attempt to predict NBA three point percentage using any significant amount of past college data.  I collected data from Sports Reference and barttorvik.com using a combination of scraping methods and being sent data (thanks so much to Bart Torvik for sending me a lot of data from his website!).


After all the data was collected (a very painstaking process), I had over 150 columns in RStudio of variables for each NBA player between 2010 and 2019 (as far as the statistics went back).  I’m not going to explain much of my process of transforming the data and finding the best possible regression (if you are interested, I have a more “mathy” explanation here), but the end result left me with a weighted least squared regression with 9 variables that proved to be consistently statistically significant after data transformations, and a usable regression equation to predict NBA three point percentage.  The weights used in the regression were the z scores of combined NBA and college minutes for each player (meant to give players with a higher sample size more weight in the model).  The adjusted R-Squared value for the regression was 0.4704 and the scaled average residual was 0.0355.  This average residual value means that the model was on average predictive of a players NBA three point percentage within 4 percentage points.  Because the finding of this regression was done with a “prediction” mindset rather than an “inference” mindset, not much can be inferred about the impact of each individual variable on the ending three point percentage prediction.  However, I’m confident that the regression has useful predictive value.


The 9 statistically significant variables are listed below:

  1. Years played in college (min. 20% of team’s minutes for year to count)

  2. Age in June of draft year

  3. College far two FG percentage (basically mid range FG%, from Barttorvik.com)

  4. College three point attempts

  5. College free throw percentage

  6. Bayesian college three point percentage (a modified version of college three point percentage)

  7. College assisted three percentage (percentage of college threes that were assisted)

  8. College dunks made (surprising, but was consistently statistically significant, possibly a representation of a player’s liking of jump shots vs. points near the rim)

  9. College decimal min % (percentage of team’s minutes played while in college)


Only seasons with at least 20% of a team’s total minutes played were counted (i.e. 8 minutes per game and playing every game of the season would qualify).


The full results can be seen here (sorted alphabetically, sorted by height, sorted by age).  The list includes every player on the top 100 draft boards of ESPN and nbadraft.net (I might have missed someone though!)


Some Thoughts on The Results:


Top 5 Most Underrated Shooters


Markus Howard (47.58%)


Howard is not just number 1 in the rankings, but is significantly ahead of everyone else.  Howard has an historic shooting profile, especially for someone so young (he is a full year or more younger than many of the other seniors in the draft).  While his size will certainly be an issue in the NBA, Howard is one of the most sure bet shooters in draft history.  


Justinian Jessup (43.85%)


In the search for the next Duncan Robinson, Jessup might be the most appealing of any 2020 option.  Jessup is 6-7 and can shoot the lights out of the ball.  He could be the type of player to slip on draft night but make his name as an undrafted role player in the years to come.


Desmond Bane (41.52%)


3 and D is one of the most common phrases that you will hear in NBA team building, and Bane is arguably the best mix of the two in the draft.  He combines size and defensive versatility with a fantastic shooting profile that could allow him to become an elite NBA role player.


Killian Tillie (40.96%)


Tillie has struggled with injuries over the past few years, but he could be the best bet to find a stretch 4 in this draft.  Not only can he shoot, but Tillie also moves very well for his size and has defensive potential in the NBA.


Patrick Williams (34.62%)


While 34.62% might not seem that high, it’s number 1 among all freshman in this model.  Williams may not shoot that well in his first few years in the league, but there are a lot of indicators that his shot will develop eventually, which pairs well with his small ball 4 skillset.  Williams’s ability to switch onto multiple positions is especially valuable in the modern NBA.



Top 5 Most Overrated Shooters


Saddiq Bey (34.17%)


Bey is thought of by many to be an elite shooting prospect, but the model isn’t that high on his shooting potential.  Personally, Williams and Bane seem like much better bets to be 3 and D role players at the next level.


Devon Dotson (32.63%)


While Dotson starred for Kansas last year, he struggled to shoot consistently throughout his time at Kansas.  He definitely has the skillset to run an NBA offense, but will he be able to stay in an NBA rotation if his shooting doesn’t pan out? 


Anthony Edwards (29.91%)

Edwards is considered to be one of the top candidates for the first pick in this year’s draft, largely due to his top tier athleticism.  However, his three point projection is very concerning considering his love of off-the-dribble jump shots.  Edwards has bust potential unless he starts replacing some of these contested jump shots with drives to the rim. 


Obi Toppin (28.21%)


Toppin stretched it out to the three point line during his time in college, but the model is not high at all on his chances of continuing that in the NBA.  A big concern is that Toppin is already 22 despite only playing 2 years in college and therefore seems unlikely to improve his shot all that much.  He might end up putting up stats in the NBA due to a lot of shots near the rim, but a PF without a shot or much defensive versatility is not a great fit in the modern NBA.


Isaac Okoro (27.26%)


Okoro is known for his defense and ball handling, so this projection shouldn’t be a surprise to many.  Still though, his projection as one of the worst shooters in the draft is definitely a concern for his NBA prospects.  At least he is young and has a long time to improve his skillset.



To be fully transparent, one of the biggest limitations of this model is my limited knowledge as a college Economics major.  Much of the statistical knowledge behind this project was learned through the internet, and I do not profess to be an expert on the methods of Bayesian statistics or machine learning data transformations that were used.


One of the most noticeable problems with this model is that it outputs a predicted career three point percentage rather than a player’s peak three point percentage.  For this reason, a lot of the younger players in the model may seem underrated.  19 or 20 year old players very rarely shoot well in their early years in the NBA.  Therefore, poor percentages in the early years of players who enter the NBA at a young age cause their career percentages to suffer.  This led to players such as Tyrell Terry or Nico Mannion having quite muted NBA projections.  That’s why looking at their projection along with age or year in college for context is probably the best way to view this data.


The model also does not account well for shot difficulty.  If a player’s shot difficulty changes drastically from college to the NBA, then their three point percentage projection will likely be off.


Basically, the model should be taken as a less of a literal projection of a player’s NBA three point percentage, and more of a ranking of shooting indicators relative to a player’s age.


Comment if you want to see the model’s projection for a prospect who is not on the list!  I’ll try to reply as quickly as I can run their stats through.


Comments

Popular posts from this blog

The Argument for Markus Howard as a top 15 NBA prospect

NBA Three Point Percentage Model "Mathy" Explanations

NBA Draft Board 2021