Posts

Showing posts from November, 2020

Using Machine Learning to Project NBA Three Point Percentage for 2020 NBA Draft Prospects

With the draft less than a week away, I wanted to share a recent project on projecting NBA three point percentage using college statistics.  While the results are by no means perfect or fully encompassing, I think they provide an interesting non-biased perspective on each prospect’s shooting profile. Questions about a college player translating into an NBA prospect often start and end with that player’s ability to shoot at the NBA level.  Some players pan out as shooters in the NBA despite a lack of a college shooting resume, while others light up from beyond the three point line in college but struggle to shoot in the NBA.  Donovan Mitchell shot only 32.9% from three in college, but has proceeded to shoot 35.5% from three in the NBA despite a farther line and more difficult shots.  Jayson Tatum shot only 34.2% from three in college but has showed elite shooting ability in the NBA at 40.1% from three.  Malik Monk shot 39.7% from three in college on a lot of attempts, but has only shot

NBA Three Point Percentage Model "Mathy" Explanations

Image
Data Tranformations Due to variation in some of the variables that is not consistent throughout the data, I was immediately worried about heteroskedasticity and how this would impact my ability to get accurate coefficients for a regression.  I did a Bruesch-Pagan test for heteroskedasticity and got a p value of 0.0003155, meaning that there was significant evidence of heteroskedasticity in the data.  Because of this, I had to apply a transformation on the data.  I used a box-cox transformation, using the machine learning function “preProcess” to try to fix this heteroskedasticity.  The Breusch-Pagan test value was now greater than 0.05, meaning that heteroskedasticity in the data was now unlikely.  While heteroskedasticity and collinearity are both problems in this data, I was not too worried about their impact in the final regression because these problems would affect the standard errors much more than the coefficients, meaning that it wouldn’t impact the model’s predictive value. Ba

NBA Draft Three Point Percentage Predictions Sorted By Age

  Player Prediction Age Years Played Height Patrick Williams 0.3462421528 19.25 1 6-8 Anthony Edwards 0.2990822472 19.25 1 6-5 Jahmi'us Ramsey 0.3264036985 19.41666667 1 6-4 Isaiah Stewart 0.3044316999 19.5 1 6-9 Kira Lewis Jr. 0.3458737419 19.58333333 2 6-3 Nico Mannion 0.3218205474 19.66666667 1 6-3 Vernon Carey Jr. 0.3054471794 19.75 1 6-10 Zeke Nnaji 0.3077487821 19.83333333 1 6-11 Isaac Okoro 0.2725895832 19.83333333 1 6-6 Onyeka Okongwu 0.2744041249 19.91666667 1 6-9 Josh Green 0.3130930479 20 1 6-6 Tyrese Maxey 0.3128102623 20 1 6-3 Jaden McDaniels 0.3167327942 20.16666667 1 6-9 Tyrell Terry 0.3401935999 20.16666667 1 6-1 Devin Vassell 0.3757779094 20.25 2 6-7 CJ Elleby 0.3389837428 20.41666667 2 6-6 Cole Anthony 0.3311657447 20.5 1 6-3 Filip Petrusev 0.3497668719 20.58333333 2 6-11 Jalen Smith 0.3089498459 20.66666667 2 6-10 Reggie Perry 0.3267347292 20.66666667 2 6-10 Tyrese Haliburton 0.3637994155 20.75 2 6-5 Tre Jones 0.3344652403 20.83333333 2 6-3 Aaron Nesmith 0.386530