top of page
Search
Writer's pictureclaire

live from Augusta National

Last week I was relishing my last spring break ever and decided to spend some time looking at PGA data -- inspired by the upcoming Masters tournament. Thanks to advancedsportsanalytics.com, I was able to download a wealth of PGA data from 2015-2022. The data contains a lot of information, but I was most interested in the strokes gained metric. Strokes gained is a statistic that was developed somewhat recently, and is essentially a way to assess a golfer's performance relative to average (https://mygolfspy.com/news-opinion/what-is-strokes-gained/ for more!)


If a golfer is gaining strokes off the tee, they are hitting above average on the tee, thus their tee shots are giving them an advantage over what would be expected at baseline. The commonly studied strokes gained positions are off the tee, on approach, around the green and putting.




I looked at a number of metrics (full code and write up here: https://rpubs.com/clairemo/masters) but was mostly interested in assessing who might perform well in this years Masters. Interestingly, looking at anyone who has won at least once since 2015, Dustin Johnson has been the most efficient golfer in terms of wins over number of appearances (12 wins, 8.4% efficiency) and also has a top average total strokes gained. This makes sense, and is more of a proof of concept of how strokes gained are a decent benchmark for performance.



Next, I ran a simple linear mixed model to assess what strokes gained variable is most predictive of finish specifically for the Masters.


The model shows strokes gained everywhere is indicative of better finish (more strokes gained predicts a lower finish), but that for every additional stroked gained on the approach (sg app) and off the tee (sg ott), your predicted finish decreases more than at other positions.


To assess how this holds up with Masters winners, I looked at strokes gained from 2015-2022 of Masters winners from the last 8 years (Jordan Speith, Danny Willett, Sergio Garcia, Patrick Reed, Tiger, Dustin Johnson, Hidecki Matsuyama and Scottie Scheffler). As you might expect (based on some of the lesser known winners!) not all of them had great strokes gained averaging across 8 years. So instead, I looked at each winner's strokes gained for the year they won.











Basically what these graphs tell us is that for the most part, winners are gaining strokes in multiple positions the year they won the Masters. However, Dustin Johnson and Danny Willett had negative strokes gained for at least 2 categories, indicating variability in play from tournament to tournament does not necessarily predict winner. Important to note though is that all of those winners had positive strokes gained off the tee and on approach (except Reed and Dustin Johnson).


To further probe this for this year, I decided to look at who has been gaining the most strokes off the tee and on approach shots. These two plots show the top 30 people from the 2022 season who gained the most strokes on approach (and also their strokes gained off the tee), and similarly the top 30 people from the 2022 season who gained the most strokes of the tee (and also their strokes gained on approach).


These graphs show Zalatoris, Rahm, Scheffler, Hovland, Conners and Finau all with high strokes gained off the tee and approach -- promising for Augusta National. Other notables to watch out for might be Matt Fitzpatrick, Collin Morikawa, Joaquin Niemann or Justin Thomas.


Finally, I wanted to run a machine learning prediction model to try to predict the winner. However, the data I downloaded did not have the ideal data! We only had strokes gained for the 2022 Masters -- no other years. So essentially I predicted the 2022 Masters results. I used the 2022 US Open, British Open and PGA Championship (only those who made the cut) as a training dataset. To train the model, I had all the strokes gained metrics predict finish (as I did above) and then used those results to predict finish based on the 2022 Masters strokes gained.



Pretty accurate! The model predicted Scottie to go lowest, Collin to go second lowest, and Cam Smith to go third lowest. Too bad we could not do this with this year's data -- but hopefully these analyses are somewhat accurate for the 2023 Masters

39 views0 comments

Recent Posts

See All

Comments


bottom of page