What if Squiggle Used xScore?
Over the past few blogs (here and here) I’ve been investigating different methods for untangling skill from luck in forecasting game margins and, in this blog, we’ll try another approach, this time using what are called xScores.
One source of randomness in the AFL is how well a team converts the scoring opportunities it creates into goals versus behinds. Given enough data, analysts far cleverer than I can estimate how often a shot of a particular type taken from a particular point of the field under particular conditions should result in a goal, a behind, or no score at all.
So, we can adjust for that randomness in conversion by replacing the result of every scoring opportunity by the average score that we would expect an average player to generate from that opportunity given its specific characteristics. By summing the expected score associated with every scoring opportunity for a team in a given game we can come up with an expected score, or xScore, for that team.
For this blog, I’ll be using the xScores created by Twitter’s @AFLxScore for the years 2017 to 2020, and those created by Twitter’s @WheeloRatings for the years 2021 to 2024.
Let’s look firstly at the season-by-season Squiggle results of using, as a game’s margin, the xScore margin instead of the actual margin.
The clear macro takeout from this data is that, without exception, forecasters’ MAEs are smaller when xScores are used to determine game margins than when actual scores are used for the same purpose. The percentage reductions in MAE roughly range from 6% to 15%, and the magnitude of improvements are less variable within a season across all forecasters than they are across seasons for an average forecaster.
For the most part, and especially for 2017 to 2020 where we are using AFLxScore, ranks for forecasters within a season are fairly similar whether we use actual margin or an AFLxScore based margin, but there are some large outliers in the rankings when moving from using actual margins to Wheelo xScore-based margins:
2021: Graft moves from 1st to 11th, PlusSIxOne from 14th to 9th, and Massey Ratings from 15th to 6th
2022: A number of forecasters move by four spots, and The Flag falls five spots, AFL Scorigami rises nine spots, and PlusSixOne rises 10 spots
2023: Again, a number of forecasters move by four spots, and Cheap Stats, Massey Ratings, and Stattraction fall five spots each, while AFL Scorigami falls six spots. Also, FMI climbs nine spots, The Flag 10, and ZaphBot 11 spots.
2024: A year when switching from actual to xScore-based would have had huge implications. For example, ZaphBot would’ve fallen from 6th to 21st, Massey Ratings from 8th to 20th, Elo Predicts! from 9th to 15th, The Footycast from 11th to 24th, The Wooden Finger from 16th to 27th, FMI from 17th to 29th, Hyperion from 18th to 23rd, and Graft from 23rd to 30th. Also, Drop Kick Data would have risen from 13th to 4th, AFLalytics from 19th to 12th, Winnable from 21st to 16th, Glicko Ratings from 22nd to 14th, Stattraction from 27th 17th, footycharts from 28th to 13th, and, most importantly, Matter of Stats from 25th to 9th.
In the final chart we review forecaster ratings using actual MAE versus the difference in xScores.
Under either metric, using average rank, Punters, Aggregate, and s10 all do very well, as we would expect, but so too do Wheelo Ratings, Live Ladders, AFLalytics, Matter of Stats, The Arc, and Figuring Footy (though some records are for only one or a few seasons).
CONCLUSION
The broad conclusion of this analysis is that using xScore-based margins rather than actual margins would have resulted in lower MAEs for all of the Squiggle forecasters in every season, but would have had mostly small impacts on overall rankings, though larger impacts in more recent seasons.
The year-by-year average ranking differences are as follows:
2017: 0.9 ladder spots per forecaster
2018: 0.4 ladder spots per forecaster
2019: 0.9 ladder spots per forecaster
2020: 0.9 ladder spots per forecaster
2021: 2.8 ladder spots per forecaster
2022: 2.7 ladder spots per forecaster
2023: 3.2 ladder spots per forecaster
2024: 6.1 ladder spots per forecaster
It’ll be fascinating to see what 2025 brings.
As to what the results here mean for individual forecasters, it’s hard to say. My thinking at this point is that, unless you can forecast which teams are more likely to produce actual scores well-above or well-below their xScores, the results here merely reflect how much the stochastic nature of scoring shot conversion has affected the MAE of a given forecaster.