The Draw's Unbalanced: So What?
In an earlier blog we looked at how each team had fared in the 2010 draw and assessed the relative difficulty of each team's draw by, somewhat crudely, estimating a (weighted) average MARS of the teams they played.
From the point of view of the competition ladder, however, what matters is not the differences in the average MARS rating of the teams played, but how these differences, on a game-by-game basis, translate into expected competition points.
For example, if team A is rated 1000 and plays teams rated 960 and 1040, both at home, should team A expect to earn a higher or lower number of competition points from these games than a team B who plays teams rated 980 and 1020, also both at home? The average MARS rating of team A's competitors is 1000, the same as the average MARS rating of team B's competitors but, it turns out, team A can expect to win 1.12 games, while team B can expect to win 1.17 games. Not a huge difference, but a difference nonetheless.
To come up with these figures I've needed to create a model that converts ratings differences and venue type into an estimate of the probability of victory. This I did using data on MARS ratings, venue type and game outcomes for all matches across seasons 2006 to 2009 to build what is called a 'binary logit' model.
Perhaps the best way to illustrate what this model can provide is to tabulate the victory probabilities it can be used to calculate, across a range of different team ratings and across all three venue types.
So, according to this model, a team rated 960 playing another team rated 960 at home can expect to win about 62.2% of the time.
Notice that not all 20-point ratings differences are alike. When playing a team rated 960 at home, the difference in win probability for a team rated 980 compared with one rated 960 is 14 percentage points (76.2% - 62.2%). Comparing instead a team rated 1000 with one rated 980, each playing that same 960 rated team at home, the difference in win probability drops by just 10 percentage points (86.2% - 76.2%).
There is one superficially odd result lurking in this table and that is that teams with equal ratings playing at a neutral venue don't have the same chance of victory. For example, for two teams rated 960 playing at a neutral venue, the first-named team in the fixture - which is the AFL's notional home team - has a 67.7% probability of victory. It turns out that notional home status is a very important advantage in matches played on neutral grounds, which is a finding that a few of the MAFL Funds already exploit.
A good question to ask about any statistical model is: how well does it fit the data on which it's modelled? In this case, one logical measure of fit is the accuracy with which it "predicts" game results. The model I've constructed on the 2006 to 2009 seasons predicts 67.4% of all games correctly.
An even better question to ask of a model is how well it predicts the outcomes of games that were not used in its construction. To answer this I've used the model to retrospectively predict the outcomes of all 185 games in the 2005 season using information about each team's MARS rating at the time the game was played and information about the venue type. The model correctly predicts 64.1% of games, which is a slightly poorer performance than the model recorded for the games on which it was constructed, but which is still very acceptable. It's typical that, much like a fortune-teller, a model will perform less well prognosticating about things it doesn't already know than about things it does, but the key issue is how much worse it does and is that "worse" still any good. Predicting around 64% of matches is my kind of worse.
So, the model seems to do a reasonable job at predicting outcomes.
Knowing that we can turn our attention to the upcoming season and answer two far more interesting questions:
1. How many games do we expect each team to win in season 2010 given the actual 2010 draw?
2. How would this differ if each team had a fully balanced (ie all-play-all home and away) draw?
The answers to both questions are in this table:
From it we can find out that the Roos are expected to win 8.40 games this year and that this is 0.37 wins more than they could expect if the draw was perfectly balanced, making them the team most-favoured by the draw in 2010. This is despite the fact that, based purely on the (weighted) average MARS Ratings of its opponents, it only has the 8th easiest draw. As I alluded to earlier, what really matters is not the average strength of the teams that you meet, it's how you can expect to convert differences in strength into competition points.
Based on MARS' projections using the actual draw, the Roos are predicted to finish 11th. If, instead, a balanced draw was employed, they'd be expected to finish 12th.
What's surprising to me about this table is how few teams are significantly affected by the imbalance in the draw in terms of ladder positions. The Roos are gifted one spot as are Port Adelaide and West Coast, and the only team to drop position is Essendon, who drop 3 spots. The final eight is unaffected both in make-up and in ordering.
This is because, it seems, the abilities of the teams are so evenly spread that the random imposition of a half-game penalty or the windfall of a half-game bonus here-and-there courtesy of the draw is insufficient to upset the underlying order. Indeed, if you look at the teams' MARS Ratings, the smallest gap is about one-half a ratings point and it's between Port Adelaide and the Roos, both of which teams are advantaged by the draw to the tune of about 0.37 games. As such, the underlying team ratings prevail and the higher-MARS-ranked Roos are projected to finish above Port Adelaide even without a balanced draw.
Essendon's draw is so bad and the Roos' is so good that, combined they swamp the 5-point MARS Rating difference between the two teams, dropping Essendon's projected ladder finish below the Roos'.
I still don't like the imbalanced draw but I have to admit that, for this season at least, it doesn't appear that the draw will have a significant effect on who plays in September.