The Bias in Line Betting Revisited

Some blogs almost write themselves. This hasn't been one of them.

It all started when I read a journal article - to which I'd now link if I could find the darned thing again - that suggested a bias in NFL (or maybe it was College Football) spread betting markets arising from bookmakers' tendency to over-correct when a team won on line betting. The authors found that after a team won on line betting one week it was less likely to win on line betting next week because it was forced to overcome too large a handicap.

Naturally, I wondered if this was also true of AFL spread betting.

What Makes a Team's Start Vary from Week-to-Week?

In the process of investigating that question, I wound up wondering about the process of setting handicaps in the first place and what causes a team's handicap to change from one week to the next.

Logically, I reckoned, the start that a team receives could be described by this equation:

Start received by Team A (playing Team B) = (Quality of Team B - Quality of Team A) - Home Status for Team A

In words, the start that a team gets is a function of its quality relative to its opponent's (measured in points) and whether or not it's playing at home. The stronger a team's opponent the larger will be the start, and there'll be a deduction if the team's playing at home. This formulation of start assumes that game venue only ever has a positive effect on one team and not an additional, negative effect on the other. It excludes the possibility that a side might be a P point worse side whenever it plays away from home.

With that as the equation for the start that a team receives, the change in that start from one week to the next can be written as:

Change in Start received by Team A = Change in Quality of Team A + Difference in Quality of Teams played in successive weeks + Change in Home Status for Team A

To use this equation for we need to come up with proxies for as many of the terms that we can. Firstly then, what might a bookie use to reassess the quality of a particular team? An obvious choice is the performance of that team in the previous week relative to the bookie's expectations - which is exactly what the handicap adjusted margin for the previous week measures.

Next, we could define the change in home status as follows:

  • Change in home status = +1 if a team is playing at home this week and played away or at a neutral venue in the previous week
  • Change in home status = -1 if a team is playing away or at a neutral venue this week and played at home in the previous week
  • Change in home status = 0 otherwise

This formulation implies that there's no difference between playing away and playing at a neutral venue. Varying this assumption is something that I might explore in a future blog.

From Theory to Practicality: Fitting a Model

(well, actually there's a bit more theory too ...)

Having identified a way to quantify the change in a team's quality and the change in its home status we can now run a linear regression in which, for simplicity, I've further assumed that home ground advantage is the same for every team.

We get the following result using all home-and-away data for seasons 2006 to 2010:

For a team (designated to be) playing at home in the current week:

Change in start = -2.453 - 0.072 x HAM in Previous Week - 8.241 x Change in Home Status

For a team (designated to be) playing away in the current week:

Change in start = 3.035 - 0.155 x HAM in Previous Week - 8.241 x Change in Home Status

These equations explain about 15.7% of the variability in the change in start and all of the coefficients (except the intercept) are statistically significant at the 1% level or higher.

(You might notice that I've not included any variable to capture the change in opponent quality. Doubtless that variable would explain a significant proportion of the otherwise unexplained variability in change in start but it suffers from the incurable defect of being unmeasurable for previous and for future games. That renders it not especially useful for model fitting or for forecasting purposes.

Whilst that's a shame from the point of view of better modelling the change in teams' start from week-to-week, the good news is that leaving this variable out almost certainly doesn't distort the coefficients for the variables that we have included. Technically, the potential problem we have in leaving out a measure of the change in opponent quality is what's called an omitted variable bias, but such bias disappears if the the variables we have included are uncorrelated with the one we've omitted. I think we can make a reasonable case that the difference in the quality of successive opponents is unlikely to be correlated with a team's HAM in the previous week, and is also unlikely to be correlated with the change in a team's home status.)

Using these equations and historical home status and HAM data, we can calculate that the average (designated) home team receives 8 fewer points start than it did in the previous week, and the average (designated) away team receives 8 points more.

All of which Means Exactly What Now?

Okay, so what do these equations tell us?

Firstly let's consider teams playing at home in the current week. The nature of the AFL draw is such that it's more likely than not that a team playing at home in one week played away in the previous week in which case the Change in Home Status for that team will be +1 and their equation can be rewritten as

Change in Start = -10.694 - 0.072 x HAM in Previous Week

So, the start for a home team will tend to drop by about 10.7 points relative to the start it received in the previous week (because they're at home this week) plus about another 1 point for every 14.5 points lower their HAM was in the previous week. Remember: the more positive the HAM, the larger the margin by which the spread was covered.

Next, let's look at teams playing away in the current week. For them it's more likely than not that they played at home in the previous week in which case the Change in Home Status will be -1 for them and their equation can be rewritten as

Change in Start = 11.276 - 0.155 x HAM in Previous Week

Their start, therefore, will tend to increase by about 11.3 points relative to the previous week (because they're away this week) less 1 point for every 6.5 points lower their HAM was in the previous week.

Away teams, therefore, are penalised more heavily for larger HAMs than are home teams.

This I offer as one source of potential bias, similar to the bias that was found in the original article I read.

Proving the Bias

As a simple way of quantifying any bias I've fitted what's called a binary logit to estimate the following model:

Probability of Winning on Line Betting = f(Result on Line Betting in Previous Week, Start Received, Home Team Status)

This model will detect any bias in line betting results that's due to an over-reaction to the previous week's line betting results, a tendency for teams receiving particular sized starts to win or lose too often, or to a team's home team status.

The result is as follows:

logit(Probability of Winning on Line Betting) = -0.0269 + 0.054 x Previous Line Result + 0.001 x Start Received + 0.124 x Home Team Status

The only coefficient that's statistically significant in that equation is the one on Home Team Status and it's significant at the 5% level. This coefficient is positive, which implies that home teams win on line betting more often than they should.

Using this equation we can quantify how much more often. An away team, we find, has about a 46% probability of winning on line betting, a team playing at a neutral venue has about a 49% probability, and a team playing at home has about a 52% probability.

That is undoubtedly a bias, but I have two pieces of bad news about it. Firstly, it's not large enough to overcome the vig on line betting at $1.90 and secondly, it disappeared in 2010.

Do Margins Behave Like Starts?

We now know something about how the points start given by the TAB Sportsbet bookie responds to a team's change in estimated quality and to a change in a team's home status. Do the actual game margins respond similarly?

One way to find this out is to use exactly the same equation as we used above, replacing Change in Start with Change in Margin and defining the change in a team's margin as its margin of victory this week less its margin of victory last week (where victory margins are negative for losses).

If we do that and run the new regression model, we get the following:

For a team (designated to be) playing at home in the current week:

Change in Margin = 4.058 - 0.865 x HAM in Previous Week + 8.801 x Change in Home Status

For a team (designated to be) playing away in the current week:

Change in Margin = -4.571 - 0.865 x HAM in Previous Week + 8.801 x Change in Home Status

These equations explain an impressive 38.7% of the variability in the change in margin. We can simplify them, as we did for the regression equations for Change in Start, by using the fact that the draw tends to alternate team's home and away status from one week to the next.

So, for home teams:

Change in Margin = 12.859 - 0.865 x HAM in Previous Week

While, for away teams:

Change in Margin = -13.372 - 0.865 x HAM in Previous Week

At first blush it seems a little surprising that a team's HAM in the previous week is negatively correlated with its change in margin. Why should that be the case?

It's a phenomenon that we've discussed before: regression to the mean. What these equations are saying are that teams that perform better than expected in one week - where expectation is measured relative to the line betting handicap - are likely to win by slightly less than they did in the previous week or lose by slightly more.

What's particularly interesting is that home teams and away teams show mean regression to the same degree. The TAB Sportsbet bookie, however, hasn't always behaved as if this was the case.

Another Approach to the Source of the Bias

Bringing the Change in Start and Change in Margin equations together provides another window into the home team bias.

The simplified equations for Change in Start were:

Home Teams: Change in Start = -10.694 - 0.072 x HAM in Previous Week

Away Teams: Change in Start = 11.276 - 0.155 x HAM in Previous Week

So, for teams whose previous HAM was around zero (which is what the average HAM should be), the typical change in start will be around 11 points - a reduction for home teams, and an increase for away teams.

The simplified equations for Change in Margin were:

Home Teams: Change in Margin = 12.859 - 0.865 x HAM in Previous Week

Away Teams: Change in Margin = -13.372 - 0.865 x HAM in Previous Week

So, for teams whose previous HAM was around zero, the typical change in margin will be around 13 points - an increase for home teams, and a decrease for away teams.

Overall the 11-point v 13-point disparity favours home teams since they enjoy the larger margin increase relative to the smaller decrease in start, and it disfavours away teams since they suffer a larger margin decrease relative to the smaller increase in start.

To Conclude

Historically, home teams win on line betting more often than away teams. That means home teams tend to receive too much start and away teams too little.

I've offered two possible reasons for this:

  1. Away teams suffer larger reductions in their handicaps for a given previous weeks' HAM
  2. For teams with near-zero previous week HAMs, starts only adjust by about 11 points when a team's home status changes but margins change by about 13 points. This favours home teams because the increase in their expected margin exceeds the expected decrease in their start, and works against away teams for the opposite reason.

If you've made it this far, my sincere thanks. I reckon your brain's earned a spell; mine certainly has.

Grand Final History: A Look at Ladder Positions

Across the 111 Grand Finals in VFL/AFL history - excluding the two replays - only 18 of them, or about 1-in-6, has seen the team finishing 1st on the home-and-away ladder play the team finishing 3rd.

This year, of course, will be the nineteenth.

Far more common, as you'd expect, has been a matchup between the teams from 1st and 2nd on the ladder. This pairing accounts for 56 Grand Finals, which is a smidgeon over half, and has been so frequent partly because of the benefits accorded to teams finishing in these positions by the various finals systems that have been in use, and partly no doubt because these two teams have tended to be the best two teams.

2010 - Grand Final Results by Ladder Position.png

In the 18 Grand Finals to date that have involved the teams from 1st and 3rd, the minor premier has an 11-7 record, which represents a 61% success rate. This is only slightly better than the minor premiers' record against teams coming 2nd, which is 33-23 or about 59%.

Overall, the minor premiers have missed only 13 of the Grand Finals and have won 62% of those they've been in.

By comparison, teams finishing 2nd have appeared in 68 Grand Finals (61%) and won 44% of them. In only 12 of those 68 appearances have they faced a team from lower on the ladder; their record for these games is 7-5, or 58%.

Teams from 3rd and 4th positions have each made about the same number of appearances, winning a spot about 1 year in 4. Whilst their rates of appearance are very similar, their success rates are vastly different, with teams from 3rd winning 46% of the Grand Finals they've made, and those from 4th winning only 27% of them.

That means that teams from 3rd have a better record than teams from 2nd, largely because teams from 3rd have faced teams other than the minor premier in 25% of their Grand Final appearances whereas teams from 2nd have found themselves in this situation for only 18% of their Grand Final appearances.

Ladder positions 5 and 6 have provided only 6 Grand Finalists between them, and only 2 Flags. Surprisingly, both wins have been against minor premiers - in 1998, when 5th-placed Adelaide beat North Melbourne, and in 1900 when 6th-placed Melbourne defeated Fitzroy. (Note that the finals systems have, especially in the early days of footy, been fairly complex, so not all 6ths are created equal.)

One conclusion I'd draw from the table above is that ladder position is important, but only mildly so, in predicting the winner of the Grand Final. For example, only 69 of the 111 Grand Finals, or about 62%, have been won by the team finishing higher on the ladder.

It turns out that ladder position - or, more correctly, the difference in ladder position between the two grand finalists - is also a very poor predictor of the margin in the Grand Final.

2010 - Grand Final Results by Ladder Position - Chart.png

This chart shows that there is a slight increase in the difference between the expected number of points that the higher-placed team will score relative to the lower-placed team as the gap in their respective ladder positions increases, but it's only half a goal per ladder position.

What's more, this difference explains only about half of one percentage of the variability in that margin.

Perhaps, I thought, more recent history would show a stronger link between ladder position difference and margin.

2010 - Grand Final Results by Ladder Position - Chart 2.png

Quite the contrary, it transpires. Looking just at the last 20 years, an increase in the difference of 1 ladder position has been worth only 1.7 points in increased expected margin.

Come the Grand Final, it seems, some of your pedigree follows you onto the park, but much of it wanders off for a good bark and a long lie down.

Just Because You're Stable, Doesn't Mean You're Normal

As so many traders discovered to their individual and often, regrettably, our collective cost over the past few years, betting against longshots, deliberately or implicitly, can be a very lucrative gig until an event you thought was a once-in-a-virtually-never affair crops up a couple of times in a week. And then a few more times again after that.

To put a footballing context on the topic, let's imagine that a friend puts the following proposition bet to you: if none of the first 100 home-and-away games next season includes one with a handicap-adjusted margin (HAM) for the home team of -150 or less he'll pay you $100; if there is one or more games with a HAM of -150 or less, however, you pay him $10,000.

For clarity, by "handicap-adjusted margin" I mean the number that you get if you subtract the away team's score from the home team's score and then add the home team's handicap. So, for example, if the home team was a 10.5 point favourite but lost 100-75, then the handicap adjusted margin would be 75-100-10.5, or -35.5 points.

A First Assessment

At first blush, does the bet seem fair?

We might start by relying on the availability heuristic and ask ourselves how often we can recall a game that might have produced a HAM of -150 or less. To make that a tad more tangible, how often can you recall a team losing by more than 150 points when it was roughly an equal favourite or by, say, 175 points when it was a 25-point underdog?

Almost never, I'd venture. So, offering 100/1 odds about this outcome occurring once or more in 100 games probably seems attractive.

Ahem ... the data?

Maybe you're a little more empirical than that and you'd like to know something about the history of HAMs. Well, since 2006, which is a period covering just under 1,000 games and that spans the entire extent - the whole hog, if you will - of my HAM data, there's never been a HAM under -150.

One game produced a -143.5 HAM; the next lowest after that was -113.5. Clearly then, the HAM of -143.5 was an outlier, and we'd need to see another couple of scoring shots on top of that effort in order to crack the -150 mark. That seems unlikely.

In short, we've never witnessed a HAM of -150 or less in about 1,000 games. On that basis, the bet's still looking good.

But didn't you once tell me that HAMs were Normal?

Before we commit ourselves to the bet, let's consider what else we know about HAMs.

Previously, I've claimed that HAMs seemed to follow a normal distribution and, in fact, the HAM data comfortably passes the Kolmogorov-Smirnov test of Normality (one of the few statistical tests I can think of that shares at least part of its name with the founder of a distillery).

Now technically the HAM data's passing this test means only that we can't reject the null hypothesis that it follows a Normal distribution, not that we can positively assert that it does. But given the ubiquity of the Normal distribution, that's enough prima facie evidence to proceed down this path of enquiry.

To do that we need to calculate a couple of summary statistics for the HAM data. Firstly, we need to calculate the mean, which is +2.32 points, and then we need to calculate the standard deviation, which is 36.97 points. A HAM of -150 therefore represents an event approximately 4.12 standard deviations from the mean.

If HAMs are Normal, that's certainly a once-in-a-very-long-time event. Specifically, it's an event we should expect to see only about every 52,788 games, which, to put it in some context, is almost exactly 300 times the length of the 2010 home-and-away season.

With a numerical estimate of the likelihood of seeing one such game we can proceed to calculate the likelihood of seeing one or more such game within the span of 100 games. The calculation is 1-(1-1/52,788)^100 or 0.19%, which is about 525/1 odds. At those odds you should expect to pay out that $10,000 about 1 time in 526, and collect that $100 on the 525 other occasions, which gives you an expected profit of $80.81 every time you take the bet.

That still looks like a good deal.

Does my tail look fat in this?

This latest estimate carries all the trappings of statistically soundness, but it does hinge on the faith we're putting in that 1 in 52,788 estimate, which, in turn hinges on our faith that HAMs are Normal. In the current instance this faith needs to hold not just in the range of HAMs that we see for most games - somewhere in the -30 to +30 range - but way out in the arctic regions of the distribution rarely seen by man, the part of the distribution that is technically called the 'tails'.

There are a variety of phenomena that can be perfectly adequately modelled by a Normal distribution for most of their range - financial returns are a good example - but that exhibit what are called 'fat tails', which means that extreme values occur more often than we would expect if the phenomenon faithfully followed a Normal distribution across its entire range of potential values. For most purposes 'fat tails' are statistically vestigial in their effect - they're an irrelevance. But when you're worried about extreme events, as we are in our proposition bet, they matter a great deal.

A class of distributions that don't get a lot of press - probably because the branding committee that named them clearly had no idea - but that are ideal for modelling data that might have fat tails are the Stable Distributions. They include the Normal Distribution as a special case - Normal by name, but abnormal within its family.

If we fit (using Maximum Likelihood Estimation if you're curious) a Stable Distribution to the HAM data we find that the best fit corresponds to a distribution that's almost Normal, but isn't quite. The apparently small difference in the distributional assumption - so small that I abandoned any hope of illustrating the difference with a chart - makes a huge difference in our estimate of the probability of losing the bet. Using the best fitted Stable Distribution, we'd now expect to see a HAM of -150 or lower about 1 game in every 1,578 which makes the likelihood of paying out that $10,000 about 7%.

Suddenly, our seemingly attractive wager has a -$607 expectation.

Since we almost saw - if that makes any sense - a HAM of -150 in our sample of under 1,000 games, there's some intuitive appeal in an estimate that's only a bit smaller than 1 in 1,000 and not a lot smaller, which we obtained when we used the Normal approximation.

Is there any practically robust way to decide whether HAMs truly follow a Normal distribution or a Stable Distribution? Given the sample that we have, not in the part of the distribution that matters to us in this instance: the tails. We'd need a sample many times larger than the one we have in order to estimate the true probability to an acceptably high level of certainty, and by then would we still trust what we'd learned from games that were decades, possibly centuries old?

Is There a Lesson in There Somewhere?

The issue here, and what inspired me to write this blog, is the oft-neglected truism - an observation that I've read and heard Nassim Taleb of "Black Swan" fame make on a number of occasions - that rare events are, well, rare, and so estimating their likelihood is inherently difficult and, if you've a significant interest in the outcome, financially or otherwise dangerous.

For many very rare events we simply don't have sufficiently large or lengthy datasets on which to base robust probability estimates for those events. Even where we do have large datasets we still need to justify a belief that the past can serve as a reasonable indicator of the future.

What if, for example, the Gold Coast team prove to be particularly awful next year and get thumped regularly and mercilessly by teams of the Cats' and the Pies' pedigrees? How good would you feel than about betting against a -150 HAM?

So when some group or other tells you that a potential catastrophe is a 1-in-100,000 year event, ask them what empirical basis they have for claiming this. And don't bet too much on the fact that they're right.

Which Teams Are Most Likely to Make Next Year's Finals?

I had a little time on a flight back to Sydney from Melbourne last Friday night to contemplate life's abiding truths. So naturally I wondered: how likely is it that a team finishing in ladder position X at the end of one season makes the finals in the subsequent season?

Here's the result for seasons 2000 to 2010, during which the AFL has always had a final 8:

2010 - Probability of Making the Finals by Ladder Position.png

When you bear in mind that half of the 16 teams have played finals in each season since 2000 this table is pretty eye-opening. It suggests that the only teams that can legitimately feel themselves to be better-than-random chances for a finals berth in the subsequent year are those that have finished in the top 4 ladder positions in the immediately preceding season. Historically, top 4 teams have made the 8 in the next year about 70% of the time - 100% of the time in the case of the team that takes the minor premiership.

In comparison, teams finishing 5th through 14th have, empirically, had roughly a 50% chance of making the finals in the subsequent year (actually, a tick under this, which makes them all slightly less than random chances to make the 8).

Teams occupying 15th and 16th have had very remote chances of playing finals in the subsequent season. Only one team from those positions - Collingwood, who finished 15th in 2005 and played finals in 2006 - has made the subsequent year's top 8.

Of course, next year we have another team, so that's even worse news for those teams that finished out of the top 4 this year.

Coast-to-Coast Blowouts: Who's Responsible and When Do They Strike?

Previously, I created a Game Typology for home-and-away fixtures and then went on to use that typology to characterise whole seasons and eras.

In this blog we'll use that typology to investigate the winning and losing tendencies of individual teams and to consider how the mix of different game types varies as the home-and-away season progresses.

First, let's look at the game type profile of each team's victories and losses in season 2010.

2010 - Game Type by Team 2010.png

Five teams made a habit of recording Coast-to-Coast Comfortably victories this season - Carlton, Collingwood, Geelong, Sydney and the Western Bulldogs - all of them finalists, and all of them winning in this fashion at least 5 times during the season.

Two other finalists, Hawthorn and the Saints, were masters of the Coast-to-Coast Nail-Biter. They, along with Port Adelaide, registered four or more of this type of win.

Of the six other game types there were only two that any single team recorded on 4 occasions. The Roos managed four Quarter 2 Press Light victories, and Geelong had four wins categorised as Quarter 3 Press victories.

Looking next at loss typology, we find six teams specialising in Coast-to-Coast Comfortably losses. One of them is Carlton, who also appeared on the list of teams specialising in wins of this variety, reinforcing the point that I made in an earlier blog about the Blues' fate often being determined in 2010 by their 1st quarter performance.

The other teams on the list of frequent Coast-to-Coast Comfortably losers are, unsurprisingly, those from positions 13 through 16 on the final ladder, and the Roos. They finished 9th on the ladder but recorded a paltry 87.4 percentage, this the logical consequence of all those Coast-to-Coast Comfortably losses.

Collingwood and Hawthorn each managed four losses labelled Coast-to-Coast Nail-Biters, and West Coast lost four encounters that were Quarter 2 Press Lights, and four more that were 2nd-Half Revivals where they weren't doing the reviving.

With only 22 games to consider for each team it's hard to get much of a read on general tendencies. So let's increase the sample by an order of magnitude and go back over the previous 10 seasons.

2010 - Game Type by Team 2001-2010.png

Adelaide's wins have come disproportionately often from presses in the 1st or 2nd quarters and relatively rarely from 2nd-Half Revivals or Coast-to-Coast results. They've had more than their expected share of losses of type Q2 Press Light, but less than their share of Q1 Press and Coast-to-Coast losses. In particular, they've suffered few Coast-to-Coast Blowout losses.

Brisbane have recorded an excess of Coast-to-Coast Comfortably and Blowout victories and less Q1 Press, Q3 Press and Coast-to-Coast Nail-Biters than might be expected. No game type has featured disproportionately more often amongst their losses, but they have had relatively few Q2 Press and Q3 Press losses.

Carlton has specialised in the Q2 Press victory type and has, relatively speaking, shunned Q3 Press and Coast-to-Coast Blowout victories. Their losses also include a disportionately high number of Q2 Press losses, which suggests that, over the broader time horizon of a decade, Carlton's fate has been more about how they've performed in the 2nd term. Carlton have also suffered a disproportionately high share of Coast-to-Coast Blowouts - which is I suppose what a Q2 Press loss might become if it gets ugly - yet have racked up fewer than the expected number of Coast-to-Coast Nail-Biters and Coast-to-Coast Comfortablys. If you're going to lose Coast-to-Coast, might as well make it a big one.

Collingwood's victories have been disproportionately often 2nd-Half Revivals or Coast-to-Coast Blowouts and not Q1 Presses or Coast-to-Coast Nail-Biters. Their pattern of losses has been partly a mirror image of their pattern of wins, with a preponderance of Q1 Presses and Coast-to-Coast Nail-Biters and a scarcity of 2nd-Half Revivals. They've also, however, had few losses that were Q2 or Q3 Presses or that were Coast-to-Coast Comfortablys.

Wins for Essendon have been Q1 Presses or Coast-to-Coast Nail-Biters unexpectedly often, but have been Q2 Press Lights or 2nd-Half Revivals significantly less often than for the average team. The only game type overrepresented amongst their losses has been the Coast-to-Coast Comfortably type, while Coast-to-Coast Blowouts, Q1 Presses and, especially, Q2 Presses have been signficantly underrepresented.

Fremantle's had a penchant for leaving their runs late. Amongst their victories, Q3 Presses and 2nd-Half Revivals occur more often than for the average team, while Coast-to-Coast Blowouts are relatively rare. Their losses also have a disproportionately high showing of 2nd-Half Revivals and an underrepresentation of Coast-to-Coast Blowouts and Coast-to-Coast Nail-Biters. It's fair to say that Freo don't do Coast-to-Coast results.

Geelong have tended to either dominate throughout a game or to leave their surge until later. Their victories are disproportionately of the Coast-to-Coast Blowout and Q3 Press varieties and are less likely to be Q2 Presses (Regular or Light) or 2nd-Half Revivals. Losses have been Q2 Press Lights more often than expected, and Q1 Presses, Q3 Presses or Coast-to-Coast Nail-Biters less often than expected.

Hawthorn have won with Q2 Press Lights disproportionately often, but have recorded 2nd-Half Revivals relatively infrequently and Q2 Presses very infrequently. Q2 Press Lights are also overrepresented amongst their losses, while Q2 Presses and Coast-to-Coast Nail-Biters appear less often than would be expected.

The Roos specialise in Coast-to-Coast Nail-Biter and Q2 Press Light victories and tend to avoid Q2 and Q3 Presses, as well as Coast-to-Coast Comfortably and Blowout victories. Losses have come disproportionately from the Q3 Press bucket and relatively rarely from the Q2 Press (Regular or Light) categories. The Roos generally make their supporters wait until late in the game to find out how it's going to end.

Melbourne heavily favour the Q2 Press Light style of victory and have tended to avoid any of the Coast-to-Coast varieties, especially the Blowout variant. They have, however, suffered more than their share of Coast-to-Coast Comfortably losses, but less than their share of Coast-to-Coast Blowout and Q2 Press Light losses.

Port Adelaide's pattern of victories has been a bit like Geelong's. They too have won disproportionately often via Q3 Presses or Coast-to-Coast Blowouts and their wins have been underrepresented in the Q2 Press Light category. They've also been particularly prone to Q2 and Q3 Press losses, but not to Q1 Presses or 2nd-Half Revivals.

Richmond wins have been disproportionately 2nd-Half Revivals or Coast-to-Coast Nail-Biters, and rarely Q1 or Q3 Presses. Their losses have been Coast-to-Coast Blowouts disproportionately often, but Coast-to-Coast Nail-Biters and Q2 Press Lights relatively less often than expected.

St Kilda have been masters of the foot-to-the-floor style of victory. They're overrepresented amongst Q1 and Q2 Presses, as well as Coast-to-Coast Blowouts, and underrepresented amongst Q3 Presses and Coast-to-Coast Comfortablys. Their losses include more Coast-to-Coast Nail-Biters than the average team, and fewer Q1 and Q3 Presses, and 2nd-Half Revivals.

Sydney's loss profile almost mirrors the average team's with the sole exception being a relative abundance of Q3 Presses. Their profile of losses, however, differs significantly from the average and shows an excess of Q1 Presses, 2nd-Half Revivals and Coast-to-Coast Nail-Biters, a relative scarcity of Q3 Presses and Coast-to-Coast Comfortablys, and a virtual absence of Coast-to-Coast Blowouts.

West Coast victories have come disproportionately as Q2 Press Lights and have rarely been of any other of the Press varieties. In particular, Q2 Presses have been relatively rare. Their losses have all too often been Coast-to-Coast blowouts or Q2 Presses, and have come as Coast-to-Coast Nail-Biters relatively infrequently.

The Western Bulldogs have won with Coast-to-Coast Comfortablys far more often than the average team, and with the other two varieties of Coast-to-Coast victories far less often. Their profile of losses mirrors that of the average team excepting that Q1 Presses are somewhat underrepresented.

We move now from associating teams with various game types to associating rounds of the season with various game types.

You might wonder, as I did, whether different parts of the season tend to produce a greater or lesser proportion of games of particular types. Do we, for example, see more Coast-to-Coast Blowouts early in the season when teams are still establishing routines and disciplines, or later on in the season when teams with no chance meet teams vying for preferred finals berths?

2010 - Game Type by Round 2001-2010.png

For this chart, I've divided the seasons from 2001 to 2010 into rough quadrants, each spanning 5 or 6 rounds.

The Coast-to-Coast Comfortably game type occurs most often in the early rounds of the season, then falls away a little through the next two quadrants before spiking a little in the run up to the finals.

The pattern for the Coast-to-Coast Nail-Biter game type is almost the exact opposite. It's relatively rare early in the season and becomes more prevalent as the season progresses through its middle stages, before tapering off in the final quadrant.

Coast-to-Coast Blowouts occur relatively infrequently during the first half of the season, but then blossom, like weeds, in the second half, especially during the last 5 rounds when they reach near-plague proportions.

Quarter 1 and Quarter 2 Presses occur with similar frequencies across the season, though they both show up slightly more often as the season progresses. Quarter 2 Press Lights, however, predominate in the first 5 rounds of the season and then decline in frequency across rounds 6 to 16 before tapering dramatically in the season's final quadrant.

Quarter 3 Presses occur least often in the early rounds, show a mild spike in Rounds 6 to 11, and then taper off in frequency across the remainder of the season. 2nd-Half Revivals show a broadly similar pattern.

2010: Just How Different Was It?

Last season I looked at Grand Final Typology. In this blog I'll start by presenting a similar typology for home-and-away games.

In creating the typology I used the same clustering technique that I used for Grand Finals - what's called Partitioning Around Medoids, or PAM - and I used similar data. Each of the 13,144 home-and-away season games was characterised by four numbers: the winning team's lead at quarter time, at half-time, at three-quarter time, and at full time.

With these four numbers we can calculate a measure of distance between any pair of games and then use the matrix of all these distances to form clusters or types of games.

After a lot of toing, froing, re-toing anf re-froing, I settled on a typology of 8 game types:

2010 - Types of Home and Away Game.png

Typically, in the Quarter 1 Press game type, the eventual winning team "presses" in the first term and leads by about 4 goals at quarter-time. At each subsequent change and at the final siren, the winning team typically leads by a little less than the margin it established at quarter-time. Generally the final margin is about about 3 goals. This game type occurs about 8% of the time.

In a Quarter 2 Press game type the press is deferred, and the eventual winning team typically trails by a little over a goal at quarter-time but surges in the second term to lead by four-and-a-half goals at the main break. They then cruise in the third term and extend their lead by a little in the fourth and ultimately win quite comfortably, by about six and a half goals. About 7% of all home-and-away games are of this type.

The Quarter 2 Press Light game type is similar to a Quarter 2 Press game type, but the surge in the second term is not as great, so the eventual winning team leads at half-time by only about 2 goals. In the second half of a Quarter 2 Press Light game the winning team provides no assurances for its supporters and continues to lead narrowly at three-quarter time and at the final siren. This is one of the two most common game types, and describes almost 1 in 5 contests.

Quarter 3 Press games are broadly similar to Quarter 1 Press games up until half-time, though the eventual winning team typically has a smaller lead at that point in a Quarter 3 Press game type. The surge comes in the third term where the winners typically stretch their advantage to around 7 goals and then preserve this margin until the final siren. Games of this type comprise about 10% of home-and-away fixtures.

2nd-Half Revival games are particularly closely fought in the first two terms with the game's eventual losers typically having slightly the better of it. The eventual winning team typically trails by less than a goal at quarter-time and at half-time before establishing about a 3-goal lead at the final change. This lead is then preserved until the final siren. This game type occurs about 13% of the time.

A Coast-to-Coast Nail-Biter is the game type that's the most fun to watch - provided it doesn't involve your team, especially if your team's on the losing end of one of these contests. In this game type the same team typically leads at every change, but by less than a goal to a goal and a half. Across history, this game type has made up about one game in six.

The Coast-to-Coast Comfortably game type is fun to watch as a supporter when it's your team generating the comfort. Teams that win these games typically lead by about two and a half goals at quarter-time, four and a half goals at half-time, six goals at three-quarter time, and seven and a half goals at the final siren. This is another common game type - expect to see it about 1 game in 5 (more often if you're a Geelong or a West Coast fan, though with vastly differing levels of pleasure depending on which of these two you support).

Coast-to-Coast Blowouts are hard to love and not much fun to watch for any but the most partial observer. They start in the manner of a Coast-to-Coast Comfortably game, with the eventual winner leading by about 2 goals at quarter time. This lead is extended to six and a half goals by half-time - at which point the word "contest" no longer applies - and then further extended in each of the remaining quarters. The final margin in a game of this type is typically around 14 goals and it is the least common of all game types. Throughout history, about one contest in 14 has been spoiled by being of this type.

Unfortunately, in more recent history the spoilage rate has been higher, as you can see in the following chart (for the purposes of which I've grouped the history of the AFL into eras each of 12 seasons, excepting the most recent era, which contains only 6 seasons. I've also shown the profile of results by game type for season 2010 alone).

2010 - Profile of Game Types by Era.png

The pies in the bottom-most row show the progressive growth in the Coast-to-Coast Blowout commencing around the 1969-1980 era and reaching its apex in the 1981-1992 era where it described about 12% of games.

In the two most-recent eras we've seen a smaller proportion of Coast-to-Coast Blowouts, but they've still occurred at historically high rates of about 8-10%.

We've also witnessed a proliferation of Coast-to-Coast Comfortably and Coast-to-Coast Nail-Biter games in this same period, not least of which in the current season where these game type descriptions attached to about 27% and 18% of contests respectively.

In total, almost 50% of the games this season were Coast-to-Coast contests - that's about 8 percentage points higher than the historical average.

Of the five non Coast-to-Coast game types, three - Quarter 2 Press, Quarter 3 Press and 2nd-half Revival - occurred at about their historical rates this season, while Quarter 1 Press and Quarter 2 Press Light game typesboth occurred at about 75-80% of their historical rates.

The proportion of games of each type in a season can be thought of as a signature of that season. being numeric, they provide a ready basis on which to measure how much one season is more or less like another. In fact, using a technique called principal components analysis we can use each season's signature to plot that season in two-dimensional space (using the first two principal components).

Here's what we get:

2010 - Home and Away Season Similarity.png

I've circled the point labelled "2010", which represents the current season. The further away is the label for another season, the more different is that season's profile of game types in comparison to 2010's profile.

So, for example, 2009, 1999 and 2005 are all seasons that were quite similar to 2010, and 1924, 1916 and 1958 are all seasons that were quite different. The table below provides the profile for each of the seasons just listed; you can judge the similarity for yourself.

2010 - Seasons Similar to 2010.png

Signatures can also be created for eras and these signatures used to represent the profile of game results from each era. If you do this using the eras as I've defined them, you get the chart shown below.

One way to interpret this chart is that there have been 3 super-eras in VFL/AFL history, the first spanning the seasons from 1897 to 1920, the second from 1921-1980, and the third from 1981-2010. In this latter era we seem to be returning to the profiles of the earliest eras, which was a time when 50% or more of all results were Coast-to-Coast game types.

2010 - Home and Away Era Similarity.png

Season 2010: An Assessment of Competitiveness

For many, the allure of sport lies in its uncertainty. It's this instinct, surely, that motivated the creation of the annual player drafts and salary caps - the desire to ensure that teams don't become unbeatable, that "either team can win on the day".

Objective measures of the competitiveness of AFL can be made at any of three levels: teams' competition wins and losses, the outcome of a game, or the in-game trading of the lead.

With just a little pondering, I came up with the following measures of competitiveness at the three levels; I'm sure there are more.

2010 - Measures of Competitiveness.png

We've looked at most - maybe all - of the Competition and Game level measures I've listed here in blogs or newsletters of previous seasons. I'll leave any revisiting of these measures for season 2010 as a topic for a future blog.

The in-game measures, though, are ones we've not explicitly explored, though I think I have commented on at least one occasion this year about the surprisingly high proportion of winning teams that have won 1st quarters and the low proportion of teams that have rallied to win after trailing at the final change.

As ever, history provides some context for my comments.

2010 - Number of Lead Changes.png

The red line in this chart records the season-by-season proportion of games in which the same team has led at every change. You can see that there's been a general rise in the proportion of such games from about 50% in the late seventies to the 61% we saw this year.

In recent history there have only been two seasons where the proportion of games led by the same team at every change has been higher: in 1995, when it was almost 64%, and in 1985 when it was a little over 62%. Before that you need to go back to 1925 to find a proportion that's higher than what we've seen in 2010.

The green, purple and blue lines track the proportion of games for which there were one, two, and the maximum possible three lead changes respectively. It's also interesting to note how the lead-change-at-every-change contest type has progressively disappeared into virtual non-existence over the last 50 seasons. This year we saw only three such contests, one of them (Fremantle v Geelong) in Round 3, and then no more until a pair of them (Fremantle v Geelong and Brisbane v Adelaide) in Round 20.

So we're getting fewer lead changes in games. When, exactly, are these lead changes not happening?

2010 - Lead Changes from One Quarter to the Next.png

Pretty much everywhere, it seems, but especially between the ends of quarters 1 and 2.

The top line shows the proportion of games in which the team leading at half time differs from the team leading at quarter time (a statistic that, as for all the others in this chart, I've averaged over the preceding 10 years to iron out the fluctuations and better show the trend). It's been generally falling since the 1960s excepting a brief period of stability through the 1990s that recent seasons have ignored, the current season in particular during which it's been just 23%.

Next, the red line, which shows the proportion of games in which the team leading at three-quarter time differs from the team leading at half time. This statistic has declined across the period roughly covering the 1980s through to 2000, since which it has stabilised at about 20%.

The navy blue line shows the proportion of games in which the winning team differs from the team leading at three-quarter time. Its trajectory is similar to that of the red line, though it doesn't show the jaunty uptick in recent seasons that the red line does.

Finally, the dotted, light-blue line, which shows the overall proportion of quarters for which the team leading at one break was different from the team leading at the previous break. Its trend has been downwards since the 1960s though the rate of decline has slowed markedly since about 1990.

All told then, if your measure of AFL competitiveness is how often the lead changes from the end of one quarter to the next, you'd have to conclude that AFL games are gradually becoming less competitive.

It'll be interesting to see how the introduction of new teams over the next few seasons affects this measure of competitiveness.

A Competition of Two Halves

In the previous blog I suggested that, based on winning percentages when facing finalists, the top 8 teams (well, actually the top 7) were of a different class to the other teams in the competition.

Current MARS Ratings provide further evidence for this schism. To put the size of the difference in an historical perspective, I thought it might be instructive to review the MARS Ratings of teams at a similar point in the season for each of the years 1999 to 2010.

(This also provides me an opportunity to showcase one of the capabilities - strip-charts - of a sparklines tool that can be downloaded for free and used with Excel.)

2010 - Spread of MARS Ratings by Year.png

In the chart, each row relates the MARS Ratings that the 16 teams had as at the end of Round 22 in a particular season. Every strip in the chart corresponds to the Rating of a single team, and the relative position of that strip is based on the team's Rating - the further to the right the strip is, the higher the Rating.

The red strip in each row corresponds to a Rating of 1,000, which is always the average team Rating.

While the strips provide a visual guide to the spread of MARS Ratings for a particular season, the data in the columns at right offer another, more quantitative view. The first column is the average Rating of the 8 highest-rated teams, the middle column the average Rating of the 8 lowest-rated teams, and the right column is the difference between the two averages. Larger values in this right column indicate bigger differences in the MARS Ratings of teams rated highest compared to those rated lowest.

(I should note that the 8 highest-rated teams will not always be the 8 finalists, but the differences in the composition of these two sets of eight team don't appear to be material enough to prevent us from talking about them as if they were interchangeable.)

What we see immediately is that the difference in the average Rating of the top and bottom teams this year is the greatest that it's been during the period I've covered. Furthermore, the difference has come about because this year's top 8 has the highest-ever average Rating and this year's bottom 8 has the lowest-ever average Rating.

The season that produced the smallest difference in average Ratings was 1999, which was the year in which 3 teams finished just one game out of the eight and another finished just two games out. That season also produced the all-time lowest rated top 8 and highest rated bottom 8.

While we're on MARS Ratings and adopting an historical perspective (and creating sparklines), here's another chart, this one mapping the ladder and MARS performances of the 16 teams as at the end of the home-and-away seasons of 1999 to 2010.

2010 - MARS and Ladder History - 1999-2010.png

One feature of this chart that's immediately obvious is the strong relationship between the trajectory of each team's MARS Rating history and its ladder fortunes, which is as it should be if the MARS Ratings mean anything at all.

Other aspects that I find interesting are the long-term decline of the Dons, the emergence of Collingwood, Geelong and St Kilda, and the precipitous rise and fall of the Eagles.

I'll finish this blog with one last chart, this one showing the MARS Ratings of the teams finishing in each of the 16 ladder positions across seasons 1999 to 2010.

2010 - MARS Ratings Spread by Ladder Position.png

As you'd expect - and as we saw in the previous chart on a team-by-team basis - lower ladder positions are generally associated with lower MARS Ratings.

But the "weather" (ie the results for any single year) is different from the "climate" (ie the overall correlation pattern). Put another way, for some teams in some years, ladder position and MARS Rating are measuring something different. Whether either, or neither, is measuring what it purports to -relative team quality - is a judgement I'll leave in the reader's hands.

Why Sydney Won't Finish Fourth

 

As the ladder now stands, Sydney trail the Dogs by 4 competition points but they have a significantly inferior percentage. The Dogs have scored 2,067 points and conceded 1,656, giving them a percentage of 124.8, while Sydney have scored 1,911 points and conceded 1,795, giving them a percentage of 106.5, some 18.3 percentage points lower.

 

If Sydney were to win next week against the Lions, and the Dogs were to roll over and play dead against the Dons, then fourth place would be awarded to the team with the better percentage. Barring something apocalyptic, that'll be the Dogs.

Here's why. A few blogs back I noted that you could calculate the change in a team's percentage resulting from the outcome of a single game by using the following expression:

(1) Change in Percentage = (%S - %C)/(1 + %C) * Old Percentage

where %S = the points scored by the team in the current game as a percentage of the points it had already scored in the season,

and %C = the points conceded by the team in the current game as a percentage of the points it had already conceded in the season.

Now at this stage of the season, a big win or loss for a team will be one where the difference between %S and %C is in the 6-8% range, bearing in mind that a single game now represents about 1/20th or 5% of the season, so a 'typical' %S or %C would be about 5%. Scoring twice as many points as 'expected' then would give a %S of 10%, and conceding half as many as 'expected' would give a %C of 2.5%, a difference of 7.5%.

Okay, so consider a big loss for the Dogs, say 30-150. That gives a (%S - %C) of around -7.5%, which (1) tells us means the Dogs' percentage will change by about -7.5%/1.1 x 1.25, which is 8.5 percentage points. That drops the Dogs' percentage to about 116.

Next consider a big win for the Swans, again say 150-30. For them, that's a (%S - %C) of 6%, which gives them a percentage boost of 6%/1.02 x 1.06, which is about 6 percentage points. That lifts their percentage to about 112.5, still 3.5 percentage points short of the Dogs'.

To completely close the gap, Sydney needs its percentage change plus the Dogs' to exceed 18.3 percentage points, the percentage chasm it currently faces. Using this fact and the expression in (1) above for both teams, you can derive the fact that, to lift its percentage above the Dogs', Sydney needs the following to be true:

Sydney's (%S - %C) > 18.3% - 1.15 times the Dogs' (%S - %C)

Now my worst case 30-150 loss for the Dogs gives them a (%S - %C) of -7.6%. That means Sydney needs its (%S - %C) to be about 9.5%. So even if Sydney were to concede no points at all to the Lions - making %C equal to 0 - they'd need to score about 180 points to achieve this.

More generally still, Sydney need the sum of their victory margin and the Dogs' margin of defeat to be around 300 points if they're to grab fourth.

Sydney won't finish fourth.

Letting the Computer Do (Most of) the Work

Around this time of year it's traditional to work through the remaining matches for each team and attempt to codify what each needs to do in order to secure a particular finish - minor premiership, top 4, top 8 or Spoon.

This year, rather than work through all the combinations manually, I've decided to be lazy - purely for instructional purposes, I should add - and enlist the help of rule induction, a mathematical technique for deducing from a dataset statements in the form If A and B then C that describe key variables in that data.

So, for example, if you were to apply the technique to help describe the use of heating and cooling appliances by a household over the course of a few years you might collect information several times each day about who was home, what the outside temperature was, what day of the week and time of day it was, and whether or not a heating or a cooling appliance was turned on.

Using a rule induction algorithm, you'd be able to come up with statements such as this one: 

  • If Number of People Home is greater than 0 AND Outside Temperature is less than 15 degrees AND Time of Day is between 5:30pm and 11:30pm AND Day of Week is not Saturday or Sunday then Heating = ON (Probability 92%)

For this blog I provided a rule induction algorithm (the JRip Weka algorithm running in R, if you're curious) with the outputs from 10,000 of the simulations I used in my earlier blog, which included for each simulation:

  • The results of each of the remaining 16 games
  • The final ladder positions of each team if these were the actual results of each game

To simplify matters a little, and recognising that the main interest is not in exact ladder position finishes, I summarised each team's finishing position as either "1st","2nd to 4th","5th to 8th","9th to 15th", or "16th".

The goal was that the rule induction algorithm would output rules of the form:

  • If X beats Y AND X beats Z AND ... then X finishes 5th to 8th

Rule induction worked remarkably well. Here are a few real examples of the rules that the algorithm offered up for Collingwood's fate:

  • Rule 1: (Collingwood..v..Adelaide <= 0) and (Hawthorn..v..Collingwood >= 0) and (Carlton..v..Geelong <= 0) => Collingwood = 2nd to 4th (168.0/2.0)
  • Rule 2: => Collingwood =1st (9832.0/0.0)

Rule 1 can be interpreted as follows: 

  • If Collingwood loses to or draws with Adelaide (ie the margin in that game, couched in terms of Collingwood is less than or equal to zero) AND Collingwood loses to or draws with Hawthorn AND Geelong beats or draws with Carlton then Collingwood finish 2nd to 4th.

What's implicit here is that Geelong also beats West Coast but since, in the simulations, this always occurred when the other conditions in the rule were met, the algorithm didn't realise that this was an additional required condition.

As well, Collingwood can't be allowed to draw both its games otherwise Geelong can't overhaul them. Again, this situation didn't occur in the simulations I provided the algorithm, and not even the smartest algorithm can intuit instances that it's never seen.

I could probably have fixed both of these shortcomings by providing the algorithm with more than 10,000 simulations, though I'd pay a price in terms of computation time. Note though the (168.0 / 2.0) annotation at the end of this rule. That tells you that the rule could be applied to 168 of the simulations, but that it was wrong for 2 of them. Maybe the two simulations for which the rule applied but was incorrect included a Geelong loss to the Eagles or two draws for Collingwood.

Rule creation algorithms include what's called a "stopping rule" to prevent them from creating a unique rule for every simulation result, which might make the rules highly accurate but also makes them completely impractical.

Rule 2 is the "otherwise" rule and is interpreted as the predicted outcome if none of the earlier rules' full set of conditions are met. For Collingwood, "otherwise" is that they finish 1st.

The rules provided for other teams were generally quite similar, although they became more complex for teams when percentages were required to determine crucial ladder positions. Here, for example, are a few of the rules where the algorithm is attempting to model Hawthorn getting bumped into 9th by Melbourne: 

  • (Hawthorn..v..Fremantle <= -7) and (Port.Adelaide..v..Melbourne = -14) and (Melbourne..v..Kangaroos >= 20) and (Hawthorn..v..Collingwood = -39) and (Melbourne..v..Kangaroos Hawthorn = 9th to 15th (54.0/2.0)
     
  • (Hawthorn..v..Fremantle <= -4) and (Port.Adelaide..v..Melbourne = -7) and (Melbourne..v..Kangaroos = 11) and (Hawthorn..v..Collingwood = -59) and (Port.Adelaide..v..Melbourne = -32) = Hawthorn = 9th to 15th (41.0/3.0)

Granted that's a mite convoluted, but nothing that a human can't recognise fairly quickly, which nicely illustrates my experience with this type of algorithm: their outputs almost always contain some useful insights but the extraction of this insight requires human interpretation.

What follows then are the rules that man and machine have crafted for each team (note that I've chosen to ignore the possibility of draws to reduce complexity)

Collingwood 

  • Finish 2nd to 4th if Collingwood lose to Adelaide and Hawthorn AND Geelong beat Carlton and West Coast
  • Otherwise finish 1st

Geelong

  • Finish 1st if Collingwood lose to Adelaide and Hawthorn AND Geelong beat Carlton and West Coast
  • Otherwise finish 2nd to 4th

 St Kilda

  • Finish 2nd to 4th

 Western Bulldogs

  • Finish 5th to 8th if Dogs lose to Essendon and to Sydney AND Fremantle beat Hawthorn and Carlton
  • Otherwise finish 2nd to 4th

 Fremantle

  • Finish 2nd to 4th if Dogs lose to Essendon and to Sydney AND Fremantle beat Hawthorn and Carlton
  • Otherwise finish 5th to 8th 

Carlton and Sydney

  • Finish 5th to 8th

Hawthorn 

  • Finish 9th to 15th if Hawthorn lose to Fremantle and Collingwood AND Roos beat West Coast and Melbourne
  • Also Finish 9th to 15th if Hawthorn lose to Fremantle and Collingwood AND Melbourne beat Port and Roos sufficient to raise Melbourne's percentage above Hawthorn's
  • Otherwise finish 5th to 8th

Kangaroos

  • Finish 5th to 8th if Hawthorn lose to Fremantle and Collingwood AND Roos beat West Coast and Melbourne
  • Otherwise finish 9th to 15th 

Melbourne 

  • Finish 5th to 8th if Hawthorn lose to Fremantle and Collingwood AND Melbourne beat Port and Roos sufficient to raise Melbourne's percentage above Hawthorn's
  • Otherwise finish 9th to 15th 

Adelaide, Port Adelaide and Essendon

  • Finish 9th to 15th 

Brisbane Lions 

  • Finish 16th if Lions lose to Essendon and Sydney AND West Coast beat Geelong and Roos sufficient to lift West Coast's percentage above the Lions' AND Richmond beat St Kilda or Port (or both)
  • Otherwise finish 9th to 15th 

Richmond 

  • Finish 16th if West Coast beat Geelong and Roos AND Richmond lose to St Kilda and Port Otherwise finish 9th to 15th 

West Coast 

  • Finish 9th to 15th if West Coast beat Geelong and Roos AND Richmond lose to St Kilda and Port Finish 9th to 15th if West Coast beat Geelong and Roos AND Lions lose to Essendon and Sydney sufficient to lift West Coast's percentage above the Lions'
  • Otherwise finish 16th

As a final comment I'll note that the rules don't allow for the possibility of Sydney or Carlton slipping into 4th. Although this is mathematically possible, it's so unlikely that it didn't occur in the simulations provided to the algorithm. (Actually, it didn't occur in any of the 100,000 simulations from which the 10,000 were chosen either.)

A quick bit of probability shows why.

Consider what's needed for Sydney to finish fourth.
1. The Dogs lose to Essendon and Sydney
2. Sydney also beat the Lions
3. Fremantle don't win both their games

Furthermore, combined, Sydney and the Dogs' results have to close the percentage gap between the two teams, which currently stands at over 25 percentage points.

But the 15% and 60% figures just relate to the probability of the required result, not the probability that the wins and losses will be big enough to lift Sydney's percentage above the Dogs'. If Sydney were to trounce the Lions by 100 points and Essendon were to do likewise to the Dogs, then Sydney would still need to beat the Dogs by about 91 points to achieve such a lift.

So let's revise the probability of 1 down to 0.01% (which is probably generous) and the probability of 2 down to 5% (which is also generous). Then the overall probability is 0.01% x 5% x 80%, or about 1 in 250,000. Not gonna happen.

(For similar reasons there are also no rules for Fremantle dropping a game but still grabbing 4th from the Dogs on the basis of a superior percentage.)

Playing the Percentages

 

It seems very likely that this season, some ladder positions will be decided on percentage, so I thought it might be helpful to give you an heuristic for estimating the effect of a game result on a team's percentage.

A little maths produces the following exact result for the change in a team's percentage:

(1) New Percentage = Old Percentage + (%S - %C)/(1 + %C) * Old Percentage

where

%S = the points scored by the team in the game in question as a percentage of the points it has scored all season, excluding this game, and

%C = the points conceded by the team in the game in question as a percentage of the points it has conceded all season, excluding this game.

(In passing, I'll note that this equation makes it obvious that the only way for a team to increase its percentage on the basis of a single result is for %S to be greater than %C or, equivalently, for %S/%C to be greater than 1. Put another way, the team's percentage in the most current game needs to exceed its pre-game percentage.

This equation also puts a practical cap on the extent to which a team's percentage can alter based on the result of any one game at this stage of the season. For a team with a high percentage the term (%S - %C) will rarely exceed 5%, so a team with, for example, an existing percentage of 140 will find it hard to move that percentage by more than about 7 percentage points. Alternatively, a team with an existing percentage of just 70, which might at the extremes produce a (%S - %C) of 7%, will find it hard to move its percentage by more than about 5 percentage points in any one game.)

As an example of the use of equation (1) consider Sydney, who have scored 1,701 points this season and conceded 1,638, giving them a 103.8 percentage. If we assume, since this is Round 20, that they'll rack up a score this week that's about 5% of what they've previously scored all season and that they'll concede about 4%, then the formula tells us that their percentage will change by (5% - 4%)/(104%) * 103.8 = 1 percentage point.

Now 5% x 1,701 is about 85, and 4% x 1,638 is about 66, so we've implicitly assumed an 85-66 victory by the Swans in the previous paragraph. Recalculating Sydney's percentage the long way we get (1,701+85)/(1,638+66), which gives a 104.8 percentage and is, indeed, a 1 percentage point increase.

So we know that the formula works, which is nice, but not especially helpful.

To make equation (1) more helpful, we need firstly note that at this stage of the season the points that a team concedes in a game are unlikely to be a large proportion of the points they've already conceded so far in the entire season. So the (1+C%) in equation (1) is going to be very close to 1. That allows us to rewrite the equation as:

(2) Change in Percentage = (%S - %C) * Old Percentage

Now this equation makes it a little easier to play some what-if games.

For example we can ask what it would take for Sydney, who are currently equal with Carlton on competition points, to lift their percentage above Carlton's this weekend. Sydney's percentage stands now at 103.8 and Carlton's at 107.0, so Sydney needs a 3.2 percentage point lift.

Using a rearranged version of Equation (2) we know that achieving a lift of 3.2 percentage points from a current percentage of 103.8 requires that (%S - %C) be greater than 3.2/103.8, or about 3%. Now, if we assume that Sydney will concede points roughly equal to its season-long average then %C will be 1/19 or a bit over 5%.

So, to get the necessary lift in percentage, Sydney will need %S to be a bit over 5% + 3%, or 8%. To turn that into an actual score we take 8% x 1,701 (the number of points Sydney has scored in the season so far), which gives us a score of about 136. That's how many points Sydney will need to score to lift its percentage to around 107, assuming that its opponent this week (Fremantle) scores 5% x 1,638, which is approximately 82 points.

Within reasonable limits you can generalise this and say that Sydney needs to beat Fremantle by 54 points or more to lift its percentage to 107, regardless of the number of points Freo score. In reality, as Fremantle's score increase - and so %C rises - the margin of victory required by Sydney also rises, but only by a few points. A 60-point margin of victory will be enough to lift Sydney's percentage over Carlton's even in the unlikely event that the score in the Sydney v Freo game is as high as 170-110.

Okay, let's do one more what-if, this one a bit more complex.

What would it take for Melbourne to grab 8th spot this weekend? Well the Roos and Hawthorn would need to lose and the combined effect of Hawthorn's loss and Melbourne's win would need to drag Melbourne's percentage above Hawthorn's. Conveniently for us, Hawthorn and Melbourne meet this weekend. Even more conveniently, their respective points for and points against are all quite close: Hawthorn's scored 1,692 points and conceded 1,635; Melbourne's scored 1,599 and conceded 1,647.

The beauty of this fact is that, for both teams, in equation (2) Old Percentage is approximately 1 and, for any score, Hawthorn's %S will be approximately Melbourne's %C and vice versa. This means that any increase in percentage achieved by either team will be mirrored by an equivalent decrease in the percentage of the other.

All Melbourne needs do then to lift its percentage above Hawthorn's is to lift its percentage by one half the current difference. Melbourne's percentage stands at 97.1 and Hawthorn's at 103.5, so the difference is 6.4 and the target for Melbourne is an increase of 3.2 percentage points.

Melbourne then needs (%S-%C) to be a bit bigger than 3%. Since the divisors for both %S and %C are about the same we can re-express this by saying that Melbourne's margin of victory needs to be around 3% of the points it's conceded so far this season, which is 3% of 1,647 or around 50 points. Let's add on a few points to account for the fact that we need the margin to be a little over 3% and call the required margin 53 points.

So how good is our approximation? Well if Melbourne wins 123-70, Hawthorn's new percentage would be (1,692+70)/(1,635+123) = 1.002, and Melbourne's would be (1,599+123)/(1,647+70) = 1.003. Score 1 for the approximation. If, instead, it were a high-scoring game and Melbourne won 163-110, then Hawthorn's new percentage would be (1,692+110)/(1,635+163) = 1.002, and Melbourne's would be (1,599+163)/(1,647+100) = 1.003. So that works too.

In summary, a victory by the Dees over the Hawks by around 9-goals or more would, assuming the Roos lose to West Coast, propel Melbourne into the eight - not a confluence of events I'd be willing to wager large sums on, but a mathematical possibility nonetheless.

A Line Betting Enigma

The TAB Sportsbet bookmaker is, as you know, a man to be revered and feared in equal measure. Historically, his head-to-head prices have been so exquisitely well-calibrated that I instinctively compare any model I construct with the forecasts he produces. To show that a model historically outperforms leads me to scuttle off to determine what error I've made in constructing the model, what piece of information I've used that, in truth, was only available with the benefit of hindsight.
Read More

Trialling The Super Smart Model

The best way to trial a potential Fund algorithm, I'm beginning to appreciate, is to publish each week the forecasts that it makes. This forces me to work through the mechanics of how it would be used in practice and, importantly, to set down what restrictions should be applied to its wagering - for example should it, like most of the current Funds, only bet on Home Teams, and in which round of the season should it start wagering.
Read More

The Relationship Between Head-to-Head Price and Points Start

I've found yet another MAFL-related use for the Eureqa tool, this time to determine the precise relationship between a team's head-to-head price and the start it's giving or receiving on line betting. A simple plot of the history of a team's head-to-head price (or the probability that can be inferred from it) versus its start on line betting makes it obvious that there's a relationship between the two and that it's a non-linear one, but in the past I've been constrained by my own (lack of) ingenuity and persistence in generating sufficient possibilities to find its exact nature.
Read More

In-Running Wagering: What's the Best Strategy?

With services such as Betfair now offering in-running wagering opportunities, the ability to accurately assess a team's chances of victory at any given point in a game is now of considerable commercial value. Imagine, for example, that your team, who are at home, lead by 18 points at the first change. Would a wager on them at $1.40 be advised?
Read More