Tipping Without Market Price Information

In a previous blog I looked at the notion of momentum and found that Richmond, St Kilda, Melbourne and Geelong all seemed to be "momentum" teams in that their likelihood of winning a game seemed to be disproportionately affected by whether they'd won or lost their previous match.
Read More

A Friendly Wager on the Margin

You're watching the footy with a mate who leans over and says he reckons the Cats will win by 15 points. How much leeway should you give him to make it a fair even money bet? Surprisingly - to me anyway - the answer is 24 points either way. So, if the Cats were to record any result between a loss by 9 points and a win by 39 points you should pay out.
Read More

Introducing MAFL's First Neural Network

I've been leery of neural networks for some time because of their perhaps undeserved reputation for overfitting data and because of the practical difficulties that have existed in using them for prediction. Phil Brierly's Tiberius software includes an implementation of neural networks that has, at least for now, converted me. As a consequence, I'm adding one final margin predictor to the mix for 2011.
Read More

Margin Prediction for 2011

We've fresh tipsters for 2011, fresh Funds for 2011, so now we need fresh margin predictors for 2011. This year, all of the margin predictors are based on models that produce probability forecasts, which includes the algorithms powering ProPred, WinPred and the Head-to-Head Fund and the "model" that is the TAB Sportsbet bookmaker. The process for creating the margin predictors was to let Eureqa loose on the historical data for seasons 2007 to 2010 to produce equations that fitted previous home team margins of victory as a function of these models' probabilities.
Read More

The Calibration of the Head-to-Head Fund Algorithm

In the previous blog we considered the logarithmic probability score on ProPred, WinPred and the TAB bookie and found that the TAB bookie was the best calibrated of the three and that relative tipping performance was somewhat unrelated to relative probability scores. For the Head-to-Head Fund, whose job in life is to make money, the key question is to what extent do its probability scores relative to the TAB bookie's shed light on its money-making prowess.
Read More

Assessing ProPred's, WinPred's and the Bookie's Probability Forecasts

Almost 12 months ago, in this blog, I introduced the topic of probability scoring as a basis on which to assess the forecasting performance of a probabilistic tipster. Unfortunately, I used it for the remainder of last season as a means of assessing the ill-fated HELP algorithm, which didn't so much need a probability score to measure its awfullness as it did a stenchometer. As a consequence I think I'd mentally tainted the measure, but it deserves another run with another algorithm.
Read More

Home Team Wagering: Rumours of Its Death Have Been Greatly Exaggerated

I should probably have noticed this sooner, but last year was quite a profitable year for blindly wagering on Home Teams. A gambler who level-staked the AFL Designated Home Team in every game in the head-to-head and in the line market would have recorded an 8.4% ROI on his or her head-to-head wagers and a 4.1% ROI on his or her line wagers.
Read More

Why You Should Have Genes in Your Ensemble

Over on the MAFL Wagers & Tips blog I've been introducing the updated versions of the Heuristics, in this post and in this post. I've shown there that these heuristics are, individually, at least moderately adept at predicting historical AFL outcomes. All told, there are eleven heuristics, comfortably enough to form an ensemble, so in the spirit of the previous entry in MAFL Statistical Analyses, the question must be asked: can I find a subset of the heuristics which, collectively, using a majority voting scheme, tips better than any one of them alone?
Read More

Ensemble Models for Predicting Binary Events

I've been following the development of prediction markets with considerable interest over the past few years. These are markets in which the opinions of many engaged experts are combined, the notion being that their combined opinion will be a better predictor of a future outcome than the opinion of any one of them. It's a notion that has proved right on many occasions.
Read More

A First Look at the 2011 Draw

In this blog I'll be reviewing the 2011 draw in terms of Venue Experience, a term that I defined and explored in an earlier blog. A team's Venue Experience for a given game is defined as the number of times that the team has played at that game's venue during the immediately preceding 12 calendar months, including finals.
Read More

Home Ground Advantage: Fans and Familiarity

In AFL, playing at home is a distinct advantage, albeit perhaps a little less of an advantage than it once was. So, around this time of year, I usually spend a few days agonising over the allocation of home team status for each game in the upcoming season.
Read More

Picking Winners - A Deeper Dive

Last blog I identified a banker's dozen of algorithms that I thought were worthy of further consideration for Fund honours next season.

Experience has taught me that, behind the attractive veneer of some models with impressive historical ROIs often lurk troubling pathologies. One form of that pathology is exhibited by models with returns that come mostly from a handful of bets, one or two of them especially fortuitous. Another manifests as a 'bet large, bet often' approach that would subject any human on the business end of such wagering to the punting equivalent of a ride on The Big Dipper that's just as likely to end with you 100 metres above the ground as 200 metres below it. The question to be answered in this blog then is: do any of the 11 algorithms I've identified this time show any such characteristics?
Read More

Can We Do Better Than The Binary Logit?

To say that there's a 'bit in this blog' is like declaring the 100 year war 'a bit of a skirmish'.

I'll start by broadly explaining what I've done. In a previous blog I constructed 12 models, each attempting to predict the winner of an AFL game. The 12 models varied in two ways, firstly in terms of how the winning team was described ...
Read More

Why It Matters Which Team Wins

In conversation - and in interrogation, come to think of it - the key to getting a good answer is often in the framing of the question.

So too in statistical modelling, where one common method for asking a slightly different question of the data is to take the variables you have and transform them.

Consider for example the following results for four binary logits, each built to provide an answer to the question 'Under what circumstances does the team with the higher MARS Rating tend to win?'.
Read More

Grand Final Margins Through History and a Last Look at the 2010 Home-and-Away Season

A couple of final charts before GF 2.0.

The first chart looks at the history of Grand Finals, again. Each point in the chart reflects four things about the Grand Final to which it pertains ...
Read More

Drawing On Hindsight

When sports journos wait until after a contest has been decided before declaring a group of winning punters to be "savvy", I find it hard not to be at least a little cynical about the aptness of the label.

So when, on Sunday, I read in the online version of the SMH that a posse of said savvy punters had foxed the bookies and cleaned up on the draw, collectively winning as I recall about $1m at prices ranging from $34 to $51, I did wonder how many column-inches would have been devoted to those same punters had the margin been anything different when the final siren sounded on Saturday. I'm fairly certain it would have been the number that has '1' as its next-door, up the road neighbour on Integer Street.
Read More

The Bias in Line Betting Revisited

Some blogs almost write themselves. This hasn't been one of them.

It all started when I read a journal article - to which I'd now link if I could find the darned thing again - that suggested a bias in NFL (or maybe it was College Football) spread betting markets arising from bookmakers' tendency to over-correct when a team won on line betting. The authors found that after a team won on line betting one week it was less likely to win on line betting next week because it was forced to overcome too large a handicap.

Naturally, I wondered if this was also true of AFL spread betting.

What Makes a Team's Start Vary from Week-to-Week?

In the process of investigating that question, I wound up wondering about the process of setting handicaps in the first place and what causes a team's handicap to change from one week to the next.

Logically, I reckoned, the start that a team receives could be described by this equation:

Start received by Team A (playing Team B) = (Quality of Team B - Quality of Team A) - Home Status for Team A

In words, the start that a team gets is a function of its quality relative to its opponent's (measured in points) and whether or not it's playing at home. The stronger a team's opponent the larger will be the start, and there'll be a deduction if the team's playing at home. This formulation of start assumes that game venue only ever has a positive effect on one team and not an additional, negative effect on the other. It excludes the possibility that a side might be a P point worse side whenever it plays away from home.

With that as the equation for the start that a team receives, the change in that start from one week to the next can be written as:

Change in Start received by Team A = Change in Quality of Team A + Difference in Quality of Teams played in successive weeks + Change in Home Status for Team A

To use this equation for we need to come up with proxies for as many of the terms that we can. Firstly then, what might a bookie use to reassess the quality of a particular team? An obvious choice is the performance of that team in the previous week relative to the bookie's expectations - which is exactly what the handicap adjusted margin for the previous week measures.

Next, we could define the change in home status as follows:

  • Change in home status = +1 if a team is playing at home this week and played away or at a neutral venue in the previous week
  • Change in home status = -1 if a team is playing away or at a neutral venue this week and played at home in the previous week
  • Change in home status = 0 otherwise

This formulation implies that there's no difference between playing away and playing at a neutral venue. Varying this assumption is something that I might explore in a future blog.

From Theory to Practicality: Fitting a Model

(well, actually there's a bit more theory too ...)

Having identified a way to quantify the change in a team's quality and the change in its home status we can now run a linear regression in which, for simplicity, I've further assumed that home ground advantage is the same for every team.

We get the following result using all home-and-away data for seasons 2006 to 2010:

For a team (designated to be) playing at home in the current week:

Change in start = -2.453 - 0.072 x HAM in Previous Week - 8.241 x Change in Home Status

For a team (designated to be) playing away in the current week:

Change in start = 3.035 - 0.155 x HAM in Previous Week - 8.241 x Change in Home Status

These equations explain about 15.7% of the variability in the change in start and all of the coefficients (except the intercept) are statistically significant at the 1% level or higher.

(You might notice that I've not included any variable to capture the change in opponent quality. Doubtless that variable would explain a significant proportion of the otherwise unexplained variability in change in start but it suffers from the incurable defect of being unmeasurable for previous and for future games. That renders it not especially useful for model fitting or for forecasting purposes.

Whilst that's a shame from the point of view of better modelling the change in teams' start from week-to-week, the good news is that leaving this variable out almost certainly doesn't distort the coefficients for the variables that we have included. Technically, the potential problem we have in leaving out a measure of the change in opponent quality is what's called an omitted variable bias, but such bias disappears if the the variables we have included are uncorrelated with the one we've omitted. I think we can make a reasonable case that the difference in the quality of successive opponents is unlikely to be correlated with a team's HAM in the previous week, and is also unlikely to be correlated with the change in a team's home status.)

Using these equations and historical home status and HAM data, we can calculate that the average (designated) home team receives 8 fewer points start than it did in the previous week, and the average (designated) away team receives 8 points more.

All of which Means Exactly What Now?

Okay, so what do these equations tell us?

Firstly let's consider teams playing at home in the current week. The nature of the AFL draw is such that it's more likely than not that a team playing at home in one week played away in the previous week in which case the Change in Home Status for that team will be +1 and their equation can be rewritten as

Change in Start = -10.694 - 0.072 x HAM in Previous Week

So, the start for a home team will tend to drop by about 10.7 points relative to the start it received in the previous week (because they're at home this week) plus about another 1 point for every 14.5 points lower their HAM was in the previous week. Remember: the more positive the HAM, the larger the margin by which the spread was covered.

Next, let's look at teams playing away in the current week. For them it's more likely than not that they played at home in the previous week in which case the Change in Home Status will be -1 for them and their equation can be rewritten as

Change in Start = 11.276 - 0.155 x HAM in Previous Week

Their start, therefore, will tend to increase by about 11.3 points relative to the previous week (because they're away this week) less 1 point for every 6.5 points lower their HAM was in the previous week.

Away teams, therefore, are penalised more heavily for larger HAMs than are home teams.

This I offer as one source of potential bias, similar to the bias that was found in the original article I read.

Proving the Bias

As a simple way of quantifying any bias I've fitted what's called a binary logit to estimate the following model:

Probability of Winning on Line Betting = f(Result on Line Betting in Previous Week, Start Received, Home Team Status)

This model will detect any bias in line betting results that's due to an over-reaction to the previous week's line betting results, a tendency for teams receiving particular sized starts to win or lose too often, or to a team's home team status.

The result is as follows:

logit(Probability of Winning on Line Betting) = -0.0269 + 0.054 x Previous Line Result + 0.001 x Start Received + 0.124 x Home Team Status

The only coefficient that's statistically significant in that equation is the one on Home Team Status and it's significant at the 5% level. This coefficient is positive, which implies that home teams win on line betting more often than they should.

Using this equation we can quantify how much more often. An away team, we find, has about a 46% probability of winning on line betting, a team playing at a neutral venue has about a 49% probability, and a team playing at home has about a 52% probability.

That is undoubtedly a bias, but I have two pieces of bad news about it. Firstly, it's not large enough to overcome the vig on line betting at $1.90 and secondly, it disappeared in 2010.

Do Margins Behave Like Starts?

We now know something about how the points start given by the TAB Sportsbet bookie responds to a team's change in estimated quality and to a change in a team's home status. Do the actual game margins respond similarly?

One way to find this out is to use exactly the same equation as we used above, replacing Change in Start with Change in Margin and defining the change in a team's margin as its margin of victory this week less its margin of victory last week (where victory margins are negative for losses).

If we do that and run the new regression model, we get the following:

For a team (designated to be) playing at home in the current week:

Change in Margin = 4.058 - 0.865 x HAM in Previous Week + 8.801 x Change in Home Status

For a team (designated to be) playing away in the current week:

Change in Margin = -4.571 - 0.865 x HAM in Previous Week + 8.801 x Change in Home Status

These equations explain an impressive 38.7% of the variability in the change in margin. We can simplify them, as we did for the regression equations for Change in Start, by using the fact that the draw tends to alternate team's home and away status from one week to the next.

So, for home teams:

Change in Margin = 12.859 - 0.865 x HAM in Previous Week

While, for away teams:

Change in Margin = -13.372 - 0.865 x HAM in Previous Week

At first blush it seems a little surprising that a team's HAM in the previous week is negatively correlated with its change in margin. Why should that be the case?

It's a phenomenon that we've discussed before: regression to the mean. What these equations are saying are that teams that perform better than expected in one week - where expectation is measured relative to the line betting handicap - are likely to win by slightly less than they did in the previous week or lose by slightly more.

What's particularly interesting is that home teams and away teams show mean regression to the same degree. The TAB Sportsbet bookie, however, hasn't always behaved as if this was the case.

Another Approach to the Source of the Bias

Bringing the Change in Start and Change in Margin equations together provides another window into the home team bias.

The simplified equations for Change in Start were:

Home Teams: Change in Start = -10.694 - 0.072 x HAM in Previous Week

Away Teams: Change in Start = 11.276 - 0.155 x HAM in Previous Week

So, for teams whose previous HAM was around zero (which is what the average HAM should be), the typical change in start will be around 11 points - a reduction for home teams, and an increase for away teams.

The simplified equations for Change in Margin were:

Home Teams: Change in Margin = 12.859 - 0.865 x HAM in Previous Week

Away Teams: Change in Margin = -13.372 - 0.865 x HAM in Previous Week

So, for teams whose previous HAM was around zero, the typical change in margin will be around 13 points - an increase for home teams, and a decrease for away teams.

Overall the 11-point v 13-point disparity favours home teams since they enjoy the larger margin increase relative to the smaller decrease in start, and it disfavours away teams since they suffer a larger margin decrease relative to the smaller increase in start.

To Conclude

Historically, home teams win on line betting more often than away teams. That means home teams tend to receive too much start and away teams too little.

I've offered two possible reasons for this:

  1. Away teams suffer larger reductions in their handicaps for a given previous weeks' HAM
  2. For teams with near-zero previous week HAMs, starts only adjust by about 11 points when a team's home status changes but margins change by about 13 points. This favours home teams because the increase in their expected margin exceeds the expected decrease in their start, and works against away teams for the opposite reason.

If you've made it this far, my sincere thanks. I reckon your brain's earned a spell; mine certainly has.