Matter of Stats

View Original

Using the In-Running Models

In the previous blog I provided three models to predict, in-running, the outcome of an AFL game. Each model was designed to be used at the end of a particular quarter and uses as inputs only the pre-game TAB Bookmaker prices and the Home team leads at the end of all previous quarters, if any. These models were: 

  1. logit(Predicted Probability of Home Team Win at End of Q1) = 0.08 + 0.42 x Pre-Game Log Odds Ratio for Home Team + 0.05 x Home Team Lead at End of Q1
  2. logit(Predicted Probability of Home Team Win at End of Q2) = 0.06 + 0.36 x Pre-Game Log Odds Ratio for Home Team - 0.02 x Home Team Lead at End of Q1 + 0.09 x Home Team Lead at End of Q2
  3. logit(Predicted Probability of Home Team Win at End of Q3) = 0.01 + 0.28 x Pre-Game Log Odds Ratio for Home Team - 0.02 x Home Team Lead at End of Q1 + 0.02 x Home Team Lead at End of Q2 + 0.11 x Home Team Lead at End of Q3

In this blog we'll explore the practical application of these models, in particular by investigating how the Home team's estimated victory probability varies depending on the lead it had established - which will be negative if it trails - at the end of the just-finished quarter.

We'll calculate these probabilities for Home teams with varying levels of pre-game Bookmaker enthusiasm, ranging from rampant favouritism to significant underdoggedness. To be used in the models these levels of enthusiasm must be expressed as log odds ratios which, though statistically most-predictive when expressed in this form, are not humanly most-intuitive, so here's a table showing the conversion of pre-game prices to probabilities and log odds ratios under the assumption of a fixed 5% vig.

For the vast majority of contests, the Home team price will range between about $1.10 and $6.00, meaning that the log odds ratio will be in the +1.80 to -1.60 range. Certainly it will be very rare to find a game where the Home team's log odds ratio will be outside the -3 to +3 range. Roughly speaking, at a log odds ratio of +3 the Home team is about 15 times as likely to win as the Away team, and at a log odds ratio of -3 the Away team is about 15 times as likely to win as the Home team. (Note that I'm using logs to base 10 here.)

A few other log odds values are useful anchor points: a log odds ratio of zero denotes equal favouritism, a ratio of +1 represents about a 75% probability of victory for the Home team, and a ratio of +2 represents about a 90% probability of victory. 

Home team underdogs all carry negative log odds ratios, with a value of -1 representing about a 25% victory probability for the Home team, and a value of -2 representing about a 10% probability.

In this blog I'll be considering Home teams with log odds ratios of +0.90, +0.35, 0.00, -0.25, -1.08 and -1.94, which roughly correspond to Home teams with victory probabilities of 73%, 60%, 50%, 45%, 25% and 12%.

QUARTER TIME MODEL

Firstly then, let's look at the first model, which gives us the estimated Home team victory as at the end of the 1st quarter given, as inputs, the Home team's pre-game log odds ratio and the lead it had established at the first change.

The chart shows how this estimated victory probability varies depending on the lead, and demonstrates the relative importance of the Home team's pre-game log odds in determining its estimated victory probability. In Bayesian terms, it shows how significantly the Home team's posterior probability, conditional on its lead at the end of Q1, is dependent on the Home team's prior probability as reflected in the Bookmaker's pre-game prices (and hence log odds assessment).

This dependence on the pre-game Bookmaker prices is reflected in the vertical distance between the lines in this chart, each of which traces the estimated victory probability of a Home team with a specified pre-game price. So, for example, we find that a Home team that was the pre-game heavy underdog (with a log odds ratio of -1.84) is estimated as having only about a 1-in-3 probability of victory if it finds itself tied at Quarter time, while a Home team that was a pre-game comfortable favourite (with a log odds ratio of +0.9) is assessed as a slightly better than 60% chance of winning.

Whilst there is a general tendency towards loyalty for Home team favourites, this loyalty has its limits. For example, Home teams that were comfortable pre-game favourites (ie with a +0.9 log odds ratio or about a 75% chance of victory) are assessed as less than even money chances if they find themselves trailing by as little as a couple of goals at Quarter time. The empirical data bears this out - though the sample is small - with Home teams priced between $1.20 and $1.40 and trailing by 2 goals or more at Quarter time winning only 14 of 29 games across the period 2006 to 2012.

HALF TIME MODEL

Scenario modelling using the second model is a little more complicated, because now we need to consider the Home team's lead at Quarter and at Half time, as well as its pre-game log odds ratio. Firstly then, let's consider Home teams that trailed by 2 goals at Quarter time.

The Home team's pre-game log odds, while still important, are somewhat less so, again as evidenced by the spacing of the lines in the chart. Now a Home team that finds itself tied at Half time, having trailed by 2 goals at Quarter time is assessed as having about a 40% chance of victory if its pre-game log odds ratio was -1.94 and as about a 65% chance of victory if its pre-game log odds ratio was +0.90.

Again too it doesn't take much by way of a points deficit to plunge the most pre-game likely of Home team victors into underdog status. A Home team that had an estimated pre-game victory probability of 75% is estimated as having only about a 30% chance of victory should it trail by 15 points at Half time having trailed by 12 at Quarter time.

On the flip-side, even a small lead is viewed positively by the models, so much so that, for example, a pre-game favourite with a +0.35 log odds ratio (and hence about a 60% pre-game victory probability) would be assessed as about 70% chances should they lead by as little as 1 goal at the main break having trailed by 2 goals at Quarter time.

It's important to note, however, how unlikely are scenarios in which the lead changes substantially from one quarter-end to the next. The change in the Home team's lead from the end of the 1st quarter until the end of the 2nd is distributed approximately as a Normal random variable (p = 0.4 using the Shapiro-Wilk test statistic) with a mean of 2 points and a standard deviation of about 18 points. As a rough guide then, for about 70% of all games, the change in the Home team's lead between Quarter time and Half time is about plus or minus 3 goals or less. In other words, if the Home team led by 6 points at Quarter time, about 70% of the time the Half time Home team lead will be between about a 2 goal deficit and a 4 goal lead.

Caution should be exercised in using the Half time model for games where the change in the Home team lead from what it was at Quarter time is greater than plus or minus 3 goals.

Next, consider Home teams that lead by 2 goals at Quarter time.

Bearing in mind the admonition to avoid using estimates from the Half time model for Half time leads too distant from the Quarter time lead, here we should focus our attention on those parts of the curves between Home team leads of about -6 to +30 points. 

One notable aspect of the probability estimates in this range is that they're below 50% for all Home teams that trail by even a few points at Half time, regardless of the Home team's pre-game log odds ratios. Again this is borne out by (relatively sparse) empirical data which shows that, amongst the 34 Home teams that have led by 2 goals or more at Quarter time and then trailed by any margin at Half time, only 11 have gone on to win.

THREE-QUARTER TIME MODEL

Lastly we turn to the third and more complex model, which requires four inputs: three Home team leads and the pre-game log odds ratio.

Six different scenarios are presented below: 

  • Home trail of 12 points at Quarter time and 24 points at Half time
  • Home trail of 12 points at Quarter time and 12 points at Half time
  • Home trail of 12 points at Quarter time and level scores at Half time
  • Home lead of 12 points at Quarter time and level scores at Half time
  • Home lead of 12 points at Quarter time and 12 points at Half time
  • Home lead of 12 points at Quarter time and 24 points at Half time

I'll leave a more detailed review of these charts to the reader but note that the distribution of Home team lead changes between Half time and Three-Quarter time is also approximately distributed as a Normal random variable, here with a mean of about +2 points and a standard deviation of about 18.6 points. So, as for the Half time model, caution should be exercised in using this model to explore probabilities for games where the change in the Home team's lead between the 2nd and 3rd quarters exceeds about plus or minus 3 goals.

(The one thing I will note is the evidence for the reduced importance of the Home team's pre-game pricing, which manifests in the relatively more compressed appearance of the lines in each chart. Note that these charts, and all others in this blog, can be clicked to access larger versions.)