Scoring Catenation: An Alternative Measure of Momentum
Almost two years ago, in a post-GF funk, I recall painstakingly cutting-and-pasting the scoring progression from the afltables site for 100 randomly-selected games from 2012. I used that data to search for evidence of in-game momentum, there characterising it as the tendency for a team that's just scored to be the team that's more likely to score next.
That blog suffered from a shortage of data and, partly as a consequence, reached only tentative conclusions. Today, armed with the data for over 1,300 games from the period 2008 to 2014 thanks to Paul from that same website, I want to revisit that analysis.
I should note firstly that the characterisation of momentum used in that earlier blog (and, later, in this one) is different from that which I used much more recently when I explored the tendency for the scoring of one or both of the teams to occur at different rates during a game, a phenomenon I referred to as the "burstiness" of scoring. A team was said to have bursty scoring if the pattern of the "interarrival" times between its successive scoring shots tended to form runs, a succession of relatively short interarrival times being followed by a succession of longer ones.
For the current blog I'm instead exploring whether momentum manifests in the turn-taking behaviour of scoring between the teams, no matter how far apart in time any successive scores might be. What I care about here is whether scoring tends to alternate between the teams consistent with the total proportion of Scoring Shots that they each ultimately produce, or whether scoring tends to occur in runs, first one team and then the other. Let's call this an exploration of whether or not scoring is "catenative" - a word used in grammar and in organic chemistry to describe a tendency to form chains.
Whereas the notion of "burstiness" can be applied to each team in a given contest separately - so that one, both or neither can exhibit it in a single game - "catenation" is a joint characteristic of the two teams' scoring in a given game and so only applies at the level of an entire game. A team can have "bursty" scoring but only a game can have "catenative" scoring.
THE METHODOLOGY
For each of the 1,364 games in the sample we form a chain representing the sequence of scoring that took place during the game paying attention only to whether the Home or the Away team scored, ignoring when the scoring occurred and ignoring whether it was a Goal, Behind or Rushed Behind.
So, for example, the sequence for the first game of 2008 was the following, HAHAAHHHAAAHAHHAHHHHHAHAAHAAHHHHAHAAAAHHAAAHHAAA, reflecting that the Home team scored first, then the Away team, then the Home team, and then the Away team twice in a row, and so on. In forming these sequences, Quarter breaks are ignored.
Having formed the sequence for a game we then use the pruns.exact function from the randomizeBE package in R to determine if the number of scoring runs we've seen is small enough that it's sufficiently unlikely to have been produced under the assumption (technically, the null hypothesis) that the likelihood of one team to score next is unrelated to whether or not it scored last. The intuition here is that a scoring sequence such as "HHHHAAAAHHHAAAHHHAAA" is less likely to have been produced at random from two teams each with a 50% chance of recording the next score regardless of which scored last, in comparison to, say, a scoring sequence such as "HAAAHAAHHHHAAHHAHAHH". The first sequence has only 6 runs (ie too few), while the latter has 11 runs.
More formally, I performed a one-tailed exact runs test using a 10% significance level. The key thing to recognise about this is that it means I should expect to find statistically significant levels of catenation 10% of the time even if the actual scoring sequences were generated by a process consistent with my null hypothesis under which assumption no such catenation exists.
THE RESULTS
Enriched as I am with such a large sample of data for this current investigation, I'm able to analyse meaningful subsets of it. Specifically, I've looked at the evidence for catenation on a season-by-season, and on a team-by-team basis.
Firstly then, let's look at the season-by-season analysis, which reveals that each of the seasons 2011 to 2014 have produced more than 10% of games whose scoring patterns were statistically significant at the 10% level.
Season 2012, the season from which I drew the 100-game sample for the earlier analysis, was not one riddled with catenative scoring behaviour however. Only 14% of games in that season produced scoring patterns with few enough runs that they were statistically significant at the 10% level, which is consistent with the 12% figure I estimated in that earlier blog. Both of the seasons either side of 2012 - more commonly referred to as 2011 and 2013 - included much larger proportions of games with evidence for catenation. The home-and-away season of 2014, however, has seen a return to more 2012-like patterns.
Overall, across the 1,364 games, 13.6% of them produced statistically significant scoring patterns at the 10% level of significance. That's almost 4.5 standard deviations away from the 10% we'd expect if no catenation were present at all, so, there is definite evidence for its existence. But, it's not an overwhelmingly common feature of the game.
Finally, a look at the team-by-team data, which reveals that, when playing at Home, Fremantle, Port Adelaide, Adelaide and Hawthorn all have a much greater than average tendency to be involved in games where the scoring is catenative, and Melbourne and the Brisbane Lions have a slightly smaller than average tendency to do likewise.
When playing Away, the Kangaroos, Port Adelaide and Carlton are the teams most commonly involved in catenative scoring and Hawthorn is the team least commonly so involved.
THE CONCLUSION
Some evidence does exist for the notion of in-game momentum characterised as catenative scoring, which is scoring that includes fewer runs of scoring by each team than would be expected if scoring by one team did not influence its probability of scoring again next.
Such scoring behaviour is not particularly prevalent however, with less than 1 in 7 games displaying it across the period from 2008 to 2014. Even in those seasons where it has been more common, it's been absent from 80% of games. As well, only a handful of teams have contributed to catenative scoring more often than 1 game in 6 when playing at Home or Away.