Matter of Stats

View Original

How Many Disposals Do You Need to Get the Coaches' Attention?

In the previous blog we investigated the differences between coaches and umpires in the player statistics they appear to take most notice of when casting their respective player-related votes.

We found some similarities (both are very influenced by disposal counts), and some differences (coaches are more influenced by whether the player is on the winning or losing team), but one thing we didn’t investigate was the specific nature of the relationships between individual player metrics and voting behaviour. For example, we know that disposals are an important metric in determining Brownlow and Coaches’ votes, but we don’t know exactly how the number of votes that a player receives varies as the disposal count changes.

COACHES’ VOTES AND PLAYER METRICS

In this blog we’re going to address this for coaches’ votes, and look at the relationship between metrics and these votes (using the same data that we used in the previous blog) by, firstly, using the ranger package to fit a random forest to a balanced sample of coaches’ votes data. Specifically, we’ll use dplyr’s slice_sample() function to create a sample (with replacement) of 500 observations for each coaches’ vote values from 0 to 10. We therefore end up with a dataset of 500 x 11 = 5,500 rows, 500 with coaches’ votes of 0, 500 with coaches’ votes of 1, and so on up to 10.

To estimate the potential impact, if any, of sample size, we’ll also create a dataset with 2,000 rows for each coaches’ vote value (so 11 x 2,000 = 22,000 rows in total).

In the table at right we look at the variable importance from these two new models compared to what we had for the ordinal forest from the previous blog.

(I should note that, for this blog, I’ve treated coaches’ votes as a continuous variable rather than an ordered factor. Also, we’ve again used the permutation metric rather than the Gini impurity metric on the basis that it is more trustworthy in cases where the model might be prone to overfitting.)

The main thing to notice is the broad similarity between the ranking of the various metrics, especially between those of the models we’ve built especially for this blog. Across the 31 metrics, the maximum ranking difference is just 4 spots and the average only 1.2 spots, due partly to the 11 metrics that have identical rankings in both models.

That gives us some comfort that we can use either model for any further analysis, and we’ll opt to use the model with the smaller sample size because it speeds up the creation of the new outputs considerably.

USING AN ICEBOX

One way of estimating the conditional effect of a metric on a target variable is via what are called Individual Conditional Expectations, which plot what happens to the prediction of the target for a single row if you calculate it, using the model, as you vary the value of the selected predictor from its minimum to its maximum value, and all values in between, and then average those results.

So, for example, to investigate the relationship between Disposals and Coaches’ Votes, we use the ranger random forest, take the first row of the data and replace the Disposal count with the smallest count seen in the data, keeping all other metrics constant, and calculate what the model estimates the Coaches’ Votes value would be. We then repeat this exercise for a grid of Disposal values up to the maximum value seen in the data. We therefore end up with 5,500 estimates of Coaches’ Votes for each of the Disposal count values in our grid, which we average to come up with what are also called Partial Dependence values.

To calculate these expectations we’ll use the wonderfully named ICEbox package in R.

On the right we have the results for the Disposals metric, which we can see comprises a series of lines each of which tracks the predicted Coaches’ Votes values as we vary the Disposal count for that particular observation. The point on each line is where Coaches’ Votes have been estimated at the actual Disposal count for that observation.

The yellow lines track the average expected Coaches’ Votes from the 5,500 estimates, and shows little impact on the expected Coaches’ Vote figure until a player gets to about 25 disposals.

After that there is a near linear increase in the expectation for Coaches’ Votes until we get to the rarefied atmosphere of about 38 disposals where we reach the maximum expected value of just under 6 Coaches’ Votes.

The Disposals metric is ranked as the most important metric by all of the models we’ve built so far, and that importance is partly reflected by how much the expectation for Coaches’ Votes varies across the range of Disposal values - especially, across the most-common values, which you can get a feel for from the “rug” shown at the bottom of the chart and which marks the low value, 10th percentile, 20th percentile, and so on for the Disposals metric.

The second-most important variable according to the ranger mode is Goals, and the plot for it appears below left.

Here we see a curvilinear relationship between expected Coaches’ Votes and Goals from 0 to about 5 goals, which represents about 95% of the sample. We see little to no change in the expected Coaches’ Votes for goal hauls above 6.

The third-most important metric is the Team Result, the plot for which appears below right. It shows that the relationship between expected Coaches’ Votes and Team Results is fairly flat for losses of up to about 50 points and then shows a progessive increase in expected Coaches’ Votes up to rough maximum at around the level of a draw.

In short, players on losing teams can expect to poll fewest Coaches’ Votes when their team loses by 8 goals or more, and can expect to poll slightly more, on average, for losses of smaller magnitude. Players on winning teams, on the other hand, start from about the same expected Coaches’ Vote count, regardless of the size of the victory.

The gallery below contains the plots for the metrics ranked 4th to 10th in terms of importance.

The broad highlights are:

  • Score Involvements: flat for fewer than 4 then a gentle increase from 4 to about 9 or 10. Flat thereafter. Only a small range from lowest expected Coaches’ Votes to highest.

  • Contested Possessions: flat for fewer than 6 then a gentle increase from 4 to about 16. Flat thereafter. Smallish range from lowest expected Coaches’ Votes to highest.

  • Kicks: flat for fewer than 8 then a gentle increase from 8 to about 20. Flat thereafter. Smallish range from lowest expected Coaches’ Votes to highest.

  • Metres Gained: flat for less than 200 then a gentle increase from 200 to about 700 (which is above the 90th percentile). Flat thereafter. Smallish range from lowest expected Coaches’ Votes to highest.

  • Effective Disposals: flat for fewer than 12 then a gentle increase from 12 to about 30 (which is above the 90th percentile). Flat thereafter. Very small range from lowest expected Coaches’ Votes to highest.

  • Clearances: nearly flat for fewer than 5 then a tiny increase from 5 to about 10 (which is above the 90th percentile). Flat thereafter. Very, very small range from lowest expected Coaches’ Votes to highest.

  • Uncontested Possessions: pretty much flat across the entire range of Uncontested Possession counts.

It’s interesting to note that by the time we’ve reached just the 10th-most important variable, the practical importance seems to be close to zero.

We get an even flatter curve if we plot the results for one of the metrics that is consistently ranked in the bottom few: Goal Assists.

So, a metric that would appear to bear some relationship to how well a player might be assessed as having played, has essentially zero affect on the voting behaviour of those who’ve watched the game intently.

SUMMARY AND CONCLUSION

We can see that expected Coaches’ Votes increase generally with the metrics that we’ve found to be important, but that they tend to do so in ranges, starting out flat for smaller values of the metric then rising for a time as that metric count increases, only to plateau for extreme values of the metric.

From this, we can get a rough idea of how many actions of a particular type a player needs to make before he will catch the attention of a voting coach.