Soccer analysts are big fans of correlations. Whenever one of us brings out a new metric, we usually quote some correlations to show how “good” it is. But these correlations can be meaningless or, worse, misleading. Here’s what to look for.
As I’ve written before, most soccer metrics ought to be correlated with winning or with some other objective – perhaps a style of play or avoidance of injury – that is important to the team. Here, for simplicity, I’ll stick with one version of winning: attaining the highest possible position in the league table at the end of the season.
What correlations would describe a useful metric in this situation? For starters, we might want a metric that predicts final positions early in the season. This kind of metric will let us know how our team is doing after only a handful of matches, so that we can make changes if necessary. We can call this Criterion 1.
But we might also want a metric that helps us to play better – a way of measuring performances so we can identify positives worth repeating and negatives to be discarded. So we might be looking for a metric where high values represented a good performance and low values a bad one.
Of course, in this case we’d have to define good and bad performances. Given our objective stated above, a good performance will be one that, if repeated, would deliver a high position in the table. We’ll call this Criterion 2.
For both criteria, it will be important to gauge correlations over several seasons. Correlations can bounce around from season to season; one metric may look crummy in one season, but the same metric might be the winner over the long term. We typically want to measure correlations over as many seasons as possible, as long as all of them represent similar conditions – that is, the same underlying system. That’s why using seasons from different leagues or from many years ago can present a problem.
Here I’m going to use the past five seasons of the English Premier League. The league hasn’t changed too much in fundamental ways since 2010-11, and, critically, data collection hasn’t changed too much, either. So how do some popular metrics fare in this sample?
The most obvious metric – goals – isn’t so hot at fulfilling Criterion 1. After five matches, goal difference was correlated at 0.65 with final positions in the table in the past five complete seasons of the English Premier League. (The correlations are actually negative, because lower numbered positions are better; here the magnitude is what interests us.) Total shots ratio (TSR) was also correlated at 0.65, but shots on target difference (SoTD) was 0.71. The most basic version of NYA’s shots-based expected goals model (“shot creation”) was 0.72.
What about Criterion 2? Final goal difference had a correlation of 0.93 with final positions in the table, but goals don’t always mark a performance that’s worth repeating. For example, consider PAOK’s win against Borussia Dortmund in the Europa League last week. PAOK had one shot. It was enough to win the match, but was there anything the club could really take away from the victory except a deeper appreciation of their goalkeeper?
Shots and even shots on target have a similar problem. Presumably, every player already tries to get every shot on target. There’s probably no shot that the shooter believes has zero chance of hitting the goal. But there may be some situations where shots have a greater likelihood of going on frame, and yet not of scoring. For example, a skilled player may regularly be able to put a direct free kick on target from 35 yards, but the chance of scoring from that distance is still relatively low; the goalkeeper has plenty of time to see the ball.
What teams really want are shots with a high chance of scoring; one of the primary purposes of shots-based expected goals models is to identify these shots. So emulating a performance with a lot of shots on target is better than nothing, but emulating one with a lot of expected goals is likely preferable. The lesson here is that a high correlation with positions doesn’t automatically make for a useful metric, either. In any case, here are the rest of the correlations for metrics at the end of the season with positions in the table: TSR 0.80, SoTD 0.83, NYA shot creation 0.87.
We also see correlations of expected goals with actual goals at the team level. A low correlation suggests a lot of the stuff that leads to goals isn’t being captured by the expected goals model. But we should be suspicious of correlations that are extremely high, too, since the models that generate them may include factors that players and coaches can’t control.
A final correlation that we often see is a season-to-season correlation for the same metric. I’m not sure how useful this one is for teams. If no team ever made any changes between seasons, then this correlation might give us some idea of how likely a team was to duplicate its results from the previous season. Yet teams make tons of changes between seasons – players, coaches, even owners – so the overall correlation across the league may have little meaning for individual teams. It’s probably most useful to bettors.
An important caveat to all of this is that correlation coefficients are not always the best way to measure the strength of an association between two variables. They essentially show how well the data fit a linear relationship, that is, whether low/high numbers in one variable correspond to low/high numbers in another variable. They’re also limited to relationships between just two variables. Ultimately, we’d want to use more than one variable to predict another.
So take correlation coefficients with a grain of salt – if an analyst can’t explain what they mean for his or her metric in a way that makes intuitive sense, then they probably don’t mean much.