Soccer analytics is still in its infancy. I say that because it lacks a central metric – a way of measuring player performance that is widely accepted by the analytics community, club directors, coaches, players, and fans alike. Yet every day, soccer wonks from around the world come a little bit closer to devising such a metric. Here are some of the traits I think it should have:
1. Accessible. Most of the people who work in professional soccer are not experts in mathematics and statistics, so trying to understand the roots of a complex metric may be too time-consuming. For a metric to gain wide acceptance, it has to be easy to use and understand. We need to choose simplicity whenever possible – perhaps even at a small cost to accuracy and precision – if we want our work to become part of the mainstream in the industry.
2. Incentive compatible. Let’s say I’m a coach, and I start evaluating players using a new metric. Eventually, the players find out what the metric is and how it works. To make themselves look good, they change their playing styles to maximize their scores on the new metric. This is great, as long as their behavior will help the team to win more games. But what if it won’t? A metric that gauges, say, the distance the ball travels towards the opposing goal after being touched by a given player is clearly susceptible to gaming; players could just hoof the ball up the field whenever they received it. By contrast, a metric that is compatible with players’ incentives will encourage them to change in a way that is better for the team.
3. Aggregable. Useful metrics are meaningful for individual players, but the most useful metrics are also meaningful when they are aggregated for entire teams. The most basic aggregable metric would satisfy at least the following relationship: if Team A’s players have a higher score on the metric than Team B’s players at every position, we should expect Team A to beat Team B. As such, an aggregable metric has some capacity for prediction.
4. Fungible. Here we take aggregation a step further. With a fungible metric, if the sum of the scores for Team A’s players is greater than the sum for Team B, then we should expect Team A to win regardless of the distribution of scores among individual players. In other words, if I replace two players on Team A who score 0.6 and 0.4 on a fungible metric with two players who score 0.8 and 0.2, the team’s overall performance should be unchanged*. Because of the interdependence of players in a game, we may never find a completely fungible metric.
5. Non-context-dependent. A question that often faces club directors is how a player from another team would perform if he joined the squad. There is no easy answer, since his performance at the other team depended not just on himself but on his teammates as well. For example, an amazing midfielder who plays behind a terrible shooter won’t get a lot of assists despite his innate quality. By the same token, a mediocre striker who sits in front of a stunning playmaker will bag a ton of goals. Only a metric that can be adjusted for the quality of their teams would be able to discover their true ability.
6. Isomorphic. This is a fancy way of saying that the metric offers a one-to-one relationship between its scores and levels of player quality. For instance, take two players who touch the ball rarely. One is terrible, and his teammates don’t pass to him whenever they can avoid it. The other is fantastic, and the opposition always tries to stop him from receiving the ball. The most useful metrics will be able to distinguish between these two players. Moreover, they will also have a score for a player who never touches the ball at all. It’s important to note that this trait does not require players of the same quality to be identical; they can supply the same level of quality in different ways.
7. Parsimonious. Simplicity saves time in both the calculation and interpretation of metrics. From a practical perspective, the most useful metrics are based on the smallest amount of data capable of carrying the information required. The collection of data in soccer is increasing by leaps and bounds, but that doesn’t mean we have to use all of it all the time.
8. Individually robust. As a corollary to the previous trait, we don’t want to use too little data, either. A bench player may have the good fortune to come on as a substitute for a few minutes of a season during which a couple of goals are scored. At first glance, it may look like his participation was pivotal in improving his team’s goal difference. But he may have played too few minutes to accept his remarkable record as anything more than a fluke.
9. Systematically robust. Beware the metric that becomes more or less informative depending on shifts in the way the game is played! Essentially, such a metric needs to be recalibrated every time something about the overall system changes**. If an influx of foreign players or a new offside rule renders a metric obsolete, then scores from the past will not be comparable to data from the future. An ideal metric will convey useful information as long as the core laws of the game remain the same, allowing players from all eras to be compared against the same standard.
10. Well named. It may sound silly, but metrics need branding just like anything else. A catchy or rhythmic title that is also informative – that is, it tells you the essence of the metric – will help to encourage take-up and usage in daily interactions. Being able to name a metric this way is a good check for accessibility, too. If there’s no easy name for the metric, then the concept behind it may be too complex.
I hope that these traits resonate with the people at the vanguard of the analytics community, even though it may be impossible to capture all of them completely in a single metric. I will be presenting some of my own metrics here in the weeks to come. In the meantime, please feel free to add new traits to my list.
* This is a particular challenge for those who believe in the “weak link” theory of soccer propounded by Anderson and Sally in The Numbers Game.
** This is often true of metrics based on coefficients from neural networks, regressions, and factor analysis.