“Past performance is no guarantee of future results.” This seemingly obvious statement is a mantra in the world of finance, yet it doesn’t seem to have filtered into professional soccer. Today the analytics community shares some of the blame for the omission – hopefully not for long.
Before virtually every game, soccer commentators faithfully recite teams’ head-to-head records as though they could somehow sway the day’s events: “This [insert name of town] derby is finely balanced, with exactly 126 wins for each team and 126 draws in the past 189 years” or “Manchester United haven’t lost to [insert name of team] at Old Trafford since 683 A.D.” Are these statistics relevant to what will actually happen on the field? Often, they are much less relevant than we might think.
If we could measure every possible factor that affected the outcome of a soccer game, then we wouldn’t need the past at all. We’d simply total up the factors for each side, and we’d have the correct prediction of the final score. In reality, of course, we can’t quantify all the factors that might be important; we don’t even know what all the relevant factors are. Still, this approach – building predictions from the bottom up – is the ideal towards which we should strive.
When we fall short of the ideal, the past can help. If we don’t know enough about a given game, looking back at historical statistics can allow us to say, “In this sort of game, what usually happens?” And if playing soccer games were part of a reliably stable process, like flipping coins, we’d have a pretty good idea.
But they’re not. As a consequence, we can’t rely on the past entirely. For example, until this weekend Manchester United had never lost to West Bromwich Albion in the Premier League. If we had relied completely on the past, we never would have predicted the outcome; precedent suggested that West Brom had zero chance of winning. To allow for any chance of a victory, we would have had to include other factors: West Brom’s good performance against Sunderland in the previous week, Manchester United’s lopsided loss against Manchester City, injuries in the squads, weather, luck, what Morgan Amalfitano ate for breakfast, etc.
This isn’t to say that we should ignore the past completely. Perhaps West Brom’s miserable record at Old Trafford really does tell us something useful about the intimidating atmosphere of the stadium, the comfort of the visiting dressing room, or the potholes on the road from the West Midlands. But what we know out about the players and tactics on game day can trump all of this information. After all, if West Brom had shown up at Old Trafford with the Brazilian national squad wearing their shirts and Sir Alex Ferguson clutching the team sheet, few pundits would have predicted another loss.
The key in prediction is how we balance history with what we know about the present. Today, we are starting to understand how players fit together as the raw materials that build the potential to win on a given day. We can finally begin to quantify the innate potential that players carry from game to game, and the synthesis that occurs when different groups of players take the field together.
But there is a limit to progress in this direction, too. To obtain the most pertinent information for prediction, we would want to administer some kind of test to the players just before a game kicked off in order to gauge their individual and collective aptitude. The results, considered alongside the manager’s tactics and the conditions in the stadium, would give the best assessment of the team’s potential on the day. Yet it’s unlikely that we’ll ever be able to create such a test, so our evaluation of players will inevitably be linked to their performance in the past.
The question then becomes, “How far back should we go?” The answer is probably, “A lot less far than we go today.” Are the results from five, ten, or twenty years ago really relevant to the Manchester United versus West Bromwich Albion rivalry? I doubt it. If they were, the league table today would look a lot more like this all-time table for the first division of English football.
The same goes for players. Plenty of bona fide stars have won the Professional Footballers Association’s Young Player of the Year award, but so have Jermaine Jenas and Harry Kewell. So does it make sense to consider a 29-year-old player’s performance as a 17-year-old, when we also have access to his performance as a 28-year-old? Maybe not so much.
More likely, the relevant information for making a prediction is contained in the past five, ten, or twenty games played by each player and each team. Our task is to analyze those games in a way that reveals more about the aptitude of the players and teams today. When we do that, the commentators will finally have some new numbers to discuss.