Pretty much everyone in the soccer/football world has heard by now that Marcelo Bielsa sent staff to spy on the training sessions of all of Leeds United’s opponents, that he admitted as much in a press conference, and that he then unveiled a substantial amount of his opposition analysis on Derby County. He also revealed that as many as 20 other people work on opposition analysis at Leeds United, and they spent 360 hours analyzing old Derby matches. And this is where I think I can save them some time.
In his own words, here are the key points that Bielsa wanted to derive from those 15 days of video analysis: “the starting 11, the tactics, the approach on set pieces.” More specifically, he also wanted to see “the chances to score, the half chances to score and which team dominates every five minutes” and “to see what were the positions of the players still in Derby from last year.” And don’t forget the “structures Derby play against.”
Well, have I got news for you, Marcelo. Any data analyst worth his or her salt has a shot-creation xG model and can write code that will compile these data for you, on any team in your division, within a few minutes. It doesn’t matter whether your data come from Opta, Ortec, StatsBomb, or any of several other decent providers. For a bit more detail, let me put the things you mentioned in analyst-y language:
- starting 11: listed at the beginning of each match by most data providers
- tactics: formations, directness, use of aerials and long balls, use of width, passing networks, etc are all tagged explicitly or easily measured
- approach on set pieces: how would you like to see a map of every corner showing where the ball landed, who received it, and what happened next – and then the same for indirect free kicks and throw-ins?
- the chances to score: look for those big bubbles on pretty much any analyst’s xG shot map, or if you prefer a table separated by player and/or location, no problem!
- the half-chances to score: look for the smaller bubbles that are still in the penalty area or the six-yard box
- the positions of the players still in the squad from last season: you can have average location, average location when playing a specific position or executing a specific action, most common positions in different formations, you name it!
- the structures Derby play against: we can do all the same things for opponents, and we can even categorize the opponents by playing style and measure who performed best against Derby in terms of shots, xG, ball progression…
- which team dominates every five minutes: like I was saying above, just pick how you want to measure domination
Of course, Bielsa didn’t just want these simple data. He also wanted to know more subtle things, such as “to understand why Derby changes the system and when” and “if the changes are to give an offensive profile to the team or to give defensive strength and if these changes work or not.” Here we do need the help of the video analysts. But we can save them a lot of time by using data to point to the notable moments in each match.
It’s pretty straightforward to queue up video with tagging software (e.g. Sportscode) based on a sync file of events from match data. In other words, the software can read an extract of the match data that highlights specific kinds of events – like a change in attacking formation – so that the video analyst can click through them quickly, only watching a few minutes at a time rather than the entire match. The same goes for the signals players send before corners and indirect free kicks and even their tells when taking penalties.
And we can go further with the data. Want to know the “best headers of the opponents” as Marcos Abad added in his portion of the press conference? Many analysts will have an algorithm for that, with a metric based on hundreds of headers – and not just from one season.
Bielsa said several times that he demanded each trove of analysis for his own peace of mind: to feel that he had done everything possible to increase his squad’s chances of winning, and to allay his own feelings of stupidity. I’d suggest that using match data to speed up and enrich his staff’s work could engender even greater peace of mind, as well as allowing his staff to spend a bit more time with their families. I don’t know how many data analysts currently work at Leeds United, but they’re hiring a data manager now, supposedly to work across all departments (though it sounds like mostly commercial). So perhaps help is already on the way!