Home » General » Diversity in analytics

Diversity in analytics

For the past several months, I’ve been thinking hard about how to correct the diversity deficit in English-speaking soccer analytics. This is overwhelmingly a white male field, and – to my eye at least – poorer for it. Can a white man do anything to improve matters?

There is some irony to this problem, as diversity in disciplines is actually one of the great assets of soccer analytics. Climate scientists, evolutionary biologists, computer scientists, marketing experts, and hobbyists with any day job you can imagine have all brought their unique abilities to the field. They’ve also imported ideas from many different sports: baseball, ice hockey, basketball, rugby, cricket, etc. But almost all of these analysts are white men, sharing fairly similar cultural norms. A bit more variety in the demographics, too, would undoubtedly enrich the field.

As an economist, I’m tempted to think about the issue in terms of the market: labor supply and labor demand. As far as I know, the demand for analysts does not overtly depend on ethnicity, gender, or sexual orientation. However, there may be subtle discrimination related to any or all of those traits. When it’s white men doing the recruiting and white men doing the hiring, perhaps it’s not surprising – albeit lamentably – if they accord some preference to white men.

On the supply side, the people offering their labor appear overwhelmingly to be white men. Just look at the photos (where available) in any list of analysts on Twitter, like this one. Or cast your eyes around the room at an analytics conference. There is usually a reasonable minority of Asian men but very few women or people of other backgrounds. In the course I recently offered with Philip Maymin at New York University’s Stern School of Business, one woman registered and then dropped out after a couple of classes.

But of course, the supply side isn’t as simple as it seems. There are barriers to entry for those who would like to supply their labor in this market. You need money and time to work with soccer data. And if you’re going to do anything mathematically intensive, you probably need some formal training as well – or at least more money and time to teach yourself. Some software packages and texts are freely available, but time and training may still be a constraint. Certainly these factors, as well as the encouragement of one’s peers, have been cited by people considering other related fields.

Data may also be an issue. They aren’t as easily available in soccer as in several other major sports. You either have to record it manually, pay for it, or figure out how to scrape it – time, money, and training once again. Data for women’s soccer are even harder to come by, notwithstanding some noble efforts by Danny Page, Christopher Long, and others. I may catch some flak for suggesting that wider availability of data on women’s soccer would bring more women into soccer analytics, but I have a feeling it would.

I think a few other simple changes could make an even bigger difference. First, the conveners in the field might give members of underrepresented groups more prominence. I haven’t seen Lucy Rushton or Sarah Rudd on any panels or podcasts lately; that may be by their choice, but I know they could hold their own with any of the men in the field. Second, providers could make extracts of their data available to the broader world, perhaps as part of a competition. Then more prospective analysts will have a chance to play with the numbers, and one barrier to entry will fall. Third, people who participate actively in the analytics community – online or otherwise – could make a special effort to engage with people from underrepresented groups who want to participate.

I’ll admit that I haven’t always done a good job of this. So this spring, I looked into the possibility of offering NYA internships aimed at underrepresented groups. In the United States, this is difficult to do without incurring a legal liability. An organization offering such positions must be able to prove that it has suffered a diversity deficit that can only be repaired in this way; otherwise, it leaves itself open to lawsuits from ineligible or overlooked but highly qualified white men.

Still, I don’t think this prevents an organization from using demographic criteria as a tiebreaker in the event that two candidates are equally qualified. And there’s nothing to stop an organization from recruiting heavily in underrepresented communities. So, when NYA is ready to hire later this year, I will do my best to get the word out – especially to people who don’t look like me. In the meantime, I welcome more ideas and suggestions on this subject.