Undersampling: When You Want to Improve Diversity at Work (but you’re too lazy to actually do it) sirius_li Follow Mar 18, 2017 · 4 min read

Roughly a year ago a blog post from Airbnb was circulating through my company’s Slack channels. The article was titled Beginning with Ourselves and it described how Airbnb drastically improved the gender balance of their data team through a combination of analysis, process changes, and investment in the community of female data scientists.

To give a quick summary, Airbnb divided their approach into two parts:

Increasing the “top of the funnel” (i.e., the proportion of female applicants) by hosting a series of talks where women in the industry could discuss their work, and Identifying the existence of bias in their interview process (female applicants were noticeably less likely to be hired) and reducing it.

In my company the response to this article was unanimously positive… as it should be! It takes a lot of time, effort, and coordination to pull off Airbnb’s accomplishment so the recognition is well-deserved. However, a year has passed and one thing is apparent: despite the initial buzz and calls to action, we didn’t follow any of the article’s advice. I remember doing a quick search for local data science groups then quickly giving up at the thought of having to write an introduction email.

This should sound alarming: a system with demonstrated results for improving diversity was researched, implemented, and spelled out for me and I couldn’t write a simple email. The same must have been true for everyone else as well because no one made it to the first step. So what’s wrong with us? Well, speaking just for myself I can say that I am too lazy. I’m not motivated enough to spend my spare time hosting talks and networking until inequality is abolished. I’m betting a lot of you can relate.

Alright, so how can anything change if most of us are too lazy to do the right thing? My approach is to try to make the right thing easy to do instead. I propose rather than increasing the number of minority candidates (still a good idea, just a lot of work), a simpler balancing system would decrease the number of majority candidates through a process known as undersampling.

From Wikipedia:

Undersampling in data analysis is a technique used to adjust the class distribution of a data set (i.e. the ratio between the different classes/categories represented). The usual reason for undersampling is to correct for a bias in the original dataset. For example, suppose we have a sample of 1000 people of which 66.7% are male… Simple undersampling will drop some of the male samples at random to give a balanced dataset of 667 samples, with 50% female.

Typically undersampling occurs after a dataset has already been collected and the collection of new data points is either expensive or time-consuming.

How would this method apply to recruiting? Let’s use Airbnb as an example. The article mentioned that 30% of their applicants were female. They knew there was a bias in the applicant pool since women compose roughly 50% of the global population. To correct it, they put a significant amount of work into increasing the number of female applicants so that eventually women and men will balance out. That’s cool, but there’s actually a simple way to reach parity right now by randomly selecting a subset of the male applicants until you have the same number in both groups. In this example sampling about 40% of the male submissions and 100% of the female would offset the initial bias, after which interviewing and hiring can proceed as normal.

There are a couple advantages to this approach.

The existing recruiting process is left almost entirely intact and the few parts that changed can be easily automated. Human effort is kept to a minimum. Undersampling naturally extends to any categories that you want to define. Let’s say you want to improve the gender and race balance. If you were to try to increase the number of minority applicants you would need to start networking with each subgroup. Undersampling requires nothing from your personal time. In technical jargon Airbnb’s approach takes O(n) time whereas undersampling is O(1). Avoids the pitfalls of diversity quotas and similar systems in that candidates are not hired based on physical characteristics; they simply receive the same shot at a job as they would have if there was no systematic bias for/against them.

Of course not everything is rosy and perfect. There are legitimate questions on whether this system would constitute as affirmative action and/or reverse discrimination, making it highly controversial in both legality and ethicality. While I can’t provide a bulletproof defense I will point out that undersampling is different in some key ways. As already mentioned, a candidate’s physical characteristics would not affect their chances in the interview or hiring stage since sampling occurs beforehand. In contrast, affirmative action factors demographics into the hiring decision. And though undersampling hurts the majority during the balancing process, I would argue that it does not fall under reverse discrimination since the penalty merely offsets an already present advantage. To illustrate, if there was no systematic imbalance in women applying for data science jobs then undersampling would have no effect whereas reverse discrimination would continue to penalize male applicants.

The debate doesn’t end there and I encourage you to leave a comment but hopefully the idea of undersampling will inspire you to come up with a more elegant solution just as I was after reading Airbnb’s article. I sincerely hope that one day you will implement a system so deviously lazy you won’t even bother to write about it!