« previous post | next post »

Arvind Narayanan, "Language necessarily contains human biases, and so will machines trained on language corpora", Freedom to Tinker 8/24/2016:

We show empirically that natural language necessarily contains human biases, and the paradigm of training machine learning on language corpora means that AI will inevitably imbibe these biases as well.

This all started in the 1960s, with Gerald Salton and the "vector space model". The idea was to represent a document as a vector of word (or "term") counts — which like any vector, represents a point in a multi-dimensional space. Then the similarity between two documents can be calculated by correlation-like methods, basically as some simple function of the inner product of the two term vectors. And natural-language queries are also a sort of document, though usually a rather short one, so you can use this general approach for document retrieval by looking for documents that are (vector-space) similar to the query. It helps if you weight the document vectors by inverse document frequency, and maybe use thesaurus-based term extension, and relevance feedback, and …

A vocabulary of 100,000 wordforms results in a 100,000-dimensional vector, but there's no conceptual problem with that, and sparse-vector coding techniques means that there's no practical problem either. Except in the 1960s, digital "documents" were basically stacks of punched cards, and the market for digital document retrieval was therefore pretty small. Also, those were the days when people thought that artificial intelligence was applied logic — one of Marvin Minsky's students once told me that Minsky warned him "If you're counting higher than one, you're doing it wrong". Still, Salton's students (like Mike Lesk and Donna Harman) kept the flame alive.

Then came the world-wide web, and the Google guys' development of "page rank", which extends a vector-space model using the eigenanalysis of the citation graph of the web, and the growth of the idea that artificial intelligence might be applied statistics. Also out there was the idea of using various dimensionality-reduction techniques to cut the order of those document vectors down from hundreds of thousands to hundreds.

The first example was "latent semantic analysis", based on the singular value decomposition of a term-by-document matrix. The initial idea was to make document storage and comparison more efficient — but this turned out not to be necessary. Another benefit was to create a sort of soft thesaurus, so that a query might fetch documents that don't feature the queried words, but do contain lots of words that often co-occur with those words. But LSA, interesting as it was, never really became a big thing.

Then people began to explore small vector-space models based on other ways of doing dimensionality reduction on other kinds of word-cooccurrence statistics, especially looking at relationships among nearby words. It didn't escape notice that this puts into effect the old idea of "distributional semantics", especially associated with Zellig Harris and John Firth, summarized in Firth's dictum that "you shall know a word by the company it keeps". Some examples are word2vec, eigenwords, and GloVe. These techniques let you produce approximate solutions to what might seem like hard problems, like London:England::Paris:?, using nothing but vector-space geometry. And it's easy to experiment with these techniques, as here and here.

Continuing with Arvind Narayanan's blog post:

Specifically, we look at “word embeddings”, a state-of-the-art language representation used in machine learning. Each word is mapped to a point in a 300-dimensional vector space so that semantically similar words map to nearby points.

We show that a wide variety of results from psychology on human bias can be replicated using nothing but these word embeddings. We primarily look at the Implicit Association Test (IAT), a widely used and accepted test of implicit bias. The IAT asks subjects to pair concepts together (e.g., white/black-sounding names with pleasant or unpleasant words) and measures reaction times as an indicator of bias. In place of reaction times, we use the semantic closeness between pairs of words. In short, we were able to replicate every single result that we tested, with high effect sizes and low p-values.

[…]

We show that information about the real world is recoverable from word embeddings to a striking degree. The figure below shows that for 50 occupation words (doctor, engineer, …), we can accurately predict the percentage of U.S. workers in that occupation who are women using nothing but the semantic closeness of the occupation word to feminine words!

The paper is Aylin Caliskan-Islam , Joanna J. Bryson, and Arvind Narayanan, "Semantics derived automatically from language corpora necessarily contain human biases". It uses the pre-trained GloVe embeddings available here, and you can read it to be convinced that many sorts of bias reliably emerge from the patterns of word co-occurrence in such material.

I'm pretty sure that Zellig Harris would not have found this surprising — one of his original motivations for developing distributional methods was to find the latent political content of texts by completely objective means.

And for a few differently-embedded words on what those marvelous word-embedding vectors leave out, see the discussion of the "cookbook problem" in these lecture notes.

Update — I should note that I'm strongly in favor of word-embedding models and similar things, but I feel that people should understand what they are and how they work (or don't work), rather than seeing them as a magic algorithmic black box that does magical algorithmic things.

Permalink