The Stanford Natural Language Inference (SNLI) Corpus

The Corpus

The new MultiGenre NLI (MultiNLI) Corpus is now available here . The corpus is in the same format as SNLI and is comparable in size, but it includes a more diverse range of text, as well as an auxiliary test set for cross-genre transfer evaluation.

The SNLI corpus (version 1.0) is a collection of 570k human-written English sentence pairs manually labeled for balanced classification with the labels entailment, contradiction, and neutral, supporting the task of natural language inference (NLI), also known as recognizing textual entailment (RTE). We aim for it to serve both as a benchmark for evaluating representational systems for text, especially including those induced by representation learning methods, as well as a resource for developing NLP models of any kind.

The following paper introduces the corpus in detail. If you use the corpus in published work, please cite it:

Here are a few example pairs taken from the development portion of the corpus. Each has the judgments of five mechanical turk workers and a consensus judgment.

Text Judgments Hypothesis A man inspects the uniform of a figure in some East Asian country. contradiction

C C C C C The man is sleeping An older and younger man smiling. neutral

N N E N N Two men are smiling and laughing at the cats playing on the floor. A black race car starts up in front of a crowd of people. contradiction

C C C C C A man is driving down a lonely road. A soccer game with multiple males playing. entailment

E E E E E Some men are playing a sport. A smiling costumed woman is holding an umbrella. neutral

N N E C N A happy woman in a fairy costume holds an umbrella.

The corpus is distributed in both JSON lines and tab separated value files, which are packaged together (with a readme) here:

The corpus includes content from the Flickr 30k corpus (also released under an Attribution-ShareAlike licence), which can be cited by way of this paper:

About 4k sentences in the training set have captionIDs and pairIDs beginning with 'vg_'. These come from a pilot data collection effort that used data from the VisualGenome corpus, which is still under construction as of the release of SNLI. For more information on VisualGenome, see: https://visualgenome.org/

The hard subset of the test set used in Gururangan et al. '18 is available in JSONL format here.

Published results

Three-way classification

Related Resources

Contact Information

The following table reflects our informal attempt to catalog published 3-class classification results on the SNLI test set. We define sentence vector-based models as those which perform classification on the sole basis of a pair of fixed-size sentence representations that are computed independently of one another. Reported parameter counts do not include word embeddings. If you would like to add a paper that reports a number at or above the current state of the art, email Sam

For any comments or questions, please email Sam and Gabor.