Improving Distributional Similarity with Lessons Learned from Word Embeddings.

Omer Levy, Yoav Goldberg, and Ido Dagan. TACL 2015. [pdf] [errata] [slides]

We reveal that much of the performance gains of word embeddings are due to certain system design choices and hyperparameter optimizations, rather than the embedding algorithms themselves.

Code

The word representations used in this work were created with hyperwords – a collection of scripts and programs for creating word representations, designed to facilitate academic research and prototyping of word representations. It allows you to tune many hyperparameters that are pre-set or ignored in other word representation packages. In addition, hyperwords contains evaluation scripts for word similarity and analogy detection tasks.

.

.

.

.

.