Models

Select one of the available models

Finnish 4B wordforms skipgram Suomi24 wordforms skipgram Suomi24 lemmas skipgram English GoogleNews Negative300

Nearest words

Given a word, this demo shows a list of other words that are similar to it, i.e. nearby in the vector space.

Show nearest Case sensitive: Top N: 5 10 20 100

Similarity of two words

Given two words, this demo gives the similarity value between 1 and -1.

Show similarity

Word analogy

This demo computes word analogy: the first word is to the second word like the third word is to which word? Try for example ilma - lintu - vesi (air - bird - water) which would expect to return kala (fish) because fish is to water like birs is to air. Other cases could be for example sammakko - hyppää - kala. This is however only a toy to show what is possible - most of the time the analogy does not work particularly well (at least for the Finnish data).

Show Top N: 2 5 10 20 100

About

The demo is based on word embeddings induced using the word2vec method, trained on 4.5B words of Finnish from the Finnish Internet Parsebank project and over 2B words of Finnish from Suomi24. On the Parsebank project page you can also download the vectors in binary form. The software behind the demo is open-source, available on GitHub. The demo is maintained by the Turku NLP group.