Can you imagine going to any country in the world and being able to communicate more accurately and fluently than ever before? Well now, thanks to a group of MIT researchers, you can.

In a paper presented not too long ago at the Conference on Empirical Methods in Natural Language Processing, the MIT researchers demonstrate a model that can perform faster and more efficiently than any translation model out today, including Facebook, Amazon, and Google.

Using a measure in statistics called Gromov-Wasserstein distance, the model is capable of measuring the distance between points in one computational space and matching them to points in another space that are similar in distance.





When this technique is applied to the actual word embeddings of two languages the model sets to work aligning the words in both that are most alike in terms of relative distances as this means they’re likely to be direct translations.

Results from the study showed that the model performed quicker and more accurate some of the time than any other advanced translation tool out there. This is a huge leap forward towards developing the perfect machine translation tool as even if there are insufficient data to match two languages you can still use their distance measurements to align them.

The idea of aligning word embeddings in order to translate them from one language to another is not a new concept, it’s just one that’s evolved. While training neural networks to match vectors in two languages directly does work, there’s a lot of tweaking required to get the alignment just right. Measuring and matching vectors based on their relational distances is a far more accurate method.

That’s where Gromov-Wasserstein becomes useful. Because this technique is used to help align image pixels in computer science it can directly relate. “If there are points, or words, that are close together in one space, Gromov-Wasserstein is automatically going to try to find the corresponding cluster of points in the other space,” says David Alvarez-Melis, a Computer Science and Artificial Intelligence Laboratory (CSAIL) Ph.D. student.





The team used a publicly available dataset of word embeddings called FASTEXT in which to train and test the system. These embeddings had a 110 language pair and within these, the researchers found that words which appeared more frequently are those with closely matching vectors. So, for example, “mother” and “father” will be close to one another but farther away from a word such as “house.”

Alvarez-Melis describes it as a kind of ‘soft translation’ as instead of simply returning one-word translations, the system notifies you if the word or vector has a strong relationship with any words in the required language. To understand this model better, imaging all of the names of the months of the year. Many of these months appear quite closely across different languages.

The way the system works is it sees a cluster of vectors in one embedding that is very similar to that in the other embedding. It doesn’t know these words are months. It just knows there is a cluster of points in one language aligning up with a cluster of points in another. “By finding these correspondences for each word, it then aligns the whole space simultaneously.”

On top of all that, the model automatically produces a value that is easily quantifiable on a numerical scale, which may be helpful for linguists. It calculates how far away vectors from two embeddings are from one another. And according to Alvarez-Melis, this “can be used to draw insights about the relationships between the languages.”





Comments

comments