Conneau and Lample 2018 – Word Translation without Parallel Data

Notes for conneau17_word_trans_without_paral_data

1 Background

This builds on an old idea from Mikolav et al 2013
- The idea goes something like this: We have deep word embeddings for two languages. Maybe we can align the embedding spaces to produce a dictionary
- How should we align the spaces? Pick 5000 anchor points of aligned words and find a mapping \(W\) from source \(X\) to target \(Y\) that minimizes \(|WX-Y|\) across the anchor points

2 What's the innovation in this paper?

Let's build on that idea, but do away with the anchor points
Instead, let's learn \(W\) with an adversarial approach. A discriminator tries to distinguish betwen points sampled from \(WX\) and points sampled from \(Y\)
A generator tries to find a \(W\) to fool the discriminator
This works and even outperforms supervised aligners

3 bib

Bibliography

[conneau17_word_trans_without_paral_data] Conneau, Lample, Ranzato, Marc'Aurelio, Denoyer, J\'egou & Herv\'e, Word Translation Without Parallel Data, CoRR, (2017). link.

Created: 2021-09-14 Tue 21:44