A sample text widget

Etiam pulvinar consectetur dolor sed malesuada. Ut convallis euismod dolor nec pretium. Nunc ut tristique massa.

Nam sodales mi vitae dolor ullamcorper et vulputate enim accumsan. Morbi orci magna, tincidunt vitae molestie nec, molestie at mi. Nulla nulla lorem, suscipit in posuere in, interdum non magna.

PhD Thesis: Unsupervised Machine Translation (Mikel Artetxe, 2020/07/29)

Title:  Unsupervised Machine Translation
           / Itzulpen automatiko gainbegiratu gabea

Non: Teleconference:
Faculty of informatics (UPV/EHU) Ada Lovelace room
Date: July 29, 2020, Wednesday,  11:00
Author: Mikel Artetxe Zurutuza 
Supervisors: Eneko Agirre & Gorka Labaka
Languages:  Basque (motivation, state of the art)  and English (second half, papers, conclusions, ~11:30…)


The advent of neural sequence-to-sequence models has led to impressive progress in machine translation, with large improvements in standard benchmarks and the first solid claims of human parity in certain settings. Nevertheless, existing systems require strong supervision in the form of parallel corpora, typically consisting of several million sentence pairs. Such a requirement greatly departs from the way in which humans acquire language, and poses a major practical problem for the vast majority of low-resource
language pairs.

The goal of this thesis is to remove the dependency on parallel data altogether, relying on nothing but monolingual corpora to train unsupervised machine translation systems. For that purpose, our approach first aligns separately trained word representations in
different languages based on their structural similarity, and uses them to initialize either a neural or a statistical machine translation system, which is further trained through back-translation.

Mikel Artetxe publications related to his PhD work:

Leave a Reply

You can use these HTML tags

<a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <s> <strike> <strong>