Séminaire de Melle Do Ngoc Diep, doctorante en co-tutelle Centre MICA/LIG - Date : jeudi 22 avril 2010, 16h00 - Lieu : Centre MICA

Intervenant : Melle DO Ngoc Diep, doctorante en co-tutelle entre le Centre MICA (Hanoi) et le LIG (Grenoble)

Date : jeudi 22 avril 2010, 16h00
Lieu : salle polyvalente, Centre MICA, C10, Institut Polytechnique de Hanoi, 1 Dai Co Viet, Hanoi
Interprète traducteur : le séminaire sera présenté en anglais

Résumé/Abstract: We presents an unsupervised method in application of extracting parallel sentence pairs from a comparable corpus. A translation system is used to mine the comparable corpus and to withdraw the parallel sentence pairs. An iteration process is implemented not only to increase the number of extracted parallel sentence pairs but also to improve the quality of the translation system. A comparison between this unsupervised method and a semi-supervised method is also presented. The unsupervised extraction method was tested in a difficult condition: the parallel corpus did not exist and the comparable corpus contained up to 50% non-parallel sentence pairs. However, the result shows that the unsupervised method can be applied in cases where parallel data are lacking.