talnarchives

Une archive numérique francophone des articles de recherche en Traitement Automatique de la Langue.

Detecting context-dependent sentences in parallel corpora

Rachel Bawden, Thomas Lavergne, Sophie Rosset

Abstract : In this article, we provide several approaches to the automatic identification of parallel sentences that require sentence-external linguistic context to be correctly translated. Our long-term goal is to automatically construct a test set of context-dependent sentences in order to evaluate machine translation models designed to improve the translation of contextual, discursive phenomena. We provide a discussion and critique that show that current approaches do not allow us to achieve our goal, and suggest that for now evaluating individual phenomena is likely the best solution.

Keywords : machine translation, context, evaluation, discourse.