talnarchives

Une archive numérique francophone des articles de recherche en Traitement Automatique de la Langue.

Machine Translation of Speech-Like Texts: Strategies for the Inclusion of Context

Rachel Bawden

Abstract : Whilst the focus of Machine Translation (MT) has for a long time been the translation of planned, written texts, more and more research is being dedicated to translating speech-like texts (informal or spontaneous discourse or dialogue). To achieve high quality and natural translation of speechlike texts, the integration of context is needed, whether it is extra-linguistic (speaker identity, the interaction between speaker and interlocutor) or linguistic (coreference and stylistic phenomena linked to the spontaneous and informal nature of the texts). However, the integration of contextual information in MT systems remains limited in most current systems. In this paper, we present and critique three experiments for the integration of context into a MT system, each focusing on a different type of context and exploiting a different method: adaptation to speaker gender, cross-lingual pronoun prediction and the generation of tag questions from French into English.

Keywords : machine translation, context, speech-like texts, gender, pronouns, tag questions.