talnarchives

Une archive numérique francophone des articles de recherche en Traitement Automatique de la Langue.

Multilinguality at Your Fingertips : BabelNet, Babelfy and Beyond !

Roberto Navigli

Abstract : Multilinguality is a key feature of today's Web, and it is this feature that we leverage and exploit in our research work at the Sapienza University of Rome's Linguistic Computing Laboratory, which I am going to overview and showcase in this talk. I will start by presenting BabelNet 3.0, available at http://babelnet.org, a very large multilingual encyclopedic dictionary and semantic network, which covers 271 languages and provides both lexicographic and encyclopedic knowledge for all the open-class parts of speech, thanks to the seamless integration of WordNet, Wikipedia, Wiktionary, OmegaWiki, Wikidata and the Open Multilingual WordNet. Next, I will present Babelfy, available at http://babelfy.org, a unified approach that leverages BabelNet to jointly perform word sense disambiguation and entity linking in arbitrary languages, with performance on both tasks on a par with, or surpassing, those of task-specific state-of-the-art supervised systems. Finally I will describe the Wikipedia Bitaxonomy, available at http://wibitaxonomy.org, a new approach to the construction of a Wikipedia bitaxonomy, that is, the largest and most accurate currently available taxonomy of Wikipedia pages and taxonomy of categories, aligned to each other. I will also give an outline of future work on multilingual resources and processing, including state-of-the-art semantic similarity with sense embeddings.

Mots clés : multilingualité, dictionnaire, sémantique, désambiguïsation, taxonomie

Keywords : multilinguality, dictionary, semantic, disambiguation, taxonomy