talnarchives

Une archive numérique francophone des articles de recherche en Traitement Automatique de la Langue.

Multilingual and Multitarget Hate Speech Detection in Tweets

Patricia Chiril, Farah Benamara Zitoune, Véronique Moriceau, Marlène Coulomb-Gully, Abhishek Kumar

Abstract : Social media networks have become a space where users are free to relate their opinions and sentiments which may lead to a large spreading of hatred or abusive messages which have to be moderated. This paper proposes a supervised approach to hate speech detection from a multilingual perspective. We focus in particular on hateful messages towards two different targets (immigrants and women) in English tweets, as well as sexist messages in both English and French. Several models have been developed ranging from feature-engineering approaches to neural ones. Our experiments show very encouraging results on both languages.

Keywords : Social media, Hate speech detection, Sexism, supervised learning.