talnarchives

Une archive numérique francophone des articles de recherche en Traitement Automatique de la Langue.

Polarity analysis of non figurative tweets : Tw-StAR participation on DEFT 2017

Hala Mulki, Hatem Haddad, Mourad Gridach

Abstract : In this paper, we present our contribution in DEFT 2017 international workshop. We have tackled task 1 entitled “Polarity analysis of non figurative tweets ”. We propose three sentiment classification models implemented using lexicon-based, supervised, and document embedding-based methods. For the first model, a novel strategy is introduced where Named Entities (NEs) have been involved in the Sentiment Analysis task. The first two models adopted bag-of-N-grams features while for the third model, features have been extracted automatically from the data itself in the form of document vectors. The official evaluation of the three models indicated that the best performance was achieved by the supervised learning-based model. Nevertheless, the results obtained by the document embeddingbased model are considered promising and can be further improved if pretrianed French word vectors are used to initialize the model’s features.

Keywords : Sentiment analysis, supervised learning, lexicon-based model, document embeddings, named entities.