talnarchives

Une archive numérique francophone des articles de recherche en Traitement Automatique de la Langue.

Information extraction from social media : A linguistically motivated approach

Nelleke Oostdijk, Ali Hürriyetoglu, Marco Puts, Piet Daas, Antal van den Bosch

Abstract : Information extraction from the social media : a linguistically motivated approach We propose a flexible method for extracting traffic information from social media. The abundance of microposts on Twitter make it possible to tap into what is going on as users are reporting on what they are actually observing. This information is highly relevant as it can help traffic security organizations and drivers to be better prepared and take appropriate action. Distinguishing 22 information categories deemed relevant to the traffic domain, we achieve a success rate of 74% when individual tweets are considered. This performance we judge to be satisfactory, seeing that there are usually multiple tweets about a given event so that we will pick up what relevant information is out there.

Keywords : social media mining, information extraction, traffic information, traffic safety.