talnarchives

Une archive numérique francophone des articles de recherche en Traitement Automatique de la Langue.

A Hybrid Approach to Utilize Rhetorical Relations for Blog Summarization

Shamima Mithun, Leila Kosseim

Abstract : The availability of huge amounts of online opinions has created a new need to develop effective query-based opinion summarizers to analyze this information in order to facilitate decision making at every level. To develop an effective opinion summarization approach, we have targeted to resolve specifically Question Irrelevancy and Discourse Incoherency problems which have been found to be the most frequently occurring problems for opinion summarization. To address these problems, we have introduced a hybrid approach by combining text schema and rhetorical relations to exploit intra-sentential rhetorical relations. To evaluate our approach, we have built a system called BlogSum and have compared BlogSum-generated summaries after applying rhetorical structuring to BlogSum-generated candidate sentences without utilizing rhetorical relations using the Text Analysis Conference (TAC) 2008 data for summary contents. Evaluation results show that our approach improves summary contents by reducing question irrelevant sentences.

Keywords : Blog Summarization, Rhetorical Relations, Text Schema