talnarchives

Une archive numérique francophone des articles de recherche en Traitement Automatique de la Langue.

Summarization for Generative Relation Extraction in the Microbiome Domain

Oumaima El Khettari, Solen Quiniou, Samuel Chaffron

Abstract : We explore a generative relation extraction (RE) pipeline tailored to the study of interactions in the intestinal microbiome, a complex and low-resource biomedical domain. Our method leverages summarization with large language models (LLMs) to refine context before extracting relations via instruction-tuned generation. Preliminary results on a dedicated corpus show that summarization improves generative RE performance by reducing noise and guiding the model. However, BERT-based RE approaches still outperform generative models. This ongoing work demonstrates the potential of generative methods to support the study of specialized domains in low-resources setting.

Keywords : Generative Relation Extraction, Instruction-tuning, Low-Resource Domain, Microbiome.