On the definition of a prosodically balaced corpuscombining greedy algorithms with expert guided manipulation
- Escudero Mancebo, David
- Aguilar, Lourdes
- Bonafonte Cávez, Antonio
- Garrido Almiñana, Juan María
ISSN: 1135-5948
Year of publication: 2009
Issue: 43
Pages: 93-101
Type: Article
More publications in: Procesamiento del lenguaje natural
Abstract
This article reports the process of building a balanced text corpus taking into account prosodic features. We formalize the application of greedy algorithms for text selection and we discuss their limitations. We also defend an expert guideline for text manipulation that significantly improves the performance of the algorithms. The application of this methodology to a radio news corpus empirically supports the proposed strategy.