On the definition of a prosodically balaced corpuscombining greedy algorithms with expert guided manipulation

  1. Escudero Mancebo, David
  2. Aguilar, Lourdes
  3. Bonafonte Cávez, Antonio
  4. Garrido Almiñana, Juan María
Journal:
Procesamiento del lenguaje natural

ISSN: 1135-5948

Year of publication: 2009

Issue: 43

Pages: 93-101

Type: Article

More publications in: Procesamiento del lenguaje natural

Abstract

This article reports the process of building a balanced text corpus taking into account prosodic features. We formalize the application of greedy algorithms for text selection and we discuss their limitations. We also defend an expert guideline for text manipulation that significantly improves the performance of the algorithms. The application of this methodology to a radio news corpus empirically supports the proposed strategy.