On Building Phonetically and Prosodically Rich Speech Corpus for Text-to-Speech Synthesis

J. Matoušek and J. Romportl (Czech Republic)


natural language processing, text-to-speech, speech synthe sis, sentence selection, speech corpus, prosody


This paper proposes a way of preparing and recording a speech corpus for unit selection text-to-speech speech syn thesis driven by symbolic prosody. The research is fo cused on a phonetically and prosodically rich sentence se lection algorithm. Symbolic description on a deep prosody level is used to enrich the phonetic representation of sen tences (by respecting the prosodeme types phones appear in). The resulting algorithm then selects sentences with respect to both phonetic and prosodic criteria. To cover supra-sentential prosody phenomena, paragraphs were se lected at random and recorded as well. The new speech corpus can be utilised in unit selection speech synthesis and also for training a data-driven prosodic parser.

