Modeling segmental duration in German text-to-speech synthesis

Bernd Mobius, Jan van Santen

Research output: Contribution to conferencePaperpeer-review

40 Scopus citations

Abstract

This paper reports on the construction of a model for segmental duration in German. The model predicts the durations of speech sounds in various textual, prosodic, and segmental contexts. It has been implemented in the German version of the Bell Labs text-to-speech system. The construction of the duration system was made efficient by the use of an interactive statistical analysis package that incorporates the approach outlined in [23]. The results are stored in tables in a format that can be directly interpreted by the TTS duration module. Tables are constructed in two phases: inferential-statistical analysis of the speech corpus, and parameter estimation. The overall correlation between observed and predicted segmental durations is .896.

Original languageEnglish (US)
Pages2395-2398
Number of pages4
StatePublished - 1996
EventProceedings of the 1996 International Conference on Spoken Language Processing, ICSLP. Part 1 (of 4) - Philadelphia, PA, USA
Duration: Oct 3 1996Oct 6 1996

Other

OtherProceedings of the 1996 International Conference on Spoken Language Processing, ICSLP. Part 1 (of 4)
CityPhiladelphia, PA, USA
Period10/3/9610/6/96

ASJC Scopus subject areas

  • General Computer Science

Fingerprint

Dive into the research topics of 'Modeling segmental duration in German text-to-speech synthesis'. Together they form a unique fingerprint.

Cite this