Modeling segmental duration in German text-to-speech synthesis

Bernd Mobius; Jan van Santen

Modeling segmental duration in German text-to-speech synthesis

Bernd Mobius, Jan van Santen

Institute on Development and Disability

Research output: Contribution to conference › Paper › peer-review

40 Scopus citations

Abstract

This paper reports on the construction of a model for segmental duration in German. The model predicts the durations of speech sounds in various textual, prosodic, and segmental contexts. It has been implemented in the German version of the Bell Labs text-to-speech system. The construction of the duration system was made efficient by the use of an interactive statistical analysis package that incorporates the approach outlined in [23]. The results are stored in tables in a format that can be directly interpreted by the TTS duration module. Tables are constructed in two phases: inferential-statistical analysis of the speech corpus, and parameter estimation. The overall correlation between observed and predicted segmental durations is .896.

Original language	English (US)
Pages	2395-2398
Number of pages	4
State	Published - 1996
Event	Proceedings of the 1996 International Conference on Spoken Language Processing, ICSLP. Part 1 (of 4) - Philadelphia, PA, USA Duration: Oct 3 1996 → Oct 6 1996

Other

Other	Proceedings of the 1996 International Conference on Spoken Language Processing, ICSLP. Part 1 (of 4)
City	Philadelphia, PA, USA
Period	10/3/96 → 10/6/96

ASJC Scopus subject areas

General Computer Science

Cite this

@conference{eb48493739ee481e9fb98b8d98eb79c6,

title = "Modeling segmental duration in German text-to-speech synthesis",

abstract = "This paper reports on the construction of a model for segmental duration in German. The model predicts the durations of speech sounds in various textual, prosodic, and segmental contexts. It has been implemented in the German version of the Bell Labs text-to-speech system. The construction of the duration system was made efficient by the use of an interactive statistical analysis package that incorporates the approach outlined in [23]. The results are stored in tables in a format that can be directly interpreted by the TTS duration module. Tables are constructed in two phases: inferential-statistical analysis of the speech corpus, and parameter estimation. The overall correlation between observed and predicted segmental durations is .896.",

author = "Bernd Mobius and {van Santen}, Jan",

year = "1996",

language = "English (US)",

pages = "2395--2398",

note = "Proceedings of the 1996 International Conference on Spoken Language Processing, ICSLP. Part 1 (of 4) ; Conference date: 03-10-1996 Through 06-10-1996",

}

TY - CONF

T1 - Modeling segmental duration in German text-to-speech synthesis

AU - Mobius, Bernd

AU - van Santen, Jan

PY - 1996

Y1 - 1996

N2 - This paper reports on the construction of a model for segmental duration in German. The model predicts the durations of speech sounds in various textual, prosodic, and segmental contexts. It has been implemented in the German version of the Bell Labs text-to-speech system. The construction of the duration system was made efficient by the use of an interactive statistical analysis package that incorporates the approach outlined in [23]. The results are stored in tables in a format that can be directly interpreted by the TTS duration module. Tables are constructed in two phases: inferential-statistical analysis of the speech corpus, and parameter estimation. The overall correlation between observed and predicted segmental durations is .896.

AB - This paper reports on the construction of a model for segmental duration in German. The model predicts the durations of speech sounds in various textual, prosodic, and segmental contexts. It has been implemented in the German version of the Bell Labs text-to-speech system. The construction of the duration system was made efficient by the use of an interactive statistical analysis package that incorporates the approach outlined in [23]. The results are stored in tables in a format that can be directly interpreted by the TTS duration module. Tables are constructed in two phases: inferential-statistical analysis of the speech corpus, and parameter estimation. The overall correlation between observed and predicted segmental durations is .896.

UR - http://www.scopus.com/inward/record.url?scp=0030366723&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=0030366723&partnerID=8YFLogxK

M3 - Paper

AN - SCOPUS:0030366723

SP - 2395

EP - 2398

T2 - Proceedings of the 1996 International Conference on Spoken Language Processing, ICSLP. Part 1 (of 4)

Y2 - 3 October 1996 through 6 October 1996

ER -

Modeling segmental duration in German text-to-speech synthesis

Abstract

Other

ASJC Scopus subject areas

Other files and links

Fingerprint

Cite this