Effects of prosodic factors on spectral balance: Analysis and synthesis

Qi Miao; Xiaochuan Niu; Esther Klabbers; Jan Van Santen

Effects of prosodic factors on spectral balance: Analysis and synthesis

Qi Miao, Xiaochuan Niu, Esther Klabbers, Jan Van Santen

Institute on Development and Disability

Research output: Chapter in Book/Report/Conference proceeding › Conference contribution

6 Scopus citations

Abstract

In natural speech, prosodic factors such as accent, stress, phrasal position and speaking style play important roles in controlling several acoustic features, including segmental duration, pitch, and spectral balance, i.e., the amplitude pattern across different frequency ranges of the power spectrum. To synthesize speech that sounds natural, these effects need to be accurately modeled. In this study we describe and evaluate a synthesis method that mimics the effects of prosodic factors on spectral balance. We measure spectral balance by using the energy in four broad frequency bands that correspond to formant frequency ranges. An additive model is used to capture the effects of prosodic factors on spectral balance. A new sinusoidal synthesis module is implemented under Festival to predict the target spectral balance value for each band from analysis results and apply it to the amplitude parameters of the sinusoidal model during synthesis. In this study we evaluate an important strength of this system, which is its ability to reduce spectral discontinuities in unit concatenation.

Original language	English (US)
Title of host publication	3rd International Conference on Speech Prosody 2006
Editors	R. Hoffmann, H. Mixdorff
Publisher	International Speech Communications Association
ISBN (Electronic)	9780000000002
State	Published - 2006
Event	3rd International Conference on Speech Prosody, SP 2006 - Dresden, Germany Duration: May 2 2006 → May 5 2006

Publication series

Name	Proceedings of the International Conference on Speech Prosody
ISSN (Print)	2333-2042

Conference

Conference	3rd International Conference on Speech Prosody, SP 2006
Country/Territory	Germany
City	Dresden
Period	5/2/06 → 5/5/06

ASJC Scopus subject areas

Language and Linguistics
Linguistics and Language

Cite this

Effects of prosodic factors on spectral balance: Analysis and synthesis. / Miao, Qi; Niu, Xiaochuan; Klabbers, Esther et al.
3rd International Conference on Speech Prosody 2006. ed. / R. Hoffmann; H. Mixdorff. International Speech Communications Association, 2006. (Proceedings of the International Conference on Speech Prosody).

Research output: Chapter in Book/Report/Conference proceeding › Conference contribution

Miao, Q, Niu, X, Klabbers, E & Van Santen, J 2006, Effects of prosodic factors on spectral balance: Analysis and synthesis. in R Hoffmann & H Mixdorff (eds), 3rd International Conference on Speech Prosody 2006. Proceedings of the International Conference on Speech Prosody, International Speech Communications Association, 3rd International Conference on Speech Prosody, SP 2006, Dresden, Germany, 5/2/06.

@inproceedings{4b1287d7b1134bdabcb2bbd6c1d7e2e0,

title = "Effects of prosodic factors on spectral balance: Analysis and synthesis",

abstract = "In natural speech, prosodic factors such as accent, stress, phrasal position and speaking style play important roles in controlling several acoustic features, including segmental duration, pitch, and spectral balance, i.e., the amplitude pattern across different frequency ranges of the power spectrum. To synthesize speech that sounds natural, these effects need to be accurately modeled. In this study we describe and evaluate a synthesis method that mimics the effects of prosodic factors on spectral balance. We measure spectral balance by using the energy in four broad frequency bands that correspond to formant frequency ranges. An additive model is used to capture the effects of prosodic factors on spectral balance. A new sinusoidal synthesis module is implemented under Festival to predict the target spectral balance value for each band from analysis results and apply it to the amplitude parameters of the sinusoidal model during synthesis. In this study we evaluate an important strength of this system, which is its ability to reduce spectral discontinuities in unit concatenation.",

author = "Qi Miao and Xiaochuan Niu and Esther Klabbers and {Van Santen}, Jan",

note = "Publisher Copyright: {\textcopyright} 2006 Proceedings of the International Conference on Speech Prosody.; 3rd International Conference on Speech Prosody, SP 2006 ; Conference date: 02-05-2006 Through 05-05-2006",

year = "2006",

language = "English (US)",

series = "Proceedings of the International Conference on Speech Prosody",

publisher = "International Speech Communications Association",

editor = "R. Hoffmann and H. Mixdorff",

booktitle = "3rd International Conference on Speech Prosody 2006",

}

TY - GEN

T1 - Effects of prosodic factors on spectral balance

T2 - 3rd International Conference on Speech Prosody, SP 2006

AU - Miao, Qi

AU - Niu, Xiaochuan

AU - Klabbers, Esther

AU - Van Santen, Jan

PY - 2006

Y1 - 2006

N2 - In natural speech, prosodic factors such as accent, stress, phrasal position and speaking style play important roles in controlling several acoustic features, including segmental duration, pitch, and spectral balance, i.e., the amplitude pattern across different frequency ranges of the power spectrum. To synthesize speech that sounds natural, these effects need to be accurately modeled. In this study we describe and evaluate a synthesis method that mimics the effects of prosodic factors on spectral balance. We measure spectral balance by using the energy in four broad frequency bands that correspond to formant frequency ranges. An additive model is used to capture the effects of prosodic factors on spectral balance. A new sinusoidal synthesis module is implemented under Festival to predict the target spectral balance value for each band from analysis results and apply it to the amplitude parameters of the sinusoidal model during synthesis. In this study we evaluate an important strength of this system, which is its ability to reduce spectral discontinuities in unit concatenation.

AB - In natural speech, prosodic factors such as accent, stress, phrasal position and speaking style play important roles in controlling several acoustic features, including segmental duration, pitch, and spectral balance, i.e., the amplitude pattern across different frequency ranges of the power spectrum. To synthesize speech that sounds natural, these effects need to be accurately modeled. In this study we describe and evaluate a synthesis method that mimics the effects of prosodic factors on spectral balance. We measure spectral balance by using the energy in four broad frequency bands that correspond to formant frequency ranges. An additive model is used to capture the effects of prosodic factors on spectral balance. A new sinusoidal synthesis module is implemented under Festival to predict the target spectral balance value for each band from analysis results and apply it to the amplitude parameters of the sinusoidal model during synthesis. In this study we evaluate an important strength of this system, which is its ability to reduce spectral discontinuities in unit concatenation.

UR - http://www.scopus.com/inward/record.url?scp=77954366438&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=77954366438&partnerID=8YFLogxK

M3 - Conference contribution

AN - SCOPUS:77954366438

T3 - Proceedings of the International Conference on Speech Prosody

BT - 3rd International Conference on Speech Prosody 2006

A2 - Hoffmann, R.

A2 - Mixdorff, H.

PB - International Speech Communications Association

Y2 - 2 May 2006 through 5 May 2006

ER -

Effects of prosodic factors on spectral balance: Analysis and synthesis

Abstract

Publication series

Conference

ASJC Scopus subject areas

Other files and links

Fingerprint

Cite this