Frequency-domain delexicalization using surrogate vowels

Alexander Kain; Jan Van Santen

Frequency-domain delexicalization using surrogate vowels

Alexander Kain, Jan Van Santen

Institute on Development and Disability

Research output: Chapter in Book/Report/Conference proceeding › Conference contribution

2 Scopus citations

Abstract

We propose a delexicalization algorithm that renders the lexical content of an utterance unintelligible, while preserving important acoustic prosodic cues, as well as naturalness and speaker identity. This is achieved by replacing voiced regions by spectral slices from a surrogate vowel, and by averaging the magnitude spectrum during unvoiced regions. Perceptual tests were carried out comparing sentences that were either unprocessed or delexicalized, using a baseline or the proposed method. An intelligibility test resulted in a keyword recall rate of 92% for the unprocessed sentences, and near complete unintelligibility for both delexicalization methods. Affect recognition was at 65% for unprocessed sentences, and 46% and 49% for the baseline and the proposed method, respectively. Preference tests showed that the proposed method preserved drastically more speaker identity, and sounded more natural than the baseline.

Original language	English (US)
Title of host publication	Proceedings of the 11th Annual Conference of the International Speech Communication Association, INTERSPEECH 2010
Publisher	International Speech Communication Association
Pages	474-477
Number of pages	4
State	Published - 2010

Publication series

Name	Proceedings of the 11th Annual Conference of the International Speech Communication Association, INTERSPEECH 2010

Keywords

Affect
Delexicalization
Intelligibility

ASJC Scopus subject areas

Language and Linguistics
Speech and Hearing
Human-Computer Interaction
Signal Processing
Software
Modeling and Simulation

Cite this

Kain, A., & Van Santen, J. (2010). Frequency-domain delexicalization using surrogate vowels. In Proceedings of the 11th Annual Conference of the International Speech Communication Association, INTERSPEECH 2010 (pp. 474-477). (Proceedings of the 11th Annual Conference of the International Speech Communication Association, INTERSPEECH 2010). International Speech Communication Association.

Frequency-domain delexicalization using surrogate vowels. / Kain, Alexander; Van Santen, Jan.
Proceedings of the 11th Annual Conference of the International Speech Communication Association, INTERSPEECH 2010. International Speech Communication Association, 2010. p. 474-477 (Proceedings of the 11th Annual Conference of the International Speech Communication Association, INTERSPEECH 2010).

Research output: Chapter in Book/Report/Conference proceeding › Conference contribution

Kain, A & Van Santen, J 2010, Frequency-domain delexicalization using surrogate vowels. in Proceedings of the 11th Annual Conference of the International Speech Communication Association, INTERSPEECH 2010. Proceedings of the 11th Annual Conference of the International Speech Communication Association, INTERSPEECH 2010, International Speech Communication Association, pp. 474-477.

Kain A, Van Santen J. Frequency-domain delexicalization using surrogate vowels. In Proceedings of the 11th Annual Conference of the International Speech Communication Association, INTERSPEECH 2010. International Speech Communication Association. 2010. p. 474-477. (Proceedings of the 11th Annual Conference of the International Speech Communication Association, INTERSPEECH 2010).

Kain, Alexander ; Van Santen, Jan. / Frequency-domain delexicalization using surrogate vowels. Proceedings of the 11th Annual Conference of the International Speech Communication Association, INTERSPEECH 2010. International Speech Communication Association, 2010. pp. 474-477 (Proceedings of the 11th Annual Conference of the International Speech Communication Association, INTERSPEECH 2010).

@inproceedings{2624cae27f0848e5858917d1fdf8d2a7,

title = "Frequency-domain delexicalization using surrogate vowels",

abstract = "We propose a delexicalization algorithm that renders the lexical content of an utterance unintelligible, while preserving important acoustic prosodic cues, as well as naturalness and speaker identity. This is achieved by replacing voiced regions by spectral slices from a surrogate vowel, and by averaging the magnitude spectrum during unvoiced regions. Perceptual tests were carried out comparing sentences that were either unprocessed or delexicalized, using a baseline or the proposed method. An intelligibility test resulted in a keyword recall rate of 92% for the unprocessed sentences, and near complete unintelligibility for both delexicalization methods. Affect recognition was at 65% for unprocessed sentences, and 46% and 49% for the baseline and the proposed method, respectively. Preference tests showed that the proposed method preserved drastically more speaker identity, and sounded more natural than the baseline.",

keywords = "Affect, Delexicalization, Intelligibility",

author = "Alexander Kain and {Van Santen}, Jan",

note = "Funding Information: This research was supported by grants from the National Institute on Deafness and Other Communication Disorders, 1R21DC010239 (Lois Black, PI) and from the National Science Foundation, 0905095 (Jan van Santen, PI). The views herein are those of the authors and do not reflect the views of the funding agencies.",

year = "2010",

language = "English (US)",

series = "Proceedings of the 11th Annual Conference of the International Speech Communication Association, INTERSPEECH 2010",

publisher = "International Speech Communication Association",

pages = "474--477",

booktitle = "Proceedings of the 11th Annual Conference of the International Speech Communication Association, INTERSPEECH 2010",

}

TY - GEN

T1 - Frequency-domain delexicalization using surrogate vowels

AU - Kain, Alexander

AU - Van Santen, Jan

N1 - Funding Information: This research was supported by grants from the National Institute on Deafness and Other Communication Disorders, 1R21DC010239 (Lois Black, PI) and from the National Science Foundation, 0905095 (Jan van Santen, PI). The views herein are those of the authors and do not reflect the views of the funding agencies.

PY - 2010

Y1 - 2010

N2 - We propose a delexicalization algorithm that renders the lexical content of an utterance unintelligible, while preserving important acoustic prosodic cues, as well as naturalness and speaker identity. This is achieved by replacing voiced regions by spectral slices from a surrogate vowel, and by averaging the magnitude spectrum during unvoiced regions. Perceptual tests were carried out comparing sentences that were either unprocessed or delexicalized, using a baseline or the proposed method. An intelligibility test resulted in a keyword recall rate of 92% for the unprocessed sentences, and near complete unintelligibility for both delexicalization methods. Affect recognition was at 65% for unprocessed sentences, and 46% and 49% for the baseline and the proposed method, respectively. Preference tests showed that the proposed method preserved drastically more speaker identity, and sounded more natural than the baseline.

AB - We propose a delexicalization algorithm that renders the lexical content of an utterance unintelligible, while preserving important acoustic prosodic cues, as well as naturalness and speaker identity. This is achieved by replacing voiced regions by spectral slices from a surrogate vowel, and by averaging the magnitude spectrum during unvoiced regions. Perceptual tests were carried out comparing sentences that were either unprocessed or delexicalized, using a baseline or the proposed method. An intelligibility test resulted in a keyword recall rate of 92% for the unprocessed sentences, and near complete unintelligibility for both delexicalization methods. Affect recognition was at 65% for unprocessed sentences, and 46% and 49% for the baseline and the proposed method, respectively. Preference tests showed that the proposed method preserved drastically more speaker identity, and sounded more natural than the baseline.

KW - Affect

KW - Delexicalization

KW - Intelligibility

UR - http://www.scopus.com/inward/record.url?scp=79959825126&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=79959825126&partnerID=8YFLogxK

M3 - Conference contribution

AN - SCOPUS:79959825126

T3 - Proceedings of the 11th Annual Conference of the International Speech Communication Association, INTERSPEECH 2010

SP - 474

EP - 477

BT - Proceedings of the 11th Annual Conference of the International Speech Communication Association, INTERSPEECH 2010

PB - International Speech Communication Association

ER -

Frequency-domain delexicalization using surrogate vowels

Abstract

Publication series

Keywords

ASJC Scopus subject areas

Other files and links

Fingerprint

Cite this