Improving the intelligibility of dysarthric speech

Alexander B. Kain; John Paul Hosom; Xiaochuan Niu; Jan P.H. van Santen; Melanie Fried-Oken; Janice Staehely

doi:10.1016/j.specom.2007.05.001

Improving the intelligibility of dysarthric speech

Alexander B. Kain, John Paul Hosom, Xiaochuan Niu, Jan P.H. van Santen, Melanie Fried-Oken, Janice Staehely

Institute on Development and Disability

Research output: Contribution to journal › Article › peer-review

119 Scopus citations

Abstract

Dysarthria is a speech motor disorder usually resulting in a substantive decrease in speech intelligibility by the general population. In this study, we have significantly improved the intelligibility of dysarthric vowels of one speaker from 48% to 54%, as evaluated by a vowel identification task using 64 CVC stimuli judged by 24 listeners. Improvement was obtained by transforming the vowels of a speaker with dysarthria to more closely match the vowel space of a non-dysarthric (target) speaker. The optimal mapping feature set, from a list of 21 candidate feature sets, proved to be one utilizing vowel duration and F1-F3 stable points, which were calculated using shape-constrained isotonic regression. The choice of speaker-specific or speaker-independent vowel formant targets appeared to be insignificant. Comparisons with "oracle" conditions were performed in order to evaluate the analysis/re-synthesis system independently of the transformation function.

Original language	English (US)
Pages (from-to)	743-759
Number of pages	17
Journal	Speech Communication
Volume	49
Issue number	9
DOIs	https://doi.org/10.1016/j.specom.2007.05.001
State	Published - Sep 2007

Keywords

Dysarthria
Intelligibility
Speech modification
Speech processing
Speech transformation

ASJC Scopus subject areas

Software
Modeling and Simulation
Communication
Language and Linguistics
Linguistics and Language
Computer Vision and Pattern Recognition
Computer Science Applications

Access to Document

10.1016/j.specom.2007.05.001

Cite this

@article{87aaa5a5b2ce484ea4528fc9d2ab375b,

title = "Improving the intelligibility of dysarthric speech",

abstract = "Dysarthria is a speech motor disorder usually resulting in a substantive decrease in speech intelligibility by the general population. In this study, we have significantly improved the intelligibility of dysarthric vowels of one speaker from 48% to 54%, as evaluated by a vowel identification task using 64 CVC stimuli judged by 24 listeners. Improvement was obtained by transforming the vowels of a speaker with dysarthria to more closely match the vowel space of a non-dysarthric (target) speaker. The optimal mapping feature set, from a list of 21 candidate feature sets, proved to be one utilizing vowel duration and F1-F3 stable points, which were calculated using shape-constrained isotonic regression. The choice of speaker-specific or speaker-independent vowel formant targets appeared to be insignificant. Comparisons with {"}oracle{"} conditions were performed in order to evaluate the analysis/re-synthesis system independently of the transformation function.",

keywords = "Dysarthria, Intelligibility, Speech modification, Speech processing, Speech transformation",

author = "Kain, {Alexander B.} and Hosom, {John Paul} and Xiaochuan Niu and {van Santen}, {Jan P.H.} and Melanie Fried-Oken and Janice Staehely",

note = "Funding Information: This research was conducted with support from NSF Grant 0117911 “Making Dysarthric Speech Intelligible”. Oregon Health & Science University (OHSU), Dr. Kain, Dr. Hosom, and Dr. Jan van Santen have a significant financial interest in BioSpeech, Inc., a company that may have a commercial interest in the results of this research and technology. This potential conflict was reviewed and a management plan approved by the OHSU Conflict of Interest in Research Committee and the Integrity Program Oversight Council was implemented.",

year = "2007",

month = sep,

doi = "10.1016/j.specom.2007.05.001",

language = "English (US)",

volume = "49",

pages = "743--759",

journal = "Speech Communication",

issn = "0167-6393",

publisher = "Elsevier",

number = "9",

}

TY - JOUR

T1 - Improving the intelligibility of dysarthric speech

AU - Kain, Alexander B.

AU - Hosom, John Paul

AU - Niu, Xiaochuan

AU - van Santen, Jan P.H.

AU - Fried-Oken, Melanie

AU - Staehely, Janice

N1 - Funding Information: This research was conducted with support from NSF Grant 0117911 “Making Dysarthric Speech Intelligible”. Oregon Health & Science University (OHSU), Dr. Kain, Dr. Hosom, and Dr. Jan van Santen have a significant financial interest in BioSpeech, Inc., a company that may have a commercial interest in the results of this research and technology. This potential conflict was reviewed and a management plan approved by the OHSU Conflict of Interest in Research Committee and the Integrity Program Oversight Council was implemented.

PY - 2007/9

Y1 - 2007/9

N2 - Dysarthria is a speech motor disorder usually resulting in a substantive decrease in speech intelligibility by the general population. In this study, we have significantly improved the intelligibility of dysarthric vowels of one speaker from 48% to 54%, as evaluated by a vowel identification task using 64 CVC stimuli judged by 24 listeners. Improvement was obtained by transforming the vowels of a speaker with dysarthria to more closely match the vowel space of a non-dysarthric (target) speaker. The optimal mapping feature set, from a list of 21 candidate feature sets, proved to be one utilizing vowel duration and F1-F3 stable points, which were calculated using shape-constrained isotonic regression. The choice of speaker-specific or speaker-independent vowel formant targets appeared to be insignificant. Comparisons with "oracle" conditions were performed in order to evaluate the analysis/re-synthesis system independently of the transformation function.

AB - Dysarthria is a speech motor disorder usually resulting in a substantive decrease in speech intelligibility by the general population. In this study, we have significantly improved the intelligibility of dysarthric vowels of one speaker from 48% to 54%, as evaluated by a vowel identification task using 64 CVC stimuli judged by 24 listeners. Improvement was obtained by transforming the vowels of a speaker with dysarthria to more closely match the vowel space of a non-dysarthric (target) speaker. The optimal mapping feature set, from a list of 21 candidate feature sets, proved to be one utilizing vowel duration and F1-F3 stable points, which were calculated using shape-constrained isotonic regression. The choice of speaker-specific or speaker-independent vowel formant targets appeared to be insignificant. Comparisons with "oracle" conditions were performed in order to evaluate the analysis/re-synthesis system independently of the transformation function.

KW - Dysarthria

KW - Intelligibility

KW - Speech modification

KW - Speech processing

KW - Speech transformation

UR - http://www.scopus.com/inward/record.url?scp=34447635527&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=34447635527&partnerID=8YFLogxK

U2 - 10.1016/j.specom.2007.05.001

DO - 10.1016/j.specom.2007.05.001

M3 - Article

AN - SCOPUS:34447635527

SN - 0167-6393

VL - 49

SP - 743

EP - 759

JO - Speech Communication

JF - Speech Communication

IS - 9

ER -

Improving the intelligibility of dysarthric speech

Abstract

Keywords

ASJC Scopus subject areas

Access to Document

Other files and links

Fingerprint

Cite this