A speech model of acoustic inventories based on asynchronous interpolation

Alexander Kain, Jan Van Santen

Research output: Chapter in Book/Report/Conference proceedingConference contribution

5 Scopus citations

Abstract

We propose a speech model that describes acoustic inventories of concatenative synthesizers. The model has the following characteristics: (i) very compact representations and thus high compression ratios are possible, (ii) re-synthezised speech is free of concatenation errors, (iii) the degree of articulation can be controlled explicitly, and (iv) voice transformation is feasible with relatively few additional recordings of a target speaker. The model represents a speech unit as a synthesis of several types of features, each of which has been computed using non-linear, asynchronous interpolation of neighboring basis vectors associated with known phonemic identities. During analysis, basis vectors and transition weights are estimated under a strict diphone assumption using a dynamic time warping approach. During synthesis, the estimated transition weight values are modified to produce changes in duration and articulation effort.

Original languageEnglish (US)
Title of host publicationEUROSPEECH 2003 - 8th European Conference on Speech Communication and Technology
PublisherInternational Speech Communication Association
Pages329-332
Number of pages4
StatePublished - 2003
Event8th European Conference on Speech Communication and Technology, EUROSPEECH 2003 - Geneva, Switzerland
Duration: Sep 1 2003Sep 4 2003

Other

Other8th European Conference on Speech Communication and Technology, EUROSPEECH 2003
Country/TerritorySwitzerland
CityGeneva
Period9/1/039/4/03

ASJC Scopus subject areas

  • Computer Science Applications
  • Software
  • Linguistics and Language
  • Communication

Fingerprint

Dive into the research topics of 'A speech model of acoustic inventories based on asynchronous interpolation'. Together they form a unique fingerprint.

Cite this