Including dynamic and phonetic information in voice conversion systems

Helenca Duxans, Antonio Bonafonte, Alexander Kain, Jan Van Santen

Research output: Chapter in Book/Report/Conference proceedingConference contribution

42 Scopus citations

Abstract

Voice Conversion (VC) systems modify a speaker voice (source speaker) to be perceived as if another speaker (target speaker) had uttered it. Previous published VC approaches using Gaussian Mixture Models [1] performs the conversion in a frame-by-frame basis using only spectral information. In this paper, two new approaches are studied in order to extend the GMM-based VC systems. First, dynamic information is used to build the speaker acoustic model. So, the transformation is carried out according to sequences of frames. Then, phonetic information is introduced in the training of the VC system. Objective and perceptual results compare the performance of the proposed systems.

Original languageEnglish (US)
Title of host publication8th International Conference on Spoken Language Processing, ICSLP 2004
PublisherInternational Speech Communication Association
Pages1193-1196
Number of pages4
StatePublished - 2004
Event8th International Conference on Spoken Language Processing, ICSLP 2004 - Jeju, Jeju Island, Korea, Republic of
Duration: Oct 4 2004Oct 8 2004

Other

Other8th International Conference on Spoken Language Processing, ICSLP 2004
Country/TerritoryKorea, Republic of
CityJeju, Jeju Island
Period10/4/0410/8/04

ASJC Scopus subject areas

  • Language and Linguistics
  • Linguistics and Language

Fingerprint

Dive into the research topics of 'Including dynamic and phonetic information in voice conversion systems'. Together they form a unique fingerprint.

Cite this