Stability and accuracy in Incremental Speech Recognition

Ethan O. Selfridge, Iker Arizmendi, Peter A. Heeman, Jason D. Williams

Research output: Chapter in Book/Report/Conference proceedingConference contribution

35 Scopus citations

Abstract

Conventional speech recognition approaches usually wait until the user has finished talking before returning a recognition hypothesis. This results in spoken dialogue systems that are unable to react while the user is still speaking. Incremental Speech Recognition (ISR), where partial phrase results are returned during user speech, has been used to create more reactive systems. However, ISR output is unstable and so prone to revision as more speech is decoded. This paper tackles the problem of stability in ISR. We first present a method that increases the stability and accuracy of ISR output, without adding delay. Given that some revisions are unavoidable, we next present a pair of methods for predicting the stability and accuracy of ISR results. Taken together, we believe these approaches give ISR more utility for real spoken dialogue systems.

Original languageEnglish (US)
Title of host publicationProceedings of the SIGDIAL 2011 Conference
Subtitle of host publication12th Annual Meeting of the Special Interest Group on Discourse and Dialogue
Pages110-119
Number of pages10
StatePublished - 2011
Event12th Annual Meeting of the Special Interest Group on Discourse and Dialogue, SIGDIAL 2011 - Portland, OR, United States
Duration: Jun 17 2011Jun 18 2011

Publication series

NameProceedings of the SIGDIAL 2011 Conference: 12th Annual Meeting of the Special Interest Group on Discourse and Dialogue

Other

Other12th Annual Meeting of the Special Interest Group on Discourse and Dialogue, SIGDIAL 2011
Country/TerritoryUnited States
CityPortland, OR
Period6/17/116/18/11

ASJC Scopus subject areas

  • Computer Graphics and Computer-Aided Design
  • Computer Vision and Pattern Recognition
  • Human-Computer Interaction
  • Modeling and Simulation

Fingerprint

Dive into the research topics of 'Stability and accuracy in Incremental Speech Recognition'. Together they form a unique fingerprint.

Cite this