TY - GEN
T1 - Stability and accuracy in Incremental Speech Recognition
AU - Selfridge, Ethan O.
AU - Arizmendi, Iker
AU - Heeman, Peter A.
AU - Williams, Jason D.
PY - 2011
Y1 - 2011
N2 - Conventional speech recognition approaches usually wait until the user has finished talking before returning a recognition hypothesis. This results in spoken dialogue systems that are unable to react while the user is still speaking. Incremental Speech Recognition (ISR), where partial phrase results are returned during user speech, has been used to create more reactive systems. However, ISR output is unstable and so prone to revision as more speech is decoded. This paper tackles the problem of stability in ISR. We first present a method that increases the stability and accuracy of ISR output, without adding delay. Given that some revisions are unavoidable, we next present a pair of methods for predicting the stability and accuracy of ISR results. Taken together, we believe these approaches give ISR more utility for real spoken dialogue systems.
AB - Conventional speech recognition approaches usually wait until the user has finished talking before returning a recognition hypothesis. This results in spoken dialogue systems that are unable to react while the user is still speaking. Incremental Speech Recognition (ISR), where partial phrase results are returned during user speech, has been used to create more reactive systems. However, ISR output is unstable and so prone to revision as more speech is decoded. This paper tackles the problem of stability in ISR. We first present a method that increases the stability and accuracy of ISR output, without adding delay. Given that some revisions are unavoidable, we next present a pair of methods for predicting the stability and accuracy of ISR results. Taken together, we believe these approaches give ISR more utility for real spoken dialogue systems.
UR - http://www.scopus.com/inward/record.url?scp=84863229187&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=84863229187&partnerID=8YFLogxK
M3 - Conference contribution
AN - SCOPUS:84863229187
SN - 9781937284107
T3 - Proceedings of the SIGDIAL 2011 Conference: 12th Annual Meeting of the Special Interest Group on Discourse and Dialogue
SP - 110
EP - 119
BT - Proceedings of the SIGDIAL 2011 Conference
T2 - 12th Annual Meeting of the Special Interest Group on Discourse and Dialogue, SIGDIAL 2011
Y2 - 17 June 2011 through 18 June 2011
ER -