TY - CONF
T1 - POS Tags and Decision Trees for Language Modeling
AU - Heeman, Peter A.
N1 - Funding Information:
We wish to thank James Allen, Geraldine Damnati, Chaojun Liu, Xintian Wu, and Yonghong Yan. This researchw ork was partiallys upportedb y NSF under grant IRI-9623665b, y the Intel ResearchC ouncil,a ndb y CNET FranceT 616comw, hile the authorw as visitingt here.
Funding Information:
We wish to thank James Alien, Geraldine Damnati, Chaojun Liu, Xintian Wu, and Yonghong Yan. This research work was partially supported by NSF under grant IRI-9623665, by the Intel Research Council, and by CNET France T?l?com, while the author was visiting there.
Publisher Copyright:
© 1999 Proceedings of the 1999 Joint SIGDAT Conference on Empirical Methods in Natural Language Processing and Very Large Corpora, EMNLP 1999. All rights reserved.
PY - 1999
Y1 - 1999
N2 - Language models for speech recognition concentrate solely on recognizing the words that were spoken. In this paper, we advocate redefining the speech recognition problem so that its goal is to find both the best sequence of words and their POS tags, and thus incorporate POS tagging. To use POS tags effectively, we use clustering and decision tree algorithms, which allow generalizations between POS tags and words to be effectively used in estimating the probability distributions. We show that our POS model gives a reduction in word error rate and perplexity for the Trains corpus in comparison to word and class-based approaches. By using the Wall Street Journal corpus, we show that this approach scales up when more training data is available.
AB - Language models for speech recognition concentrate solely on recognizing the words that were spoken. In this paper, we advocate redefining the speech recognition problem so that its goal is to find both the best sequence of words and their POS tags, and thus incorporate POS tagging. To use POS tags effectively, we use clustering and decision tree algorithms, which allow generalizations between POS tags and words to be effectively used in estimating the probability distributions. We show that our POS model gives a reduction in word error rate and perplexity for the Trains corpus in comparison to word and class-based approaches. By using the Wall Street Journal corpus, we show that this approach scales up when more training data is available.
UR - http://www.scopus.com/inward/record.url?scp=0039623602&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=0039623602&partnerID=8YFLogxK
M3 - Paper
AN - SCOPUS:0039623602
SP - 129
EP - 137
T2 - 1999 Joint SIGDAT Conference on Empirical Methods in Natural Language Processing and Very Large Corpora, EMNLP 1999
Y2 - 21 June 1999 through 22 June 1999
ER -