TY - GEN
T1 - Automated scoring of clinical expressive language evaluation tasks
AU - Wang, Yiyi
AU - Prud'Hommeaux, Emily
AU - Asgari, Meysam
AU - Dolata, Jill
N1 - Funding Information:
We thank Beth Calamé, Julie Bird, Kristin Hinton, Christine Yang, and Emily Fabius for their contributions to data collection and annotation. This work was supported in part by NIH NIDCD awards R01DC012033 and R21DC017000. Any opinions, findings, conclusions, or recommendations expressed in this publication are those of the authors and do not necessarily reflect the views of the NIH or NIDCD.
Publisher Copyright:
© 2020 Association for Computational Linguistics.
PY - 2020
Y1 - 2020
N2 - Many clinical assessment instruments used to diagnose language impairments in children include a task in which the subject must formulate a sentence to describe an image using a specific target word. Because producing sentences in this way requires the speaker to integrate syntactic and semantic knowledge in a complex manner, responses are typically evaluated on several different dimensions of appropriateness yielding a single composite score for each response. In this paper, we present a dataset consisting of non-clinically elicited responses for three related sentence formulation tasks, and we propose an approach for automatically evaluating their appropriateness. Using neural machine translation, we generate correct-incorrect sentence pairs to serve as synthetic data in order to increase the amount and diversity of training data for our scoring model. Our scoring model uses transfer learning to facilitate automatic sentence appropriateness evaluation. We further compare custom word embeddings with pre-trained contextualized embeddings serving as features for our scoring model. We find that transfer learning improves scoring accuracy, particularly when using pre-trained contextualized embeddings.
AB - Many clinical assessment instruments used to diagnose language impairments in children include a task in which the subject must formulate a sentence to describe an image using a specific target word. Because producing sentences in this way requires the speaker to integrate syntactic and semantic knowledge in a complex manner, responses are typically evaluated on several different dimensions of appropriateness yielding a single composite score for each response. In this paper, we present a dataset consisting of non-clinically elicited responses for three related sentence formulation tasks, and we propose an approach for automatically evaluating their appropriateness. Using neural machine translation, we generate correct-incorrect sentence pairs to serve as synthetic data in order to increase the amount and diversity of training data for our scoring model. Our scoring model uses transfer learning to facilitate automatic sentence appropriateness evaluation. We further compare custom word embeddings with pre-trained contextualized embeddings serving as features for our scoring model. We find that transfer learning improves scoring accuracy, particularly when using pre-trained contextualized embeddings.
UR - http://www.scopus.com/inward/record.url?scp=85112673131&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85112673131&partnerID=8YFLogxK
M3 - Conference contribution
AN - SCOPUS:85112673131
T3 - Proceedings of the Annual Meeting of the Association for Computational Linguistics
SP - 177
EP - 185
BT - ACL 2020 - Innovative Use of NLP for Building Educational Applications, Proceedings of the 15th Workshop
PB - Association for Computational Linguistics (ACL)
T2 - 15th Workshop on Innovative Use of NLP for Building Educational Applications, BEA 2020 at the 58th Annual Meeting of the Association for Computational Linguistics, ACL 2020
Y2 - 10 July 2020
ER -