Refining Semantic Similarity of Paraphasias Using a Contextual Language Model

Alexandra C. Salem, Steven Bedrick, Robert Gale, Marianne Casilio, Mikala Fleegle, Gerasimos Fergadiotis

Research output: Contribution to journalArticlepeer-review

1 Scopus citations


Purpose: ParAlg (Paraphasia Algorithms) is a software that automatically cate-gorizes a person with aphasia’s naming error (paraphasia) in relation to its intended target on a picture-naming test. These classifications (based on lexi-cality as well as semantic, phonological, and morphological similarity to the tar-get) are important for characterizing an individual’s word-finding deficits or anomia. In this study, we applied a modern language model called BERT (Bidi-rectional Encoder Representations from Transformers) as a semantic classifier and evaluated its performance against ParAlg’s original word2vec model. Method: We used a set of 11,999 paraphasias produced during the Philadelphia Naming Test. We trained ParAlg with word2vec or BERT and compared their performance to humans. Finally, we evaluated BERT’s performance in terms of word-sense selection and conducted an item-level discrepancy analysis to iden-tify which aspects of semantic similarity are most challenging to classify. Results: Compared with word2vec, BERT qualitatively reduced word-sense issues and quantitatively reduced semantic classification errors by almost half. A large percentage of errors were attributable to semantic ambiguity. Of the possible semantic similarity subtypes, responses that were associated with or category coordinates of the intended target were most likely to be misclassified by both models and humans alike. Conclusions: BERT outperforms word2vec as a semantic classifier, partially due to its superior handling of polysemy. This work is an important step for further establishing ParAlg as an accurate assessment tool.

Original languageEnglish (US)
Pages (from-to)206-220
Number of pages15
JournalJournal of Speech, Language, and Hearing Research
Issue number1
StatePublished - Jan 2023

ASJC Scopus subject areas

  • Language and Linguistics
  • Linguistics and Language
  • Speech and Hearing


Dive into the research topics of 'Refining Semantic Similarity of Paraphasias Using a Contextual Language Model'. Together they form a unique fingerprint.

Cite this