TY - JOUR
T1 - Optimizing feature representation for automated systematic review work prioritization.
AU - Cohen, Aaron M.
N1 - Copyright:
This record is sourced from MEDLINE®/PubMed®, a database of the U.S. National Library of Medicine
PY - 2008
Y1 - 2008
N2 - Automated document classification can be a valuable tool for enhancing the efficiency of creating and updating systematic reviews (SRs) for evidence-based medicine. One way document classification can help is in performing work prioritization: given a set of documents, order them such that the most likely useful documents appear first. We evaluated several alternate classification feature systems including unigram, n-gram, MeSH, and natural language processing (NLP) feature sets for their usefulness on 15 SR tasks, using the area under the receiver operating curve as a measure of goodness. We also examined the impact of topic-specific training data compared to general SR inclusion data. The best feature set used a combination of n-gram and MeSH features. NLP-based features were not found to improve performance. Furthermore, topic-specific training data usually provides a significant performance gain over more general SR training.
AB - Automated document classification can be a valuable tool for enhancing the efficiency of creating and updating systematic reviews (SRs) for evidence-based medicine. One way document classification can help is in performing work prioritization: given a set of documents, order them such that the most likely useful documents appear first. We evaluated several alternate classification feature systems including unigram, n-gram, MeSH, and natural language processing (NLP) feature sets for their usefulness on 15 SR tasks, using the area under the receiver operating curve as a measure of goodness. We also examined the impact of topic-specific training data compared to general SR inclusion data. The best feature set used a combination of n-gram and MeSH features. NLP-based features were not found to improve performance. Furthermore, topic-specific training data usually provides a significant performance gain over more general SR training.
UR - http://www.scopus.com/inward/record.url?scp=69549134271&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=69549134271&partnerID=8YFLogxK
M3 - Article
C2 - 18998798
AN - SCOPUS:69549134271
SN - 1559-4076
SP - 121
EP - 125
JO - AMIA ... Annual Symposium proceedings / AMIA Symposium. AMIA Symposium
JF - AMIA ... Annual Symposium proceedings / AMIA Symposium. AMIA Symposium
ER -