Tasks, topics and relevance judging for the TREC Genomics Track: five years of experience evaluating biomedical text information retrieval systems

Phoebe M. Roberts; Aaron M. Cohen; William R. Hersh

doi:10.1007/s10791-008-9072-x

Tasks, topics and relevance judging for the TREC Genomics Track: five years of experience evaluating biomedical text information retrieval systems

Phoebe M. Roberts, Aaron M. Cohen, William R. Hersh

Medical Informatics and Clinical Epidemiology

Research output: Contribution to journal › Article › peer-review

22 Scopus citations

Abstract

With the help of a team of expert biologist judges, the TREC Genomics track has generated four large sets of "gold standard" test collections, comprised of over a hundred unique topics, two kinds of ad hoc retrieval tasks, and their corresponding relevance judgments. Over the years of the track, increasingly complex tasks necessitated the creation of judging tools and training guidelines to accommodate teams of part-time short-term workers from a variety of specialized biological scientific backgrounds, and to address consistency and reproducibility of the assessment process. Important lessons were learned about factors that influenced the utility of the test collections including topic design, annotations provided by judges, methods used for identifying and training judges, and providing a central moderator "meta-judge".

Original language	English (US)
Pages (from-to)	81-97
Number of pages	17
Journal	Information Retrieval
Volume	12
Issue number	1
DOIs	https://doi.org/10.1007/s10791-008-9072-x
State	Published - Feb 2009

Keywords

Evaluation
Information retrieval
Inter-annotator agreement
Reference standards
Text mining

ASJC Scopus subject areas

Information Systems
Library and Information Sciences

Access to Document

10.1007/s10791-008-9072-x

Cite this

@article{e6b6e09f37a64e69a1d6266d45ec27d0,

title = "Tasks, topics and relevance judging for the TREC Genomics Track: five years of experience evaluating biomedical text information retrieval systems",

abstract = "With the help of a team of expert biologist judges, the TREC Genomics track has generated four large sets of {"}gold standard{"} test collections, comprised of over a hundred unique topics, two kinds of ad hoc retrieval tasks, and their corresponding relevance judgments. Over the years of the track, increasingly complex tasks necessitated the creation of judging tools and training guidelines to accommodate teams of part-time short-term workers from a variety of specialized biological scientific backgrounds, and to address consistency and reproducibility of the assessment process. Important lessons were learned about factors that influenced the utility of the test collections including topic design, annotations provided by judges, methods used for identifying and training judges, and providing a central moderator {"}meta-judge{"}.",

keywords = "Evaluation, Information retrieval, Inter-annotator agreement, Reference standards, Text mining",

author = "Roberts, {Phoebe M.} and Cohen, {Aaron M.} and Hersh, {William R.}",

note = "Funding Information: Acknowledgements The TREC Genomics Track was funded by grant ITR-0325160 to W.R.H. from the U.S. National Science Foundation. The authors would like to thank the Genomics track steering committee, especially Kevin Bretonnel Cohen and Anna Divoli, for helpful discussions about relevance judgments and guidelines.",

year = "2009",

month = feb,

doi = "10.1007/s10791-008-9072-x",

language = "English (US)",

volume = "12",

pages = "81--97",

journal = "Information Retrieval",

issn = "1386-4564",

publisher = "Springer Netherlands",

number = "1",

}

TY - JOUR

T1 - Tasks, topics and relevance judging for the TREC Genomics Track

T2 - five years of experience evaluating biomedical text information retrieval systems

AU - Roberts, Phoebe M.

AU - Cohen, Aaron M.

AU - Hersh, William R.

N1 - Funding Information: Acknowledgements The TREC Genomics Track was funded by grant ITR-0325160 to W.R.H. from the U.S. National Science Foundation. The authors would like to thank the Genomics track steering committee, especially Kevin Bretonnel Cohen and Anna Divoli, for helpful discussions about relevance judgments and guidelines.

PY - 2009/2

Y1 - 2009/2

N2 - With the help of a team of expert biologist judges, the TREC Genomics track has generated four large sets of "gold standard" test collections, comprised of over a hundred unique topics, two kinds of ad hoc retrieval tasks, and their corresponding relevance judgments. Over the years of the track, increasingly complex tasks necessitated the creation of judging tools and training guidelines to accommodate teams of part-time short-term workers from a variety of specialized biological scientific backgrounds, and to address consistency and reproducibility of the assessment process. Important lessons were learned about factors that influenced the utility of the test collections including topic design, annotations provided by judges, methods used for identifying and training judges, and providing a central moderator "meta-judge".

AB - With the help of a team of expert biologist judges, the TREC Genomics track has generated four large sets of "gold standard" test collections, comprised of over a hundred unique topics, two kinds of ad hoc retrieval tasks, and their corresponding relevance judgments. Over the years of the track, increasingly complex tasks necessitated the creation of judging tools and training guidelines to accommodate teams of part-time short-term workers from a variety of specialized biological scientific backgrounds, and to address consistency and reproducibility of the assessment process. Important lessons were learned about factors that influenced the utility of the test collections including topic design, annotations provided by judges, methods used for identifying and training judges, and providing a central moderator "meta-judge".

KW - Evaluation

KW - Information retrieval

KW - Inter-annotator agreement

KW - Reference standards

KW - Text mining

UR - http://www.scopus.com/inward/record.url?scp=58149218476&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=58149218476&partnerID=8YFLogxK

U2 - 10.1007/s10791-008-9072-x

DO - 10.1007/s10791-008-9072-x

M3 - Article

AN - SCOPUS:58149218476

SN - 1386-4564

VL - 12

SP - 81

EP - 97

JO - Information Retrieval

JF - Information Retrieval

IS - 1

ER -

Tasks, topics and relevance judging for the TREC Genomics Track: five years of experience evaluating biomedical text information retrieval systems

Abstract

Keywords

ASJC Scopus subject areas

Access to Document

Other files and links

Fingerprint

Cite this