TY - GEN
T1 - Analyzing web log files of the health on the net honmedia search engine to define typical image search tasks for image retrieval evaluation
AU - Müller, Henning
AU - Boyer, Célia
AU - Gaudinat, Arnaud
AU - Hersh, William
AU - Geissbuhler, Antoine
PY - 2007
Y1 - 2007
N2 - Medical institutions produce ever-increasing amount of diverse information. The digital form makes these data available for the use on more than a single patient. Images are no exception to this. However, less is known about how medical professionals search for visual medical information and how they want to use it outside of the context of a single patient. This article analyzes ten months of usage log files of the Health on the Net (HON) medical media search engine. Key words were extracted from all queries and the most frequent terms and subjects were identified. The dataset required much pre-treatment. Problems included national character sets, spelling errors and the use of terms in several languages. The results show that media search, particularly for images, was frequently used. The most common queries were for general concepts (e.g., heart, lung). To define realistic information needs for the ImageCLEFmed challenge evaluation (Cross Language Evaluation Forum medical image retrieval), we used frequent queries that were still specific enough to at least cover two of the three axes on modality, anatomic region, and pathology. Several research groups evaluated their image retrieval algorithms based on these defined topics.
AB - Medical institutions produce ever-increasing amount of diverse information. The digital form makes these data available for the use on more than a single patient. Images are no exception to this. However, less is known about how medical professionals search for visual medical information and how they want to use it outside of the context of a single patient. This article analyzes ten months of usage log files of the Health on the Net (HON) medical media search engine. Key words were extracted from all queries and the most frequent terms and subjects were identified. The dataset required much pre-treatment. Problems included national character sets, spelling errors and the use of terms in several languages. The results show that media search, particularly for images, was frequently used. The most common queries were for general concepts (e.g., heart, lung). To define realistic information needs for the ImageCLEFmed challenge evaluation (Cross Language Evaluation Forum medical image retrieval), we used frequent queries that were still specific enough to at least cover two of the three axes on modality, anatomic region, and pathology. Several research groups evaluated their image retrieval algorithms based on these defined topics.
KW - image retrieval evaluation
KW - log files analysis
UR - http://www.scopus.com/inward/record.url?scp=35748980893&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=35748980893&partnerID=8YFLogxK
M3 - Conference contribution
C2 - 17911928
AN - SCOPUS:35748980893
SN - 9781586037741
T3 - Studies in Health Technology and Informatics
SP - 1319
EP - 1323
BT - MEDINFO 2007 - Proceedings of the 12th World Congress on Health (Medical) Informatics
PB - IOS Press
T2 - 12th World Congress on Medical Informatics, MEDINFO 2007
Y2 - 20 August 2007 through 24 August 2007
ER -