𝔖 Bobbio Scriptorium
✦   LIBER   ✦

Using graded relevance assessments in IR evaluation

✍ Scribed by Jaana Kekäläinen; Kalervo Järvelin


Publisher
John Wiley and Sons
Year
2002
Tongue
English
Weight
134 KB
Volume
53
Category
Article
ISSN
1532-2882

No coin nor oath required. For personal study only.

✦ Synopsis


Abstract

This article proposes evaluation methods based on the use of nondichotomous relevance judgements in IR experiments. It is argued that evaluation methods should credit IR methods for their ability to retrieve highly relevant documents. This is desirable from the user point of view in modern large IR environments. The proposed methods are (1) a novel application of P‐R curves and average precision computations based on separate recall bases for documents of different degrees of relevance, and (2) generalized recall and precision based directly on multiple grade relevance assessments (i.e., not dichotomizing the assessments). We demonstrate the use of the traditional and the novel evaluation measures in a case study on the effectiveness of query types, based on combinations of query structures and expansion, in retrieving documents of various degrees of relevance. The test was run with a best match retrieval system (InQuery1) in a text database consisting of newspaper articles. To gain insight into the retrieval process, one should use both graded relevance assessments and effectiveness measures that enable one to observe the differences, if any, between retrieval methods in retrieving documents of different levels of relevance. In modern times of information overload, one should pay attention, in particular, to the capability of retrieval methods retrieving highly relevant documents.


📜 SIMILAR VOLUMES


The Disease Activity Score is not suitab
✍ Frederick Wolfe; Kaleb Michaud; Theodore Pincus; Daniel Furst; Edward Keystone 📂 Article 📅 2005 🏛 John Wiley and Sons 🌐 English ⚖ 218 KB 👁 2 views

## Abstract ## Objective The Disease Activity Score (DAS) is widely used in clinical trials. A DAS of 5.1 defines the level of severe rheumatoid arthritis (RA) and is the criterion for the initiation of anti–tumor necrosis factor therapy in the UK and The Netherlands. In North America, similar rul