𝔖 Bobbio Scriptorium
✦   LIBER   ✦

A model for quantitative evaluation of an end-to-end question-answering system

✍ Scribed by Nina Wacholder; Diane Kelly; Paul Kantor; Robert Rittman; Ying Sun; Bing Bai; Sharon Small; Boris Yamrom; Tomek Strzalkowski


Publisher
John Wiley and Sons
Year
2007
Tongue
English
Weight
430 KB
Volume
58
Category
Article
ISSN
1532-2882

No coin nor oath required. For personal study only.

✦ Synopsis


Abstract

We describe a procedure for quantitative evaluation of interactive question‐answering systems and illustrate it with application to the High‐Quality Interactive Question‐Answering (HITIQA) system. Our objectives were (a) to design a method to realistically and reliably assess interactive question‐answering systems by comparing the quality of reports produced using different systems, (b) to conduct a pilot test of this method, and (c) to perform a formative evaluation of the HITIQA system. Far more important than the specific information gathered from this pilot evaluation is the development of (a) a protocol for evaluating an emerging technology, (b) reusable assessment instruments, and (c) the knowledge gained in conducting the evaluation. We conclude that this method, which uses a surprisingly small number of subjects and does not rely on predetermined relevance judgments, measures the impact of system change on work produced by users. Therefore this method can be used to compare the product of interactive systems that use different underlying technologies.


📜 SIMILAR VOLUMES


Impact of model for end-stage liver dise
✍ Urmila Khettry; Gissou Azabdaftari; Mary Ann Simpson; Elizabeth A. Pomfret; Jame 📂 Article 📅 2006 🏛 John Wiley and Sons 🌐 English ⚖ 160 KB 👁 1 views

The Model for End-Stage Liver Disease (MELD) scoring system, a validated objective liver disease severity scale, was adopted in February 2002 to allocate cadaveric organs for liver transplantation (LT). To improve transplantability before succumbing to advanced disease, patients with low-stage hepat

An evaluation of a traditional and a neu
✍ David Cameron; Pauline Kneale; Linda See 📂 Article 📅 2002 🏛 John Wiley and Sons 🌐 English ⚖ 154 KB

## Abstract This study evaluates two (of the many) modelling approaches to flood forecasting for an upland catchment (the River South Tyne at Haydon Bridge, England). The first modelling approach utilizes ‘traditional’ hydrological models. It consists of a rainfall–runoff model (the probability dis

Quality-by-design (QbD): An integrated a
✍ Huiquan Wu; Mansoor A. Khan 📂 Article 📅 2009 🏛 John Wiley and Sons 🌐 English ⚖ 234 KB

The objective of this study was to develop an integrated process monitoring approach for evaluating powder blending process kinetics and determining blending process end-point. A mixture design was created to include 26 powder formulations consisting of ibuprofen as the model drug and three excipien