𝔖 Bobbio Scriptorium
✦   LIBER   ✦

On the complexity of Rocchio's similarity-based relevance feedback algorithm

✍ Scribed by Zhixiang Chen; Bin Fu


Publisher
John Wiley and Sons
Year
2007
Tongue
English
Weight
271 KB
Volume
58
Category
Article
ISSN
1532-2882

No coin nor oath required. For personal study only.

✦ Synopsis


Abstract

Rocchio's similarity‐based relevance feedback algorithm, one of the most important query reformation methods in information retrieval, is essentially an adaptive learning algorithm from examples in searching for documents represented by a linear classifier. Despite its popularity in various applications, there is little rigorous analysis of its learning complexity in literature. In this article, the authors prove for the first time that the learning complexity of Rocchio's algorithm is O(d + d^2^(log d + log n)) over the discretized vector space {0,…, n βˆ’ 1}^d^, when the inner product similarity measure is used. The upper bound on the learning complexity for searching for documents represented by a monotone linear classifier $\left( {\overrightarrow q ,0} \right)$ over {0,…, n βˆ’ 1}^d^ can be improved to, at most, 1 + 2**k** (n βˆ’ 1) (log d βˆ’ log(n βˆ’ 1)), where k is the number of nonzero components in q. Several lower bounds on the learning complexity are also obtained for Rocchio's algorithm. For example, the authors prove that Rocchio's algorithm has a lower bound $\Omega \left( {\left( {_2^d } \right){\rm{log}},n} \right)$ on its learning complexity over the Boolean vector space {0, 1}^d^.


πŸ“œ SIMILAR VOLUMES


The MoT-ICE: A New High-Resolution Wave-
✍ Sebastian Noelle πŸ“‚ Article πŸ“… 2000 πŸ› Elsevier Science 🌐 English βš– 351 KB

Fey's Method of Transport (MoT ) is a multidimensional flux-vector-splitting scheme for systems of conservation laws. Similarly to its one-dimensional forerunner, the Steger-Warming scheme, and several other upwind finite-difference schemes, the MoT suffers from an inconsistency at sonic points when