๐”– Bobbio Scriptorium
โœฆ   LIBER   โœฆ

[ACM Press the third ACM international conference - New York, New York, USA (2010.02.04-2010.02.06)] Proceedings of the third ACM international conference on Web search and data mining - WSDM '10 - A model to estimate intrinsic document relevance from the clickthrough logs of a web search engine

โœ Scribed by Dupret, Georges; Liao, Ciya


Book ID
118053388
Publisher
ACM Press
Year
2010
Weight
519 KB
Volume
0
Category
Article
ISBN
160558889X

No coin nor oath required. For personal study only.

โœฆ Synopsis


We propose a new model to interpret the clickthrough logs of a web search engine. This model is based on explicit assumptions on the user behavior. In particular, we draw conclusions on a document relevance by observing the user behavior after he examined the document and not based on whether a user clicks or not a document url. This results in a model based on intrinsic relevance, as opposed to perceived relevance. We use the model to predict document relevance and then use this as feature for a "Learning to Rank" machine learning algorithm. Comparing the ranking functions obtained by training the algorithm with and without the new feature we observe surprisingly good results. This is particularly notable given that the baseline we use is the heavily optimized ranking function of a leading commercial search engine. A deeper analysis shows that the new feature is particularly helpful for non navigational queries and queries with a large abandonment rate or a large average number of queries per session. This is important because these types of query is considered to be the most difficult to solve.


๐Ÿ“œ SIMILAR VOLUMES


[ACM Press the third ACM international c
โœ Dang, Van; Croft, Bruce W. ๐Ÿ“‚ Article ๐Ÿ“… 2010 ๐Ÿ› ACM Press โš– 551 KB

Query reformulation techniques based on query logs have been studied as a method of capturing user intent and improving retrieval effectiveness. The evaluation of these techniques has primarily, however, focused on proprietary query logs and selected samples of queries. In this paper, we suggest tha