𝔖 Bobbio Scriptorium
✦   LIBER   ✦

Images of similarity: A visual exploration of optimal similarity metrics and scaling properties of TREC topic-document sets

✍ Scribed by Rorvig, Mark


Publisher
John Wiley and Sons
Year
1999
Tongue
English
Weight
458 KB
Volume
50
Category
Article
ISSN
0002-8231

No coin nor oath required. For personal study only.

✦ Synopsis


Multiple similarity measures for five TREC topic-document sets from the LDC TREC Collection Disk 1 are derived from the full text of documents. Each measure on each set is scaled using SAS MDS under ordinal, interval, and MLE assumptions. The resulting 75 permutations are ploted. It is suggested that cosine-vector and overlap measures for similarity appear to recover optimal data relationships among the documents of the five sets. MLE assumptions appear to be required to model the data adequately.