MetaSpider: Meta-searching and categorization on the Web
โ Scribed by Hsinchun Chen; Haiyan Fan; Michael Chau; Daniel Zeng
- Publisher
- John Wiley and Sons
- Year
- 2001
- Tongue
- English
- Weight
- 622 KB
- Volume
- 52
- Category
- Article
- ISSN
- 1532-2882
- DOI
- 10.1002/asi.1180
No coin nor oath required. For personal study only.
โฆ Synopsis
Abstract
It has become increasingly difficult to locate relevant information on the Web, even with the help of Web search engines. Two approaches to addressing the low precision and poor presentation of search results of current search tools are studied: metaโsearch and document categorization. Metaโsearch engines improve precision by selecting and integrating search results from generic or domainโspecific Web search engines or other resources. Document categorization promises better organization and presentation of retrieved results. This article introduces MetaSpider, a metaโsearch engine that has realโtime indexing and categorizing functions. We report in this paper the major components of MetaSpider and discuss related technical approaches. Initial results of a user evaluation study comparing MetaSpider, NorthernLight, and MetaCrawler in terms of clustering performance and of time and effort expended show that MetaSpider performed best in precision rate, but disclose no statistically significant differences in recall rate and time requirements. Our experimental study also reveals that MetaSpider exhibited a higher level of automation than the other two systems and facilitated efficient searching by providing the user with an organized, comprehensive view of the retrieved documents.
๐ SIMILAR VOLUMES
## Abstract Although social tagging has received attention in the LIS field as a promising information organization mechanism, there is little research comparing userโsupplied tags and search queries. For using userโsupplied tags as an alternative or supplementary tool for existing image representa
Most of the existing techniques for page classification on the World Wide Web are based on text only analysis. Recently, several hypertext clustering algorithms have been proposed. These provide promising results when the clustering is based on combined term-similarity and hyperlink-similarity measu
While pages on the Web contain more and more multimedia information, such as images, videos and audio, today's search engines are mostly based on textual information. There is an emerging need for a new generation of search engines that try to exploit the full multimedia information present on the W