How effective is suffixing?
β Scribed by Harman, Donna
- Publisher
- John Wiley and Sons
- Year
- 1991
- Tongue
- English
- Weight
- 859 KB
- Volume
- 42
- Category
- Article
- ISSN
- 0002-8231
No coin nor oath required. For personal study only.
β¦ Synopsis
The interaction of suffixing algorithms and ranking techniques in retrieval performance, particularly in an online environment, was investigated. Three general purpose suffixing algorithms were used for retrieval on the Cranfield 1400, Medlars, and CACM test collections, with no significant improvement in performance shown for any of the algorithms. A failure analysis suggested three modifications to ranking techniques: variable weighting of term variants, selective stemming depending on query length, and selective stemming depending on term importance. None of these modifications improved performance. Recommendations are made regarding the uses of suffixing in an online environment. introduction Traditional statistically based keyword retrieval systems have been the subject of experiments for over 30 years. The use of simple keyword matching as a basis for retrieval can produce acceptable results, and the addition of ranking techniques
based on the frequency of a given matching term within a document collection and/or within a given document adds considerable improvement (Sparck Jones, 1972;Salton, 1983).
The conflation of word variants using suffixing algorithms was one of the earliest enhancements to statistical keyword retrieval systems (Salton, 1971), and has become so standard a part of most systems that many system descriptions neglect to mention the use of suffixing, or to identify the algorithm was used. Suffixing was originally done for two principle reasons: the large reduction in storage required by a retrieval dictionary (Bell, 1979), and the increase in performance due to the use of word variants. Recent research has been more concerned with performance improvement than with storage reduction.
The NLM IRX (Information Retrieval Experiment) project (Benson,
π SIMILAR VOLUMES
## Abstract To implement Section 404 of the SarbanesβOxley Act (SOX), management must report on the βeffectivenessβ of the company's internal control. But how can you gauge the level of effectiveness? And how can you tell if that level is acceptable? The author offers some solid answers. __Β© 2004 W
The rate of decline of plasma HIV RNA in patients treated with anti-retroviral drugs has been postulated to reflect the half-lives of previously HIV-infected cells. Here, Zvi Grossman and colleagues argue that the observed decline is explained by the kinetics of ongoing infection cycles. Residual ce
This calculation is derived as follows. Working through the 43 suffixes in Table , we find that 9 suffixes select for adjectives, and 16 form adjectives (hence 9x 16 = 144 combinations are allowed); 21 suffixes select for nouns and 21 form nouns (hence 21 x 21 = 441 allowed); and 13 suffixes select