𝔖 Bobbio Scriptorium
✦   LIBER   ✦

An efficient document retrieval method using n-gram indexing

✍ Scribed by Yasushi Ogawa; Toru Matsuda


Book ID
104591078
Publisher
John Wiley and Sons
Year
2002
Tongue
English
Weight
169 KB
Volume
33
Category
Article
ISSN
0882-1666

No coin nor oath required. For personal study only.

✦ Synopsis


Abstract

In Japanese, the border between words is not explicitly indicated. Consequently, n‐gram (a tuple of n characters) indexing is usually applied to document retrieval. Retrieval based on the n‐gram indexing is performed for long retrieval words as follows. After dividing the long retrieval word into n‐grams, two‐stage processing is applied, consisting of determination of the retrieval candidate document containing the divided n‐gram, and noise elimination in the candidate document by matching the position of the n‐gram. This paper proposes two methods to speed up the above retrieval processing. One selects the combination that minimizes the processing cost for noise elimination from the n‐gram extracted from the long retrieval word. The other applies additionally an n‐gram other than the one minimizing the cost, extracted from the retrieval word, to the determination of the candidate document, which improves efficiency in narrowing the range of the candidate documents. An evaluation experiment was performed, using 4 years of newspaper articles, and the effectiveness of the proposed methods was demonstrated. Β© 2002 Scripta Technica, Syst Comp Jpn, 33(2): 54–63, 2002


πŸ“œ SIMILAR VOLUMES


An efficient implementation of the full-
✍ Robert J. Harrison; Sohrab Zarrabian πŸ“‚ Article πŸ“… 1989 πŸ› Elsevier Science 🌐 English βš– 484 KB

The determinant full-C1 algorithm of Zarrabian, Sarma and Paldus (Chem. Phys. Letters 158 (1989) 183) has been Implemcnted for efficient operation on parallel vector computers. For few electrons (n) in many orhitals (m) and n,, determinants, the floating point operation count is M (ncIm2n2), dominat

An efficient solution method for incompr
✍ D. Ghosh Roychowdhury; Sarit Kumar Das; T. Sundararajan πŸ“‚ Article πŸ“… 1999 πŸ› John Wiley and Sons 🌐 English βš– 311 KB πŸ‘ 1 views

An e$cient strategy for the solution of N-S Equations using collocated, non-orthogonal grids is presented. The governing equations have been discretized in the physical plane itself without co-ordinate transformation, thereby retaining the lucidity of the basic "nite volume method. The non-orthogona

An Efficient Method for the Preparation
✍ Hidenori Aoki; Teruaki Mukaiyama πŸ“‚ Article πŸ“… 2006 πŸ› John Wiley and Sons βš– 26 KB πŸ‘ 1 views

## Abstract ChemInform is a weekly Abstracting Service, delivering concise information at a glance that was extracted from about 200 leading journals. To access a ChemInform Abstract, please click on HTML or PDF.