𝔖 Bobbio Scriptorium
✦   LIBER   ✦

An evaluation of retrieval effectiveness using spelling-correction and string-similarity matching methods on Malay texts

✍ Scribed by Bakar, Zainab Abu ;Sembok, Tengku Mohd T. ;Yusoff, Mohammed


Publisher
John Wiley and Sons
Year
2000
Tongue
English
Weight
103 KB
Volume
51
Category
Article
ISSN
0002-8231

No coin nor oath required. For personal study only.

✦ Synopsis


This article evaluates the effectiveness of spelling-correction and string-similarity matching methods in retrieving similar words in a Malay dictionary associated with a set of query words. The spelling-correction techniques used are SPEEDCOP, Soundex, Davidson, Phonix, and Hartlib. Two dynamic-programming methods that measure longest common subsequence and editcost-distance are used. Several search combinations of query and dictionary words are performed in the experiments, the best being one that stems both query and dictionary words using an existing Malay stemming algorithm. The retrieval effectiveness (E) and retrieved and relevant (R&R) mean measures are calculated from weighted combination of recall and precision values. Results from these experiments are then compared with available digram, a string-similarity method. The best R&R and E results are given by using digram. Editcostdistances produce the best E results, and both dynamicprogramming methods rank second in finding R&R mean measures.