𝔖 Bobbio Scriptorium
✦   LIBER   ✦

On the development of name search techniques for Arabic

✍ Scribed by Syed Uzair Aqeel; Steve Beitzel; Eric Jensen; David Grossman; Ophir Frieder


Publisher
John Wiley and Sons
Year
2006
Tongue
English
Weight
189 KB
Volume
57
Category
Article
ISSN
1532-2882

No coin nor oath required. For personal study only.

✦ Synopsis


Abstract

The need for effective identity matching systems has led to extensive research in the area of name search. For the most part, such work has been limited to English and other Latin‐based languages. Consequently, algorithms such as Soundex and n‐gram matching are of limited utility for languages such as Arabic, which has vastly different morphologic features that rely heavily on phonetic information. The dearth of work in this field is partly caused by the lack of standardized test data. Consequently, we have built a collection of 7,939 Arabic names, along with 50 training queries and 111 test queries. We use this collection to evaluate a variety of algorithms, including a derivative of Soundex tailored to Arabic (ASOUNDEX), measuring effectiveness by using standard information retrieval measures. Our results show an improvement of 70% over existing approaches.


πŸ“œ SIMILAR VOLUMES


cover
✍ Stewart Johnson, Sarah πŸ“‚ Fiction πŸ“… 2020 πŸ› Allen Lane; Crown 🌐 English βš– 384 KB πŸ‘ 1 views

Into the silent sea -- The light that shifts -- Red smoke -- The gates of the wonder world -- Stone from the sky -- Traversing -- Periapsis -- The acid flats -- In aeternum -- Sweet water -- Form from a formless thing.;Mars was once similar to Earth, but today there are no rivers, no lakes, no ocean

On the distribution of search cost for t
✍ James Allen Fill; Lars Holst πŸ“‚ Article πŸ“… 1996 πŸ› John Wiley and Sons 🌐 English βš– 393 KB

A file of records, each with an associated request probability, is dynamically maintained as a serial list. Successive requests are mutually independent. The list is reordered according to the move-to-front (MTF) rule: The requested record is moved to the front of the list. We derive the stationary