Fastest Pattern Matching in Strings
β Scribed by L. Colussi
- Publisher
- Elsevier Science
- Year
- 1994
- Tongue
- English
- Weight
- 850 KB
- Volume
- 16
- Category
- Article
- ISSN
- 0196-6774
No coin nor oath required. For personal study only.
β¦ Synopsis
An algorithm is presented that substantially improves the algorithm of Boyer and Moore for pattern matching in strings, both in the worst case and in the average. Both the Boyer and Moore algorithm and the new algorithm assume that the characters in the pattern and in the text are taken from a given alphabet (\Sigma) of finite size. The new algorithm performs (2 n) character comparisons in the worst case while the Boyer and Moore algorithm requires (3 n) comparisons [4]; the new algorithm requires fewer comparisons than Boyer and Moore on the average (but for the case of a binary alphabet, where the two algorithms perform roughly the same). For large patterns the ratio between the average number of comparisons for Boyer and Moore algorithm and the average number of comparisons for the new algorithm is close to the size (|\Sigma|) of the alphabet. As a shortcoming of the new algorithm, the preprocessing of the pattern requires (O(m)) time on the average but (O\left(m^{2}\right)) in the worst case. A mixed strategy between the two algorithms is suggested in order to make the preprocessing linear, at the expense of a slightly less efficient performance of the algorithm. The new algorithm has been obtained from the Boyer and Moore algorithm using a correctness proof (in the style of Hoare's axiomatic semantics) as a tool to improve algorithms. 1994 Academic Press, Inc.
π SIMILAR VOLUMES
The importance of hypertext has been steadily growing over the past decade. The Internet and other information systems use hypertext format, with data organized associatively rather than sequentially or relationally. A myriad of textual problems have been considered in the pattern matching field wit
Searching for long DNA strings is studied. A q-gram variation of the Boyer-Moore algorithm is considered. An alphabet transformation with precomputed tables is utilized to reduce the processing time. Experimental results show that the new algorithm is efficient in practice.
Various string matching algorithms have been designed and some experimental work on string matching over bounded alphabets has been performed, but string matching over unbounded alphabets has been little investigated. We present here experimental results where symbols are taken among potentially inf