𝔖 Bobbio Scriptorium
✦   LIBER   ✦

A new regular grammar pattern matching algorithm

✍ Scribed by Bruce W. Watson


Publisher
Elsevier Science
Year
2003
Tongue
English
Weight
155 KB
Volume
299
Category
Article
ISSN
0304-3975

No coin nor oath required. For personal study only.

✦ Synopsis


This paper presents a Boyer-Moore type algorithm for regular grammar pattern matching, answering a variant of an open problem posed by Aho (Pattern Matching in Strings, Academic Press, New York, 1980, p. 342). The new algorithm handles patterns speciΓΏed by regular (left linear) grammars-a generalization of the Boyer-Moore (single keyword) and Commentz-Walter (multiple keyword) algorithms.

Like the Boyer-Moore and Commentz-Walter algorithms, the new algorithm makes use of shift functions which can be precomputed and tabulated. The precomputation functions are derived, and it is shown that they can be obtained from Commentz-Walter's d1 and d2 shift functions.

In most cases, the Boyer-Moore (respectively, Commentz-Walter) algorithm has greatly outperformed the Knuth-Morris-Pratt (respectively, Aho-Corasick) algorithm. In practice, an earlier version of the algorithm presented in this paper also frequently outperforms the regular grammar generalization of the Aho-Corasick algorithm.


πŸ“œ SIMILAR VOLUMES


A new algorithm for linear regular tree
✍ Priti Shankar; Amitranjan Gantait; A.R. Yuvaraj; Maya Madhavan πŸ“‚ Article πŸ“… 2000 πŸ› Elsevier Science 🌐 English βš– 129 KB

We consider the problem of linear regular tree pattern matching and describe a new solution based on a bottom up technique. Current bottom up techniques preprocess the patterns and construct a ΓΏnite state tree pattern matching automaton for the purpose. Though matching time is linear in the size of

A Subquadratic Algorithm for Approximate
✍ S. Wu; U. Manber; E. Myers πŸ“‚ Article πŸ“… 1995 πŸ› Elsevier Science 🌐 English βš– 673 KB

The main result of this paper is an algorithm for approximate matching of a regular expression of size \(m\) in a text of size \(n\) in time \(O\left(n m / \log _{d+2} n\right)\), where \(d\) is the number of allowed errors. This algorithm is the first \(o(m n)\) algorithm for approximate matching t