๐”– Bobbio Scriptorium
โœฆ   LIBER   โœฆ

Pattern Matching in Hypertext

โœ Scribed by Amihood Amir; Moshe Lewenstein; Noa Lewenstein


Publisher
Elsevier Science
Year
2000
Tongue
English
Weight
142 KB
Volume
35
Category
Article
ISSN
0196-6774

No coin nor oath required. For personal study only.

โœฆ Synopsis


The importance of hypertext has been steadily growing over the past decade. The Internet and other information systems use hypertext format, with data organized associatively rather than sequentially or relationally. A myriad of textual problems have been considered in the pattern matching field with many nontrivial results. Nevertheless, surprisingly little work has been done on the natural combination of pattern matching and hypertext. In contrast to regular text, hypertext has a nonlinear structure and the techniques of pattern matching for text cannot be ลฝ directly applied to hypertext. Manber and Wu 1992, ''IAPR Workshop on Struc-. tural and Syntactic Pattern Recognition, Bern, Switzerland'' pioneered the study of pattern matching in hypertext and defined a hypertext model for pattern ลฝ matching. Akutsu 1993, ''Procedures of the 4th Symposium on Combinatorial . Pattern Matching, Podova, Italy,'' pp. 1แސ10 developed an algorithm that can be used for exact pattern matching in a tree-structured hypertext. Park and Kim ลฝ . 1995, ''6th Symposium on Combinatorial Pattern Matching, Helsinki, Finland'' considered regular pattern matching in hypertext. They developed a complex algorithm that works for hypertext with an underlying structure of a DAG. In this paper we present a much simpler algorithm achieving the same complexity which runs on any hypertext graph. We then extend the problem to approximate pattern matching in hypertext, first considering hamming distance and then edit distance. We show that in contrast to regular text, it does make a difference whether the errors occur in the hypertext or the pattern. The approximate pattern matching problem in hypertext with errors in the hypertext turns out to be N N P P-complete and the approximate pattern matching problem in hypertext with errors in the pattern has a polynomial time solution.


๐Ÿ“œ SIMILAR VOLUMES


Inverse Pattern Matching
โœ Amihood Amir; Alberto Apostolico; Moshe Lewenstein ๐Ÿ“‚ Article ๐Ÿ“… 1997 ๐Ÿ› Elsevier Science ๐ŸŒ English โš– 181 KB

Let a textstring T of n symbols from some alphabet โŒบ and an integer mn be given. A pattern P of length m over โŒบ is sought such that P minimizes ลฝ . alternatively, maximizes the total number of pairwise character mismatches generated when P is compared with all m-character substrings of T. Two additi

Fastest Pattern Matching in Strings
โœ L. Colussi ๐Ÿ“‚ Article ๐Ÿ“… 1994 ๐Ÿ› Elsevier Science ๐ŸŒ English โš– 850 KB

An algorithm is presented that substantially improves the algorithm of Boyer and Moore for pattern matching in strings, both in the worst case and in the average. Both the Boyer and Moore algorithm and the new algorithm assume that the characters in the pattern and in the text are taken from a given

Pattern matching in historical data
โœ Michael C. Johannesmeyer; Ashish Singhal; Dale E. Seborg ๐Ÿ“‚ Article ๐Ÿ“… 2002 ๐Ÿ› American Institute of Chemical Engineers ๐ŸŒ English โš– 581 KB
Pattern Matching with Swaps
โœ Amihood Amir; Yonatan Aumann; Gad M. Landau; Moshe Lewenstein; Noa Lewenstein ๐Ÿ“‚ Article ๐Ÿ“… 2000 ๐Ÿ› Elsevier Science ๐ŸŒ English โš– 146 KB

Let a text string T of n symbols and a pattern string P of m symbols from alphabet be given. A swapped version T of T is a length n string derived from T by a series of local swaps (i.e., t โ† t +1 and t +1 โ† t ), where each element can participate in no more than one swap. The pattern matching with

Trends, fashions, patterns, norms, conve
โœ Einat Amitay ๐Ÿ“‚ Article ๐Ÿ“… 2000 ๐Ÿ› John Wiley and Sons ๐ŸŒ English โš– 207 KB

This article describes the theoretical approach behind the InCommonSense system. This approach makes use of writing conventions on the Web. The theory behind InCommonSense is based on research findings from the fields of linguistics, psychology, HCI, sociology, and information retrieval. The theoret