<p>This volume of the Lecture Notes in Computer Science series provides a c- prehensive, state-of-the-art survey of recent advances in string processing and information retrieval. It includes invited and research papers presented at the 9th International Symposium on String Processing and Informatio
String Processing and Information Retrieval: 9th International Symposium, SPIRE 2002, Lisbon, Portugal, September 11-13, 2002 Proceedings (Lecture Notes in Computer Science, 2476)
β Scribed by Alberto H.F. Laender (editor), Arlindo L. Oliveira (editor)
- Publisher
- Springer
- Year
- 2002
- Tongue
- English
- Leaves
- 351
- Category
- Library
No coin nor oath required. For personal study only.
β¦ Synopsis
This volume of the Lecture Notes in Computer Science series provides a c- prehensive, state-of-the-art survey of recent advances in string processing and information retrieval. It includes invited and research papers presented at the 9th International Symposium on String Processing and Information Retrieval, SPIRE2002, held in Lisbon, Portugal. SPIREhas its origins in the South Am- ican Workshop on String Processing which was ?rst held in Belo Horizonte, Brazil, in 1993. Starting in 1998, the focus of the workshop was broadened to include the area of information retrieval due to its increasing relevance and its inter-relationship with the area of string processing. The call for papers for SPIRE2002 resulted in the submission of 54 papers from researchers around the world. Of these, 19 were selected for inclusion in the program (an acceptance rate of 35%). In addition, the Program Committee decided to accept six other papers, considered as describing interesting ongoing research, in the form of short papers. The authors of these 25 papers came from 18 di?erent countries (Argentina, Australia, Brazil, Canada, Czech Republic, Chile, Colombia, Finland, France, Germany, Japan, Italy, Mexico, Saudi Arabia, Switzerland, Spain, United Kingdom, and USA).
β¦ Table of Contents
String Processing and Information Retrieval
Preface
SPIRE 2002 Organization
Table of Contents
The DBLP Computer Science Bibliography: Evolution, Research Issues, Perspectives
History
The Beginning
Two Very Early Decisions
Early Recognition
SIGMOD Anthology
Sponsor Found
Undergraduate Software Labs
Technical Background
HTML TOCs
Mirrors
Simple Search
XML Records
BHT (Bibliography HyperText)
MG
Anthology Full Text Search
Research Issues
Person Names
Person Search
Normalization
Rating Functions
DBLP Browser
Perspectives
References
From Searching Text to Querying XML Streams
Databases, Text, and XML
XML Stream Processing
XML Stream Processing Techniques
Processing with NFAs
Processing with DFAs
Analyzing the Size of the DFA
Processing XML Streams with Lazy DFAs
Validation of the Size of the Lazy DFA
The Throughput of Lazy DFAs
Conclusions
References
String Matching Problems from Bioinformatics Which Still Need Better Solutions Extended Abstract
Introduction
Identifying Ξ±-Helices
Matching Profiles or Probabilistic Sequences
Optimal Exact String Matching Based on Suffix Arrays
Introduction
Basic Notions
The lcp-Intervals of a Suffix Array
The Enhanced Suffix Array
Construction of the Child-Table
Determining Child Intervals in Constant Time
Answering Queries in Optimal Time
Implementation Details
Experimental Results
References
Faster String Matching with Super--Alphabets
Introduction
Preliminaries
Super--Alphabet Simulation of SA
Applicability
Approximate Matching
Pattern Matching in Compressed Text
Shift--Or
Experimental Results
Conclusions
References
On the Size of DASG for Multiple Texts
Introduction
DASG for One Text
DASG for a Set of Texts
Lower Bound for the Number of States
Building DASG
Conclusion
References
Sorting by Prefix Transpositions
Introduction
Definitions
Approximation Algorithms
Approximation Algorithm with Factor 3
Approximation Algorithm with Factor 2
The Diameter of Prefix Transpositions
Tests
Permutations that Satisfy the Breakpoint Lower-Bound
Conclusions
References
Efficient Computation of Long Similar Subsequences
Introduction
Background
Finding Long Alignments with High Ordinary Score
Finding Long Alignments with High Normalized Score
Implementation and Test Results
Conclusion
References
Stemming Galician Texts
Introduction
Stemming Strategies
Stemming for Documents Previous to 1977
Galician Stemming Algorithm
The Exploitation of the Galician Stemming Algorithm
Conclusions and Future Work
References
Firing Policies for an Arabic Rule-Based Stemmer
Introduction
Rule-Based Arabic Stemmer
Firing Policies
Conclusions
References
Enhancing the Set-Based Model Using Proximity Information
Introduction
Set-Based Model Revisited
Termset Weights
Similarity Calculation
Query Mechanisms
Accounting for Proximity Information in Closed Termsets
Closed Termsets
Proximity Information
Algorithm Description
Experimental Results
Retrieval Performance
Computational Efficiency
Final Remarks
References
Web Structure, Dynamics and Page Quality
Introduction
Previous Work
Relations to the Web Structure
Link-Based Ranking and Age
An Age Based Pagerank
Conclusions
References
A Theoretical Analysis of Google's PageRank
Introduction
Related Works
The Intuitive Definition
Mathematical Aspects
A General Formula for PageRank
Deducing Classic PageRank
Resolving Minor Problems
Building a Personalized PageRank
A Comparison between Two Different Formulations of Classic PageRank
On the Dependence of Classic PageRank on the Damping Factor
Conclusion
References
Machine Learning Approach for Homepage Finding Task
Introduction
Related Work
Link Analysis
Combining Different Sources of Evidence
Homepage Finding Task
Problem Context
Baseline IR System, Collection Analysis, and Initial Results
Decision Tree Model
Logistic Regression Model
Testing Query Results and Discussion
Conclusion and Future Research Work
Acknowledgements
References
Tree Pattern Matching for Linear Static Terms
Tree Pattern Matching and Linear Terms
The Algorithm
Experimental Results
Conclusion
References
Processing Text Files as Is: Pattern Matching over Compressed Texts, Multi-byte Character Texts, and Semi-structured Texts
Introduction
Preliminaries
Prefix Code
Aho-Corasick Pattern Matching Machine
Pattern Matching over Variable-Length Encoded Texts
PMM Construction Algorithm
Correctness of the Algorithm
Applications
Comparing with BM Algorithm Followed by Quick Verification
Generalization of Prefix Codes
Generalized Prefix Codes
Applications
Applications to Query Processing for XML Documents
XML Documents and What They Represent
Processing XML Documents as Is
Experimental Results
Conclusion
References
Pattern Matching over Multi-attribute Data Streams
Introduction
Concepts and Notation
Our Pattern Matching Algorithms
Precalculation for Algorithm R2L
Precalculation for Algorithm L2R
Comparison Results
Conclusion
References
Java MARIAN: From an OPAC to a Modern Digital Library System
Introduction
Java MARIAN Design Principles
MARIAN Digital Library API
Java MARIAN Architecture and Implementations
The Database Layer
The Search Layer
Webgate and User Information Layer
Data Analysis, Collection Builders & Loading Tools
Conclusions and Future Work
References
A Framework for Generating Attribute Extractors for Web Data Sources
Introduction
Attribute Extractors
Prefix-Data-Suffix Extractors
PDS Tree
Guiders
Putting All Together
Experimental Results
Related Work
Conclusions and Future Work
References
Multiple Example Queries in Content-Based Image Retrieval
Introduction
Background
Image Features and Distance Measures
Multiple-Example Queries
Combining Functions
Experiments
Results
Conclusion and Future Work
References
Focussed Structured Document Retrieval
Introduction
The Approach
Structured Test Collection
Experiments and Results
Conclusions
Acknowledgements
References
Towards a More Comprehensive Comparison of Collaborative Filtering Algorithms
Introduction
Collaborative Filtering Algorithms
Experimental Procedure
Results
Conclusions
References
Fully Dynamic Spatial Approximation Trees
Introduction
The Spatial Approximation Tree
Construction
Searching
Incremental Construction
Insertion
Searching
Deletions
Fake Nodes
Reinserting Subtrees
Combining Both Methods
Experimental Comparison
Conclusions
References
String Matching with Metric Trees Using an Approximate Distance
Introduction
Background
The M-tree
Proposed Solution
How Much Do We Save?
Generalizing the Approach
Experimental Results
Conclusions
References
Probabilistic Proximity Searching Algorithms Based on Compact Partitions
Introduction
Basic Concepts
Pivot-Based Algorithms
Algorithms Based on Compact Partitions
Spatial Approximation Tree
List of Clusters
Probabilistic Algorithms for Proximity Searching
Our Approach
Probabilistic Incremental Search
Ranking of Zones
Experimental Results
Conclusions
References
t-Spanners as a Data Structure for Metric Space Searching
Introduction
Previous Work
Our Proposal
Experimental Results
Strings under Edit Distance
Documents under Cosine Distance
Conclusions
References
Compact Directed Acyclic Word Graphs for a Sliding Window
Introduction
Compact Directed Acyclic Word Graphs
Definitions
On-Line Algorithm to Construct CDAWGs
Suffix Trees for a Sliding Window
CDAWGs for a Sliding Window
Edge Deletion
Maintaining the Structure of CDAWG
Detecting DelPoint(u)
On Buffer Size
Keeping Edge Labels Valid
Conclusion
References
Indexing Text Using the Ziv-Lempel Trie
Introduction
Ziv-Lempel Compression
Basic Technique
Data Structures
Search Algorithm
A Succinct Index Representation
Space and Time Complexity
Conclusions
References
Author Index
π SIMILAR VOLUMES
<span>This volume contains the papers presented at the 13th International Symposium on String Processing and Information Retrieval (SPIRE), held October 11-13, 2006, in Glasgow, Scotland. The SPIRE annual symposium provides an opportunity for both new and established researchers to present original
<span>This volume of the Lecture Notes in Computer Science series provides a c- prehensive, state-of-the-art survey of recent advances in string processing and information retrieval. It includes invited and research papers presented at the 10th International Symposium on String Processing and Inform
<span>This book constitutes the refereed proceedings of the 29th International Symposium on String Processing and Information Retrieval, SPIRE 2022, held in ConcepciΓ³n, Chile, in November 2022.<br></span><p><span>The 23 full papers presented in this volume were carefully reviewed and selected from 4
<p><span>This volume LNCS 14240 constitutes the refereed proceedings of the 30th International Symposium on String Processing and Information Retrieval, SPIRE 2023, held in Pisa, Italy, during September 26β28, 2023. </span></p><p><span>The 31 full papers presented were carefully reviewed and selecte
<span>The papers contained in this volume were presented at the 11th Conference on String Processing and Information Retrieval (SPIRE), held Oct. 5-8, 2004 at the Department of Information Engineering of the University of Padova, Italy. They wereselected from 123 paperssubmitted in responseto the ca