𝔖 Scriptorium
✦   LIBER   ✦

πŸ“

String Processing and Information Retrieval: 30th International Symposium, SPIRE 2023, Pisa, Italy, September 26–28, 2023, Proceedings (Lecture Notes in Computer Science)

✍ Scribed by Franco Maria Nardini (editor), Nadia Pisanti (editor), Rossano Venturini (editor)


Publisher
Springer
Year
2023
Tongue
English
Leaves
409
Category
Library

⬇  Acquire This Volume

No coin nor oath required. For personal study only.

✦ Synopsis


This volume LNCS 14240 constitutes the refereed proceedings of the 30th International Symposium on String Processing and Information Retrieval, SPIRE 2023, held in Pisa, Italy, during September 26–28, 2023.

The 31 full papers presented were carefully reviewed and selected from 47 submissions. They cover topics such as: data structures; algorithms; constrained Substring complexity; data compression codes; succinct k-spectra; and LCP array of wheeler DFAs.

✦ Table of Contents


Preface
Organization
Abstracts ofΒ Invited Talks
Information Retrieval Needs More Theoreticians
Regular Expression Matching
Recent Results onΒ theΒ Longest Common Substring Problem
Contents
Longest Common Prefix Arrays for Succinct k-Spectra
1 Introduction
2 Preliminaries
3 Basic O(nk)-Time LCS Array Construction
4 Faster Construction via Super-Alphabet Techniques
5 Construction in Linear Time
5.1 Correctness
6 Experimental Evaluation
7 Concluding Remarks
References
On Suffix Tree Detection
1 Introduction
2 Preliminaries
3 Periodic Strings Suffix Tree Detection
4 Necessary Conditions on a Binary String Suffix Tree
5 Conclusion and Open Problems
References
Optimally Computing Compressed Indexing Arrays Based on the Compact Directed Acyclic Word Graph
1 Introduction
2 Preliminaries
3 Techniques
4 Computing Run-Length BWT
5 Computing Irreducible GLPF Arrays
References
Evaluating Regular Path Queries on Compressed Adjacency Matrices
1 Introduction and Related Work
2 Basic Concepts
2.1 Labeled Graphs and Regular Path Queries (RPQs)
2.2 An Algebra on Boolean Matrices
2.3 K2-Trees
3 Evaluating RPQs Through the Boolean Matrix Algebra
4 Implementation of the Boolean Matrix Algebra
4.1 Transposition
4.2 Boolean Sum
4.3 Boolean Multiplication
4.4 Closure
4.5 Restrictions
4.6 Query Plan
5 Experimental Results
5.1 A Baseline
5.2 Benchmark
6 Conclusions
References
Approximate Cartesian Tree Matching: An Approach Using Swaps
1 Introduction
2 Preliminaries
2.1 Basic Notations
2.2 Cartesian Tree Matching
2.3 Approximate Cartesian Tree Matching
3 Characterization of the Parent-Distance Tables When a Swap Occurs
4 Swap Graph of Cartesian Trees
5 Algorithms
5.1 An Algorithm Using the Parent-Distance Tables
5.2 An Aho-Corasick Based Algorithm
6 Perspectives
References
Optimal Wheeler Language Recognition
1 Introduction
2 Preliminaries
2.1 Infima and Suprema Strings
3 Recognizing Wheeler Languages
4 The Algorithm
4.1 A Parameterized Algorithm
5 A Matching Conditional Lower Bound
References
Approximation and Fixed Parameter Algorithms for the Approximate Cover Problem
1 Introduction
2 Preliminaries and Problem Definitions
3 NP-Hardness of the ACP
4 A Polynomial-Time Approximation Algorithm for the ACP
5 An FPT Algorithm for the ACP
6 Conclusions and Future Work
References
Data Structures for SMEM-Finding in the PBWT
1 Introduction
2 Preliminaries
3 Building Blocks for a PBWT Data Structure
3.1 Queries Needed to Support SMEM-Finding
3.2 Top Level: Mapping Structure
3.3 Second Level: Start Support
3.4 Third Level: Extend Support
3.5 -Encoded Divergence Array
4 Composite Data Structures for the PBWT
5 Experiments and Discussion
5.1 Comparison of Data Structures
5.2 Comparison of Methods on 1000 Genomes Project Data
6 Conclusions
References
Compressibility Measures for Two-Dimensional Data
1 Introduction
2 Two-Dimensional Compressibility Measures
2.1 The Measure 2D
3 Space Bounds for Two-Dimensional Block Trees
References
From de Bruijn Graphs to Variation Graphs – Relationships Between Pangenome Models
1 Introduction
2 Representing String Collections with Graphs
2.1 String Graphs
2.2 Representations of Collections of Strings
2.3 De Bruijn Graphs
2.4 Variation Graphs
3 Graph Transformation
3.1 Transition Graphs
3.2 Transformation 1: Split
3.3 Transformation 2: Merge
3.4 Transformation 3: Collapse
3.5 Correctness of the Algorithm
4 Conclusion
References
CAGE: Cache-Aware Graphlet Enumeration
1 Introduction
1.1 Results
2 Baseline Algorithm
3 The CAGE Algorithm
3.1 Addressing a Hard Combinatorial Limit
3.2 Pseudocode
3.3 Exploiting What is Already Cached
3.4 Remarks
4 Experimental Results
4.1 Environment and Dataset
4.2 CAGE Implementation and Comparison Methodology
4.3 Cache Analysis
4.4 Running Time Analysis
5 Conclusions
References
Space-Time Trade-Offs for the LCP Array of Wheeler DFAs
1 Introduction
2 Definitions
3 Wheeler DFAs
4 A Space-Time Trade-Off for the LCP Array
5 Applications
References
Computing All-vs-All MEMs in Grammar-Compressed Text
1 Introduction
2 Preliminaries
3 Definitions
4 Overview of Our Algorithm
5 Building the Fix-Free Grammar
6 Computing prMEMs in the Fix-Free Grammar
7 Positioning MEMs in the Text
8 Concluding Remarks
References
Sublinear Time Lempel-Ziv (LZ77) Factorization
1 Introduction
2 Preliminaries
3 Algorithm for 3-Approximate LZ-Like Factorization
3.1 Computing Longest Previous Factors of Sample Positions
3.2 Computing a Gapped Factorization
4 Algorithm for Exact LZ Factorization
4.1 Computing the Exact LZ Factorization
5 Computing the Non-overlapping LZ Factorization
References
New Advances in Rightmost Lempel-Ziv
1 Introduction
2 Preliminaries
3 Computing Rightmost LZ-End Parsings
4 Partially Solving Rightmost LZ-Like Parsings
4.1 Long Phrases
4.2 Arbitrary Subsets of Phrases
4.3 Infrequent Phrases
4.4 Close Phrases
References
Engineering a Textbook Approach to Index Massive String Dictionaries
1 Introduction
2 Background
3 Our Two-Level Approach
3.1 Storage Level
3.2 Indexing Level
4 Experiments
4.1 Indexing Level Evaluation
4.2 Two-Level Approach Evaluation
5 Conclusions and Future Work
References
Count-Min Sketch with Variable Number of Hash Functions: An Experimental Study
1 Introduction
2 Background and Related Work
2.1 Conservative Count-Min: Definitions
2.2 Analysis of Conservative Count-Min: Prior Works
2.3 Hash Hypergraph
2.4 Hypergraph Peelability and Phase Transition of Error
2.5 Variable Number of Hash Functions: Mixed Hypergraphs
3 Results
3.1 Uniform Distribution
3.2 Step Distribution
3.3 Zipf's Distribution
4 Conclusions
References
Dynamic Compact Planar Embeddings
1 Introduction
1.1 Our Contribution
2 Related Work
3 Preliminaries
3.1 Notation
3.2 Dynamic Bitvectors
3.3 Planar Graph Traversal
3.4 Dynamic Succinct Euler-Tour Trees
4 Data Structure and the Marker Model
4.1 Data Structure
4.2 The Marker Model
4.3 Navigation
5 Dynamization
5.1 Inserting an Edge
5.2 Deleting an Edge
References
A Simple Grammar-Based Index for Finding Approximately Longest Common Substrings
1 Introduction
2 Preliminaries
3 Data Structure
4 Queries
5 Faster Queries
References
On the Number of Factors in the LZ-End Factorization
1 Introduction
2 Preliminaries
3 Our Result
References
Non-overlapping Indexing in BWT-Runs Bounded Space
1 Introduction and Related Work
2 Preliminaries
2.1 Rank and Select
2.2 Suffix Array
2.3 Burrows–Wheeler Transform
2.4 The r-Index and Some Related Results
3 The Data Structures
3.1 An O(rlog(n/r)) Space Solution
3.2 An O(r+rR) Space Solution
3.3 Our Final O(r) Space Solution
4 Open Problems
References
Efficient Parameterized Pattern Matching in Sublinear Space
1 Introduction
2 Preliminaries
2.1 Parameterized Matching Problem
2.2 Periodicity of Parameterized Strings
3 Properties of Parameterized Periods
3.1 Alternative Periodicity Lemma
3.2 Prefix Periods
4 Proposed Algorithm
4.1 Pattern Preprocessing
4.2 Searching for Parameterized Matches
5 Conclusion and Future Work
References
Largest Repetition Factorization of Fibonacci Words
1 Introduction
2 Preliminaries
3 Candidate of Largest Factorizations
4 Maximality of Candidate Repetition Factorizations
4.1 Upper Bound of the Length of Factors
4.2 Analysis of Candidate Factorizations
References
String Covers of a Tree Revisited
1 Introduction
2 Preliminaries
3 Directed Tree Cover
3.1 Gaps
3.2 Binarisation and Path Compaction
3.3 Updates
4 Undirected Tree Cover
4.1 Match Tables
4.2 Complexity and Correctness
References
Compacting Massive Public Transport Data
1 Introduction
2 Background
3 Previous Concepts on Transport Networks
4 Our Proposal
5 Improving Our Proposal
6 Supported Queries
6.1 Obtaining the Types of Trips to Travel from the Origin Stop So to the Destination Stop Sd (getTrips)
6.2 Obtaining the Total Number of People Who Made a Trip Starting at So and Ending at Sd (getPeople)
6.3 Obtaining All the Origin and Destination Stops that Uses St as Transfer Stop (getOriginDestinations)
7 Experimental Evaluation
7.1 Input Data
7.2 Used Methods
7.3 Results
8 Conclusions and Future Work
References
Constant Time and Space Updates for the Sigma-Tau Problem
1 Introduction
2 Constant Time Successor Rule for Hamilton Paths
3 Constant Time Successor Rule for Hamilton Cycles
4 Simpler Rule for Hamilton Paths and Termination
References
Linear-Time Computation of Generalized Minimal Absent Words for Multiple Strings
1 Introduction
2 Preliminaries
3 The DAWG Data Structure
4 Algorithm Overview for k = 2
5 Skip Links for k = 2
5.1 When B= 10
5.2 When B= 11
5.3 Our Main Result for k = 2
6 Algorithm for Arbitrary k > 2
7 Discussions
References
Frequency-Constrained Substring Complexity
1 Introduction
2 The Data Structure
References
Chaining of Maximal Exact Matches in Graphs
1 Introduction
2 Preliminaries
3 Finding MEMs in Labeled DAGs
4 Symmetric Co-Linear Chaining in Labeled DAGs
4.1 DAG Chaining with Node MEMs
4.2 Revisiting Symmetric String-to-string Chaining with MEMs
4.3 Integration of Symmetry to DAG Chaining
5 Discussion
References
Algorithms and Hardness for the Longest Common Subsequence of Three Strings and Related Problems
1 Introduction
2 Preliminaries
3 Improved Algorithms for Longest Cubic Subsequence
3.1 The Complementary Problem
3.2 Extending the LCS-2 Algorithm by Nakatsu et al. to LCS-3
4 LCS-3 Parameterized by the Number of Matching Triples
5 Algorithm and Hardness Result for Longest Majority t-Pseudo-Subsequence and Its Relative
5.1 LMtPS+ is NP-Complete
5.2 LMtPS is Polynomially Solvable
6 Concluding Remarks
References
Binary Mixed-Digit Data Compression Codes
1 Introduction
2 Codes Definition. Straightforward Decoding and Encoding
3 Searching the Optimal Code
4 Fast Decoding
5 Experimental Results
6 Conclusions
References
Author Index


πŸ“œ SIMILAR VOLUMES


String Processing and Information Retrie
✍ Mario A. Nascimento (editor), Edleno S. de Moura (editor), Arlindo L. Oliveira ( πŸ“‚ Library πŸ“… 2003 πŸ› Springer 🌐 English

<span>This volume of the Lecture Notes in Computer Science series provides a c- prehensive, state-of-the-art survey of recent advances in string processing and information retrieval. It includes invited and research papers presented at the 10th International Symposium on String Processing and Inform

String Processing and Information Retrie
✍ Diego Arroyuelo (editor), Barbara Poblete (editor) πŸ“‚ Library πŸ“… 2022 πŸ› Springer 🌐 English

<span>This book constitutes the refereed proceedings of the 29th International Symposium on String Processing and Information Retrieval, SPIRE 2022, held in ConcepciΓ³n, Chile, in November 2022.<br></span><p><span>The 23 full papers presented in this volume were carefully reviewed and selected from 4

String Processing and Information Retrie
✍ Alberto H.F. Laender (editor), Arlindo L. Oliveira (editor) πŸ“‚ Library πŸ“… 2002 πŸ› Springer 🌐 English

<span>This volume of the Lecture Notes in Computer Science series provides a c- prehensive, state-of-the-art survey of recent advances in string processing and information retrieval. It includes invited and research papers presented at the 9th International Symposium on String Processing and Informa

String processing and information retrie
✍ Fici, Gabriele; Sciortino, Marinella; Venturini, Rossano πŸ“‚ Library πŸ“… 2017 πŸ› Springer 🌐 English

<p><p>This book constitutes the proceedings of the 24th International Symposium on String Processing and Information Retrieval, SPIRE 2017, held in Palermo, Italy, in September 2017. <br/>The 26 papers presented in this volume were carefully reviewed and selected from 71 submissions. They focus on f

String Processing and Information Retrie
✍ Erik D. Demaine (auth.), Roberto Grossi, Fabrizio Sebastiani, Fabrizio Silvestri πŸ“‚ Library πŸ“… 2011 πŸ› Springer-Verlag Berlin Heidelberg 🌐 English

<p>This book constitutes the proceedings of the 18th International Symposium on String Processing and Information Retrieval, SPIRE 2011, held in Pisa, Italy, in October 2011. <br>The 30 long and 10 short papers together with 1 keynote presented were carefully reviewed and selected from 102 submissio

String Processing and Information Retrie
✍ Erik D. Demaine (auth.), Roberto Grossi, Fabrizio Sebastiani, Fabrizio Silvestri πŸ“‚ Library πŸ“… 2011 πŸ› Springer-Verlag Berlin Heidelberg 🌐 English

<p>This book constitutes the proceedings of the 18th International Symposium on String Processing and Information Retrieval, SPIRE 2011, held in Pisa, Italy, in October 2011. <br>The 30 long and 10 short papers together with 1 keynote presented were carefully reviewed and selected from 102 submissio