<p><span>This volume LNCS 14240 constitutes the refereed proceedings of the 30th International Symposium on String Processing and Information Retrieval, SPIRE 2023, held in Pisa, Italy, during September 26–28, 2023. </span></p><p><span>The 31 full papers presented were carefully reviewed and selecte
String Processing and Information Retrieval: 29th International Symposium, SPIRE 2022, Concepción, Chile, November 8–10, 2022, Proceedings (Lecture Notes in Computer Science)
✍ Scribed by Diego Arroyuelo (editor), Barbara Poblete (editor)
- Publisher
- Springer
- Year
- 2022
- Tongue
- English
- Leaves
- 337
- Category
- Library
No coin nor oath required. For personal study only.
✦ Synopsis
This book constitutes the refereed proceedings of the 29th International Symposium on String Processing and Information Retrieval, SPIRE 2022, held in Concepción, Chile, in November 2022.
The 23 full papers presented in this volume were carefully reviewed and selected from 43 submissions. They cover topics such as: data structures; algorithms; information retrieval; compression; combinatorics on words; and computational biology.
✦ Table of Contents
Preface
Organization
Abstracts of Invited Talks
De Bruijn Graphs: Solving Biological Problems in Small Space
LZ-End Parsing: Upper Bounds and Algorithmic Techniques
Contents
String Algorithms
Subsequence Covers of Words
1 Introduction
2 Testing if a Word is an s-Cover
3 Maximal Lengths of s-Primitive Words
3.1 Ternary Alphabet
3.2 General Alphabet
3.3 Behaviour of the Function (k) for Small k
4 Computing s-Covers
5 The Number of Distinct Shortest s-Covers
6 Final Remarks
References
Maximal Closed Substrings
1 Introduction
2 Preliminaries
3 A Bound on the Number of MCS
4 An Algorithm for Locating All MCS
References
Online Algorithms for Finding Distinct Substrings with Length and Multiple Prefix and Suffix Conditions
1 Introduction
2 Preliminaries and Definitions
2.1 Strings
2.2 Suffix Array and LCP Array
2.3 The Problems
3 Algorithm
3.1 Sketch of Algorithm
3.2 Removing Redundant Elements
3.3 Detecting P and S Occurrences
3.4 Maintaining pList
3.5 Computing excludeLeft, excludeRight, and start
3.6 Summarizing the Algorithm
4 Applying the Algorithm for Traffic Classification
5 Conclusion and Future Work
A Appendix
References
The Complexity of the Co-occurrence Problem
1 Introduction
1.1 Our Results
1.2 Techniques
2 The Left-Minimal Co-occurrence Problem
2.1 Main Idea
2.2 Data Structure
3 The Co-occurrence Problem
4 Lower Bounds
4.1 The Increment Gadget
4.2 Lower Bound on Space
4.3 Lower Bound on Space in Terms of and
A Preprocessing
B Lower Bound on Time
References
String Data Structures
Reconstructing Parameterized Strings from Parameterized Suffix and LCP Arrays
1 Introduction
2 Preliminaries
3 Algorithm
3.1 Step 1: Left-Table
3.2 Step 2: Reconstruct Point of Mismatch
3.3 Step 3: Right-Table
3.4 Step 4 Reconstruct P-String
3.5 Step 5 Verify Output
3.6 Proofs of Correctness and Efficiency
A Appendix
References
Computing the Parameterized Burrows–Wheeler Transform Online
1 Introduction
2 Preliminaries
2.1 Parameterized Burrows–Wheeler Transform
3 Computing pBWT Online
3.1 Step 1: UpdateLF Computes LT[i] and FT[i]
3.2 Step 2: InsertRow Computes LT and FT
3.3 Step 3: Updating LCP by UpdateLCP
A Proofs
References
Accessing the Suffix Array via -1-Forest
1 Introduction
2 Preliminaries
3 Access Data Structures to SA
3.1 Access via -1-Graph
3.2 Access via -1-Forest
4 Experiments
5 Conclusion
References
On the Optimisation of the GSACA Suffix Array Construction Algorithm
1 Introduction
2 Preliminaries
3 GSACA
3.1 Phase II
3.2 Phase I
4 Experiments
A Proofs
References
String Compression
Balancing Run-Length Straight-Line Programs
1 Introduction
2 Terminology
2.1 Strings
2.2 Straight-Line Programs
2.3 Directed Acyclic Graph of an SLP
2.4 Run-Length Straight-Line Programs
3 Balancing Run-Length Straight-Line Programs
4 Substring Range Operations in O(grl) Space
4.1 Karp-Rabin Fingerprints
4.2 Range Minimum Queries
4.3 More General Functions
5 Conclusion
A PSV and NSV Queries
References
Substring Complexities on Run-Length Compressed Strings
1 Introduction
2 Preliminaries
3 Connection Between ST(k) and the Runs-Suffix-Trie
4 Algorithm
References
Information Retrieval
How Train–Test Leakage Affects Zero-Shot Retrieval
1 Introduction
2 Background and Related Work
3 Identifying Leaking Queries
4 Experimental Analysis
5 Conclusion
References
Computational Biology
Genome Comparison on Succinct Colored de Bruijn Graphs
1 Introduction
2 Definitions and Notation
2.1 Burrows-Wheeler Similarity Distribution
2.2 Succinct de Bruijn Graphs
3 gcBB – Genome Comparison Using BOSS and BWSD
4 Experiments
5 Conclusions and Future Work
References
Sorting Genomes by Prefix Double-Cut-and-Joins
1 Introduction
1.1 Permutations, Genomes, and Rearrangements
1.2 Problems
2 A Generic Lower Bounding Technique
2.1 The Signed Case
2.2 The Unsigned Case
3 Prefix DCJs
3.1 Signed Prefix DCJs
3.2 Unsigned Prefix DCJs
4 Conclusions and Future Work
References
KATKA: A KRAKEN-Like Tool with k Given at Query Time
1 Introduction
2 Design
3 Queries
4 Future Work
References
Computing All-vs-All MEMs in Run-Length-Encoded Collections of HiFi Reads
1 Introduction
2 Preliminaries
3 Our Contribution
3.1 Definitions
3.2 Description of the Problem
4 Bi-directional BWT and DNA Reverse Complements
4.1 Homopolymer Errors and MEM Sequences
4.2 Computing MEMs in Compressed Space
4.3 Improving the Time Complexity for Reporting MEMs
5 Concluding Remarks
References
Space-Efficient Data Structures
Internal Masked Prefix Sums and Its Connection to Fully Internal Measurement Queries
1 Introduction
2 Preliminaries
3 Data Structures for Masked Prefix Sum
3.1 Time-Space Trade-off
3.2 A Conditional Lower Bound
3.3 Dynamic Masked Prefix Sum
3.4 Approximate Masked Prefix Sum
3.5 Parallel Algorithms
4 Data Structures for Sparse Internal Inner Product
4.1 Static Sparse Internal Inner Product
4.2 Dynamic Sparse Internal Inner Product
5 The Connections Between the Problems and the Internal Measurements
A Details Omitted from Sect.3
B Details Omitted from Sect.5
References
Compressed String Dictionaries via Data-Aware Subtrie Compaction
1 Introduction
2 A Motivating Example
3 CoCo-trie: Compressed Collapsed Trie
3.1 Compressed Encoding of Collapsed Subtries
3.2 On the Choice of the Subtries to Collapse
3.3 A Pool of Succinct Encoding Schemes
3.4 Squeezing the Universe of Branching Labels
3.5 On the Space-Time Trade-Off
3.6 Trie Operations
4 Experiments
5 Conclusions and Future Work
A Calculations for the Motivating Example of Sect.2
B Proof of Theorem 1
References
.26em plus .1em minus .1emOn Representing the Degree Sequences of Sublogarithmic-Degree Wheeler Graphs
1 Introduction
2 Intuition
3 Proof of Lemma 2
4 Postscript
References
Engineering Compact Data Structures for Rank and Select Queries on Bit Vectors
1 Introduction and Related Work
2 Preliminaries
3 Space Efficient Rank and Select Data Structures
3.1 CS-Poppy: Rank-Based Rank and Select Data Structure
3.2 Flat-Popcount: Storing More Information Wasting No Bits
3.3 Wide-Popcount: Faster Rank
4 Experimental Evaluation
5 Conclusion
A Additional Experimental Results
A.1 Comparison of Our Implementations Only
A.2 Construction Times
A.3 Answering Both Queries
References
Pattern Matching on Strings, Graphs, and Trees
Matching Patterns with Variables Under Edit Distance
1 Introduction
2 Preliminaries
3 Our Results
4 Conclusion
References
On the Hardness of Computing the Edit Distance of Shallow Trees
1 Introduction
2 Preliminaries
3 Lower Bound Conditioned on the APSP Hypothesis
4 Lower Bound for Decomposition Algorithms
References
Quantum Time Complexity and Algorithms for Pattern Matching on Labeled Graphs
1 Introduction
1.1 Quantum Computing and Input Model
1.2 NC-QSETH
1.3 Our Results
2 Reduction from LCS to PMLG
2.1 Hardness of PMLG over Binary Alphabet
3 Quantum Algorithm for PMLG
4 Discussion
References
Pattern Matching Under DTW Distance
1 Introduction
2 Preliminaries
3 Main Result: O(kmn)-Time Algorithm
4 Approximation Algorithm
5 Experiments
A
B
References
Author Index
📜 SIMILAR VOLUMES
<span>This volume of the Lecture Notes in Computer Science series provides a c- prehensive, state-of-the-art survey of recent advances in string processing and information retrieval. It includes invited and research papers presented at the 9th International Symposium on String Processing and Informa
<span>This volume of the Lecture Notes in Computer Science series provides a c- prehensive, state-of-the-art survey of recent advances in string processing and information retrieval. It includes invited and research papers presented at the 10th International Symposium on String Processing and Inform
<span>This book constitutes the proceedings of the 17th International Symposium on String Processing and Information Retrieval, SPIRE 2010, held in Los Cabos, Mexico, in October 2010.<br>The 26 long and 13 short papers presented were carefully reviewed and selected from 109 submissions. The volume a