✦ LIBER ✦

On spaced seeds for similarity search

✍ Scribed by Uri Keich; Ming Li; Bin Ma; John Tromp

Book ID: 104294282
Publisher: Elsevier Science
Year: 2004
Tongue: English
Weight: 235 KB
Volume: 138
Category: Article
ISSN: 0166-218X
DOI: 10.1016/s0166-218x(03)00382-2

No coin nor oath required. For personal study only.

✦ Synopsis

Genomics studies routinely depend on similarity searches based on the strategy of ÿnding short seed matches (contiguous k bases) which are then extended. The particular choice of the seed length, k, is determined by the tradeo between search speed (larger k reduces chance hits) and sensitivity (smaller k ÿnds weaker similarities). A novel idea of using a single deterministic optimized spaced seed was introduced in Ma et al. (Bioinformatics (2002) 18) to the above similarity search process and it was empirically demonstrated that the optimal spaced seed quadruples the search speed, without sacriÿcing sensitivity. Multiple, randomly spaced patterns, spaced q-grams, and spaced probes were also studied in Califano and Rigoutsos (Technical Report, IBM T.J. Watson Research Center (1995), Burkhardt, K arkk ainen, CPM (2001), and Buhler, Bioinformatics 17 (2001) 419) and in other applications [(RECOMB (1999) 295, RE-COMB (2000) 245)]. They were all found to be better than their contiguous counterparts. In this paper we study some of the theoretical and practical aspects of optimal seeds. In particular we demonstrate that the commonly used contiguous seed is in some sense the worst one, and we o er an algorithmic solution to the problem of ÿnding the optimal seed.

📜 SIMILAR VOLUMES

Hit integration for identifying optimal

Hit integration for identifying optimal spaced seeds

✍ Won-Hyoung Chung; Seong-Bae Park 📂 Article 📅 2010 🏛 BioMed Central 🌐 English ⚖ 579 KB

Rank hash similarity for fast similarity

✍ Lu, Min; Huang, YaLou; Xie, MaoQiang; Liu, Jie 📂 Article 📅 2013 🏛 Elsevier Science 🌐 English ⚖ 760 KB

Similarity searching in large combinator

✍ Matthias Rarey; Martin Stahl 📂 Article 📅 2001 🏛 Springer Netherlands 🌐 English ⚖ 751 KB

Properties of embedding methods for simi

Properties of embedding methods for similarity searching in metric spaces

✍ Hjaltason, G.R.; Samet, H. 📂 Article 📅 2003 🏛 IEEE 🌐 English ⚖ 498 KB

Efficient Algorithms for Similarity Sear

Efficient Algorithms for Similarity Search

✍ S. Rajasekaran; Y. Hu; J. Luo; H. Nick; P.M. Pardalos; S. Sahni; G. Shaw 📂 Article 📅 2001 🏛 Springer US 🌐 English ⚖ 217 KB

Statistical quantization for similarity

Statistical quantization for similarity search

✍ Wang, Qi; Zhu, Guokang; Yuan, Yuan 📂 Article 📅 2014 🏛 Elsevier Science 🌐 English ⚖ 813 KB