✦ LIBER ✦

Algorithmic complexity of protein identification: combinatorics of weighted strings

✍ Scribed by Mark Cieliebak; Thomas Erlebach; Zsuzsanna Lipták; Jens Stoye; Emo Welzl

Publisher: Elsevier Science
Year: 2004
Tongue: English
Weight: 319 KB
Volume: 137
Category: Article
ISSN: 0166-218X
DOI: 10.1016/s0166-218x(03)00187-2

No coin nor oath required. For personal study only.

✦ Synopsis

We investigate a problem which arises in computational biology: Given a constant-size alphabet A with a weight function : A → N, ÿnd an e cient data structure and query algorithm solving the following problem: For a string over A and a weight M ∈ N, decide whether contains a substring with weight M , where the weight of a string is the sum of the weights of its letters (ONE-STRING MASS FINDING PROBLEM). If the answer is yes, then we may in addition require a witness, i.e., indices i 6 j such that the substring beginning at position i and ending at position j has weight M . We allow preprocessing of the string and measure e ciency in two parameters: storage space required for the preprocessed data and running time of the query algorithm for given M . We are interested in data structures and algorithms requiring subquadratic storage space and sublinear query time, where we measure the input size as the length n of the input string . Among others, we present two non-trivial e cient algorithms: LOOKUP solves the problem with O(n) storage space and O(n=log n) time; INTERVAL solves the problem for binary alphabets with O(n) storage space in O(log n) query time. We introduce other variants of the problem and sketch how our algorithms may be extended for these variants. Finally, we discuss combinatorial properties of weighted strings.

📜 SIMILAR VOLUMES

Topology identification of weighted comp

Topology identification of weighted complex dynamical networks

✍ Jin Zhou; Jun-an Lu 📂 Article 📅 2007 🏛 Elsevier Science 🌐 English ⚖ 841 KB

Identification of low molecular weight p

Identification of low molecular weight proteins isolated by 2-D liquid separations

✍ Kan Zhu; Fred R. Miller; Timothy J. Barder; David M. Lubman 📂 Article 📅 2004 🏛 John Wiley and Sons 🌐 English ⚖ 580 KB

## Abstract Proteins with molecular mass (__M__~r~) <20 kDa are often poorly separated in 2‐D sodium dodecyl sulfate polyacrylamide gel electrophoresis. In addition, low‐__M__~r~ proteins may not be readily identified using peptide mass fingerprinting (PMF) owing to the small number of peptides gen

An effective algorithm for string correc

An effective algorithm for string correction using generalized edit distance—II. Computational complexity of the algorithm and some applications

✍ R.L. Kashyap; B.J. Oommen 📂 Article 📅 1981 🏛 Elsevier Science 🌐 English ⚖ 981 KB

Immunochemical identification of novel h

Immunochemical identification of novel high-molecular-weight protein isoforms of the adenomatous polyposis coli (APC) gene

✍ Cornelia Kraus; Julio Reina-Sanchez; Zora Šuleková; Wolfgang G. Ballhausen 📂 Article 📅 1996 🏛 John Wiley and Sons 🌐 French ⚖ 786 KB

Identification of novel homologues of th

Identification of novel homologues of three low molecular weight subunits of the mitochondrial bc1complex

✍ Hans-Peter Braun 📂 Article 📅 1996 🏛 Springer 🌐 English ⚖ 471 KB

Large-scale random cDNA sequencing projects have been started for several organisms and are a valuable tool for the analysis of quantitative and qualitative aspects of gene expression. However, the reliability of the obtained data is limited as most of the clones are only partially analysed on one s

Identification of a high molecular weigh

Identification of a high molecular weight protein on the surface of murine thymus and thymus-dependent cells

✍ Vilma Santana; Nina Wedderburn; Erika R. Abney; R. M. E. Parkhouse 📂 Article 📅 1976 🏛 John Wiley and Sons 🌐 English ⚖ 749 KB