𝔖 Scriptorium
✦   LIBER   ✦

πŸ“

Pattern Discovery in Bioinformatics: Theory & Algorithms

✍ Scribed by Laxmi Parida


Publisher
Chapman and Hall/CRC
Year
2007
Tongue
English
Leaves
515
Series
Chapman & Hall/CRC Mathematical & Computational Biology
Edition
1
Category
Library

⬇  Acquire This Volume

No coin nor oath required. For personal study only.

✦ Synopsis


The computational methods of bioinformatics are being used more and more to process the large volume of current biological data. Promoting an understanding of the underlying biology that produces this data, Pattern Discovery in Bioinformatics: Theory and Algorithms provides the tools to study regularities in biological data. Taking a systematic approach to pattern discovery, the book supplies sound mathematical definitions and efficient algorithms to explain vital information about biological data. It explores various data patterns, including strings, clusters, permutations, topology, partial orders, and boolean expressions. Each of these classes captures a different form of regularity in the data, providing possible answers to a wide range of questions. The book also reviews basic statistics, including probability, information theory, and the central limit theorem. This self-contained book provides a solid foundation in computational methods, enabling the solution of difficult biological questions.

✦ Table of Contents


Cover......Page 1
c5491fm.pdf......Page 2
Pattern Discovery in Bioinformatics: Theory & Algorithms......Page 5
Contents......Page 8
Acknowledgments......Page 14
Table of Contents......Page 0
1.1 Ubiquity of Patterns......Page 15
1.3 The Need for Rigor......Page 16
1.4 Who is a Reader of this Book?......Page 17
1.4.1 About this book......Page 18
Part I: The Fundamentals......Page 20
2.2 Graphs......Page 21
2.3 Tree Problem 1: Minimum Spanning Tree......Page 26
2.3.1 Prim’s algorithm......Page 29
2.4 Tree Problem 2: Steiner Tree......Page 33
2.5 Tree Problem 3: Minimum Mutation Labeling......Page 34
2.5.1 Fitch’s algorithm......Page 35
2.6 Storing & Retrieving Elements......Page 39
2.7 Asymptotic Functions......Page 42
2.8 Recurrence Equations......Page 44
2.8.1 Counting binary trees......Page 46
2.8.2 Enumerating unrooted trees (PrΓΌfer sequence)......Page 48
2.9 NP-Complete Class of Problems......Page 52
2.10 Exercises......Page 53
3.1 Introduction......Page 59
3.2.1.1 Defining omega and F......Page 60
3.2.1.3 Defining probability P......Page 61
3.2.2 Multiple events (Bayes’ theorem)......Page 62
3.2.3 Inclusion-exclusion principle......Page 63
3.2.4 Discrete probability space......Page 66
3.2.5 Algebra of random variables......Page 69
3.2.6.1 Properties of expectations......Page 70
3.2.6.2 Back to the concrete example......Page 71
3.2.7 Discrete probability distribution (binomial, Poisson)......Page 72
3.2.8 Continuous probability distribution (normal)......Page 76
3.2.9 Continuous probability space (omega is R)......Page 78
3.3 The Bare Truth about Inferential Statistics......Page 81
3.3.1 Probability distribution invariants......Page 82
3.3.2 Samples & summary statistics......Page 84
3.3.3 The central limit theorem......Page 89
3.3.4 Statistical significance (p-value)......Page 92
3.5 Exercises......Page 94
Comments......Page 100
4.1 Introduction......Page 101
4.3 Pattern Duality......Page 102
4.4 Irredundant Patterns......Page 104
4.4.1 Special case: maximality......Page 105
4.4.3 Uniqueness property......Page 107
4.4.4 Case studies......Page 108
4.6 When is a Pattern Specification Nontrivial?......Page 111
4.7 Classes of Patterns......Page 112
4.8 Exercises......Page 115
Comments......Page 121
Part II: Patterns on Linear Strings......Page 122
5.2 Modeling a Biopolymer......Page 123
5.2.1 Repeats in DNA......Page 124
5.2.2 Directionality of biopolymers......Page 125
5.2.2.1 Random number generator......Page 126
5.2.3 Modeling a random permutation......Page 127
5.2.4 Modeling a random string......Page 129
5.3 Bernoulli Scheme......Page 130
5.4 Markov Chain......Page 131
5.4.1 Stationary distribution......Page 133
5.4.2 Computing probabilities......Page 137
5.5 Hidden Markov Model (HMM)......Page 138
5.5.1 The decoding problem (Viterbi algorithm)......Page 140
5.7 Conclusion......Page 143
5.8 Exercises......Page 144
6.1 Introduction......Page 148
6.2 Notation......Page 149
6.3 Solid Patterns......Page 151
6.3.2 Counting the maximal patterns......Page 153
6.4 Rigid Patterns......Page 158
6.4.1 Maximal rigid patterns......Page 159
6.4.2 Enumerating maximal rigid patterns......Page 161
6.4.3 Density-constrained patterns......Page 165
6.4.4 Quorum-constrained patterns......Page 166
6.4.5 Large-|sigma| input......Page 167
6.4.6 Irredundant patterns......Page 169
6.4.6.1 Density-constrained patterns......Page 171
6.4.6.2 Quorum-constrained patterns......Page 172
6.5 Extensible Patterns......Page 173
6.6.1 Homologous sets......Page 174
6.6.2 Sequence on reals......Page 176
6.6.2.1 Mapping to a discrete instance......Page 177
6.7 Exercises......Page 179
Comments......Page 190
7.2 Discovery Algorithm......Page 191
7.4 Rigid Patterns......Page 199
Markov Chain......Page 200
7.5 Extensible Patterns......Page 201
7.5.1 Nondegenerate extensible patterns......Page 202
7.5.2 Degenerate extensible patterns......Page 204
7.5.3 Correction factor for the dot character......Page 205
7.6 Measure of Surprise......Page 206
7.6.2 chi-square ratio......Page 207
7.6.3 Interplay of combinatorics & statistics......Page 208
7.7 Applications......Page 209
7.8 Exercises......Page 211
8.1 Introduction: Local Multiple Alignment......Page 221
8.2 Probabilistic Model: Motif Profile......Page 222
8.3 The Learning Problem......Page 223
8.4.1 Statistical significance......Page 224
8.4.2 Information content......Page 227
8.5 Algorithms to Learn a Motif Profile......Page 228
8.6.1 The initial estimate rho0......Page 230
8.6.2 Estimating z given rho......Page 231
8.6.3 Estimating rho given z......Page 232
8.7.1 Estimating rho given an alignment......Page 235
8.7.3 Estimating Z given rho......Page 236
8.8 Interpreting the Motif Profile in Terms of......Page 237
8.9 Exercises......Page 238
9.1 Introduction: Consensus Motif......Page 242
9.2 Combinatorial Model: Subtle Motif......Page 243
9.3 Distance between Motifs......Page 245
9.4 Statistics of Subtle Motifs......Page 247
9.5 Performance Score......Page 252
9.6.1 Neighbor enumeration (exact)......Page 253
9.6.2 Submotif enumeration (inexact)......Page 256
9.7 A Combinatorial Algorithm......Page 259
9.8 A Probabilistic Algorithm......Page 262
9.9 A Modular Solution......Page 264
9.10 Conclusion......Page 266
9.11 Exercises......Page 267
Comments......Page 268
Part III: Patterns on Meta-Data......Page 269
10.1 Introduction......Page 270
10.1.1 Notation......Page 271
10.2 How Many Permutation Patterns?......Page 272
10.3 Maximality......Page 273
10.3.1 P=1: Linear notation & PQ trees......Page 274
10.3.2 P>1: Linear notation?......Page 276
10.4 Parikh Mapping-based Algorithm......Page 278
10.4.2 Time complexity analysis......Page 280
10.5 Intervals......Page 283
10.5.1 The naive algorithm......Page 285
10.5.2 The Uno-Yagiura RC algorithm......Page 286
10.6 Intervals to PQ Trees......Page 299
10.6.1 Irreducible intervals......Page 300
10.6.2 Encoding intervals as a PQ tree......Page 302
10.6.2.1 Time complexity of algorithms (10) and (11)......Page 311
10.7 Applications......Page 312
10.7.1 Case study I: Human and rat......Page 313
10.7.2 Case study II: E. Coli K-12 and B. Subtilis......Page 314
10.8 Conclusion......Page 316
10.9 Exercises......Page 317
Comments......Page 326
11.2 Unstructured Permutations......Page 327
11.2.1 Multinomial coefficients......Page 329
11.2.2 Patterns with multiplicities......Page 332
11.3 Structured Permutations......Page 333
11.3.1 P-arrangement......Page 334
11.3.2 An incremental method......Page 335
11.3.3 An upper bound on P-arrangements......Page 340
11.3.3.1 A dynamic programming solution......Page 343
11.3.4 A lower bound on P-arrangements......Page 345
11.3.5 Estimating the number of frontiers......Page 346
11.3.6 Combinatorics to probabilities......Page 349
11.4 Exercises......Page 350
12.1.1 Graph notation......Page 358
12.2 What Are Topological Motifs?......Page 359
12.2.1 Combinatorics in topologies......Page 360
12.2.2 Input with self-isomorphisms......Page 361
12.3 The Topological Motif......Page 362
12.3.1 Maximality......Page 370
12.4.1 Occurrence-isomorphisms......Page 372
12.4.2 Vertex indistinguishability......Page 375
12.4.4 Compact vertex, edge & motif......Page 376
12.4.6 Conjugates of compact lists......Page 377
12.4.7 Characteristics of compact lists......Page 381
12.4.8 Maximal operations on compact lists......Page 383
12.4.9 Maximal subsets of location lists......Page 384
12.4.11 Compact motifs from compact lists......Page 387
12.5 The Discovery Method......Page 395
12.5.1.1 Initialization: generating Linit......Page 396
12.5.1.2 The iterative step......Page 399
12.6 Related Classical Problems......Page 402
12.7 Applications......Page 403
12.9 Exercises......Page 405
Comments......Page 418
13.1 Introduction......Page 419
13.2 Some Basic Properties of Finite Sets......Page 420
13.3 Partial Order Graph G(S, E) of Sets......Page 421
13.3.1 Reduced partial order graph......Page 422
13.3.2 Straddle graph......Page 423
13.4.1 Intersection closure......Page 425
13.4.2 Union closure......Page 426
13.5.1 PQ trees......Page 428
13.5.2 Straddling sets......Page 431
13.6 Maximal Set Intersection Problem (maxSIP)......Page 436
13.6.1 Ordered enumeration trie......Page 437
13.6.2 Depth first traversal of the trie......Page 438
13.7.1 Algorithm......Page 449
13.7.2 Minimal from maximal sets......Page 450
13.8 Multi-Sets......Page 452
13.8.1 Ordered enumeration trie of multi-sets......Page 453
13.8.2 Enumeration algorithm......Page 455
13.9 Adapting the Enumeration Scheme......Page 457
13.10 Exercises......Page 460
Comments......Page 470
14.1 Introduction......Page 471
14.1.1 Motivation......Page 472
14.2 Extracting (Monotone CNF) Boolean Expressions......Page 473
14.2.1 Extracting biclusters......Page 477
14.2.2 Extracting patterns in microarrays......Page 480
14.3.1 Partial orders......Page 482
14.3.2 Partial order construction problem......Page 483
14.3.3 Excess in partial orders......Page 485
14.4 Statistics of Partial Orders......Page 487
14.4.1 Computing Cex(B)......Page 491
14.5 Redescriptions......Page 495
14.6 Application: Partial Order of Expressions......Page 496
14.7 Summary......Page 497
14.8 Exercises......Page 498
References......Page 505


πŸ“œ SIMILAR VOLUMES


Pattern Discovery in Bioinformatics: The
✍ Parida L. πŸ“‚ Library πŸ“… 2007 πŸ› CRC 🌐 English

The computational methods of bioinformatics are being used more and more to process the large volume of current biological data. Promoting an understanding of the underlying biology that produces this data, Pattern Discovery in Bioinformatics: Theory and Algorithms provides the tools to study regula

Pattern Discovery in Bioinformatics: The
✍ Laxmi Parida πŸ“‚ Library πŸ“… 2007 🌐 English

The computational methods of bioinformatics are being used more and more to process the large volume of current biological data. Promoting an understanding of the underlying biology that produces this data, Pattern Discovery in Bioinformatics: Theory and Algorithms provides the tools to study regula

Algorithms in Bioinformatics: Theory and
✍ Paul A. Gagniuc πŸ“‚ Library πŸ“… 2021 πŸ› Wiley 🌐 English

Explore a comprehensive and insightful treatment of the practical application of bioinformatic algorithms in a variety of fields Algorithms in Bioinformatics: Theory and Implementation delivers a fulsome treatment of some of the main algorithms used to explain biological functions and relationshi

Algorithms in Bioinformatics: Theory and
✍ Paul A. Gagniuc πŸ“‚ Library πŸ“… 2021 πŸ› Wiley 🌐 English

<b>ALGORITHMS IN BIOINFORMATICS</b> <p><b>Explore a comprehensive and insightful treatment of the practical application of bioinformatic algorithms in a variety of fields</b> </p><p><i>Algorithms in Bioinformatics: Theory and Implementation</i> delivers a fulsome treatment of some of the main algori

Algorithms in Bioinformatics: Theory and
✍ Paul A. Gagniuc πŸ“‚ Library πŸ“… 2021 πŸ› Wiley 🌐 English

Explore a comprehensive and insightful treatment of the practical application of bioinformatic algorithms in a variety of fields Algorithms in Bioinformatics: Theory and Implementation delivers a fulsome treatment of some of the main algorithms used to explain biological functions and relationshi

Scalable Pattern Recognition Algorithms:
✍ Pradipta Maji, Sushmita Paul (auth.) πŸ“‚ Library πŸ“… 2014 πŸ› Springer International Publishing 🌐 English

<p>This book addresses the need for a unified framework describing how soft computing and machine learning techniques can be judiciously formulated and used in building efficient pattern recognition models. The text reviews both established and cutting-edge research, providing a careful balance of t