Generalized affine gap costs for protein sequence alignment
โ Scribed by Stephen F. Altschul
- Book ID
- 101229013
- Publisher
- John Wiley and Sons
- Year
- 1998
- Tongue
- English
- Weight
- 125 KB
- Volume
- 32
- Category
- Article
- ISSN
- 0887-3585
No coin nor oath required. For personal study only.
โฆ Synopsis
Based on the observation that a single mutational event can delete or insert multiple residues, affine gap costs for sequence alignment charge a penalty for the existence of a gap, and a further length-dependent penalty. From structural or multiple alignments of distantly related proteins, it has been observed that conserved residues frequently fall into ungapped blocks separated by relatively nonconserved regions. To take advantage of this structure, a simple generalization of affine gap costs is proposed that allows nonconserved regions to be effectively ignored. The distribution of scores from local alignments using these generalized gap costs is shown empirically to follow an extreme value distribution. Examples are presented for which generalized affine gap costs yield superior alignments from the standpoints both of statistical significance and of alignment accuracy. Guidelines for selecting generalized affine gap costs are discussed, as is their possible application to multiple alignment. Proteins 32:88-96, 1998.
๐ SIMILAR VOLUMES
It is shown how to normalize the costs of an alignment algorithm that employs affine or linear gap costs. The normalized costs are interpreted as the -log probabilities of the instructions of a finite-state edit-machine. This gives an explicit model relating sequences that can be linked to processes
Alignment algorithms can be used to infer a relationship between sequences when the true relationship is unknown. Simple alignment algorithms use a cost function that gives a "xed cost to each possible point mutation\*mismatch, deletion, insertion. These algorithms tend to "nd optimal alignments tha