Probabilistic analysis of the frequencies of amino acid pairs within characterized protein sequences
โ Scribed by Shiyi Shen; Bo Kai; Jishou Ruan; J. Torin Huzil; Eric Carpenter; Jack A. Tuszynski
- Publisher
- Elsevier Science
- Year
- 2006
- Tongue
- English
- Weight
- 174 KB
- Volume
- 370
- Category
- Article
- ISSN
- 0378-4371
No coin nor oath required. For personal study only.
โฆ Synopsis
Here, we describe a unique probabilistic evaluation of the 20, naturally occurring, amino acids and their distributions within the Swiss-Prot and Complete Human Genebank databases. We have developed a computational technique that imparts both directionality and length constraints into searches for unique combinations of amino acids within protein sequences. Using statistical approaches, we have carried out searches of all possible two- and three-residue motifs contained within these databases. This technique is based on the unusually high occurrence of a small number of these motifs when compared to the expected probability of finding a specific residue grouping within a given database. Subsequent filtering of this search to identify such unique combinations has provided several examples that can be used as markers to identify particular proteins within or across databases. We focus on three of these motifs, which were found to be of greatest interest to us. The CC, CM and a combination of the two, CCM motifs all occur either more or less frequently than would be predicted based on standard amino acid distributions within the entire human proteome.
๐ SIMILAR VOLUMES
We analyse for each of 20 amino acids X the statistics of spacings between consecutive occurrences of X within the well-characterized Saccharomyces cerevisiae genome. The occurrences of amino acids may exhibit near random, clustered or smoothed out behaviour, like one-dimensional stochastic processe
The multidimensional statistical technique of discriminant analysis is used to allocate amino acid sequences to one of four secondary structural classes: high a content, high / 3 content, mixed a and @, low content of ordered structure. Discrimination is based on four attributes: estimates of percen
## Abstract Various blotting membranes were evaluated and correlated with the efficiency of electroblotting and the performance in the sequencing process. Structural parameters including specific surface area, pore size distribution, pore volumes, and permeabilities of different solvents lead to di