A new representation of protein sequence is devoted in this paper, in which each protein can be represented by a 20-dimensional (20D) vector of unit length. Inspired by the principle of superposition of state in quantum mechanics, the squares of the 20 components of the vector correspond to the amin
Persistent biases in the amino acid composition of prokaryotic proteins
✍ Scribed by Géraldine Pascal; Claudine Médigue; Antoine Danchin
- Publisher
- John Wiley and Sons
- Year
- 2006
- Tongue
- English
- Weight
- 379 KB
- Volume
- 28
- Category
- Article
- ISSN
- 0265-9247
No coin nor oath required. For personal study only.
✦ Synopsis
Abstract
Correspondence analysis of 28 proteomes selected to span the entire realm of prokaryotes revealed universal biases in the proteins' amino acid distribution. Integral Inner Membrane Proteins always form an individual cluster, which can then be used to predict protein localisation in unknown proteomes, independently of the organism's biotope or kingdom. Orphan proteins are consistently rich in aromatic residues. Another bias is also ubiquitous: the amino acid composition is driven by the G + C content of the first codon position. An unexpected bias is driven, in many proteomes, by the AAN box of the genetic code, suggesting some functional biochemical relationship between asparagine and lysine. Less‐significant biases are driven by the rare amino acids, cysteine and tryptophan. Some allow identification of species‐specific functions or localisation such as surface or exported proteins. Errors in genome annotations are also revealed by correspondence analysis, making it useful for quality control and correction. BioEssays 28: 726–738, 2006. © 2006 Wiley Periodicals, Inc.
📜 SIMILAR VOLUMES
Knowledge of amino acid composition, alone, is verified here to be sufficient for recognizing the structural class, alpha, beta, alpha + beta, or alpha/beta of a given protein with an accuracy of 81%. This is supported by results from exhaustive enumerations of all conformations for all sequences of
Despite the fact that several studies have reported the concentrations of various free amino acids in tobacco, their enantiomeric composition is unknown. Both the absolute and enantiomeric compositions of proline, alanine, asparagine, aspartic acid, valine, methionine, leucine,and phenylalanine were
Scheme 3.1 Enantiospecific hydrolysis of N-acetyl-D,L-amino acids (9) by A. oryzae acylase I.