## Abstract Recent advances in large‐scale genome sequencing have led to the rapid accumulation of amino acid sequences of proteins whose functions are unknown. Since the functions of these proteins are closely correlated with their subcellular localizations, many efforts have been made to develop
Prediction and classification of protein subcellular location—sequence-order effect and pseudo amino acid composition
✍ Scribed by Kuo-Chen Chou; Yu-Dong Cai
- Publisher
- John Wiley and Sons
- Year
- 2003
- Tongue
- English
- Weight
- 206 KB
- Volume
- 90
- Category
- Article
- ISSN
- 0730-2312
No coin nor oath required. For personal study only.
✦ Synopsis
Given a protein sequence, how to identify its subcellular location? With the rapid increase in newly found protein sequences entering into databanks, the problem has become more and more important because the function of a protein is closely correlated with its localization. To practically deal with the challenge, a dataset has been established that allows the identification performed among the following 14 subcellular locations: (1) cell wall, (2) centriole, (3) chloroplast, (4) cytoplasm, (5) cytoskeleton, (6) endoplasmic reticulum, (7) extracellular, (8) Golgi apparatus, (9) lysosome, (10) mitochondria, (11) nucleus, (12) peroxisome, (13) plasma membrane, and (14) vacuole. Compared with the datasets constructed by the previous investigators, the current one represents the largest in the scope of localizations covered, and hence many proteins which were totally out of picture in the previous treatments, can now be investigated. Meanwhile, to enhance the potential and flexibility in taking into account the sequence-order effect, the series-mode pseudo-amino-acid-composition has been introduced as a representation for a protein. High success rates are obtained by the re-substitution test, jackknife test, and independent dataset test, respectively. It is anticipated that the current automated method can be developed to a high throughput tool for practical usage in both basic research and pharmaceutical industry.
📜 SIMILAR VOLUMES
## Abstract Support Vector Machine (SVM), which is one class of learning machines, was applied to predict the subcellular location of proteins by incorporating the quasi‐sequence‐order effect (Chou [2000] Biochem. Biophys. Res. Commun. 278:477–483). In this study, the proteins are classified into t
In order to find alternative protein sources in African regions where protein deficiency in nutrition is prevailing, solubility, in-vitro digestibility, amino acid composition and chemical score of Balanites aegyptiaca Del. kernel proteins were investigated as a function of different processing step