A new dataset of 396 protein domains is developed and used to evaluate the performance of the protein secondary structure prediction algorithms DSC, PHD, NNSSP, and PREDATOR. The maximum theoretical Q 3 accuracy for combination of these methods is shown to be 78%. A simple consensus prediction on th
Improving protein structural class prediction using novel combined sequence information and predicted secondary structural features
β Scribed by Qi Dai; Li Wu; Lihua Li
- Publisher
- John Wiley and Sons
- Year
- 2011
- Tongue
- English
- Weight
- 116 KB
- Volume
- 32
- Category
- Article
- ISSN
- 0192-8651
No coin nor oath required. For personal study only.
β¦ Synopsis
Abstract
Protein structural class prediction solely from protein sequences is a challenging problem in bioinformatics. Numerous efficient methods have been proposed for protein structural class prediction, but challenges remain. Using novel combined sequence information coupled with predicted secondary structural features (PSSF), we proposed a novel scheme to improve prediction of protein structural classes. Given an amino acid sequence, we first transformed it into a reduced amino acid sequence and calculated its word frequencies and word position features to combine novel sequence information. Then we added the PSSF to the combine sequence information to predict protein structural classes. The proposed method was tested on four benchmark datasets in low homology and achieved the overall prediction accuracies of 83.1%, 87.0%, 94.5%, and 85.2%, respectively. The comparison with existing methods demonstrates that the overall improvements range from 2.3% to 27.5%, which indicates that the proposed method is more efficient, especially for lowβhomology amino acid sequences. Β© 2011 Wiley Periodicals, Inc. J Comput Chem, 2011
π SIMILAR VOLUMES
## Abstract Knowledge of structural classes is useful in understanding of folding patterns in proteins. Although existing structural class prediction methods applied virtually all stateβofβtheβart classifiers, many of them use a relatively simple protein sequence representation that often includes
We present an analysis of the blind predictions submitted to the fold recognition category for the second meeting on the Critical Assessment of techniques for protein Structure Prediction. Our method achieves fold recognition from predicted secondary structure sequences using hidden Markov models (H
We describe the development of a scoring function based on the decomposition P(structure 0 sequence) Ο° P(sequence 0 structure) \*P(structure), which outperforms previous scoring functions in correctly identifying native-like protein structures in large ensembles of compact decoys. The first term cap
Analysis of our fold recognition results in the 3rd Critical Assessment in Structure Prediction (CASP3) experiment, using the programs THREADER 2 and GenTHREADER, shows an encouraging level of overall success. Of the 23 submitted predictions, 20 targets showed no clear sequence similarity to protein
A protein energy surface is constructed. Validation is through applications of global energy minimization to surface loops of protein crystal structures. For 9 of 10 predictions, the native backbone conformation is identified correctly. Electrostatic energy is modeled as a pairwise sum of interactio