✦ LIBER ✦

Multiple classifier integration for the prediction of protein structural classes

✍ Scribed by Lei Chen; Lin Lu; Kairui Feng; Wenjin Li; Jie Song; Lulu Zheng; Youlang Yuan; Zhenbin Zeng; Kaiyan Feng; Wencong Lu; Yudong Cai

Publisher: John Wiley and Sons
Year: 2009
Tongue: English
Weight: 112 KB
Volume: 30
Category: Article
ISSN: 0192-8651
DOI: 10.1002/jcc.21230

No coin nor oath required. For personal study only.

✦ Synopsis

Abstract

Supervised classifiers, such as artificial neural network, partition trees, and support vector machines, are often used for the prediction and analysis of biological data. However, choosing an appropriate classifier is not straightforward because each classifier has its own strengths and weaknesses, and each biological dataset has its own characteristics. By integrating many classifiers together, people can avoid the dilemma of choosing an individual classifier out of many to achieve an optimized classification results (Rahman et al., Multiple Classifier Combination for Character Recognition: Revisiting the Majority Voting System and Its Variation, Springer, Berlin, 2002, 167–178). The classification algorithms come from Weka (Witten and Frank, Data Mining: Practical Machine Learning Tools and Techniques, Morgan Kaufmann, San Francisco, 2005) (a collection of software tools for machine learning algorithms). By integrating many predictors (classifiers) together through simple voting, the correct prediction (classification) rates are 65.21% and 65.63% for a basic training dataset and an independent test set, respectively. These results are better than any single machine learning algorithm collected in Weka when exactly the same data are used. Furthermore, we introduce an integration strategy which takes care of both classifier weightings and classifier redundancy. A feature selection strategy, called minimum redundancy maximum relevance (mRMR), is transferred into algorithm selection to deal with classifier redundancy in this research, and the weightings are based on the performance of each classifier. The best classification results are obtained when 11 algorithms are selected by mRMR method, and integrated together through majority votes with weightings. As a result, the prediction correct rates are 68.56% and 69.29% for the basic training dataset and the independent test dataset, respectively. The web‐server is available at http://chemdata.shu.edu.cn/protein_st/. © 2009 Wiley Periodicals, Inc. J Comput Chem, 2009

📜 SIMILAR VOLUMES

Prediction of protein structure: The pro

Prediction of protein structure: The problem of fold multiplicity

✍ Andrei L. Lomize; Irina D. Pogozheva; Henry I. Mosberg 📂 Article 📅 1999 🏛 John Wiley and Sons 🌐 English ⚖ 154 KB 👁 2 views

Three-dimensional (3D) models of four CASP3 targets were calculated using a simple modeling procedure that includes prediction of regular secondary structure, analysis of possible ␤-sheet topologies, assembly of amphiphilic helices and ␤-sheets to bury their nonpolar surfaces, and adjustment of side

Support Vector Machines for Prediction o

Support Vector Machines for Prediction of Protein Domain Structural Class

✍ YU-DONG CAI; XIAO-JUN LIU; XUE-BIAO XU; KUO-CHEN CHOU 📂 Article 📅 2003 🏛 Elsevier Science 🌐 English ⚖ 126 KB

The support vector machines (SVMs) method was introduced for predicting the structural class of protein domains. The results obtained through the self-consistency test, jack-knife test, and independent dataset test have indicated that the current method and the elegant component-coupled algorithm de

Using support vector machines for predic

Using support vector machines for prediction of protein structural classes based on discrete wavelet transform

✍ Jian-Ding Qiu; San-Hua Luo; Jian-Hua Huang; Ru-Ping Liang 📂 Article 📅 2009 🏛 John Wiley and Sons 🌐 English ⚖ 211 KB

## Abstract The prediction of secondary structure is a fundamental and important component in the analytical study of protein structure and functions. How to improve the predictive accuracy of protein structural classification by effectively incorporating the sequence‐order effects is an important

Prediction of protein structural class f

Prediction of protein structural class from the amino acid sequence

✍ Petr Klein; Charles Delisi 📂 Article 📅 1986 🏛 Wiley (John Wiley & Sons) 🌐 English ⚖ 781 KB

The multidimensional statistical technique of discriminant analysis is used to allocate amino acid sequences to one of four secondary structural classes: high a content, high / 3 content, mixed a and @, low content of ordered structure. Discrimination is based on four attributes: estimates of percen

An information-theoretic approach to the

An information-theoretic approach to the prediction of protein structural class

✍ Xiaoqi Zheng; Chun Li; Jun Wang 📂 Article 📅 2009 🏛 John Wiley and Sons 🌐 English ⚖ 143 KB

## Abstract An information‐theoretical approach, which combines a sequence decomposition technique and a fuzzy clustering algorithm, is proposed for prediction of protein structural class. This approach could bypass the process of selecting and comparing sequence features as done previously. First,

Evaluation and improvement of multiple s

Evaluation and improvement of multiple sequence methods for protein secondary structure prediction

✍ James A. Cuff; Geoffrey J. Barton 📂 Article 📅 1999 🏛 John Wiley and Sons 🌐 English ⚖ 166 KB 👁 2 views

A new dataset of 396 protein domains is developed and used to evaluate the performance of the protein secondary structure prediction algorithms DSC, PHD, NNSSP, and PREDATOR. The maximum theoretical Q 3 accuracy for combination of these methods is shown to be 78%. A simple consensus prediction on th