𝔖 Bobbio Scriptorium
✦   LIBER   ✦

ProtoMap: Automatic classification of protein sequences, a hierarchy of protein families, and local maps of the protein space

✍ Scribed by Golan Yona; Nathan Linial; Michal Linial


Publisher
John Wiley and Sons
Year
1999
Tongue
English
Weight
384 KB
Volume
37
Category
Article
ISSN
0887-3585

No coin nor oath required. For personal study only.

✦ Synopsis


We investigate the space of all protein sequences in search of clusters of related proteins. Our aim is to automatically detect these sets, and thus obtain a classification of all protein sequences. Our analysis, which uses standard measures of sequence similarity as applied to an all-vs.all comparison of SWISSPROT, gives a very conservative initial classification based on the highest scoring pairs. The many classes in this classification correspond to protein subfamilies. Subsequently we merge the subclasses using the weaker pairs in a two-phase clustering algorithm. The algorithm makes use of transitivity to identify homologous proteins; however, transitivity is applied restrictively in an attempt to prevent unrelated proteins from clustering together. This process is repeated at varying levels of statistical significance. Consequently, a hierarchical organization of all proteins is obtained.

The resulting classification splits the protein space into well-defined groups of proteins, which are closely correlated with natural biological families and superfamilies. Different indices of validity were applied to assess the quality of our classification and compare it with the protein families in the PROSITE and Pfam databases. Our classification agrees with these domain-based classifications for between 64.8% and 88.5% of the proteins. It also finds many new clusters of protein sequences which were not classified by these databases. The hierarchical organization suggested by our analysis reveals finer subfamilies in families of known proteins as well as many novel relations between protein families. Pro-


πŸ“œ SIMILAR VOLUMES


Recognition of a protein fold in the con
✍ Inna Dubchak; Ilya Muchnik; Christopher Mayor; Igor Dralyuk; Sung-Hou Kim πŸ“‚ Article πŸ“… 1999 πŸ› John Wiley and Sons 🌐 English βš– 73 KB πŸ‘ 2 views

A computational method has been developed for the assignment of a protein sequence to a folding class in the Structural Classification of Proteins (SCOP). This method uses global descriptors of a primary protein sequence in terms of the physical, chemical, and structural properties of the constituen

Local expression and distribution of a s
✍ Hyang-Mi Cheon; Hong-Ja Kim; Duck-Hwa Chung; Myeong-Ok Kim; Joong-Suk Park; Chi- πŸ“‚ Article πŸ“… 2001 πŸ› John Wiley and Sons 🌐 English βš– 409 KB πŸ‘ 2 views

## Abstract Storage protein‐1 (HcSP‐1) is a major storage protein found in the hemolymph and fat body of __Hyphantria cunea__. HcSP‐1 has a high methionine (6.0%) and low aromatic amino acid content (8.5%) (Cheon et al., 1998). In this study, the accumulation and expression of HcSP‐1 in ovary was i

DNA sequence analysis of a 10 624 bp fra
✍ Lafuente, MarΓ­a J.; Gamo, Francisco-Javier; Gancedo, Carlos πŸ“‚ Article πŸ“… 1996 πŸ› John Wiley and Sons 🌐 English βš– 353 KB πŸ‘ 2 views

We have determined the sequence of a 10 624 bp DNA segment located in the left arm of chromosome XV of Saccharomyces cerevisiae. The sequence contains eight open reading frames (ORFs) longer than 100 amino acids. Two of them do not present significant homology with sequences found in the databases.

Properties and cellular localization of
✍ Morse, D. ;Fritz, L. ;Pappenheimer, A. M. ;Hastings, J. W. πŸ“‚ Article πŸ“… 1989 πŸ› John Wiley and Sons βš– 305 KB

A luciferin binding protein LBP involved in the bioluminescence reaction of Gonyaulax po/yedra was purified and used for antibody production. Luciferin bound to LBP is fluorescent and can be used as a marker in living cells, allowing the localization of LBP in cortical organelles to be visualized. I

Exploring the conformational space of pr
✍ Andrew R. Leach; Andrew P. Lemon πŸ“‚ Article πŸ“… 1998 πŸ› John Wiley and Sons 🌐 English βš– 228 KB πŸ‘ 1 views

We describe an algorithm which enables us to search the conformational space of the side chains of a protein to identify the global minimum energy combination of side chain conformations as well as all other conformations within a specified energy cutoff of the global energy minimum. The program is