๐”– Bobbio Scriptorium
โœฆ   LIBER   โœฆ

Multiple parameter cross-species protein identification using MultiIdent - a world-wide web accessible tool

โœ Scribed by Dr. Marc R. Wilkins; Elisabeth Gasteiger; Colin H. Wheeler; Ingrid Lindskog; Jean-Charles Sanchez; Amos Bairoch; Ron D. Appel; Michael J. Dunn; Denis F. Hochstrasser


Book ID
102832345
Publisher
John Wiley and Sons
Year
1998
Tongue
English
Weight
849 KB
Volume
19
Category
Article
ISSN
0173-0835

No coin nor oath required. For personal study only.

โœฆ Synopsis


Multiple parameter cross-species protein identification using MultiIdenta world-wide web accessible tool

Recent increases in the number of genome sequencing projects means that the amount of protein sequence in databases is increasing at an astonishing pace, In proteome studies, this is facilitating the identification of proteins from molecularly well-defined organisms. However, in studies of proteins from the majority of organisms, proteins must be identified by comparing analytical data to sequences in databases from other species. This process is known as cross-species protein identification. Here we present a new program, MultiIdent, which uses multiple protein parameters such as amino acid composition, peptide masses, sequence tags, estimated protein pZ and mass, to achieve cross-species protein identification. The program is structured so that protein amino acid composition, which is highly conserved across species boundaries, first generates a set of candidate proteins. These proteins are then queried with other protein parameters such as sequence tags and peptide masses. A final list of database entries which considers all analytical parameters is presented, ranked by an integrated score. We illustrate the power of the approach with the identification of a set of standard proteins, and the identification of proteins from dog heart separated by two-dimensional gel electrophoresis. The MultiIdent program is available on the world-wide web at: http://www .expasy.ch/sprot/multiident.html.

Proteome projects involve the identification and characterisation of large numbers of proteins in an organism [ l a ] .

Frequently proteins are separated by two-dimensional gel electrophoresis, followed by the application of protein identification techniques such as microsequencing, "tag" sequencing, amino acid composition, peptide mass fingerprinting, or mass spectrometry sequencing (reviewed in [5]). There is an impressive array of computer programs available to assist making these identificationsmany of which are available on the world-wide web. As the sequencing of genes and genomes is advancing at an astonishing pace, one might expect that it is becoming increasingly easy to identify a protein, or large numbers of proteins, with high confidence. This is certainly true for organisms whose genomes are sequenced and available in public databases such as Haemophilus influenzae and Succharomyces cerevisiae [6-71. However it is not widely appreciated that the bulk of information in protein sequence databases comes from a small number of species. For example, 48% of entries in release 34 of the SWISS-PROT database [8] come from just 20 organisms. Thus the researcher working with a less popular (or a poorly molecularly characterised) organism may face difficulties if undertaking large-scale protein identifications, as identi-


๐Ÿ“œ SIMILAR VOLUMES