Key residues approach to the definition of protein families and analysis of sparse family signatures
β Scribed by Jon C. Ison; Matthew J. Blades; Alan J. Bleasby; Stephen C. Daniel; J. Howard Parish; John B.C. Findlay
- Book ID
- 102648841
- Publisher
- John Wiley and Sons
- Year
- 2000
- Tongue
- English
- Weight
- 564 KB
- Volume
- 40
- Category
- Article
- ISSN
- 0887-3585
No coin nor oath required. For personal study only.
β¦ Synopsis
We extend the concept of the motif as a tool for characterizing protein families and explore the feasibility of a sparse "motif" that is the length of the protein sequence itself. The type of motif discussed is a sparse family signature consisting of a set of N key residue positions (A1, A2. . .AN) preceded by gaps (G) thus G1A1G2A2. . . .GNAN. Both a residue and gap can be variable. A signature is matched to a protein sequence and scored using a dynamic programming algorithm which permits variability in gap distance and residue type. Generating a signature involves identifying residues associated with points of contact in interactions between secondary structure elements. A raw signature consists of a set of positions with potential key structural roles sampled from a sequence alignment constructed with reference to this contact data. Raw signatures are refined by sampling different gap-residue pairs until the specificity of a signature for the family cannot be further improved. We summarize signatures for nine families of protein of diverse fold and function and present results of scans against the OWL protein sequence database. The implications of such signatures are discussed.
π SIMILAR VOLUMES