𝔖 Bobbio Scriptorium
✦   LIBER   ✦

Segmentation of Heteropolymer Sequences Specifying Subsequences with Different Composition and Statistical Properties

✍ Scribed by Leonid V. Gusev; Valentina V. Vasilevskaya; Vsevolod Ju. Makeev; Pavel G. Khalatur; Alexei R. Khokhlov


Publisher
John Wiley and Sons
Year
2003
Tongue
English
Weight
230 KB
Volume
12
Category
Article
ISSN
1022-1344

No coin nor oath required. For personal study only.

✦ Synopsis


Abstract

We have studied the segmentation of two‐letter AB heterosequences composed of subsequences with different composition and distribution of A and B monomer units along the chain. Our approach is based on the segmentation function S(k) introduced in the present work and on the Jensen–Shannon divergence measure determined with respect to the probabilities of the lengths of uniform blocks of A and B monomer units. It is shown that the function S(k) is extremely sensitive to the sequence statistics. Even visual analysis of S(k) allows judgment on some features of sequence statistics. In particular, function S(k) is constant for random copolymers, it is an oscillating function for random block copolymers and shows monotonic growth up to some constant value for proteinlike copolymers. However, due to significant fluctuations observed for short sequences, the function S(k) can be effectively used only for segmentation of a heterosequence composed of very long subsequences. On the other hand, we find that the Jensen–Shannon divergence measure does not allow one to judge the type of statistics, but is extremely efficient for segmentation of a heterosequence. Therefore, the two introduced functions, being mutually complementary, provide an effective approach for recognizing and segmentation of heterosequences. As an example, the methods developed are applied for concatenating sequences of different proteins.

Segmentation function S(k, l, x) as a function of parameter k and starting number x of “window” for a sequence composed of elastin and ribonuclease sequences.

magnified imageSegmentation function S(k, l, x) as a function of parameter k and starting number x of “window” for a sequence composed of elastin and ribonuclease sequences.


📜 SIMILAR VOLUMES


Microstructure and Thermal Properties of
✍ Hiroshi Urayama; Sung-Il Moon; Yoshiharu Kimura 📂 Article 📅 2003 🏛 John Wiley and Sons 🌐 English ⚖ 132 KB

## Abstract A series of polylactides (PLA) with different stereo sequences are prepared by the copolymerization of L‐lactide and DL‐lactide. It is confirmed that the glass transition temperature (__T__~g~) of the PLA decreases with decreasing optical purity of the lactate units (%ee) according to t