𝔖 Bobbio Scriptorium
✦   LIBER   ✦

High Statistics Block Entropy Measures of DNA Sequences

✍ Scribed by Pietro Liò; Antonio Politi; Marcello Buiatti; Stefano Ruffo


Publisher
Elsevier Science
Year
1996
Tongue
English
Weight
487 KB
Volume
180
Category
Article
ISSN
0022-5193

No coin nor oath required. For personal study only.

✦ Synopsis


We have used an improved block-entropy measure in order to gain some further insights into the short-range correlations present in whole chromosomes of S. cerevisiae, viruses and organelles and very large genomic regions of E. coli. Although DNA sequences are largely inhomogeneous and word frequencies are unevenly distributed, the comparison of entire chromosomes and large genomic regions show a "bulk" composition homogeneity. This property suggests that biases in selection, directional mutational pressure and recombination processes act in homogenizing the base composition of the DNA molecules within a genome but their mode of action, relative impact and direction may vary in different organisms. The most interesting results appear to be the differences between the SW (C,G/A,T) and RY (A,G/C,T) two-letter alphabet entropies. Deviations from randomness in E. coli and S. cerevisiae sequences particularly concern SW dinucleotide frequencies and RY tetranucleotide frequencies.


📜 SIMILAR VOLUMES


Estimating the Entropy of DNA Sequences
✍ Armin O. Schmitt; Hanspeter Herzel 📂 Article 📅 1997 🏛 Elsevier Science 🌐 English ⚖ 231 KB

The Shannon entropy is a standard measure for the order state of symbol sequences, such as, for example, DNA sequences. In order to incorporate correlations between symbols, the entropy of n-mers (consecutive strands of n symbols) has to be determined. Here, an assay is presented to estimate such hi

Statistical analysis of DNA sequences. I
✍ M. Y. Azbel; Y. Kantor; L. Verkh; A. Vilenkin 📂 Article 📅 1982 🏛 Wiley (John Wiley & Sons) 🌐 English ⚖ 203 KB
Statistical analysis of DNA sequences. I
✍ Alexander Vilenkin; Lev Verkh 📂 Article 📅 1982 🏛 Wiley (John Wiley & Sons) 🌐 English ⚖ 163 KB

A DNA molecule can be viewed as a text written in four letters: A, T, G, and C. As we know, this text contains the genetic message of a living organism. The sequence of letters in the text cannot be very regular (otherwise it would carry very little information). To an observer who does not know the

Similarity analysis of DNA sequences bas
✍ Chun Li; Hong Ma; Yang Zhou; Xiaolei Wang; Xiaoqi Zheng 📂 Article 📅 2010 🏛 John Wiley and Sons 🌐 English ⚖ 91 KB 👁 2 views

A DNA primary sequence is a string consisting of letters on an alphabet X 5 {a, c, g, t}. Based on all of the 2-combinations of the set X, here the repetition is allowed, we transform a DNA primary sequence into a special sequence over a set with cardinality 10. With the 10-letter sequence, we assoc